Chin Med J 2010;123(14):1948-1951 1948
Viewpoint
Medical screening: to be or not to be?
WANG Wei-zhong and TANG Jin-ling
Keywords: medical screening; benefits; harms; cost
WHAT IS MEDICAL SCREENING?
Worldwide chronic diseases have become a major cause
of suffering, disability and mortality. When patients are
diagnosed as a result of the appearance of symptoms, it is
often too late and treatment options are limited. Hoping
that early diagnosis and early treatment can retard or stop
disease progression, medical screening is proposed as a
secondary prevention method in which people without
specific medical complaints are invited to undergo
interventions to identify and modify risk factors, or to
find disease early in its course so that early treatment
prevents further severe complications.1,2
Screening can be defined as “the systematic application
of a test, or inquiry, to identify individuals at sufficient
risk of a specific disorder to warrant further investigation
or direct preventive action, amongst persons who have
not sought medical attention on account of symptoms of
that disorder.”3 In fact, what is screened for could be
either a disease or a risk factor for an important disease.
The purpose of screening is secondary prevention of more
severe complications in the former case and primary
prevention in the later case. For example, a skin test
called the PPD is widely used to screen for exposure to
tuberculosis,4 mammography to detect breast cancer,5
colonoscopy to detect colorectal cancer,6 and cholesterol
screening to find a high risk population for coronary heart
disease.7 Screening should not be taken as just application
of a test. Figure 1 uses Papanicolau (Pap) smear for
screening cervical cancer to show the general flow and
main components of a screening program.
Figure 1. An example of medical screening: Pap smear to detect
in situ cervical cancer.
Early diagnosis and early treatment incur costs and do
harm but may not necessarily lead to a greater benefit.
Useful screening programs are those that can bring about
more good than harm at an acceptable cost. What
screening programs would be potentially useful?
WHEN WOULD SCREENING BE POTENTIALLY
BENEFICIAL?
Major factors related to the potential usefulness of a
screening program include the disease, the screening test
and the treatment.
Disease
The disease to be screened for must have a long
detectable preclinical phase. The severity of the disease
must be considered a large disease burden to the
population in terms of suffering, disability, destitution and
death, and the pre-clinical phase of the disease must be
relatively prevalent.
The preclinical or asymptomatic phase of a disease is the
time period from the onset of the disease to the time that
symptoms appear and a diagnosis is made by a doctor. A
long pre-clinical phase is the prerequisite for the disease
to be detected and treated early enough for the treatment
to be more effective. Most infectious and acute diseases
that have a very short asymptomatic phase are not
suitable for screening, while cancers that may have a
preclinical period of many years are often the target of
screening. Although the burden of disease is a relative
concept, diseases such as cancer that cause suffering,
disability, and death are generally more suitable for
screening than illnesses such as minor skin problems.
The prevalence determines the total number of cases that
can be detected by screening. If too few cases are
detected, no matter how effective the treatment would be,
screening would not have an important impact on the
disease burden.8 The prevalence of a disease is positively
DOI: 10.3760/cma.j.issn.0366-6999.2010.14.022
Department of Administration and Logistics Management, Institute
of Pathogen Biology, Chinese Academy of Medical Sciences,
Beijing 100730, China (Wang WZ)
Division of Epidemiology, School of Public Health and Primary
Care, The Chinese University of Hong Kong, Hong Kong Special
Administrative Region, China (Tang JL)
Correspondence to: Prof. TANG Jin-ling, Division of
Epidemiology, School of Public Health and Primary Care, The
Chinese University of Hong Kong, Hong Kong Special
Administrative Region, China (Tel: 852-22528779. Fax:
852-26063500. Email: jltang@cuhk.edu.hk)
Chinese Medical Journal 2010;123(14):1948-1951 1949
related to the prevalence of the risk factors and the
average duration of the preclinical phase. The prevalence
of the undiagnosed but detectable preclinical disease also
depends on whether or not the population has been
screened previously. Recent screening of a population
would reduce the number of undiagnosed asymptomatic
patients. This suggests that in order for the screening
program to be cost-effective, two subsequent rounds of
testing should not be done too close. For example, in breast
cancer screening the test should be performed again in the
same person every 3–5 years rather than very year.
Screening test
The screening test refers to the test or a series of tests
used in a screening program to identify those who are
likely to have the disease and warrant further
confirmatory diagnostic investigations. Screening is
feasible only if a good screening test is available. The test
should be sufficiently accurate, i.e., it should have
desirable sensitivity and specificity. The ability of a test
to classify people who truly have the pre-clinical disease
as test positive is sensitivity. The ability to classify those
who are truly free of the disease as test negative is
specificity.
If sensitivity and specificity are not perfect (which is
almost always the case in screening), errors will occur in
the test. Those who truly have the disease but are test
negative are false negatives; false negative results would
lead to missing the opportunity of early diagnosis and
treatment. On the other hand, those who are truly free of
the disease but test positive are false positives; false
positive results would cause unnecessary confirmatory
diagnostic investigations and even unnecessary
treatments.
Moreover, the screening test should be able to give
sufficient lead-time. Lead time is the time period from the
time an asymptomatic patient is detected by screening to
the hypothetical time point when symptoms would appear
and diagnosis would be made by a doctor if the screening
test had not been performed. For a group of patients this
lead time can be estimated but for any individual patient
lead time is always a hypothetical concept and
un-estimable. For a screening program to be effective in
reducing morbidity and mortality in a population there
must be enough lead time in a sufficient number of cases.
If lead time is insufficient, cases may not be treated in
time to retard or stop the progression of disease.
The test should also be simple, rapid, inexpensive, and
safe so that it can be applied to a large number of people
at an affordable cost and people and physicians will be
willing to do the testing.9-11 Screening can not be either
too expensive to be affordable or too unattractive so that
people will not do it. The test should be easy to learn and
perform. These that can be administered by non-physician
medical personnel will necessarily cost less than those
that need to be performed by a doctor of years of medical
experience. Moreover, screening that requires
hospitalization of people for a few days to go through
complex and invasive procedures are unlikely to be very
much welcomed by either doctors or patients.
Treatment
If screening can bring about any good, it will be from the
treatment rather than any other activities of a screening
programme. Therefore, for a screening program to be
potentially beneficial there must be treatments available
that are effective and acceptable to asymptomatic patients.
Treatments used for late-stage patients do not necessarily
help early-stage patients, nor would treatments acceptable
to symptomatic patients be equally acceptable to
asymptomatic ones.
How do we know whether early treatment would lead to
better outcomes than late treatment? For cancer screening,
early attempts to evaluate the outcome of screening
compare the survival rate at various time points after
treatment between asymptomatic cases detected by
screening and symptomatic cases diagnosed due to
symptoms. A higher survival rate in patients detected by
screening would suggest that screening could reduce the
death rate and increase survival.
However, this comparison is flawed and the results often
mislead. The major biases include lead time bias, length
bias and selection bias. Due to lead time bias, patients
detected by screening would necessarily survive longer
than those diagnosed due to symptoms simply because
the former patients were diagnosed earlier (by the amount
of lead time) than the later in the natural history of the
disease. This is true even if early detection and early
treatment do not affect the natural progress of the disease
at all and early detected patients would live as long as the
patients diagnosed due to symptoms after the onset of the
disease (Figure 2A).
Length bias occurs because the same cancer may progress
at a different rate in different patients (Figure 2B).
Screening tests are likely to find tumors with a moderate
growth rate (type B patients), whereas fast-growing
tumors (type A patients) would cause symptoms and be
diagnosed by a doctor before screening and slow-growing
tumors (type C patients) are too small to be detectable at
the time of testing. Screening, therefore, tends to find
tumors with inherently better prognoses than those found
by doctors. As a result, the survival in patients detected
by screening would be better than those diagnosed due to
symptoms even if screening is ineffective.
Selection bias occurs because not everyone will
participate in a screening program and those who do
participate may differ from those who do not. If people at
a higher risk of the disease are more eager to be screened,
for instance, women with a family history of breast
Chin Med J 2010;123(14):1948-1951 1950
Figure 2. The major biases. A: Lead time bias. Dx is the time of
diagnosis; the light shaded area is the survival after diagnosis if
there is no treatment or treatment is ineffective; the dark shaded
area is the extra survival due to effective treatment. Type A
patient represents those normally diagnosed and treated without
screening. Types B and C patients represent those early detected
and treated and only Type C patient represents those who has a
real benefit from screening. The comparison of type A patients
with types B and C patients is biased by the lead time which is
unknown in real patients so that we cannot distinguish between
situations respectively represented by types B and C patients.
Thus, we can not tell whether screening is beneficial or not by
such comparisons. B: Length bias. Rapidly growing tumors (A)
will develop symptoms and diagnosed by a doctor before
screening is performed, slowly growing tumors (C) are too small
to be detectable at the time of testing, and only tumors at a
medium-growth rate (B) can be detected by screening. As a
result, comparison of type A patients with type B patients will be
biased as type B patients are bound to survive longer than type A
patients even if screening is not effective.
cancer are more willing to get a mammography, a
screening test will look worse than it truly is. This will
cause selection bias in the above comparison. Conversely,
selection bias can make screening look better than it truly
is. If the test is more available to the young and healthy,
for instance, if people have to travel a long distance to be
screened, fewer people in the screened population would
die than those who do not participate and screening
would seem to make a positive difference.
The reliable method to evaluate the effectiveness of a
screening program is a cluster randomized controlled trial,
in which groups or communities are randomly allocated
to receive either screening or normal care without
screening. A lower mortality from the screened disease in
the screened group than the control group will suggest
screening is beneficial.
For example, randomized controlled trials show that
mammography can significantly reduce mortality from
breast cancer in women over 50 years old.12 In contrast,
chest X-rays with or without sputum cytology do not
affect mortality from lung cancer.13
BENEFITS, HARMS AND COSTS OF SCREENING
Screening offers a benefit at a cost. The costs include
money and time incurred, and inconvenience, worries and
harm to the screened.
Benefit
Screening can be effective only when the detected
patients can be effectively treated. The size of the benefit
from a screening program will be determined by the total
number of cases detected and the effectiveness of the
treatment. The former is further determined by the
prevalence of the disease and the sensitivity of the test.
The higher the prevalence and the sensitivity are, the
larger number of cases will be detected and then treated.
The total years of life gained from screening is the
product of the number of cases treated and the extra
survival gained due to early treatment compared to late
treatment. If the treatment is effective, the benefit can be
increased by screening only persons at a high risk of
disease, by using a test of high sensitivity, by using a
cutoff value that gives rise to a high sensitivity, and/or by
testing the same persons less often.
Cost
Costs are incurred for the initial test, for the confirmatory
diagnostic investigations, and for treating the detected
patients. The first two costs are necessary and can not be
avoided, whereas confirmatory diagnostic investigations
are where a high unnecessary cost may occur. These
investigations are normally much more expensive and
more time-consuming per person, and more likely to be
invasive and cause harms than the initial test. For people
whose tests are false positive, these investigations are
unnecessary and should be reduced to a minimum. The
false positive rate is determined by the specificity of the
test and the prevalence of the disease. The lower the
prevalence and the specificity are, the higher the false
positive rate will be. As the prevalence is mostly low in
screening, the main method to reduce the cost due to false
positives is to increase the specificity of the initial test.
However, sensitivity and specificity of a test are inversely
related and an increase in specificity will inevitably
decrease sensitivity, which determines the number of
cases detected and consequently the benefit. Therefore, a
good screening program must carefully choose the test
and determine the cutoff value for the test so as to achieve
a desirable combination of sensitivity and specificity.
This will help favorably balance the benefit against the
cost and harm.
Harm
Harms in screening are mainly in the following forms:
harm and inconvenience caused by the initial test;
negative labeling effects (stress and anxiety caused by a
false positive result); harm and inconvenience caused by
unnecessary investigations in false positive patients; harm,
inconvenience and costs incurred in unnecessary
treatment of those who do not have the disease or have
Chinese Medical Journal 2010;123(14):1948-1951 1951
the disease that would never progress to the clinical stage
and would not have been treated if there were no
screening; prolonged stress and anxiety if detected
patients can not be effectively treated; a false sense of
security caused by false negative results which could
delay final diagnosis. Unnecessary harm comes mainly
from false positive results.
TO SCREEN OR NOT TO SCREEN?
All screening programs incur cost and do harm; only
some have good as well. Only screening programs that
can do more good than harm are useful. For similar
reasons, periodic health checkups, which are probably
less scrutinized and more widely adopted, should be
evaluated in a similar manner. The benefit of screening
can only be reliably demonstrated by a randomized
controlled trial. Even if effective, only a few people will
eventually benefit from it at a large cost to the society.
Even worse, the harm starts immediately but the good, if
any, takes longer to appear. Thus, the decision to
introduce a new screening program should be made as
carefully as the decision to build a large new hospital.
When would the benefit from a screening program be
worth its costs and harm? The answer to this question
would inevitably vary across populations that have
different resources available and different health needs to
address. In order to achieve good value for money from
screening, developing countries should screen only when
there a desirable benefit:harm ratio, where “desirable” is
not universal but determined by the resources available
locally and the values of the population. The cost-benefit
ratio can be further improved by targeting people at a
high risk of the disease, by performing the test when
people visit physicians for other illnesses, by testing less
often, and/or by using a test with a high specificity.
REFERENCES
1. Eddy DM. Common screening tests. Philadelphia: American
College of Physicians; 1991: 7-10.
2. Morabia A, Zhang FF. History of medical screening: from
concepts to action. Postgrad Med J 2004; 80: 463-469.
3. National Screening Committee. First report of the National
Screening Committee. London: Health Departments of the
United Kingdom; 1998 (Accessed June 5, 2010 at
http://www.nsc.nhs.uk/pdfs/nsc_firstreport.pdf).
4. American Thoracic Society. Diagnostic standards and
classification of tuberculosis in adults and children. Am J
Respir Crit Care Med 2000; 161: 1376-1395.
5. Humphrey LL, Helfand M, Chan BK, Woolf SH. Breast
cancer screening: a summary of the evidence for the US
Preventive Services Task Force. Ann Intern Med 2002; 137:
347-360.
6. Whitlock EP, Lin JS, Liles E, Beil TL, Fu R. Screening for
colorectal cancer: a targeted, updated systematic review for
the US Preventive Services Task Force. Ann Intern Med 2008;
149: 638-658.
7. Pignone M, Phillips C, Atkins D, Teutsch S, Mulrow C, Lohr
K. Screening and treating adults for lipid disorders. Am J Prev
Med 2001; 20 (3 Suppl): 77-89.
8. Morrison AS. Screening in chronic disease. New York: Oxford
University Press; 1985.
9. Wilson JMG, Jungner G. Principles and practice of screening
for disease (Public Health Paper 34). Geneva: World Health
Organization; 1968.
10. Sackett DL, Holland WW. Controversy in the detection of
disease. Lancet 1975; 2: 357-359.
11. Cochrane AL, Holland WW. Validation of screening
procedures. Br Med Bull 1971; 27: 3-8.
12. Wells J. Mammography and the politics of randomized
controlled trials. BMJ 1998; 317: 1224-1230.
13. Bach PB, Kelley MJ, Tate RC, McCrory DC. Screening for
lung cancer: a review of the current literature. Chest 2003; 123:
72-82.
(Received February 3, 2010)
Edited By SUN Jing