Drs. Hensley, Castiel, and Robson address a topic that is of high priority not only to ovarian cancer specialists, but to all those interested in cancer prevention and control. Of cancer sites for which a proven effective screening tool is not yet available, few hold as much promise (in terms of achieving mortality reductions) as seems possible were there an effective screening test for ovarian cancer. This optimism is premised upon the improved prognosis of women diagnosed with ovarian cancer in its earlier stages, compared to those diagnosed with late-stage disease.
This line of reasoning may be overly simplistic, however, as much remains to be learned about the natural history of ovarian cancer. For example, little is known about the malignant potential of benign ovarian masses and the likelihood that stage I ovarian cancer will actually develop into late-stage disease. The challenges inherent to screening for ovarian cancer are evident from the fact that possible reductions in mortality have yet to be realized, despite considerable efforts toward this end.
The development of the potential screening tools now available stand testament to the tremendous strides that have been made in the past few decades. These advances give rise to the hope that an effective ovarian cancer screening program will one day be developed. However, deficiencies in available tests leave us with difficult decisions, namely how to best utilize our current resources.
Screening Tools: A Potential to Harm?
As with the practice of medicine, the primary tenet of screening programs is do no harm. It is sobering to be reminded that when implemented under inappropriate circumstances, screening programs can actually do more harm than good.
As outlined by Drs. Hensley, Castiel, and Robson, the standards for implementing a screening program are high: The test must be highly accurate in discriminating true cases from true non-cases, and this accuracy is measured on an absolute, rather than relative, scale. Thus, if the case-control approach is used to evaluate a screening test, it matters little if the screening test is statistically significant in discriminating between cases and controls or if it yields what appears to be a strong relative risk or odds ratio.
Furthermore, due to study design considerations, the positive predictive value (PPV) cannot be estimated directly from the study data. As illustrated by lysophosphatidic acid (LPA) data in Table 1, even a level of accuracy that may appear promising when a marker is tested in small groups of women with or without ovarian cancer may be insufficient when considered for use on a population-wide scale. When the sensitivity (97.9%) and specificity (89.6%) estimated in the clinical study of 48 women with ovarian cancer and 48 cancer-free women are applied to a hypothetical population of 100,000 women with a disease prevalence of 30 per 100,000 women, the resulting PPV of 0.28% is quite lowparticularly when one considers the anxiety and costs of follow-up testing associated with each false-positive test. Thus, extremely high accuracy is required when considering screening for a relatively rare disease such as ovarian cancer.
The ultimate test of an ovarian cancer screening programs effectiveness must come from randomized trials with mortality as an end point; looking for stage shift or differences in survival times can be misleading. Large trials of long duration are required, but we must be patient and make evidence-based decisions.
Combining Tests: Is the Whole Better Than the Sum of Parts?
When faced with a dilemma such as that seen with ovarian cancerfor which more than one potential screening tool exists but none is ideal for population-based screeningit is natural to try to increase the yield by combining two or more tests. Deciding upon combinations ultimately boils down to trade-offs, in terms of both practical constraints and accuracy. Practical issues, such as invasiveness and expense of the test, are exemplified when comparing blood tests vs ultrasound. A tumor marker that can be measured inexpensively and noninvasively in the blood is highly desirable. Obstacles to accuracy in this approach include variability in the assay, intra-individual variability (eg, during the menstrual cycle), positive results due to benign conditions such as endometriosis, and lack of organ specificity. Ultrasound provides results specific to the ovary and a greater degree of accuracy, but is more expensive and invasive than a blood test and is subject to inter- and intraobserver variation.
When tests are combined, any net gain in sensitivity is offset by a net loss in specificity, and vice versa. For example, combining multiple blood markers into a single panel leads to a net increase in sensitivity at the price of reduced specificity. The same holds true for any combination of tests, such as the combination of simultaneous CA-125 and transvaginal ultrasound, which is recommended for women who test positive for germ-line mutations in BRCA1 or BRCA2 (and is currently being tested in the ovarian cancer arm of the Prostate, Lung, Colorectal, and Ovarian Cancer Screening [PLCO] trial).
Another approach is to combine tests serially, thereby increasing the net specificity of the overall screening program, but reducing the net sensitivity. This approach has been implemented in a randomized trial with women who test positive on an initial CA-125 assay proceeding to further testing by ultrasound. Although estimates of net sensitivity and net specificity of this screening regimen were not reported, of the 9,364 women who participated in at least one of three screens, 468 tested positive for CA-125 (> 30 U/mL) and underwent 781 subsequent ultrasounds (apparently many women had consistently elevated CA-125 levels). Of these, 29 went on to surgery, whereby ovarian cancer was detected in 6 women.
The reported PPV of 21% was based on the 6 women with ovarian cancer, who were detected from the 29 women referred to surgery. The small numbers upon which the point estimate of PPV is based indicate a high degree of variability around this point estimate; ie, the lower limit of the 95% confidence interval is 8%. More importantly, the large proportion of false-positive test results between the CA-125 and ultrasound stages of screening underscores the difficulties of this approach and raises questions about its feasibility.
Using the rate of change in tumor markers measured longitudinally as the actual screening test, rather than the value at one point in time, is an intriguing approach that deserves more rigorous evaluation. An additional issue that may deserve more careful consideration centers on the cutoff points used for tumor markers measured on a continuous scale, such as CA-125. The use of higher cutoff points will reduce the sensitivity even further, but will achieve needed gains in specificity. This is an option worth evaluating for high-risk women who are regularly screened for ovarian cancer.
Screening High-Risk Women
Testing for germ-line mutations of BRCA1 and BRCA2 identifies women who are at a significantly elevated risk of ovarian cancer relative to the general population. Drs. Hensley, Castiel, and Robson clearly outline the need to carefully evaluate the implications of implementing an ovarian cancer screening program among these high-risk women in the presence of uncertain evidence of effectiveness. In the interim, it is essential that patients understand the limitations of this screening approach and that they be incorporated into the process of informed decision-making.
While awaiting the results of large-scale randomized trials to evaluate screening strategies based on currently available screening tools, there is an urgent need to invest in research to identify, refine, and evaluate even better screening tools than those presently available. The development of a more accurate method of detecting ovarian cancer in its early, more curable stages will help overcome many of the current obstacles to ovarian cancer screening.
An infrastructure needs to be in place to rapidly test any new markers in the clinical setting. The National Cancer Institutes Early Detection Research Networkan interdisciplinary, nationwide consortium of researchersis an important step toward this end. In allocating resources, we must not overlook the need to search for etiologic clues that may eventually lead to pathways for primary prevention. The potential return on these investments, in terms of alleviating human suffering, is simply too large to ignore.