Principles guiding clinician decisions of when to recommend adjuvant therapy continue to evolve, though evidence-based lessons from the last 40 years, such as from the Oxford Overview, must not be forgotten. In the past, use of adjuvant chemotherapy was primarily determined by anatomic prognostic markers of recurrence risk, including tumor size and lymph node status. To reinforce the value of these measures in guiding the decision to offer adjuvant therapy, the 2010 Oxford Overview showed that proportional reductions in early recurrence, any recurrence, and breast cancer mortality with use of adjuvant anthracycline-based and taxane-based chemotherapy were largely independent of age, nodal status, size, estrogen receptor (ER) status, or differentiation (mostly high or intermediate, with relatively few cases of low-grade disease). This latter observation is critical to understand the Overview data, in that to some degree adjuvant chemotherapy improved outcomes even in strongly ER-positive disease, though not necessarily to the same extent as in less strongly ER-positive cases.
Consequently, efforts have been made to improve on the traditional adjuvant decision-making tools using additional clinicopathologic factors to better select appropriate therapies for patients. The Nottingham Prognostic Index was developed as a first step, by adding tumor size and node status to the Nottingham Grading TNM System (for tubule formation, nuclear pleomorphism, and mitotic count). The addition of the ER allowed prognosis and prediction determinations based on Adjuvant! Online, and now human epidermal growth factor receptor 2 (HER2) status is routinely determined in all tumors. Knowledge of tumor phenotypes, such as the ER-positive/HER2-negative, the HER2-positive (any ER), and the triple-negative phenotypes as determined by standard pathology measures, is of the utmost importance as it can influence therapy decisions. Consequently, accurate predictive biomarker testing is critical. It is also of interest that markers like ER and HER2 (both of which are modest prognostic markers) have strong negative predictive value (likelihood of benefit if marker-negative), but only a modest positive predictive value (likelihood of benefit if marker-positive).
The development of gene expression signatures has recently allowed a more quantitative description of breast cancer as a heterogeneous disease. In the last decade, semi-unsupervised gene expression array analysis of a cohort of breast cancers identified several intrinsic subtypes, including the luminal A, luminal B, HER2-enriched, claudin-low, basal, and normal breast subtypes. In 2009, the St. Gallen panel concluded that much of the prognostic information in these signatures resides in their sampling of proliferative genes, and the panel agreed that validated multi-gene tests, if readily available, could assist clinicians in deciding whether to add chemotherapy in cases in which their use was uncertain after consideration of conventional markers. The St. Gallen panel went one step further in 2011 and suggested using Ki67 or an alternative measure of proliferation to discriminate between the so-called luminal A and B subtypes.
Unfortunately, classifications based on microarray studies only partly reconcile with traditional phenotypes using standard pathology like immunohistochemistry assays, which were the measures most commonly used in therapeutic clinical trials thus far. This challenge is further compounded by the lack of robust analytical validation of various assays used in routine clinical practice, such as proliferation based on Ki67. The lack of sufficient concordance across platforms is the reason for the recommendation that the suffix “-like” be added when ascribing microarray nomenclature, using the standard methods immunohistochemistry and in situ hybridization (eg, a triple-negative tumor would be called “basal-like”); new methods are being developed.
In this issue of ONCOLOGY, Zelnak and O’Regan review some of the breast cancer multi-gene signatures currently available or soon to be commercialized. As they discuss, few have enough evidence to support their routine clinical use. MammaPrint was cleared by the US Food and Drug Administration (FDA) after prospective-retrospective validation of its prognostic value. Medium-size prospective-retrospective studies examined both the prognostic and predictive utility of Oncotype DX to identify patients with ER-positive tumors who are most likely to benefit from chemotherapy. Some of these tests have now been widely adopted in clinical practice to identify patients who could be spared from adjuvant chemotherapy. It is therefore of interest to understand how clinicians are using them.
In one of the largest retrospective exercises using prospectively collected data from National Comprehensive Cancer Network (NCCN) institutions, Oncotype DX testing (n = 7,375) was less frequent in African Americans and in patients with less than a college education. Also, physicians appeared to be ordering the test to potentially reinforce their pre-test bias. For instance, testing in patients with small node-negative cancers was associated with higher odds of subsequent chemotherapy use. In contrast, Oncotype DX testing in patients with node-positive or large node-negative breast cancer was associated with lower odds of chemotherapy use. Data like these suggest that clinicians are using existing tools to preselect whose tumors should be further tested. The irony is that this may result in the testing results being less helpful than desired. For instance, while the frequency of low recurrence score results (52%) in the NCCN dataset was similar to initial studies that validated risk cutoffs, the observed number of high-risk results was much smaller, while the number of intermediate-risk results was much higher (eg, 38% of intermediate scores in the NCCN cohort vs 21% in National Surgical Adjuvant Breast and Bowel Project [NSABP] B-20 trial and 28% in the Southwest Oncology Group [SWOG] 8814 trial).
In addition, there is a lack of clarity on how to best integrate molecular assays with standard pathology measures. Several studies have observed the lack of correlation between prognostic estimates of risk derived from Oncotype DX vs routine pathology measures, which suggests that they are independent and in fact could also be complementary. Therefore, pending completion of ongoing prospective trials such as TAILORx (NCT00310180), RxPONDER (NCT01272037), and MINDACT (NCT00433589), it may be too simplistic to declare the first generation of predictive markers for ER-positive disease as superior to standard measures. MINDACT specifically has a unique trial design, in that patients whose estimates of risk by Adjuvant! Online and MammaPrint are concordant are treated with endocrine therapy alone if low-risk by both assays, or with chemotherapy followed by endocrine therapy if high-risk by both assays, while those with discordant test results based on the two assays (low/high or high/low, respectively) are randomized to receive chemotherapy or not.
The first generation of gene expression signatures or proliferation assays also have no apparent value in HER2-positive or triple-negative disease. In fact, patients with ER-negative disease are also at risk for overtreatment. As shown in NSABP B-13, almost 50% of ER-negative, node-negative patients who did not receive adjuvant chemotherapy (control arm) remained disease-free after 14 years. Therefore, prognostic and predictive markers are critically needed for patients whose tumors are lumped together as “triple-negative,” a term that fails to account for the profound biologic and prognostic diversity observed in those tumors.
The Cancer Genome Atlas recently completed the first extensive “omics” characterization of large numbers of tumors using copy numbers, exome sequencing, mRNA arrays, and methylation, among other methods. While clinical outcome data were not obtained, analyses across the four main intrinsic subtypes (luminal A, luminal B, basal-like, and HER2-enriched) suggested that much of the clinically observable heterogeneity occurs within these major biological subtypes, and not across. Data in other tumors also show that cancers can have a high degree of mutational heterogeneity within and across sites of disease (primary sites and metastases). This is an important observation in that technological advances coupled with small therapeutic studies have identified a growing number of so-called driver mutations that can be exploited for therapeutic purposes, such as KRAS mutations in colon cancer and ALK translocations and EGFR mutations in non–small-cell lung cancer. DNA sequencing costs have plummeted in the recent past, and we are now approaching a so-called “inflection point,” where we will increasingly move away from descriptive diagnostics to quantitative prediction. However, for this to succeed, clinical outcome and therapeutic data are much needed before we begin routinely using these data in the clinic, in that not all “actionable” or “targetable” mutations are likely to be relevant for the care of individual patients.
As we move into the brave new world of “precision medicine,” the “old world” rules still apply. For instance, critical attention must be paid to pre-analytical factors regarding tissue specimen handling (eg, cold ischemia) and to the standardization of assay development and reporting criteria. Clinicians and bioinformatics experts must also learn to speak the same language, starting with basic principles regarding analytical validity, clinical validity, and clinical utility. More than ever, a true partnership between clinical, laboratory, and bioinformatics scientists as part of a multidisciplinary team is needed to benefit our patients.
Financial Disclosure: The authors have no significant financial interest or other relationship with the manufacturers of any products or providers of any service mentioned in this article.