In clinical trials evaluating the efficacy of oncology drugs, improved survival has been the "gold standard" accepted by the US Food and Drug Administration (FDA) for establishing clinical benefit. Yet survival itself is not an ideal endpoint for clinical trials because it necessitates designing large, long studies; may be affected by crossover therapy; does not capture symptom improvement; and may include noncancer deaths. Thus, over the past decade, clinical trials of oncologic therapies have used surrogate endpoints which, by reducing the duration and size of the trials, facilitate accelerated approval. Tumor shrinkage is a logical endpoint, since advancing tumor burden is the predominant mechanism by which the disease causes morbidity and mortality.
Objective response rate (ORR) has been considered a measure of drug antitumor activity, even in single-agent studies. There are several reasons that ORR has been widely used as an endpoint. First, tumor shrinkage is likely to be attributable to a direct effect of the agent under investigation. Second, tumor response has been widely accepted as a means of guiding cancer treatment. Finally, ORR would seem to be likely to predict clinical benefit, assuming the response rate (RR) is high enough and the responses are of sufficient duration.
Despite its logical appeal, reduction in tumor burden has proven to be a controversial endpoint. Typically, many recent trials have assessed tumor response using the recently developed Response Evaluation Criteria in Solid Tumors (RECIST). These criteria are derived from a retrospective analysis of measurements obtained from eight clinical trials in which patients were assessed for tumor response. To simplify tumor assessment, RECIST uses unidimensional measurement of the longest diameter of a tumor. Such measurements have reportedly underestimated response to chemotherapy in pleural-based masses, such as malignant pleural mesothelioma,[3,4] tumors with significant surrounding fibrosis, and other tumors, especially in light of current imaging technologies and multimodality approaches. The limitations of this unidimensional approach are illustrated in Figure 1.
In addition, tumor response may underestimate treatment effects on clinical endpoints, including survival, by failing to reflect the magnitude, breadth, and duration of effects on tumor burden. Alternatively, tumor response can overestimate impact on survival if the response is brief or if it does not capture unintended harmful mechanisms of action of the tested treatment. The question of whether ORR correlates with overall survival (OS) and, thus, whether it is an appropriate endpoint is still open to significant debate.
A number of factors influence the effect of tumor response on OS in randomized clinical trials. One obvious consideration is the magnitude of the response difference between arms. For example, a 20% response in one treatment arm and 10% in the other may be clinically significant, but in a small trial the difference will not translate into a significant OS benefit. Quality of response is also important: complete response (CR), partial response (PR), or stable disease (SD) may all impact survival differently. The durability of response, which sometimes is overlooked, obviously can affect progression-free survival (PFS) or OS. Durability of response must be evaluated in the context of the rate of disease progression for the specific disease. With renal cell cancer (RCC), for example, tumors classified as low-, intermediate-, or high-risk according to the Sloan-Kettering classification system progress at different rates. Thus, a brief response in patients whose disease progresses slowly may not necessarily affect PFS or OS. In contrast, with a rapidly progressing tumor, a large number of responses (even if relatively brief) may influence PFS.
Finally, in assessing whether tumor response influences OS or PFS in randomized clinical trials (RCTs), an important consideration is the type of comparator arms. A study that compares active drug with placebo or observation will yield different information from studies that compare two drugs with similar mechanisms of action or two drugs with markedly different mechanisms of action. Single-arm studies have occasionally provided the basis for FDA approval when no effective therapy is available and spontaneous tumor regressions are rare. In contrast, trials using historical controls have rarely been successful unless the survival outcomes differ markedly in the active comparator arm.
Following is an overview of the efficacy and limitations of using tumor response as an endpoint in key clinical studies of various agents used to treat renal cell carcinoma, melanoma, lung cancer, breast cancer, and other solid tumors.