The introduction of prostate-specific antigen (PSA) as a reliabletumor marker for prostate cancer brought significant changes in theend points used for outcome reporting after therapy. With regard to adefinition of failure after radiation, a consensus was reached in 1996that took into account the particular issues of an intact prostate aftertherapy. Over the next several years, the consensus definition issued bythe American Society for Therapeutic Radiology and Oncology(ASTRO) was used and studied. Concerns and criticisms were raised.The sensitivity and specificity of this definition vs other proposals hasbeen investigated, and differences in outcome analyzed and compared.Although the ASTRO definition came from analysis of datasets on external-beam radiation and most of the work on this topic has been withthis modality, failure definitions for brachytherapy must be exploredas well. The concept of a universal definition of failure that might beapplied to multiple modalities, including surgery, should also be investigated,at least for comparative study and research purposes.
ABSTRACT: The introduction of prostate-specific antigen (PSA) as a reliabletumor marker for prostate cancer brought significant changes in theend points used for outcome reporting after therapy. With regard to adefinition of failure after radiation, a consensus was reached in 1996that took into account the particular issues of an intact prostate aftertherapy. Over the next several years, the consensus definition issued bythe American Society for Therapeutic Radiology and Oncology(ASTRO) was used and studied. Concerns and criticisms were raised.The sensitivity and specificity of this definition vs other proposals hasbeen investigated, and differences in outcome analyzed and compared.Although the ASTRO definition came from analysis of datasets on external-beam radiation and most of the work on this topic has been withthis modality, failure definitions for brachytherapy must be exploredas well. The concept of a universal definition of failure that might beapplied to multiple modalities, including surgery, should also be investigated,at least for comparative study and research purposes.Although we are lucky enoughto have a marker in the treatmentof prostate cancer to predictprognosis before therapy and outcomeafterwards, it has taken over 15years to determine how to best use it,and recent long-term outcome studieshave raised several issues along withoptions for improvement. While littlecontroversy surrounds the prognosticuse of this marker in choosing therapy,we are still exploring treatmentfailure definitions, their applicationacross unlike therapeutic modalities,and the utility of early documentationof recurrence and associated salvagetherapy.HistoryWhen prostate-specific antigen(PSA) first came into clinical use inthe mid- to late 1980s, there was greatenthusiasm for its use in screeningand in the follow-up of prostate cancerafter therapy. As studies were conducted,PSA was also found to be asurrogate for tumor burden and provedits value in pretreatment prognosis andtherapeutic decision-making. A normalpostsurgery PSA level was soonagreed upon, given that with nearlyall prostate tissue removed, it had to bevery close to zero. Such was not thecase with radiation, however, becausePSA-secreting prostate was left intact.Thus, the dilemma and debate began.Some of the first publications usingPSA as a measure of outcomedefined successful radiation as a posttreatmentPSA level ≤ 4 ng/mL; itwas soon discovered, however, thatthis level might be normal before therapybut too high for an irradiated prostatewith a markedly changedsecretory capacity.[2-4] Much discussionensued as to what the appropriatelevel should be. We saw a rapidtransition to using this marker as anindicator of treatment efficacy as opposedto waiting long periods for localor distant disease to becomeclinically evident. Outcome studieswere reported with a variety of PSAend points, however. Of note werethe differences between the clinicaloutcome measures previously used-local, regional, and distant failure-and the PSA results that appeared todocument treatment failure earlier andin higher proportion.[1,6]Consensus Guidelines
In an effort to standardize studyreporting using this objective PSAparameter, the American Society forTherapeutic Radiology and Oncology(ASTRO) assembled an expert panelin 1996 and charged the memberswith reaching a consensus on "thesignificance of the depth of the PSAnadir, the definition of a rising PSA,the optimal PSA surrogate end pointfor total eradication of tumor or forrelapse after irradiation, and guidelinesfor using PSA end points forreporting (publishing) success or failureafter irradiation."Data on patient outcomes with variousPSA characteristics and trendsafter external-beam radiation alongwith illustration of various failure definitionswere supplied by investigatorsfrom seven institutions, who hadmade significant contributions to theliterature in the PSA era. Followingpresentations by these individuals andadditional information obtained fromrecursive partitioning techniques, theASTRO panel agreed on four guidelines:
In addition, the following guidelineswere suggested for studies submittedfor publication: (1) seriesshould have a minimum observationperiod of 24 months; (2) PSA determinationsshould be obtained at 3- or4-month intervals during the first2 years after the completion of radiationtherapy, and every 6 monthsthereafter; and (3) patients reportedin a series who have had one or twoconsecutive rises in PSA but not thethree consecutive rises necessary forfailure should be reported separatelyin the results.In the 6 years following the consensusconference, the ASTRO definitionhas been the most frequentlyused PSA definition of failure afterradiotherapy, providing a consistentstandard by which studies can be evaluatedand compared. Shortly after theconference, six of the participatinginvestigators pooled data on stage T1/2prostate cancer patients who had apretreatment PSA and received external-beam radiation at least 2 yearsprior to analysis. They reported outcomefor 1,765 patients with a medianfollow-up of 4.1 years using theASTRO definition of failure. Theefficacy and durability of irradiationwere established and prognostic factorsconfirmed. Subsequently, singleinstitutionstudies added to the bodyof outcome data using a consistentend point for reporting.
Deficiencies in theASTRO Definition
As is not uncommon in any situationwhere attention is focused on oneprimary recommendation, the less salient,seemingly minor points thatarose from the consensus conference,such as requiring a minimum of24 months follow-up before reportinga study and in some way indicatingpatients who had two but not threerises in PSA, were largely overlooked.Additionally, as frequently occurs inpiloting a new procedure, more experiencewith the proposed definitionand mature follow-up in the PSA eraled to the emergence of deficienciesand interpretative issues.Major criticisms of the ASTROdefinition included the lack of considerationof laboratory variation andstandard error, the extensive periodof time (which was follow-up interval-dependent) required to documentthree PSA rises, the bias associatedwith backdating the failure date, thepotential for greater sensitivity andspecificity of other definitions in predictingclinical failure, and the substantialdifference between this andsurgical definitions of failure.Several methods have been proposedto deal with the areas of concern.To accommodate the variationin laboratory testing of PSA levels, toreduce the problems associated withPSA values near the lower limit ofdetection appearing to change by largepercentages when increases occurred,and to discount minor fluctuations inPSA production in normal prostatetissue, a definition that quantified theminimum amount of each rise in PSAwas proposed-for example, threePSA rises of at least 0.5 ng/mLeach.Another method recommended asa way to compensate for small butconsecutive increases in PSA level,which also reportedly enhanced thepredictive power of the ASTRO definition,was to stipulate a required minimumtotal PSA level (eg, 1.5 ng/mL)in addition to the requirement of threeconsecutive rises in PSA. Proposalsto decrease the amount of timenecessary to document three PSA rises,especially with follow-up intervalsof 6 months or more, includeddefining failure as fewer rises, suchas two but of a certain value each, orany elevation above an absolute nadirvalue, such as 0.2 ng/mL.[9,11]Some authors suggested that thebias introduced by backdating couldbe remedied by allowing for adequatelength of follow-up, perhaps an additional3 years beyond the point chosenfor analysis. The bias stemsfrom reporting failure at the backdateddate, when it actually takes considerablylonger (ie, time for three PSArises) to declare a treatment failure. Ifthere is not enough follow-up timeavailable to allow for three rises inPSA, the failure rate may be significantlyunderestimated at the backdateddate. This bias could also be dealtwith by moving the reported failuredate to the date when the failure wasactually determined-in the case ofthe ASTRO definition, to the date ofthe third PSA rise. This, of course,would remove the backdating aspectof the definition.Finally, the backdating bias couldbe handled by a more complicatedoption that would involve backdatingthe censor date for patients with oneor two PSA rises for whom no additionalinformation was available.To address the lack of uniformity betweenthe definitions of failure usedfor differing treatment modalities, asingle, surgically oriented definitionusing a solitary cutoff point was proposed.[11,15] As might be expected,without appropriate sensitivity andspecificity testing, enthusiasm wasdecidedly lacking.
Comparing Failure Definitions
To expand the work of Shipley etal and the first multi-institutional outcomestudy, nine institutions recentlycontributed 4,839 T1/2 prostatecancer patients to a single, combineddatabase. All patients were treatedin the PSA era and, therefore, hadboth pretreatment and a series of posttreatmentPSA measurements. Thesemen were treated with definitive external-beam irradiation alone no morerecently than 1995, to provide potentialfollow-up of at least 5 years. Medianfollow-up was calculated at6.3 years with 2,049 patients stillavailable for analysis at 5 years, 616at 8 years, and 179 at 10 years posttreatment.Not only did this databaseprovide the most robust outcome reporton external-beam radiation todate, but it also provided a valuableresource with which to test and comparedefinitions of failure. Althoughmultiple failure definitions have beensuggested for various reasons, thislarge body of data provided a soundbasis for the testing and objective comparisonof definitions.
Additionally, with the ASTRO definition,three PSA rises must be documented,which also requires anextended period of time. For example,if posttreatment PSA measurementsare obtained every 3 months, 9months is necessary to observe threeconsecutive PSA rises. For follow-upperiods of 4 and 6 months, 12 and 18months, respectively, would be requiredto establish a PSA failure.Although yearly hazard rates showthat when using the ASTRO definition,the greatest risk of failure is at1.5 to 3.5 years after radiotherapy,one must bear in mind that this is thebackdated failure date (halfway betweenthe nadir and the first PSA rise)and that the failure would have beenestablished or called 7.5 to 15 monthsafter the backdated point, dependingon the follow-up interval. Data fromthe 4,839-patient study showed that38% of PSA failures were called by3 years posttherapy, 73% by 5 years,95% by 8 years, and 99% by 10 years,illustrating further the significantamount of time necessary to documentthree PSA rises and declare aPSA failure by a definition dependenton this criterion.The effect of follow-up time onoutcome reporting has been illustratedby other authors as well. Viciniand colleagues, in applying theASTRO definition to patients treatedwith external-beam radiation, showedthat the rate of biochemical failurediffered markedly depending on theamount of follow-up informationused. If only 3 years of follow-upinformation was allowed, the 3-yearbiochemical disease-free survival ratewas 71% vs 44% when 6 years offollow-up data was used. This phenomenonwas confirmed with the nineinstitution,4,839-patient dataset.Using the intermediate-risk subgroup,follow-up information wastruncated to produce median followuptimes of 3, 4, 5, and 6.3 yearsfor the same group of patients. PSAdisease-free survival was recalculatedfour times, each time using oneof these median follow-up times(Figure 2A). As for any PSA-baseddefinition, the greater the length offollow-up, the more failures are documented,so that outcome assessment,even if statistically accurate, may beoverly optimistic after a short follow-up. The situation after short follow-up times is even worse when oneuses a backdated definition; however,the statistical bias in outcome assessmentusing backdating becomesmore and more apparent the shorterthe follow-up time.Thus, although reporting study resultsprematurely may provideoutcome estimates that are too optimistic-especially when evaluatingnew treatment methods and technology-backdating introduces an additionalfactor that statistically alters thesurvival curves. Caution must be takenwhen comparing groups of patientsin whom the median survival is significantlydifferent.The bias related to follow-up time,which results from backdating the failuredate, can be avoided by includingin outcome analyses only patients withenough follow-up time to allow fornearly all failures to be counted or byusing a failure definition that scoresthe failure at the call date, withoutbackdating. Alternatively, a morecomplicated calculation can be performedthat particularly addresses patientswith one or two PSA rises whoare at considerable risk for failure. Inthese patients, the censor date (whenthere is no additional follow-up informationavailable) would be backdatedto halfway between the nadir dateand the first PSA rise, similar to backdatingthe PSA failure date. This compensatesfor the effect of follow-uptime when using a backdated failuredefinition (Figure 2B).
Horwitz and colleagues recentlyillustrated the difference in outcomeresults related to backdating thefailure date to various points in timevs calculating failure from the time itis declared, at the third PSA rise(Figure 3A). This group showedthat distortion of the Kaplan-Meiercurves would be minimized by movingthe failure date closer to the pointat which it is declared (Figure 3B).They felt that this would provide amore realistic assessment of an actuarial-based estimate of outcome.
Using the 4,839-patient, multi-institutionaldataset, outcome was plottedfor three definitions found to bemore sensitive and specific than theASTRO definition: (1) two PSA risesof at least 0.5 ng/mL each, backdated,(2) PSA ≥ current nadir PSA +2 ng/mL, and (3) PSA ≥ current nadirPSA + 3 ng/mL. Differences in PSAdisease-free survival are shown graphically(Figures 4A and 4B). It appearsthat a backdated failure definitionclusters the failures within the first5 years after treatment, when the firstPSA rise occurs, and therefore, PSAdisease-free survival is lower in theearlier years of reporting for theASTRO definition than for the twodefinitions that use a call date to denotethe time of failure, definitions 2and 3. To illustrate this further, a failuredefinition defined as three PSArises backdated can be compared tothree PSA rises recorded at the calldate (Figure 4C). Biochemical failurerates are greater in the earlier yearsafter therapy when the backdated definitionis used. The curves cross at6 years where there are then fewerfailures for the backdated definition,having moved the failures from thetime of the third PSA rise back to theearlier years when PSA first starts torise. In comparing the definition basedon two PSA rises of at least 0.5 ng/mL,backdated, to the ASTRO definition,the curves are parallel and separatedby only a few percentage points(Figure 4A). These analyses show thatoutcome results could vary widely andbe difficult to compare if the samedefinition of failure is not consistentlyapplied.Additionally, it has been suggestedthat to fairly compare outcomesbetween treatment modalities, similardefinitions of failure must beused.[15,22] Using surgical patients,Amling et al and Gretzer et aldemonstrated outcome differencescomparing the ASTRO definition, eitherbackdating or at call date, to atypical surgical definition of 0.2 ng/mL(Gretzer) or 0.4 ng/mL (Amling). Asseen in Figure 5A, outcome estimatesfor the backdated ASTRO definitionare lower within the first 4 years aftertreatment but higher than the surgicaldefinition in later years. One mustbear in mind, however, that with radiationtherapy the prostate remains intactand still produces at least a smallamount of PSA, which can vary withconditions such as prostatitis, the"bounce" phenomenon, gland size,and length of time from treatment.When the ASTRO definition wascompared to the surgical definition offailure (PSA > 0.2 ng/mL) for the4,839 irradiated patients in the multiinstitutionalstudy, great disparity inoutcome was seen; however, the sensitivityand specificity of the surgicaldefinition were 91% and only 9%,respectively (Figure 5B). This indicatesthat although the majority oftrue failures are expeditiously detected,unfortunately, so are all other nonmalignantconditions, indiscriminately.Of note, the multi-institutionalstudy did show that while one PSAnadir level could not predict failurefor every patient, the median and modenadir PSA levels for patients who weredisease-free at 8 to 10 years followupwere very low-0.5 and 0.1 ng/mL,respectively.
The ASTRO failure definition wasdeveloped using externally irradiatedpatients, and subsequent testing of otherproposed definitions was performedon this population as well. WhetherPSA characteristics and failure criteriaare the same for brachytherapypatients remains to be determined.From the work of Critz, it appearsthat patients treated with a combinationof implant and external-beam irradiationshow a correlation betweenPSA nadir values and subsequent diseaseprogression that is very differentfrom that seen in patients treated withexternal irradiation alone.Critz maintains that a PSA nadir of0.2 ng/mL is necessary to retain disease-free status as defined by a nonrisingPSA. According to his analysis,99% of patients with this nadir leveldo so, and he, therefore, proposes thatthis nadir level be used as the definitionof biochemically disease-free.Further strengthening this position ishis finding that only 16% of his patientswith a PSA nadir level of 0.3 to1.0 ng/mL remain disease-free at7 years posttreatment.
The multi-institutional analysis ofpatients treated by external-beam radiationalone shows a very differentassociation between PSA nadir leveland subsequent PSA failure as determinedby the ASTRO definition.These data showed that while fewerand fewer patients were disease-freewith progressively higher PSA nadirlevels, choosing one nadir level, suchas 0.2 ng/mL, would label a significantnumber of patients as failureswhen, in fact, their PSA level remainedstable. The percentages of externallyirradiated patients who weredisease-free 8 years posttreatment bythe ASTRO definition with nadir levelsof 0 to 0.49, 0.50 to 0.99, 1.0 to1.99, 2.0 to 3.99, and ≥ 4.0 ng/mLwere 70%, 53%, 42%, 23%, and 12%,respectively. As previously noted, thesensitivity of the 0.2-ng/mL failuredefinition for this group of patientswas 91%, but the specificity was only9%, as would be expected fromthe correlation of nadir level and failurepresented above. Perhaps the effectof the combination of external-beamradiation and radioisotopic implant onthe prostate is different from external-beam radiation alone in terms ofthe amount of normal PSA-producingprostate that remains after treatment.This may account for the nonrisingbut higher PSA levels.To address some of the previouscriticisms related to the ASTRO failuredefinition, Kattan and colleaguesused data from 1,213 brachytherapypatients, some of whom also receivedexternal-beam irradiation. The investigatorscompared the ASTRO definitionto four definitions of failure thatmodified it as follows: (1) early censoringof nonrecurrent patients withrising PSA levels, (2) cumulative ratherthan consecutive rises (without adecrease) as evidence of recurrence,(3) both of the above, and (4) waiting2 years before performing an analysisof data. The first definition dealtwith the bias introduced by backdating,the second with the potentiallylengthy period of time necessary forthree consecutive rises, and the fourthwith the effect of short follow-up onoutcome.As Figure 6 shows, only minimaldifferences in outcome were seen ascompared to the reports of Thamesand Vicini, which showed greater differenceswith externally irradiatedpatients.[12,17] These findings pointto the need for proper analysis offailure definitions using sensitivityand specificity testing in brachytherapy-treated patients, as has beenconducted in externally irradiatedpatients.
Universal Failure Definition
Although we have learned a greatdeal about the use of PSA as an earlysurrogate for treatment failure and diseaserecurrence in prostate cancer patients,distinctly different parametershave been used to define surgical, externalradiation, and brachytherapyfailure. In the case of surgery, the prostateis totally removed, while with bothexternal radiation and radioisotopicimplant, the gland remains intactalthough the functional- and PSAproducingcapabilities after each respectivetreatment may differ. Thismay mean that a definition of failuremust be established for each modality,but treatment comparisons would surelybe simplified if the use of a universaldefinition of failure were possible.To date, definitions of failure havebeen tested and compared most thoroughlyfor external-beam radiotherapy,and to a lesser extent forbrachytherapy and prostatectomy.Only through more extensive comparativetesting of failure definitionsfor the various treatment modalitieswill the potential for the use of oneencompassing definition be learned.Collaborative studies similar to themulti-institutional analysis discussedabove[16-17] are in progress.
The authors have nosignificant financial interest or other relationshipwith the manufacturers of any productsor providers of any service mentioned in this article.
Kuban DA, El-Mahdi AM, SchellhammerPF: Prostate-specific antigen for pretreatmentprediction and posttreatment evaluation of outcomeafter definitive irradiation for prostatecancer. Int J Radiat Oncol Biol Phys 32:307-316, 1995.
Horwitz EM, Vicini FA, Ziaja E, et al:Assessing the variability of outcome for patientstreated with localized prostate irradiationusing different definitions of biochemical control.Int J Radiat Oncol Biol Phys 36:565-571,1996.
Kuban DA, el-Mahdi AM, SchellhammerPF: PSA for outcome prediction and posttreatmentevaluation following radiation for prostatecancer: Do we know how to use it? SeminRadiat Oncol 8:72-80, 1998.
Wilett CG, Zietman AL, Shipley WU, etal: The effect of pelvic radiation therapy onserum levels of prostate specific antigen. J Urol151:1579-1581, 1994.
McLaughlin PW, Sandler HN, JiroutekMR: Prostate-specific antigen following prostateradiotherapy: How low can you go? J ClinOncol 14:2889-2892, 1996.
Zietman AL, Tibbs MK, Dallow KC, et al:Use of PSA nadir to predict subsequent biochemicaloutcome following external beam radiationtherapy for T1-2 adenocarcinoma of the prostate.Radiother Oncol 40:159-162, 1996.
American Society for Therapeutic Radiologyand Oncology Consensus Panel Consensusstatement: Guidelines for PSA followingradiation therapy. Int J Radiat Oncol Biol Phys37:1035-1041, 1997.
Shipley WU, Thames HW, Sandler HM,et al: Radiation therapy for clinically localizedprostate cancer: A multi-institutional pooledanalysis. JAMA 281:1598-1604, 1999.
Taylor JMG, Griffith KA, Sandler HM:Definitions of biochemical failure in prostatecancer following radiation therapy. Int J RadiatOncol Biol Phys 50:1212-1219, 2001.
Pickles T, Duncan GG, Kim-Sing C, etal: PSA relapse definitions: The Vancouverrules show superior predictive power. Int JRadiat Oncol Biol Phys 43:699-700, 1999.
Critz FA: A standard definition of diseasefreedom is needed for prostate cancer:Undetectable prostate specific antigen comparedwith the American Society of TherapeuticRadiology and Oncology Consensus Definition.J Urol 167:1310-1313, 2002.
Vicini FA, Kestin LL, Martinez AA: Theimportance of adequate follow-up in definingtreatment success after external-beam irradiationfor prostate cancer. Int J Radiat Oncol BiolPhys 45:553-561, 1999.
Horwitz EM, Uzzo RG, Hanlon AL, etal: Modifying the American Society for TherapeuticRadiology and Oncology definition ofbiochemical failure to minimize the influenceof backdating in patients with prostate cancertreated with 3-dimenional conformal radiationtherapy alone. J Urol 169:2153-2159, 2003.
Kattan MW, Fearn PA, Leibel S, et al:The definition of biochemical failure in patientstreated with definitive radiotherapy. Int J RadiatOncol Biol Phys 48:1469-1474, 2000.
Gretzer MB, Trock BJ, Han M, et al: Acritical analysis of the interpretation of biochemicalfailure in surgically treated patientsusing the American Society for TherapeuticRadiation and Oncology criteria. J Urol168:1419-1422, 2002.
Kuban DA, Thames HD, Levy LB, et al:Long-term multi-institutional analysis of stageT1âT2 prostate cancer treated with radiotherapyin the PSA era. Int J Radiat Oncol BiolPhys 57:915-928, 2003.
Thames H, Kuban D, Levy L, et al: Comparisonof alternative biochemical failure definitionsbased on clinical outcome in 4,839 prostatecancer patients treated by external-beamradiotherapy between 1986 and 1995. Int JRadiat Oncol Biol Phys 57:929-943, 2003.
Aref I, Eapen L, Agboola O, et al: Therelationship between biochemical failure andtime to nadir in patients treated with externalbeamtherapy for T1-T3 prostate carcinoma.Radiother Oncol 48:203-207, 1998.
Hanlon AL, Diratzouian H, Hanks GE:Posttreatment prostate-specific antigen nadirhighly predictive of distant failure and deathfrom prostate cancer. Int J Radiat Oncol BiolPhys 53:297-303, 2002.
Hanlon AL, Hanks GE: Scrutiny of theASTRO consensus definition of biochemicalfailure in irradiated prostate cancer patientsdemonstrates its usefulness and robustness. IntJ Radiat Oncol Biol Phys 46:559-566, 2000.
Jani AB, Chen MH, Vaida F, et al: PSAbasedoutcome analysis after radiation therapyfor prostate cancer: A new definition of biochemicalfailure after intervention. Urology54:700-705, 1999.
Amling CL, Bergstralh EJ, Blute ML, etal: Defining prostate specific antigen progressionafter radical prostatectomy: What is themost appropriate cut point? J Urol 165:1146-1151, 2001.