A deep learning approach to assessing cancer outcomes appears feasible, according to results of a study in patients with lung cancer. Machine curation yielded similarly accurate assessments of progression and times to improvement and response compared with human counterparts. This development could speed up oncology care processes.
“Important clinical end points, such as response to therapy and disease progression, are often recorded in the EHR [electronic health record] only as unstructured text,” wrote authors led by Kenneth L. Kehl, MD, MPH, of the Dana-Farber Cancer Institute in Boston. Standards such as RECIST are not routinely applied outside of clinical trials
“To create a true learning health system for oncology and facilitate delivery of precision medicine at scale, methods are needed to accelerate curation of cancer-related outcomes from EHRs according to validated standards,” the authors wrote further.
The researchers have developed a structured framework that incorporates multiple data sources in order to provide a representation of a patient’s clinical trajectory and outcomes; it incorporates pathologic reports, radiologic/imaging reports, molecular markers, signs/symptoms, and medical oncologist assessments. Curation from this structured record is “resource intensive and often impractical,” the authors wrote.
Thus, they applied deep machine learning to EHR reports based on that structured framework, focused on imaging reports; they hypothesized that deep learning algorithms could use routinely generated radiologic text reports to identify cancer as well as shifts in disease burden and outcome. Their report was published online ahead of print on July 25 in JAMA Oncology.
The study included a total of 2,406 patients with lung cancer. Researchers manually reviewed radiologic reports for 1,112 patients, producing reports on the presence of cancer and changes to it over time. These were used to train the deep learning models, which were first tested on 109 patients. In those patients, the deep learning models identified the presence of cancer, improvement or response to therapy, and disease progression with high accuracy (area under the receiver operating characteristic curve, >0.90).
When compared to human curation, the machine-based curation yielded similar measurements for disease-free survival, with a hazard ratio for machine versus human of 1.18 (95% CI, 0.71–1.95). The same was true for progression-free survival, with an HR of 1.11 (95% CI, 0.71–1.71), and for time to improvement/response, with an HR of 1.03 (95% CI, 0.65–1.64).
The algorithm was tested on an additional 1,294 patients with lung cancer as well. In those patients, algorithm-detected cancer progression was significantly associated with decreased overall survival, with an HR for mortality of 4.04 (95% CI, 2.78–5.85). Similarly, algorithm-detected improvement or response was associated with increased OS, with an HR of 0.41 (95% CI, 0.22–0.77).
“These results suggest that deep natural language processing… can rapidly extract clinically relevant outcomes from the text of radiologic reports,” the authors concluded. They noted that human curators can annotate imaging reports at a rate of approximately 3 patients per hour, meaning a single curator would need 6 months to annotate all the reports in this cohort. The deep learning model, in contrast, could annotate the entire cohort within 10 minutes.
In an accompanying editorial, Andrew D. Trister, MD, PhD, of Oregon Health & Science University in Portland, warned that it is important to consider the generalizability of the algorithm in question.
“The network can only know what it has already seen,” he wrote, meaning that if there are differences at other institutions in how radiologic reports are worded or structured, the model may not work. Among the ways to mitigate this issue in the future might be to make large data sets more widely available, or to use established validation sets held by third parties.
“Among the greatest hopes for artificial intelligence in medicine is the potential to both lower barriers to care and improve outcomes for large populations,” Trister wrote. “Efforts that leverage the vast amount of data already digitized in the health system are reasonable first steps toward this promise.”