Rachel Karchin, PhD, is a professor of biomedical engineering, oncology, and computer science, with joint appointments at the Whiting School of Engineering and School of Medicine at Johns Hopkins University in Baltimore. She is a core member of the Institute for Computational Medicine.
A computational biologist, Dr. Karchin develops algorithms and software to analyze genomic data and interpret its impact on human disease. Her most recent work has focused on cancer and the effects of germline and somatic alterations and their contributions to progression models of tumor evolution. She led the computational efforts to identify driver mutations for the Johns Hopkins Sidney Kimmel Cancer Center’s pioneering cancer sequencing projects, and she co-led The Cancer Genome Atlas (TCGA) PanCancer Atlas Essential Genes and Drivers Analysis Working Group.
Cancer Network asked Dr. Karchin about the contemporary search for cancer-driving gene mutations.
—Interviewed by Bryant Furlow
Cancer Network: How have cancer driver genes and mutations typically been identified in the past?
Dr. Karchin: Prior to the advent of high-throughput DNA sequencing technologies, driver genes were identified by a variety of laboratory experimental techniques. Oncogene or activating driver genes were discovered by their similarity to genes in tumor-causing retroviruses in animal models. Tumor suppressor or loss-of-function driver genes were discovered mainly by genetic studies of individuals with inherited cancer syndromes. After the sequencing of the human reference genome, increasingly efficient sequencing techniques have made it possible to generate DNA profiles of human cancers in large patient cohorts, spanning all human protein-coding genes, and most recently the whole human genome.
Accumulation of large amounts of cancer sequencing data led to the rise of computational and statistical techniques as primary tools in identifying cancer driver genes and mutations. Dozens of such methods have been developed. In the PanSoftware/PanCancer paper published in Cell, we developed an approach to optimally combine a large number of these methods to establish a consensus. Researchers often wonder which methods are the most reliable, and my team also published a protocol to benchmark cancer driver gene prediction methods in Proceedings of the National Academy of Sciences (PNAS) 2 years ago. We used this benchmark in the PanSoftware/PanCancer paper to up-weight and down-weight the contributions of different methods.
Cancer Network: What are PanSoftware and PanCancer, and how did your team use them to conduct a comprehensive search for cancer-driving gene variants across tumor types? How many cancer types were represented and how many tumor exomes were interrogated?
Dr. Karchin: PanSoftware refers to our integration of 26 computational tools to construct a catalog of driver genes and mutations. PanCancer refers to a collection of tumors, across many types of cancer. In our analysis, this included 9,423 sequenced patient tumor exomes in 33 cancer types studied by TCGA projects. Our approach identified 299 cancer driver genes and 3,437 unique missense mutations predicted to be driver mutations.
Cancer Network: How did you validate your findings? What proportion of initially identified drivers were validated experimentally?
Dr. Karchin: We compared our computational predictions with results of an independent dataset of 1,049 somatic mutations, previously identified in cancer patients treated at MD Anderson Cancer Center. The somatic mutations were evaluated for oncogenicity based on cell survival and growth in two cell lines, Ba/F3 and MCF10A, which depend on the presence of stimulating proteins known as “growth factors” to survive. These cell types typically die when the factors are absent, but the introduction of “driver” mutations into their DNA enables them to proliferate even when growth factors are missing. In contrast, the cells die if the mutations introduced are passengers.
We predicted 579 missense mutations to be drivers, and 46 of these were observed in the MD Anderson patients. Eighty-five percent of those were validated by the cell line assays.
Cancer Network: What percentage of tumors harbor actionable gene mutations that can be targeted with existing agents?
Dr. Karchin: With respect to specific gene mutations and copy number alterations, we considered four tiers of targeted agents: those approved by the US Food and Drug Administration, those in clinical trials, those from case reports, and those in preclinical studies. Approximately 30% of the samples in our dataset had at least one actionable alteration, by one of those criteria. We also considered putatively actionable alterations, for which the affected gene is considered an actionable target, and we found that 52% of the samples contained a putatively actionable target. By that criteria, the most common targetable alterations were deletions in CDKN2A (13% of samples), PIK3CA mutations (12%), MYC amplifications (8%), BRAF mutations and amplifications (8%), and KRAS mutations (7%).
Cancer Network: What’s next? How will your team or others build on these findings?
Dr. Karchin: Shifting the focus of research from driver genes to specific driver mutations is an important direction, because driver genes contain a mixture of driver and passenger mutations. For purposes of precision oncology, a clinician wants to know whether particular mutations that appear in patient sequencing results are actionable, not only that the mutations are harbored in actionable driver genes. The field is also moving towards cancer-specific driver identification, because different cancer types are characterized by different driver mutations. A patient’s therapeutic response to drugs targeting a specific gene and optimal assignment to a clinical trial is increasingly understood to depend on both the specific mutation in the gene of interest and cancer type. Finally, it is likely that most of the common driver genes in the TCGA cohorts have now been discovered, leaving undiscovered and lower frequency driver genes in less-studied groups of patients and tumor types.
Cancer Network: Is there anything else you would like to tell readers?
Dr. Karchin: My group has developed a computational method called CHASMplus that can predict driver missense mutations for specific cancer types. While many computational methods have been developed to predict whether a missense mutation is generally deleterious or pathogenic, there has not previously been a method to score the oncogenic impact of a missense mutation specifically by cancer type. Although it is well known that missense mutations have different impacts in different cancer types, currently available computational methods either do not take it into consideration or fail at the task of distinguishing the differences. Lack of cancer type specificity has been an obstacle to adoption of computational missense mutation predictors in the clinic.
We have also modeled the trajectory of discovery of driver missense mutations and found that it differs significantly across the 33 cancer types sequenced by TCGA. For some cancer types, discovery appears to have saturated, while for others additional tumor sequencing is likely to yield new discoveries. TCGA cancer types can be separated into groups with similar trajectories, complementing previous groupings by cell of origin, transcriptome profiles, and others.