Pediatric cancers enter the whole-genome sequencing pipeline with the initiation of the St. Baldrick’s project

March 18, 2010

A collaborative project to sequence the neuroblastoma cancer genome could revolutionize diagnosis and treatment.

ABSTRACT: A collaborative project to sequence the neuroblastoma cancer genome could revolutionize diagnosis and treatment.

It has long been recognized that cancer is a genetic disease that arises when a cell develops mutations in key genes that govern cell behavior. The discovery of such cancer-causing mutations has provided tremendous insight into the process of carcinogenesis and led to numerous important diagnostic, prognostic, and therapeutic applications.

But the search for cancer-causing mutations has been hampered by substantial technologic barriers. For decades investigators have had the tools to search cancer cells for gross genomic aberrations, such as chromosome translocations and gene amplifications, or search them selectively for mutations in a handful of genes of interest. The ability to assess the entire cancer genome at the sequence level has been absent.

Advancements in sequencing technologies, facilitated by the completion of the reference human genome, have led to a new era in cancer investigation in which entire cancer genomes are searched for cancer-causing mutations. Although this remains a costly endeavor, the potential benefit may change the landscape of cancer care in the near future. To date, a near-complete cancer genome sequencing focused on the coding sequences of all known human genes has been achieved for a handful of adult malignancies (breast, colon, glioblastoma).

Currently, additional adult cancers are being sequenced using "next-generation sequencing" that also includes noncoding genomic regions. These studies are beginning to yield catalogues of somatic mutations for individual cancers, identify the principal biopathways disrupted, and provide insight into the mechanisms of mutagenesis for these cancers.

Pediatric cancers have not been included in the sequencing pipeline even though they are a major cause of death during childhood. But pediatric cancer is far less common than adult cancer, so fewer resources are available to study them. The genetic aberrations and preferred signaling pathways involved in pediatric cancers are likely to differ from adult carcinomas, so it has become imperative that global genomic sequencing be applied.

The global sequencing project for neuroblastoma is a collaboration between the Children's Oncology Group (COG) and investigators at the Children's Hospital of Philadelphia (CHOP), Johns Hopkins University in Baltimore, and Texas Tech University Health Sciences Center in Lubbock. I will serve as the principal investigator for this project. The first phase of this study is supported with a $500,000 grant from St. Baldrick's Foundation (see Did You Know box).

Did You Know?Shaving the daySt. Baldrick's Foundation raises funds for childhood cancer research by hosting worldwide head-shaving events. Participants voluntarily shave their heads to show their solidarity with children undergoing cancer treatment. Visit

www.stbaldricks.org

for more information.

The mysteries of neuroblastoma

Neuroblastoma is a common solid tumor that arises in the developing peripheral nervous system. Although some children have localized tumors that are curable with surgery and/or outpatient chemotherapy, the majority present with locally invasive or metastatic tumors that behave aggressively. Despite intensive treatments that include chemoradiotherapy, stem cell rescue, and bioimmunotherapy, the prognosis for such children remains dismal, with the majority dying from tumor progression.

Studies of the gross genomic aberrations that occur in neuroblastoma led to the discovery of amplification of the MYCN oncogene in the early 1980s. This anomaly is found in 20% to 25% of all neuroblastomas and is highly correlated with an aggressive tumor and poor treatment outcome.

Further work by investigators worldwide has established other genomic hallmarks of high-risk neuroblastoma, including segmental chromosomal aberrations involving chromosome arms 1p, 3p, 11q, 17q, and others. Despite the recognition that distinct patterns of genomic aberrations correlate with diverse tumor behavior, no causal cancer genes or treatment targets other than MYCN have emerged from these studies.

Instead, a candidate gene approach, informed by linkage studies with rare neuroblastoma-prone families, led to the recent discovery of the ALK kinase as a neuroblastoma oncogene. Germline ALK mutations are responsible for the majority of familial neuroblastomas, while somatic mutations or gene amplifications arise in up to 20% of sporadic tumors. The finding of a potential therapeutic target such as ALK among the fewer than 200 genes sequenced in neuroblastoma to date offers hope that many other cancer genes and potential therapy targets will be discovered from the more than 20,000 genes not yet investigated.

Research plan overview

Through the efforts of the COG clinical trials program, numerous highly annotated tumor specimens, including matched germline DNA, have been biobanked for research. This includes neuroblastoma-derived cell lines, which are being sequenced in the first stage, or "discovery screen," of this work. The benefit of working with cell line DNA is the absence of infiltrating normal tissues that, when present, can complicate mutation detection. As many as 24 cell lines with matched normal germline DNA will be sequenced using the Solexa sequencing technology (Illumina Systems) complemented by an optimized enrichment by exon capture. The target coverage is 70-fold considering both whole-genome and exome-enriched sequencing, and the estimated error rate, given improvements in imaging, is about 0.05%.

The identified mutations will enter a bioinformatics pipeline previously optimized to analyze the genomes of breast and colorectal cancers. The overwhelming majority of sequence variants will represent both rare and common germline polymorphisms (SNPs) and will be identical in the tumor and normal DNA.

Though some of these may impact an individual's risk of developing neuroblastoma, the primary focus will be on identifying the acquired sequence variants that drive the cancer process. Multidimensional bioinformatic and statistical techniques that utilize a machine-learning approach will be used to identify genes mutated at significant frequencies, as well as to identify gene pathways enriched for mutations. These genes and preferred pathways can then be correlated with distinct molecular-genetic high-risk subtypes in subsequent stages.

The candidate neuroblastoma genes identified in the discovery screen will next be sequenced in a targeted fashion from an independent set of primary high-risk neuroblastomas. This second stage "validation screen" will give a more accurate estimate of the mutation prevalence for each gene.

Together, the discovery and validation screenings are powered for a more than 70% likelihood of detecting any mutation that is present in at least 5% of tumors. In the third or "tertiary screen" stage, representative tumors of all risk classes and all stages, including the enigmatic 4S stage, will be used to define the mutation prevalence across diverse tumor subsets. This staged study design provides a sensitive approach for somatic mutation detection while lowering costs by an estimated two- to threefold.

Goals and expectations

The discovery phase of this work is anticipated to be available in the next six months. We expect that these studies will identify important diagnostic and therapeutic targets in neuroblastoma. Importantly, because cell lines are used in the discovery phase, candidate cancer mutations that are prioritized for further study can be readily tested for their functional significance, where such assays exist.

This collaborative effort will provide the initial cancer gene survey for pediatric cancer, and novel genes found mutated become immediate candidate cancer genes for other childhood tumors as well. This undertaking will spur the development of new diagnostic and therapeutic interventions.

The full data set of mutations will be made publicly available following validation, and we anticipate that the pediatric research community will be required to fully leverage these discoveries in a timely fashion.