CAMBRIDGE, Mass.—A research team led by scientists from the Broad Institute of MIT and Harvard has completed a large-scale study that has expanded the list of known genes linked to various cancer types by 25 percent. The study has also demonstrated that it is possible to create a comprehensive catalog of cancer genes for different cancer types with as few as 100,000 patient samples.
In the past few decades, evidence has been found for roughly 135 genes that play causal roles in one or more of the 21 tumor types focused on in this recent study. That number has been added to significantly, as this latest analysis has uncovered 33 genes that play roles in cell death, cell growth, genome stability, immune evasion and a number of other biological processes.
The cancer types included in this study consisted of leukemias such as chronic lymphocytic leukemia and acute myeloid leukemia, childhood cancers such as rhabdoid tumor and neuroblastoma, and more common cancer types such as breast, prostate, ovarian, colorectal, melanoma and lung cancer. These cancers represent types with very few mutations and types with a higher frequency of mutated genes. All told, the genomes of nearly 5,000 cancer samples were analyzed, then compared with matched samples of normal tissue.
Michael Lawrence, first author of the study and a computational biologist at the Broad, notes that most of the newly identified genes were “significant in only one tumor type,” with a few exceptions. Those outliers include ELF3, “an ETS transcription factor not previously known to be significantly somatically mutated in any cancer,” was found to be significantly mutated in bladder and colorectal cancer. ARHGAP35, also known as GRLF1, was mutated in endometrial cancer, but was also “very close to significance in two types of lung cancer, as well as kidney cancer and head-and-neck cancer.”
“There were some newly discovered genes that weren’t significant in any individual tumor type, but only became significant when combining all the data together and pooling the weak signal from multiple kinds of cancer,” Lawrence explains. “A good example of this is TP53BP1, ‘Tumor protein p53 binding protein 1,’ named because it binds to p53, the most commonly mutated gene across cancer. We saw sporadic mutations across many tumor types, and the gene became significant only when pooling all the evidence.”
“We could tell that our current knowledge was incomplete because we discovered many new cancer genes,” added co-senior author Gad Getz, director of the Broad Institute’s Cancer Genome Computational Analysis group and a Broad associate member. “Moreover, we could tell that there are many genes still to be discovered by measuring how the number of gene discoveries grows as we increase the number of samples in our analysis. The curve is still going up.”
The team will be moving forward with this research, hopefully on a larger scale. They estimate it will take some 2,000 samples of each cancer type in order to properly catalog most of the mutations, coming to approximately 100,000 samples spanning about 50 tumor types.
“ TCGA is continuing to analyze large numbers of tumor-normal pairs that have been collected,” says Lawrence. “Also, the worldwide ICGC project is analyzing large numbers of patients of many cancer types. We aim to combine all of this data together and analyze it through our pipeline, so that the data is harmonized and analyzed together in a consistent way. That will allow us to reach much higher levels of power to discover cancer genes at the 2 percent patient frequency level. However, many more samples (2,000 per tumor type) are needed to identify most of the cancer genes at the 2 percent patient frequency level.”
In addition, more work needs to be done with the genes the team has targeted in order to determine if any have the potential to be targets for drug development. Different initiatives will be handling that work, including the Cancer Program’s Target Accelerator at the Broad Institute, as well as the National Institutes of Health’s Cancer Target Discovery and Development project, of which the Broad Institute is a member. The latter seeks to “decode cancer genotypes so as to read out acquired pathway and oncogene addictions of the specific tumor subtypes, and to identify small molecules that target these dependencies.”
“For the first time, we know what it will take to draw the complete genomic picture of human cancer,” said Eric Lander, Broad Institute founding director and a senior co-author of the paper. “That’s tremendously exciting, because the knowledge of genes and their pathways will highlight new, potential drug targets and help lead the way to effective combination therapy.”