DNA provides the instruction manual for making proteins. This fundamental scientific tenet is a staple lesson in introductory biology classes everywhere. It is less widely taught, however, that only approximately one percent of the genome actually serves as a template for making proteins (1). While scientists previously considered the remaining noncoding regions to be useless, they have more recently come to appreciate that this once-termed “junk DNA” in fact regulates the expression of nearby genes. For example, structural variants (large-scale deletions, translocations, and other rearrangements) in noncoding regions of DNA can alter gene expression, opening up a new area of study in cancer genetics.
In a recent Nature Communications paper, researchers at Baylor College of Medicine and the University of Alabama at Birmingham described how they analyzed whole genome sequencing and proteomic datasets to investigate how these structural variants influence gene expression in tumors (2). Their findings highlight a key set of genes of particular interest for cancer treatment, revealing new candidates for precision medicine approaches.
“It’s impressive on the scale that they leveraged these repositories of all this data that’s out there, genomic and proteomic, and put it all together and did some really interesting analyses to cut to the consequences of what happens because of some of these genomic alterations and how they associate with all these different cancers,” said Timothy Griffin, a multiomics bioinformatician at the University of Minnesota who was not involved in the research. “It’s kind of a tour de force of a study.”
The team first compared genomic and proteomic data from 1,307 human tumors. “We see all these genes being overexpressed at the mRNA level, but then we took the next step in this study to say how many of those are actually showing up at the protein level?” said Chad Creighton, a cancer proteogenomics researcher at Baylor College of Medicine and coauthor of the study. The researchers observed that altered expression was conserved from mRNA to protein in approximately 25 percent of the genes.
This imperfect correlation can be largely attributed to regulatory mechanisms such as post-transcriptional modification of RNA that influence its translation to protein. However, it also accounts for instances where a protein was not captured in the proteomic data due to technical challenges, yielding a missed association between mRNA and protein. “You're at the mercy of the detection limit with proteomics,” Griffin said. “If they're low abundance proteins, they may be there, and they may be really interesting, but [the researchers] aren't detecting or quantifying them in a way that they can really make any conclusive results.”
Despite its limitations, incorporating protein-level data helps to pinpoint the structural variant-regulated genes that are most relevant in the context of cancer. “A lot of the important genes that we know from other studies have very specific roles in cancer will show up at the protein level,” Creighton said. “So, maybe we focus on the genes that show up at both levels of mRNA and protein.”
Griffin agreed. “The real power of [the proteomic data] is you're getting closer to the functional targets that are being affected by these various genomic alterations that really are maybe the functional players in cancer,” he said.
To explore this connection further, the researchers compiled a list of genes where structural variants altered protein expression and compared them to genes linked to worse patient survival and those that reduced cell viability when knocked out. They observed a significant overlap between their list and both the poor prognosis genes and the knockout-sensitive genes across multiple types of cancer. “If you see all those things, that can whittle it down to a core set of genes that will be very interesting and may be understudied. ...There are a few genes that probably have important roles in cancer that haven't been looked at as deeply,” Creighton said. “Maybe, people can look at our results and think of going after some of these other genes.”
Specifically, Creighton envisions that precision medicine for cancer could expand to sequence noncoding regions of DNA in a tumor to identify structural variants that lead to overexpression of oncogenic genes. The tumor could then be treated with a drug that knocks down the gene or blocks a related target in the genetic pathway. “A lot of additional people might benefit from a given therapy that maybe would have been missed by precision medicine approaches that just look within the gene versus alterations outside the gene,” Creighton said.
Looking forward, the team hopes to integrate even more data from public sources to analyze a larger cohort of tumors. “In future studies, as we can scale this up further, that gives us more power to [get] more consistent, robust findings and maybe even identify additional genes,” Creighton said. “The resources are constantly expanding.”
- Marian, A.J. Sequencing your genome: What does it mean? Methodist Debakey Cradiovasc J 10, 3-6 (2014).
- Chen, F., Zhang, Y., Chandrashekar, D.S., Varambally, S., & Creighton, C.J. Global impact of somatic structural variation on the cancer proteome. Nat Commun 14, 5637 (2023).