WASHINGTON, D.C.—National Institutes of Health (NIH) researchers recently announced the completion of a seven-year journey to map out a detailed atlas documenting the stretches of human DNA that influence gene expression—a key way a person’s genome gives rise to an observable trait like hair color or disease risk. The atlas is a critical resource for the scientific community searching for a key to the vast unknown of genomics.
The atlas, available for viewing since Oct. 11, is the culmination of work from the Genotype-Tissue Expression (GTEx) Consortium, launched in 2010 and completed in the summer of 2017, to catalog how genomic variation influences how genes are turned off and on.
“GTEx was unique because its researchers explored how genomic variation affects the expression of genes in individual tissues, across many individuals and even within an individual,” states Simona Volpi, program director for GTEx at the National Human Genome Research Institute (NHGRI).
According to Volpi, previously there was no resource at the scale used by GTEx that enabled researchers to study how gene expression in the liver might be different than in the lung or heart, for example, and how those differences relate to the inherited genomic variation in an individual.
To tackle this massive project, researchers involved in the GTEx Consortium collected data from more than 53 different tissue types (including brain, liver and lung) from autopsy, organ donation and tissue transplant programs.
“GTEx depended entirely on families choosing to donate bio-samples for research after the death of a loved one,” says Susan Koester, deputy director for the Division of Neuroscience and Basic Behavioral Science and GTEx program director at the National Institute of Mental Health (NIMH). “GTEx researchers are deeply grateful for this priceless gift.”
The result is a biobank house of collected tissue samples, as well as extracted DNA and RNA, for future studies by independent researchers. The summary-level data are available to the public through the GTEx Portal, and the most recent release of the raw data has been submitted to the Database of Genotypes and Phenotypes (dbGaP), an archive of results from studies investigating the genomic contributions to phenotypes (physical characteristics or disease states).
“The project met all of its goals, primarily to demonstrate that we could collect a large number (up to 44) of different tissues from a large number of postmortem donors and generate high-quality RNA sequence data from these,” Koester told DDNews. “We collected as many as 44 tissues from each of over 960 donors using a carefully developed set of SOPs to insure that the tissues and nucleic acids derived from them would be of highest-quality giving reliable data.”
The primary goal “was to provide a reference dataset for researchers to understand how particular loci in the genome affect gene expression in a tissue-specific manner, so-called eQTLs, (identified by establishing which specific variants are associated with differences in gene expression levels),” Koester says.
“Researchers have been relying on the data to develop a wide variety of tools and methods with applications in cancer, neurological and psychiatric disorders and many other health applications,” she says. “However, all of the data are from adult postmortem donors (age 21 to 70), so we can’t be specific about how these data relate to gene expression during human development.”
Other studies “look at many tissues from a small number of individuals or many individuals, but a small number of tissues,” Koester says. “The data from GTEx combines these approaches, allowing researchers to discern eQTLs across multiple tissue types.”
Although the GTEx project has officially wrapped up, plans for future work are already underway, such as the Enhancing GTEx (eGTEx) project, which began in 2013, extending GTEx’s efforts by combining gene expression studies with additional measurements, such as protein expression.
“These eGTEx efforts are underway, with the first data expected soon,” Koester says.
The atlas data are available at no cost from two sources: Summary data including lists of eQTLs, levels of gene expression across different tissues and images of the actual tissue sections are available to anyone through the GTEx portal, managed and continually updated by the Broad Institute. This portal development and maintenance, including the sharing of the upcoming eGTEx data, is supported by NHGRI, NIMH and NIH Common Fund. In addition, qualified researchers may request access to individual level sequence data through the dbGaP at the National Library of Medicine.
“By combining data across multiple tissues, we curated a set of gene expression outliers that replicated at higher rates and showed stronger enrichment of rare variants than those from any single tissues,” NIH researchers report. “However, our ability to characterize the genetic basis of multi-tissue outliers remains incomplete.”
The consortium now presents the deepest survey of gene expression across multiple tissues and individuals to date.