A DNA helix in white is drawn over a blue background. The individual DNA bases are shown.

Credit: iStock

A blueprint for a new epigenomic era

A recent effort to sequence the entire human genome also revealed how it can be modified.
Aparna Nathan Headshot
| 9 min read
Register for free to listen to this article
Listen with Speechify
0:00
9:00

Three billion base pairs make up each human genome, and the Telomere-to-Telomere (T2T) Consortium recently took on the Herculean task of sequencing every single one of them. But DNA isn’t the only thing in their crosshairs.

T2T also offers a first look at the epigenome that accompanies each DNA base — information that was missing in several crucial regions in older maps of the human genome. The T2T team hopes that having a fuller picture will shed light on how the epigenome helps cells work and what happens when things go awry.

“Our goal at the T2T Consortium is a complete base-by-base picture of our genome,” said Karen Miga, a genome biologist at the University of California, Santa Cruz and co-chair of the consortium. “Understanding the DNA modifications is a perfect example of information that can be brought to light by having a more complete genome.”

Lost in the genomic desert

While DNA carries the instructions for all cellular functions, not every cell looks or acts the same. This is because of differences in the epigenome: chemical modifications to the DNA and surrounding proteins. These tweaks influence how easily a given cell can read the instructions encoded in the DNA, determining where and when a cell cracks open this expansive instruction manual.

Continue reading below...
A stylized illustration of human kidneys drawn in white outlines, set against a blue background filled with colorful abstract flowers and leaves.
WebinarsUnlocking insights into rare kidney disease through genomic data
Large-scale clinicogenomic data sheds light on the biology of rare kidney disorders and opens doors to new treatment possibilities.
Read More

Sophisticated technologies can spot epigenetic molecules coating pieces of DNA. But to figure out exactly which genes are affected by the modifications, scientists need to find the spot in the human genome that matches the coated DNA sequences. This wasn’t always an easy task; older versions of the human genome were choppy, missing sections of the genome that were too repetitive to figure out where exactly a fragment belonged.

“You're basically standing on a dune in the Sahara Desert looking in every direction, and you have no idea where you are — it’s all desert,” said Rachel O’Neill, a genome biologist at the University of Connecticut and senior author of one of the T2T papers. “When you’re in the middle of those repeats, you’re lost.

TBD
Rachel O'Neill is interested in how epigenetic variation in repeat sequences evolved.
Credit: Rachel O'Neill

However, some of these dunes could still be biologically important regions of DNA. For example, the repetitive midsection of each chromosome, the centromere, plays an important role in ensuring that cells divide properly, and some scientists think that epigenetic errors in this region can increase the risk for cancer. That makes the task of understanding epigenetic modifications in these regions “pretty fundamental,” said Steven Henikoff, a molecular biologist at the Fred Hutchinson Cancer Research Center who was not involved in the T2T projects.

Continue reading below...
A 3D rendering of motor neurons lit up with blue, purple, orange, and green coloring showing synapses against a black background.
WhitepaperNew approaches to studying ALS
Learn how stem cell-derived motor neurons and microglia are opening new pathways to understand ALS and explore potential therapies.
Read More

To be able to navigate repeats and other tricky genome topologies, the T2T Consortium used the relatively new long-read sequencing technology. This method reads the genome in longer segments than conventional methods: more than 20,000 bases at a time, instead of a few hundred. This means that instead of glimpsing a small portion of a repeated sequence and not knowing which part of the genome it came from, the technology can read longer stretches of repeats and map their locations like finding an exact match for a sentence in a book, instead of trying to match a single word.

As an added bonus, long-read sequencing automatically detects one epigenetic modification: base methylation. “The epigenetic information comes along for free,” said Winston Timp, a biomedical engineer at Johns Hopkins University and senior author of one of the T2T papers. Measuring other types of epigenetic modifications requires some extra experiments, but it can be tacked on with relative ease, he added.

Continue reading below...
A conceptual illustration of a drug capsule filled with microchips, representing the integration of artificial intelligence in drug discovery and development
Technology GuidesA Technology Guide for AI-Enabled Drug Discovery
Learn practical strategies for using artificial intelligence to find the best drug candidate.
Read More

“I think of the epigenome as the fifth base,” Timp said. “We shouldn’t be ignoring it.”

Creating a blueprint

With the help of long-read sequencing, the T2T Consortium successfully produced a complete sequence of 22 human chromosomes and the X chromosome — simultaneously creating a complete map of methylation on each base (1). Using Oxford Nanopore and PacBio’s long-read technologies, they slogged through the eight percent of the genome that hadn’t been sequenced before.

For Ting Wang, a geneticist at Washington University in St. Louis who was not directly involved in the T2T projects, the new research is the beginning of the next phase of epigenome mapping efforts. He worked on the Roadmap Epigenomics Project, an earlier effort supported by the National Institutes of Health to conduct large-scale epigenomic profiling (2). That project, which concluded in 2018, built an epigenome reference on top of an earlier version of the genome, but it was limited by the gaps. “Before we had a complete reference genome, it was impossible to have a complete reference epigenome,” Wang said.

Continue reading below...
A 3D illustration of two DNA strands in a transparent bubble
EbooksOvercoming barriers in gene therapy
Advanced gene editing, delivery, and analytical tools are driving better gene therapies.
Read More
Savannah Hoyt, a graduate student at the University of Connecticut, and the rest of the T2T team used long-read sequencing to read the epigenetic marks on each of the 3 billion bases of DNA.
Savannah Hoyt, a graduate student at the University of Connecticut, and the rest of the T2T team used long-read sequencing to read the epigenetic marks on each of the 3 billion bases of DNA.
Credit: Rachel O'Neill

Now, with the new T2T reference, researchers can better interpret results from older epigenomic datasets, such as Roadmap and the Encyclopedia of DNA Elements (ENCODE). The T2T researchers looked for matches between the older sequencing data from these projects and the new complete genome and linked epigenetic modifications to more genes, including those involved in diseases.

For example, epigenetic modifications in the neuroblastoma breakpoint family (NBPF) genes were previously thought to trigger brain tumors, but the genes were so similar to each other that it was difficult to pinpoint the exact culprits. With the T2T reference, the researchers linked a constellation of tumor-specific epigenetic marks to specific NBPF genes (3). Wang expects that with more long-read epigenetic data in the future, the T2T reference will prove even more useful.

T2T’s data analysis software is a key resource for other researchers who want to take a base-by-base walk through a gene or genomic region of interest, Henikoff said. Mapping repetitive long-read sequencing data is still a relatively new challenge, and many of the tools didn’t exist prior to this project (4).

Continue reading below...
An illustration showing a DNA strand and scientists removing segments with tweezers, representing the CRISPR gene editing technology
Technology GuidesA Technology Guide for CRISPR Screening
Emerging CRISPR screening methods are shaping what’s possible in drug development and precision medicine.
Read More

“We could only look at where we could shine the light before, but now we have a map,” Timp said. “We provided a blueprint for how to do this.”

The epigenetic piece of many puzzles

The Consortium has published many papers using the new data, each focusing on a different previously obscured aspect of the genome, and the power of the epigenetic data has earned it a place in almost every story, Miga said. Her lab, for example, studies centromeres, the genomic regions in the middles of chromosomes that serve as central hubs to keep related pieces of DNA organized but separate while a cell divides.

The T2T team noticed that there was less methylation at specific spots on the centromere. The location of these “centromeric dip regions” varied between chromosomes, and when they looked at the X chromosome in people from around the world, they also varied between people. A closer look showed that these epigenetic changes corresponded to the location where key proteins bind during cell division.

Continue reading below...
A compass on a nautical map
InfographicsCharting a cellular treasure map with spatial transcriptomics
Spatial transcriptomics technologies unveil a goldmine of biological information.
Read More
Winston Timp's team is interested in how epigenetics — "the fifth base" — influence how cells interpret the genome.
Winston Timp's team is interested in how epigenetics — "the fifth base" — influence how cells interpret the genome.
Credit: Will Kirk

This was just one of many surprises hidden in the methylation data. Repetitive regions in the centromere and elsewhere in the genome had diverse methylation profiles, even in regions thought to share a critical function (5). “It sounds like finishing a project,” said Henikoff, who also studies centromeres. “But in a way, it's kind of a beginning for those of us who are interested in centromeres.”

Timp was also interested to learn that genes that had duplicated over the course of human evolution didn’t always have the same epigenetic marks. In fact, in some cases, one copy had marks suggesting that it was silenced while the other was active. Similar patterns emerged in repeating sequences of DNA.

O’Neill, who studies the genome from an evolutionary perspective, sees this as an example of how the genome might use epigenetic modifications to defend itself against DNA fragments that could wreak havoc if transcribed. Like many of the other investigators, she didn’t know what to expect when delving into unexplored parts of the genome and was surprised to find a transposable element that was highly transcribed, unlike other repeats (6). These mobile pieces of DNA could induce epigenetic changes in their surrounding regions and form boundaries for different regions of chromosome structure.

Continue reading below...
A gloved researcher’s hand uses tweezers to place a biological sample into a storage box for preservation.
InfographicsBanking on biology
Biobanking gives scientists access to thousands of biological samples, moving precision medicine one step closer to reality.
Read More

“This technology has reinvigorated a field that's actually several decades old,” said O’Neill, who is planning to build off this work to continue studying how repeat regions influence chromosome structure.

Bringing epigenetics to the clinic

As a cancer researcher, Henikoff is particularly interested in how cell division goes wrong in cancer cells. The epigenetic modifications that the T2T team found around centromeric repeat elements might hold the answer. O’Neill also wonders whether the variability of the epigenome between people can explain variability in cancer risks and outcomes. For example, epigenetic modifications that make repeat elements less stable might be dire handicaps for patients who already have common cancer mutations that hamper their cells’ abilities to fix errors. Understanding this will require analyzing this variability in healthy and sick people, Timp said.

“Just like how genomics is going to give us better personalized therapies, I think the epigenome can also play a role there as well,” Timp said.

Continue reading below...
An illustration of a DNA double helix with a part of the helix colored red
MilestonesScience Milestone: The evolution of gene therapy
Gene therapy has a complicated history, marked by many ups and downs and crafted by ever-advancing technologies.
Read More

Epigenetic marks might also make good targets for therapeutics. Some cancer treatments, such as histone deacetylase inhibitors, already target the epigenome.

Bringing that kind of knowledge into clinical practice might still be many years away. Technologies need to improve in accuracy, and being able to sequence genomes from hundreds or thousands of people will require improvements in efficiency and cost.

“It is early days, but this will eventually become the standard for research,” said Christopher Mason, a genome biologist at Weill Cornell Medical Center who was not involved in the T2T projects, in an email. “It is just a matter of time before it is the standard in clinical care as well.”

First of many

Miga is careful to note that this is not “the human genome,” but rather is “one human genome.” Not only do DNA sequences vary between people, but the DNA used in this study actually came from a cell line that looks quite different from the average human cell.

Continue reading below...
An illustration of genetic engineering with CRISPR/Cas9, a powerful tool for gene editing that may cause off-target effects if used without molecular safety switches.
InfographicsSafety buttons for CRISPR/Cas9
New technologies keep CRISPR/Cas9 in check and minimize off-target risks.
Read More

For starters, this cell line, dubbed CHM13, comes from a type of molar pregnancy, so it resembles an early embryonic state, rather than the cells with fully developed functions in an adult human. Notably, it is haploid. This made the researchers’ jobs a bit easier because it meant that they didn’t have to figure out which copy of the genome each sequence came from. But this means that the genome and epigenome could look different in adult human cells.

Karen Miga led the T2T Consortium's efforts to sequence a complete human genome with its accompanying methylation.
Karen Miga led the T2T Consortium's efforts to sequence a complete human genome with its accompanying methylation.
Credit Karen Miga

“It’s not your cell, my cell, or anybody’s cell,” Wang said. “It's a valuable tool for building this complete reference, but in terms of the biology, it whets your appetite but is far away from being what we want.”

Miga worried about this too, especially when she saw the intriguing methylation patterns in the centromere. Without prior data on this region, she couldn’t tell if the patterns were unique to CHM13 cells or early developmental stages. To help answer this, she and her team sequenced an X chromosome from a more differentiated diploid human cell line. The methylation patterns were generally the same, which was an encouraging sign for Timp and Miga. But methylation was absent from other regions of CHM13’s genome in regions that are almost always methylated in differentiated cells.

Continue reading below...
A close-up of a positive pregnancy test with two red lines, indicating the presence of hCG antibodies
MilestonesScience Milestone: The science behind modern pregnancy tests
Key advances in antigen detection technology and antibody specificity allowed early at home pregnancy detection.
Read More

Beyond the 3 billion bases that T2T sequenced, Miga thinks that even more value will come from extending understanding of variation between genomes. Epigenetic studies have long shown that differences in DNA modifications are part of what gives different cell types their specialized functions, and the T2T data suggest that epigenetic patterns in the centromere can even vary between people from different parts of the world.

Wang, who is collaborating with Miga on the Human Pangenome Reference Consortium (HPRC), hopes to do just that. The HPRC aims to sequence 350 whole genomes from people of diverse ancestries, and Wang’s goal is to make a “human pan-epigenome” alongside it.

“The symbolic meaning of [the T2T] epigenome is just enormous,” Wang said. “In my own mind, it’s really much bigger than the actual biology.”

References

  1. Nurk, S.*, Koren, S.*, Rhie, A.* et al. The complete sequence of a human genome. Science 376 (6588), eabj6987 (2022). *authors contributed equally
  2. Kundaje, A. et al. Integrative analysis of 111 reference human epigenomes. Nature 518 (7539), 317-30 (2015).
  3. Altemose, N. et al. Complete genomic and epigenetic maps of human centromeres. Science 376 (6588), eabl4178 (2022).
  4. Jain, C. et al. Weighted minimizer sampling improves long read mapping. Bioinformatics 36 (Suppl. 1), i111–8 (2020).
  5. Gershman, A. et al. Epigenetic patterns in a complete human genome. Science 376 (6588), eabj5089 (2022).
  6. Hoyt, S. et al. From telomere to telomere: The transcriptional and epigenetic state of human repeat elements. Science 376 (6588), eabk3112 (2022).

About the Author

  • Aparna Nathan Headshot

    Aparna is a freelance science writer pursuing a PhD in bioinformatics and genomics at Harvard University. She uses her multidisciplinary training to find both the cutting-edge science and the human stories in everything from genetic testing to space expeditions. She was recently a 2021 AAAS Mass Media Fellow at the Philadelphia Inquirer. Her writing has also appeared in Popular Science, PBS NOVA, and The Open Notebook.

Related Topics

Published In

July/August 2022 : Volume 18 : Issue 7
Volume 18 - Issue 7 | July/August 2022

July/August 2022

July/August 2022 Issue

Loading Next Article...
Loading Next Article...
Subscribe to Newsletter

Subscribe to our eNewsletters

Stay connected with all of the latest from Drug Discovery News.

Subscribe

Sponsored

Close-up of a researcher using a stylus to draw or interact with digital molecular structures on a blue scientific interface.
When molecules outgrow the limits of sketches and strings, researchers need a new way to describe and communicate them.
Portrait of Scott Weitze, Vice President of Research and Technical Standards at My Green Lab, beside text that reads “Tell us what you know: Bringing sustainability into scientific research,” with the My Green Lab logo.
Laboratories account for a surprising share of global emissions and plastic waste, making sustainability a priority for modern research.
3D illustration of RNA molecules on a gradient blue background.
With diverse emerging modalities and innovative delivery strategies, RNA therapeutics are tackling complex diseases and unmet medical needs.
Drug Discovery News September 2025 Issue
Latest IssueVolume 21 • Issue 3 • September 2025

September 2025

September 2025 Issue

Explore this issue