Colorful bands of DNA are shown against a black background, representing a DNA sequencing gel.

People carrying two copies of recessive genetic variants that cause severe disease early in life are missing from the general population. By looking for these missing mutations, researchers revealed the causes of rare genetic diseases.

Credit: istock/filo

Missing mutations solve a genetic mystery

In a journey that spanned the genomes of more than 150,000 Icelanders, archival samples, and a fetus in utero, researchers not only identified genetic variants missing from the population, but also proved that these missing mutations cause three rare genetic diseases.
Stephanie DeMarco, PhD Headshot
| 9 min read
Register for free to listen to this article
Listen with Speechify
0:00
9:00

At the height of the second World War, US military commanders knew they had a major problem in the air. Many of the pilots that left for battle never returned, and the ones who did landed in planes perforated with bullet holes.

The military needed to reinforce the planes with more armor to prevent the enemy forces from shooting them down so easily, but a plane armored from nose to tail would be too heavy to take off. They had to prioritize the most critical parts.

A drawing of a World War II war plane with red dots representing where bullets struck the plane.
The red dots indicate where bullets struck WWII planes that returned to the Allied airbases after a battle. Abraham Wald suggested that the military should reinforce the planes with armor where bullets had not struck.
Credit: Martin Grandjean, McGeddon, and Cameron Moll

Abraham Wald, a mathematician at the wartime Statistical Research Group handed pilots an outline of a plane and asked them to mark where bullets had struck their aircrafts. Based on the pilots’ notes, the military commanders decided to add armor to the locations on the planes that received the most damage: the wings, the tail, and the fuselage (1).

But Wald said no, they should reinforce the areas with no damage. Assuming that bullets strike every location on a plane, he explained, the planes hit in the most vulnerable spots never made it home (2). They were missing from the data set. Even though the returning planes had been hit by enemy fire, they still arrived home in one piece, suggesting that the spots they’d been hit were not critical.

Unlike the military commanders, Wald took survivorship bias into account when identifying the most vulnerable spots on a plane. Of course, if Wald could have found the missing planes, their damaged sections would have shown him exactly where to put the armor.

About 80 years later, a team of genomics researchers led by Kári Stefánsson and Patrick Sulem at Iceland’s deCODE Genetics searched for the human genetics equivalent of Wald’s missing planes —mutations missing from the population. By combing through the rich sequencing and archival data from the Icelandic population, the researchers identified these missing variants and proved that three of them cause rare genetic diseases, giving patients and their families a long sought-after diagnosis or hope for potential therapeutics (3).

Missing missense mutations

With its remote location and small population, Iceland is uniquely suited to the study of human genetics. Most of the country’s 366,425 inhabitants descended from a small number of ancestors, which led to a higher prevalence of rare genetic variants in the Icelandic population than would occur in larger and more diverse populations.

Because of this, researchers have performed whole genome sequencing on a large portion of the Icelandic population already (4-5). To identify rare genetic variants that cause disease, the team at deCODE Genetics worked with the National Hospital of Iceland to sequence the whole genomes of 764 patients with rare diseases and their families.

“In only about 35% of the cases, we have a solution to the problem,” said Stefánsson. “Then you begin to look at the remaining 65% and begin to ask the question, is it possible among the 65% of cases there are pathogenic variants that have yet to be discovered?”

The easiest disease-causing genetic variants to find are the dominant ones. A child only needs to inherit one disease allele from either their mother or father to present with the disease. Recessive genetic disorders are rarer and more difficult to identify because they require that both the mother and the father give the child the same rare variant.

Stefánsson and Sulem reasoned that if inheriting two copies of a disease allele was so deleterious that the child died during embryonic development or soon after birth, there would be fewer people with both of those alleles present in the population compared to the expected frequency.

Using this approach in a 2015 study, the deCODE Genetics team identified multiple new recessive disease alleles that caused previously undiagnosed rare genetic diseases (6). In this case, they specifically looked for recessive mutations that resulted in the loss of a particular protein.

But not all mutations lead to the complete loss of protein. In the case of missense mutations — where a single amino acid in a protein gets swapped for a different one — the cell still makes a protein, but it makes the wrong protein. Missense mutations don’t always lead to disease, but in rare cases they can.

“For missense mutations, the impact would really depend on the specific genetic change, so that makes everything much, much harder to detect and prove that these missense mutations are really causing a novel genetic recessive disorder,” said Siddharth Banka, a clinical geneticist at the University of Manchester who was not involved in the new study.

With their extensively sequenced and isolated Icelandic population, the deCODE researchers were up for the challenge.

From Iceland to California, the mystery of CPSF3

Armed with sequencing data from 153,054 Icelanders, Sulem and Stefánsson defined a potential missense disease variant as one where they expected at least three people to have both copies of the variant but instead found none. They identified 114 missense variants, 34 of which corresponded to known recessive genetic diseases.

“If we had not found something, we would have been really surprised because logic tells us that we have to find variants in this way, and we did,” said Stefánsson.

When Stefánsson and Sulem compared their list of missing missense variants against the National Hospital of Iceland’s rare disease genomic database, they found multiple patients who had inherited two copies of their newly identified missense variants.

The researchers were particularly surprised to find one of the missense disease variants in the gene CPSF3, which is involved in processing mRNA and transporting it out of the nucleus (7). Scientists had never identified genetic variants of CPSF3 that caused a disease before this study.

A photograph of Patrick Sulem and Kári Stefánsson from deCODE Genetics standing side-by-side.
Patrick Sulem (left) and Kári Stefánsson (right) lead the genomics research program at deCODE Genetics.
Credit: deCODE Genetics

Sulem and Stefánsson found two distantly related patients in their clinical database with the same missense mutations in CPSF3 who both had an intellectual disability and microcephaly among other similar features. When they compared the whole genome sequences of both patients, they could find no other genetic explanation for the patients’ disease symptoms.

To find additional patients with missense alleles in CPSF3, they searched their deCODE Genetics genealogical database which includes relatedness information for almost all Icelanders from the last 100 years (4). They identified three couples where both members carried one missense allele of CPSF3. Of the ten children born to these three couples, four died before age eight and had features similar to the first two patients that Sulem and Stefánsson identified.

The researchers obtained tissue samples from two of the children who had died because Icelandic hospitals have kept a tissue archive of all autopsies and biopsies since 1950. Sulem and Stefánsson found that both children, who happened to be related to one of the patients identified in deCODE Genetics’ rare disease database, expressed two missense copies of the CPSF3 gene.

To definitively prove that CPSF3 caused this rare genetic disorder, Sulem and Stefánsson needed to identify the mutation in a non-genetically related patient. They uploaded the CPSF3 genetic variant information to the online service, GeneMatcher, which helps clinicians and researchers working on the same gene find each other and share information. Very quickly, they found two patients at Children’s Hospital of Orange County (CHOC) in Southern California who had two copies of a different missense variant in CPSF3 but who both had very similar disease features.

“We were very fortunate that the initial contact was very quick after we found this match, and the communication has been very easy within the collaboration,” said Rebekah Barrick, a clinical genetic counselor at CHOC and an author of the study. The deCODE Genetics and CHOC researchers found no other genetic explanation for these two patients’ disease features, which led them to conclude that this missense mutation in CPSF3 caused their disease. Barrick and her team were excited to tell the patients’ family that they had at last found the gene responsible for the disease.

“We've followed them for many years, always looking for answers using available testing and technology. But finally, coming to what we think is the answer for them — they were excited. They have a name or at least a gene to understand what's been going on,” Barrick said.

Identical GNE variants are worse together

Researchers have known since 2001 that recessive missense mutations in the gene GNE cause the rare, muscle wasting disease GNE myopathy, which typically manifests between age 20 and 40 (8). While people with GNE myopathy have two missense alleles that cause their disease, these alleles always have mutations in different places in the GNE gene.

When Stefánsson and Sulem discovered that there was no one in the Icelandic population with two copies of the same specific missense mutation in the GNE gene, they were puzzled. They had been working with a couple whose daughter had died soon after birth. Both parents expressed one copy of the same GNE missense allele, and through post-mortem sequencing of tissue from their daughter, Sulem and Stefánsson confirmed that she carried  both copies of the GNE missense allele.

“We went back to the clinician, and they said, ‘that doesn't fit because we're expecting something happening in the second decade of life.’ They were not necessarily expecting something that early and that drastic, so we had to do more, get confidence, and convince them that there was this deficit in the population,” said Sulem.

Using their missing mutations approach, Sulem and Stefánsson expected that if the missense mutation was not causing a disease, there should be at least six people with two copies of it in the Icelandic population, but there were none. At the time they investigated this, the daughter’s mother was pregnant with another child. During the mother’s 12-week ultrasound, doctors noticed thickening around the neck of the child, potentially suggesting GNE myopathy. Sulem and Stefánsson sequenced a sample from the developing fetus and confirmed that the fetus also expressed two copies of the same GNE missense mutation as their sister.

“Sometimes this mutation was seen, but together with probably a milder version. It's a combination now of the two” identical missense alleles that cause this more severe version of the disease, Sulem said. “If you think of a continuum, we're probably reaching one end.”

Missing births with GLE1 mutations

Encouraged by their identification of missense disease alleles in CPSF3 and GNE, Stefánsson and Sulem searched for other missing missense disease alleles. They noticed that no one harbored two copies of a specific mutation in the gene GLE1 when at least ten were expected in the population.

Similar to GNE, researchers had shown that when the GLE1 missense allele that Sulem and Stefánsson found was expressed with a different GLE1 missense mutation, it caused death right before or soon after birth (9).

To determine why there were no people with two copies of the same GLE1 missense allele, Sulem and Stefánsson identified 17 couples who each carried one copy of the GLE1 allele. To their surprise, Sulem and Stefánsson found that none of the couples had lost a child soon after birth.

Stefánsson cautioned, “When you're working with extremely rare phenomena, you have to live with the fact that absence of evidence is not evidence of absence.”

They hypothesized that having two copies of this missense mutation caused the fetus to die early during development, likely during the first trimester of pregnancy. In fact, in interviews with the couples and in reviewing their medical records specifically for a note of early miscarriage, Sulem and Stefánsson learned that more than 60% of the women in these couples reported having a miscarriage between 5- and 8-weeks of pregnancy, compared to the 12-24% rate of miscarriage at this timepoint in the general population (10).

While they did not have fetal samples to sequence, Sulem and Stefánsson hypothesized that expressing two of these identical missense mutations leads to very early spontaneous abortions.

“It really shows the power of what can be achieved if [there is] very good coverage of genotyping across a given population,” said Banka. “This approach not only identifies new genetic disorders, but also enables [them] to identify genetic disorders, which in any other way would be probably very challenging to identify.”

Banka was impressed with the thoroughness of the deCODE Genetics team’s study and the number of steps the researchers took to prove the causality of each missense variant they found. As a next step, he is interested in learning more about the function of the missense mutations they found, for example, by assessing the severity of the mutations in patient cells or animal models.

With their team at deCODE Genetics, Sulem and Stefánsson are working on a follow up study to identify more missing mutations in an even larger population. 

“This is a method that can be applied to figuring out the causes of diseases of early childhood, for example, [and] understanding why spontaneous abortions occur,” said Stefánsson.

With a better understanding of the genetic mechanisms that drive some of these diseases, researchers can develop new treatments for them, especially for the diseases that manifest in early childhood. Depending on the underlying disease mechanism, medication may already exist that can be repurposed for a particular genetic disease.

While many genetic diseases have no cure yet, a diagnosis allows patients and their families relief from going through more diagnostic testing. A diagnosis may also help parents with family planning for future pregnancies and may allow for additional social support for children living with rare genetic disorders.

Banka added, “Even if none of this was possible, that it doesn't alter your reproductive choices, it doesn't alter your management, it doesn't alter the ease of accessibility to help — even if all of that was not possible, having an understanding as to what is the reason for someone's medical problems can be quite therapeutic in itself.”

References

  1. Mangel, M. & Samaniego, F.J. Abraham Wald's Work on Aircraft Survivability. Journal of the American Statistical Association  79, 259-267 (1984).
  2. Wallis, W.A. The Statistical Research Group, 1942-1945: Rejoinder. Journal of the American Statistical Association  75, 334-335 (1980).
  3. Arnadottir, G.A. et al. Population-level deficit of homozygosity unveils CPSF3 as an intellectual disability syndrome gene. Nat Commun  13, 705 (2022).
  4. Gudbjartsson, D. et al. Large-scale whole-genome sequencing of the Icelandic population. Nat Genet  47, 435-444 (2015). 
  5. Jónsson, H. et al. Whole genome characterization of sequence diversity of 15,220 Icelanders. Sci Data  4, 170115 (2017).
  6. Sulem, P. et al. Identification of a large set of rare complete human knockouts. Nat Genet  47, 448-452 (2015).
  7. Dominski, Z. et al. The Polyadenylation Factor CPSF-73 Is Involved in Histone-Pre-mRNA Processing. Cell  123, 37-48 (2005). 
  8. Eisenberg, I. et al. The UDP-N-acetylglucosamine 2-epimerase/N-acetylmannosamine kinase gene is mutated in recessive hereditary inclusion body myopathy. Nat Genet  29, 83-87 (2001).
  9. Said, E. et al. Survival beyond the perinatal period expands the phenotypes caused by mutations in GLE1. Am J Med Genet Part A 173A, 3098-3103 (2017).
  10. Jurkovic, D., Overton, C., Bender-Atik, R. Diagnosis and management of first trimester miscarriage. BMJ  346, f3676 (2013).

About the Author

  • Stephanie DeMarco, PhD Headshot

    Stephanie joined Drug Discovery News as an Assistant Editor in 2021. She earned her PhD from the University of California Los Angeles in 2019 and has written for Discover Magazine, Quanta Magazine, and the Los Angeles Times. As an assistant editor at DDN, she writes about how microbes influence health to how art can change the brain. When not writing, Stephanie enjoys tap dancing and perfecting her pasta carbonara recipe.

Related Topics

Published In

September 2022
Volume 18 - Issue 9 | September 2022

September 2022

September 2022

Loading Next Article...
Loading Next Article...
Subscribe to Newsletter

Subscribe to our eNewsletters

Stay connected with all of the latest from Drug Discovery News.

Subscribe

Sponsored

A blue x-ray style image of a human body is shown with the liver illuminated in orange against a dark blue background.

Harnessing liver-on-a-chip models for drug safety

Discover how researchers leverage microphysiological systems in toxicology studies.  
A person wearing a white lab coat types on a laptop with various overlaid enlarged files shown with plus signs on file folders floating over the laptop screen with a clinical lab shown in the background in grey and white tones.

Enhancing bioanalytical studies with centralized data management

Learn how researchers can improve compliance and efficiency with advanced LIMS solutions.
A 3D-rendered digital illustration of a molecular structure floating among red blood cells in a bloodstream environment.

Explained: How are metabolite biomarkers improving drug discovery and development?

By offering a rich source of insights into disease and drugs, metabolite biomarkers are at the forefront of therapeutic exploration.
Drug Discovery News March 2025 Issue
Latest IssueVolume 21 • Issue 1 • March 2025

March 2025

March 2025 Issue

Explore this issue