Under great strain

OpGen partners with U of Maryland to develop a microbial sequence database

Lloyd Dunlap
Register for free to listen to this article
Listen with Speechify
0:00
5:00
GAITHERSBURG, Md.—Using technology that may shed light on the nature of the recent E. coli outbreak in Germany, researchers at the University of Maryland Institute for Genome Sciences (IGS) and OpGen Inc., a whole-genome DNA analysis company, will collaborate to develop a database of high-quality, finished, annotated microbial sequences.

Under a collaboration agreement announced last month, IGS will provide clinically characterized microbial samples and sequencing data from microbial genomics studies, including from the U.S. National Institutes of Health's Human Microbiome Project (HMP) and Genomic Sequencing Center for Infectious Diseases (GSCID). OpGen will provide optical maps and sequence finishing technology.

Sequencing and sequence databases are becoming more important in microbiology research and clinical diagnostics, notes OpGen CEO Doug White, and accurate sequence data in these databases is essential. While next-generation sequencing technologies have enabled rapid and low-cost access to sequence data, these technologies do not provide insight into the microbial genome architecture and often provide an incomplete or inaccurate view of the complete microbial genome, he states. OpGen's Argus Optical Mapping System provides high-resolution comparative genomics, whole-genome sequence assembly and strain characterization into the laboratory in a cartridge-based, automated platform. The system allows researchers to investigate microbial structure, function, diversity and genetics—without the need for amplification, PCR, cloning, paired-end libraries, pure isolates or genomic specific reagents.

"Inclusion of optical mapping for the characterization of genomes will raise the standard of high-quality genome sequence data and will be of extraordinary value given the unprecedented amount of next-generation sequencing of clinically relevant organisms. We are using this technology for validation of our de novo sequencing projects, and anticipate that these will serve as an extraordinary set of reference organism templates to be used by the large number of resequencing efforts worldwide," comments Dr. Claire Frasier-Liggett, director of IGS and professor of medicine, microbiology and immunology at the University of Maryland School of Medicine.

"We believe that our optical mapping whole genome analysis capabilities and bioinformatics products are a perfect complement to next-generation sequencing for assembly and finishing. We look forward to working closely with IGS and Frasier-Liggett's group as we continue to advance the understanding of clinically relevant microorganisms in disease," says OpGen's Doug White.

Using the Argus system, researchers lyse DNA to obtain long (250 kilobases to 1 million kilobases) fragments, which are then stained with a fluorescent stain and imaged with a fluorescent microscope. Each piece of DNA is unique, based on where the cut sites are. Finally, the fragments are reconstructed into a whole-genome map—of E. coli, for example—in only 24 hours.

White says the technology is useful in two areas: quick sequencing and finishing, and strain typing and comparative genomics to detect differences between isolates.

Applications to expand optical mapping technology to large genomes and clinical diagnostics are currently in development, and OpGen has active collaborations underway with a number of other partners, including BGI (formerly the Beijing Genomics Institute) where investigations have begun in plant, animal and human DNA. A successful startup is now moving into beta testing. At the Sanger Institute, for example, parasitic microbials are being sequenced. Both OpGen collaborators have appreciated the technology's speed and accuracy, White says.

Finally, at the University of Münster, Germany, the current E. coli outbreak is being studied and six isolates have been identified for assembly and finishing. Asked if the E. coli strain is showing significant variation from typical outbreaks of the bacterium, White demurs, noting that a journal article is in process and should be published soon. The findings, he implies, will be interesting.
 


IGS says Human Microbiome Project Data is available to research community

BALTIMORE, Md.—The Institute for Genome Sciences (IGS) also announced last month that the Human Microbiome Project (HMP) is releasing reads and assembled sequences from whole-metagenome shotgun sequencing of 690 microbiomes and about 72 million reads from targeted 16S sequencing of 5034 microbiomes from healthy human subjects for use by the scientific community via the HMP Data Analysis and Coordination Center (DACC).

The DACC is located at the IGS. The HMP is funded through the National Institutes of Health Common Fund's Roadmap for Medical Research.

Since its launch in 2008, the HMP has generated a huge volume of sequence, annotation and metadata. The HMP DACC is charged with the task of organizing this information and facilitating its use in analysis by researchers outside the HMP.

Now, on behalf of the HMP, the DACC is releasing both processed reads and assemblies from whole-metagenome shotgun sequencing for an initial group of 690 metagenomes from a subset of more than 17,000 samples collected from 300 healthy human volunteers. Fifteen body habitats are represented in this collection. Assembly was carried out by the HMP Assembly Working Group.

"These data provide the research community with a key resource for their human metagenomic studies, facilitating research into complex human diseases," says Dr. Owen White, director of bioinformatics at the IGS and principal investigator of the DACC.

The DACC is in the process of generating a gene index of all proteins predicted from these assemblies, which will be made available to the public upon completion. In addition, about 72 million reads corresponding to deconvoluted, trimmed 16S sequences from 5034 samples are also available. 16S variable region V3-5 was sequenced for all 5034 samples, with variable regions V1-3 and V6-9 also sequenced for subsets of the samples. Eighteen body sites are represented in this 16S collection.

Sampling of the subjects was done at the Baylor College of Medicine and Washington University, the J. Craig Venter InstituteE and the Broad Institute carried out sequencing.

The whole-metagenome read and assembly data is available at www.hmpdacc.org/HMASM/. The 16S reads are available at www.hmpdacc.org/HM16STR/.


Lloyd Dunlap

Subscribe to Newsletter
Subscribe to our eNewsletters

Stay connected with all of the latest from Drug Discovery News.

March 2024 Issue Front Cover

Latest Issue  

• Volume 20 • Issue 2 • March 2024

March 2024

March 2024 Issue