Entering the proteome: GenePattern reaches past genomics

Scientists at the Broad Institute release GenePattern 2.0, which includes components that permit analysis of proteomic data.

Jeffrey Bouley
CAMBRIDGE, Mass.—Scientists at the Broad Institute of MIT and Harvard recently released GenePattern 2.0, an enhanced version of the integrative software tool for analyzing gene expression data. But more than being a simple update of this tool, this new version of GenePattern for the first time includes components that permit analysis of proteomic data and improve its ability to capture and recall individual steps in the analytic process.
 
"The strengths that GenePattern brings to gene expression analysis can now be similarly realized for proteomic data," says Michael M. Reich, manager of cancer informatics development at the Broad Institute and the group leader for GenePattern. "We have also added features to improve its ability to capture and reproduce analyses, which is vital to researchers both individually and as a community."
 
These functions are important to drug discovery operations, he says, but one cannot discount the fact that the software is also free, which in itself is a potential cost-saver for drug discovery researchers. Currently, there are over 2,300 registered GenePattern users worldwide, including more than 500 institutions and 30 pharmaceutical and biotech companies.
 
"While gene expression analysis was the first foundation area for this software, even from the beginning we designed the infrastructure to be as flexible as possible and as comprehensive as possible down the line," notes Jill P. Mesirov, CIO and director of computational biology and bioinformatics for the Broad Institute.
 
The focus on gene expression tools was motivated by the fact that gene expression was the primary data type the Broad Institute was analyzing and that many of the potential users of the software were analyzing. But to meet the needs of drug discovery and drug development researchers the functionality of the software has broadened over the years.
 
"Our vision is really to make a much more comprehensive computational biology environment that can handle lots of different modalities of data," Mesirov says.
 
GenePattern also reportedly provides multiple user interfaces to accommodate the needs of researchers with programming experience as well as those without such experience.
 
"We did this for ease of functionality and sharing of data, and also because this dual nature of researchers—programmers and non-programmers—really mirrors the community we have here at the Broad," Mesirov says.
 
Going forward, one of the major challenges to overcome in GenePattern is dealing with the large size of data sets being analyzed, Reich says. That means future versions of the software not only need to be able to address the ability to display data that can easily get into the gigabyte range but also effectively transfer those large data sets from one location to another quickly and accurately.
 
Future enhancements to  GenePattern will include analyzing SNP data, such as copy number estimation, loss of heterozygosity determination, and the identification of chromosomal amplifications and deletions.

Jeffrey Bouley

Subscribe to Newsletter
Subscribe to our eNewsletters

Stay connected with all of the latest from Drug Discovery News.

February 2023 Front Cover

Latest Issue  

• Volume 19 • Issue 2 • February 2023

February 2023

February 2023 Issue