Blueprint debuts data curation
The Blueprint Initiative Asia recently introduced directed data curation services that are designed to deliver to clients defined biomolecular interaction data to facilitate scientific research efforts. These curation services are expected to help scientists identify important biological molecules and complexes, such as drug targets, diagnostic biomarkers, and metabolic processes critical to healthcare.
SINGAPORE—The Blueprint Initiative Asia recently introduced directed data curation services that are designed to deliver to clients defined biomolecular interaction data to facilitate scientific research efforts. These curation services are expected to help scientists identify important biological molecules and complexes, such as drug targets, diagnostic biomarkers, and metabolic processes critical to healthcare.
Blueprint Asia, an affiliate of Toronto-based Blueprint North America and Mt. Sinai Hospital, is supported by Singapore's Economic Development Board and Sun Microsystems. The Blueprint Initiative began in 1999 to develop and populate the Biomolecular Interaction Network Database (BIND).
BIND is comprised of more than 177,000 protein-protein, protein-RNA, small molecule and genetic interaction records, and Blueprint curators will work with customers to identify, annotate, and cross-reference molecular interaction information from the peer-reviewed literature as it relates to a specific disease state or biological condition. The launch of the data curation services with Blueprint Asia brings that Singapore-based entity, which began operations in spring 2004, up to date with Blueprint North America's curation services, which began in late 2002 or early 2003.
BIND was developed along the same lines as GenBank, a database of nucleotide sequences maintained by the National Institutes of Health, says Eric Andrade, managing director–global for The Blueprint Initiative. "For a while, there were 40 or 50 mom-and-pop databases with particular organisms or mechanisms for protein, small molecule and genetic interactions," he says, "and BIND, like GenBank, was an attempt to cover the waterfront."
Blueprint's Asia-based curation services, like those in North America, will help both to give customers a proprietary leg-up on their competition, while also leading to better data population for BIND.
As Andrade notes, early efforts to populate BIND focused on recently published journal data, which meant that "we sort of left behind the volume of historical information, and the effort to get that information into BIND is called backfilling."
"The key to BIND's success has been not just assembling scientifically relevant biomolecular data, but also serving up that data to biologists in an easy-to-use, readily compatible, consistent format," says Dr. Christopher Hogue, Blueprint's principal investigator. "In identifying this information, however, we have traditionally focused more on its content than its context."
To ensure curators keep up with the most cutting-edge information and historical information and provide it in a more relevant manner, Blueprint now also focuses on particular diseases or mechanisms of interest to customers and ultimately uses that data to populate BIND.
"Those customers come to us to assemble for them historical interaction information with regard to, say, diabetes or prion-related data or something else," Andrade says. "We assemble a team of data curators who are peeled out from the general BIND efforts and dedicated to these customer efforts for six months to a year. The customer can then view that information for, say, six months on a privileged basis behind their own firewall—a very economical way for them to get targeted data—before the data is moved into the public version."
"There are a number of databases out there, but one of the hallmarks of what is done at Blueprint is that all the curators are master's and PhD folks who know the information that is critical to researcher doing experimental work," Hogue says. "In a lot of databases out there, it's simply data mining from the literature. We actually understand the science and biology behind what's being done, and that makes for better-quality data."
Although text mining is a valuable part of the efforts for Blueprint Asia and Blueprint North America, it cannot decipher figures, points out Brian Yates, managing director of Blueprint Asia.
"Each record is hand-curated and annotated by two curators to ensure that spurious information doesn't find its way into BIND records," he says.
Although Blueprint Asia's curation efforts will focus on Asian companies' and researchers' needs, the data goes to the same common BIND database as informationgenerated through Blueprint North America.