The idea is to combine the analytical capabilities of bothcompanies to efficiently extract complex life science knowledge in acomputable, structured, biological expression language format that can be usedto interpret large-scale experimental data in the context of publishedliterature.
That format is BEL, a structured language designed torepresent scientific findings in a computable form with supporting contextualinformation, such as tissue, disease, species and publication. BEL isuse-neutral, notes Dr. David Milward, chief technology officer at Linguamatics,and it articulates an idea in a manner that is "unambiguous, terse and conveysthe facts and associated contexts without loss or ambiguity." BEL, along withthe BEL Framework, is available through a portal to the scientific community topromote the collection, sharing and interchange of structured scientificknowledge. Selventa's discovery platform operates on top of a scientificknowledge base made up of a set of BEL statements.
"The Selventa and Linguamatics collaboration shows howprecise, detailed information can be automatically extracted from theliterature and provided in a format suitable for further analysis andreasoning," says Milward. "This will allow reuse of knowledge from theliterature, at greater scale and speed."
Much of the business and the future of Selventa are tiedinto biomarker discovery and personalized healthcare, David de Graaf, CEO ofSelventa, tells ddn.
"The way we get there is having qualified knowledgeavailable to us and to other users and comparing it to patient data sets. It'sa matter of pulling together prior knowledge in a usable manner with theanalytics on top," he says.
He notes that well-structured knowledge is already beingcustomized within the scientific realm by organizations that are able togenerate knowledge bases in specific areas, but in many cases they are usingresources in China and India, "and we can't directly compete with that," headds. "But well-quantified and organized knowledge is something our clientsneed and that drew us to Linguamatics so that this general knowledge out therecould be put in a more terse and usable form."
By using NLP-based capabilities to efficiently identify andextract relationships hidden in unstructured text and generate structured datafor comprehensive biological investigation and analysis, the I2E platform issaid to offer dramatically increased speed, scale and reproducibility, and thepossibility to efficiently go back into a textual data source to pull outadditional information that has become relevant.
"This partnership is a great strategic fit to facilitate therepresentation of complex biological knowledge that can be recycled andmaximized through our analytical platform," said de Graaf in the news releaseabout the deal. "Collaborating with Linguamatics will enable rapid yetcomprehensive investigation of new areas of biology by extracting computable knowledgefrom unstructured text. This will lead to innovation on many fronts, such asnext-generation sequencing, where well-structured information for reasoning hasbeen limited."
One of primary goals for Selventa in this partnership is tobe able to stratify patients using biomarkers.
"We're doing a lot of this kind of work through the BELinitiative with Pfizer and along with Linguamatics as well, de Graaf says. Hetells ddn that his company has also talked to other people in theknowledge space, whether publishers or other makers of knowledge bases, to getthe best data possible.
"When customers acquire their data sets they often acquirebroad assets that are shallow. They cover a lot of territory, like everythingabout clinical trials or a particular kind of chemistry, but what they don't dois get what they often really need, which might be everything relevant in aparticular area, like multiple sclerosis or breast cancer," de Graaf says. "Soyou want to go relatively narrow but much deeper by integrating resources, andthis is where Linguamatics helps us meet customers' needs. We have a set ofanalytic tools that analyze prior experimental data and compare to your currentset of experiments, and Linguamatics provides us with a platform for generatingwell-quantified knowledge from that."
Having worked at companies like AstraZeneca and BoehringerIngelheim, de Graaf had previously been involved in the evaluation andacquisition of NLP tools, and says that he came into Selventa already knowingfolks at Linguamatics, "and far as I'm concerned, they are a premier providerof NLP solutions," he says. "Unlike with other NLP platforms, where they arenot expandable, Linguamatics meets our needs because its flexible—we knew ourplatform would require lots of tweaking, and knocking on their door to gettechnology that could handle that was just logical."
Looking toward the future, de Graaf says he plans to workwith the "best of the best" to implement not only a set of tools to feed intoBEL but also to discover and analyze biomarkers, better stratify patients andmore.