As any scientist stepping into a new field knows, catching up on what can be decades worth of literature is a daunting task. Chasing down protocols from methods sections that cite progressively older papers is an art in itself. I learned very quickly in my first graduate school rotation that if I tried to print out every journal article I wanted to read, I was going to run out of printer paper — fast.
It doesn’t help that over the years, the number of scientific studies published has grown considerably. In 2022, researchers published 5.4 million academic papers compared to just 4.18 million in 2018 — an increase of more than 20 percent in just five years. As more journal articles get added to the public record, it can be hard to keep up with all the advances.
Sometimes, new findings don’t even make it into published papers. Investor support could move to a different research area, or a student could graduate with an unfinished project, leaving potentially important results sitting in spreadsheets and databases. In this way, promising discoveries can hang in a kind of scientific stasis, making it very easy for people to forget about them or even miss them altogether.
I was thinking about this idea of scientific memory loss as I put together my recent DDN Dialogues podcast episode about researchers who search for forgotten antibiotics in medieval and early modern medical books. Forget about digging into papers from the past 30 years; they went back more than 1,000 years. Their work uncovered a historical remedy for an eye stye that killed modern-day bacteria as well as the biofilm-killing power of the ancient combination of honey and pomegranate vinegar called oxymel. They showed just how important it is to look back at older work to see what we might have missed.
While it’s not feasible to pore over millions of papers and historical texts manually, new tools incorporating artificial intelligence and machine learning can make that process much faster. In fact, the same researchers testing medieval remedies recently used data mining and network analysis to screen through a 15th-century text to match medicinal ingredients with microbial infection symptoms. Now, by training machine learning models on data that scientists have already collected, these neural networks can predict molecules that will have beneficial drug properties, suggest already-approved drugs for repurposing, or even create brand new ones.
With these new tools, it will be easier to uncover discoveries that may have gotten buried or forgotten about over time. Languishing data sets could get a new analysis, and passed-over pipelines could get a second look. It’s exciting to see that even as research pushes forward, cutting-edge technologies are giving scientists a way to look back and find something new.
Reference
- Connelly, E., del Genio, C.I., and Harrison, F. Data Mining a Medieval Medical Text Reveals Patterns in Ingredient Choice That Reflect Biological Activity against Infectious Agents. mBio 11, e03136-19 (2020).