Speaking the same molecular language
in the age of complex therapeutics
When molecules outgrow the limits of sketches and strings, researchers need a new way to
describe and communicate them.
Every discipline has its own language for sharing ideas. Musicians
use sheet music to turn sound into symbols any performer can
read. Programmers write code to transform logic into instructions
a computer can execute. In biopharmaceutical research, chemists
and biologists had long relied on sketches and formulas that turn
molecules into lines, letters, and symbols. This approach worked
well when most medicines were relatively simple small molecules.
Today, however, the therapeutic landscape is far more diverse.
Researchers design drugs made of peptides, proteins, oligonucleotides,
and complex conjugates. Conventional notations capture
fragments of these structures but fall short of representing their
full complexity in a way that everyone can understand. HELM —
the hierarchical editing language for macromolecules — steps in
to close this gap, giving scientists a precise shared language for
modern therapeutics.
Why scientists needed a new molecular script
For decades, scientists had reliable tools for describing molecules.
Structural formulas, connection tables, and string-based
notations recorded the essentials of small molecules with precision
and economy (1,2).
As therapeutics grew more complex, these familiar notations
began to show their limits. For antibody–drug conjugates (ADCs),
conventional notations can depict the small-molecule payload and
the linker, but they cannot represent the protein’s full structure, the
precise attachment site, or the connectivity between components.
In practice, researchers describe each fragment separately. The
same challenge arises with cyclic peptides or chemically modified
nucleic acids, where branching, crosslinks, or unnatural monomers
lie beyond the reach of standard representations.
This gap creates more than just an inconvenience for multidisciplinary
teams. Chemists may describe a drug candidate atom by
atom, while biologists think in sequences of amino acids or nucleotides,
and bioinformaticians may treat it as digital sequence data
stripped of chemical context. Without a shared notation, critical
details are easily lost in translation.
Faced with these challenges, pharmaceutical researchers
designed a new way of writing molecules that could keep up with
the expanding universe of therapeutics. In 2012, they introduced
HELM to the scientific community (3). This solution quickly grew
into a shared standard, offering scientists a common script for
describing modern therapeutics in full detail.
How HELM works
Like building LEGO® blocks, HELM breaks down molecules into a
hierarchy of components. At the foundation are monomers — building
blocks such as amino acids and nucleotides. These combine into
polymers, forming peptides, proteins, or strands of nucleic acids.
Connections then define how polymers link together — whether
in linear chains, cyclic loops, or complex branched architectures.
Layer by layer, HELM assembles molecules in a way that preserves
every level of detail.
This framework makes HELM flexible enough to represent not
just natural biomolecules but also modified or synthetic ones. A
CREDIT: REVVITY SIGNALS
Various representations of a single monomer (Cysteine) and of the cyclic peptide Oxytocin. HELM representation allows for depicting structural elements that the FASTA notation
cannot show (cyclisation, interconnection, non-natural monomers, etc.).
cyclic peptide can be represented with a closing connection that
captures its ring structure. A small interfering RNA (siRNA) can be
written as two complementary RNA chains, each annotated with
backbone or sugar substitutions at specific positions alongside
their hydrogen bonds. An ADC can be described in its entirety
by defining the protein backbone, the small-molecule payload,
and the chemical linker, including the precise attachment points
that connect them (3).
HELM was also designed for dual readability. Scientists can
interpret the notation to understand the logical structure of a
molecule, while bioinformatics tools can process the same record
for visualization, analysis, or database storage, making HELM a
universal script for complex therapeutics.
Bringing HELM to life in everyday research
HELM is most effective when it becomes part of a researcher’s
daily workflow. Tools like ChemDraw™ make this transition easy
(4). Instead of learning a new system from scratch, scientists
can work in the same environment they already use for small
molecules, now extended to macromolecules.
Signals ChemDraw includes a dedicated HELM editor, complete
with curated libraries of amino acids, nucleotides, sugars,
and common chemical modifications. Researchers can drag
and drop monomers, assemble them into polymers, and define
linkages or crosslinks graphically. Behind the scenes, Chem-
Draw automatically generates the HELM notation and creates
a precise digital record.
The integration works both ways. An existing HELM string can
be imported to produce an editable molecular diagram, while a
peptide, siRNA, or antisense oligonucleotide drawn in ChemDraw
can be exported as HELM notation for use in databases or electronic
lab notebooks. This bidirectional flow allows molecules to
move seamlessly between graphical sketches and machine-readable
records with accuracy and consistency. Signals One™ further
extends this capability by allowing scientists in multidisciplinary
teams to store, search, and analyze drug candidates within a
shared environment.
In an era defined by complex therapeutics and cross-disciplinary
collaboration, HELM enables discovery teams to speak
the same molecular language. Embedded into everyday tools
like ChemDraw, it ensures innovations move forward clearly and
consistently, from the first sketch to the finished drug.
References
1. Nguyen-Vo, T.-H., Teesdale-Spittle, P., Harvey, J. E. & Nguyen, B. P. Molecular representations
in bio-cheminformatics. Memetic Comp. 16, 519–536 (2024).
2.Evans, D. A. History of the Harvard ChemDraw Project. Angewandte Chemie
International Edition 53, 11140–11145 (2014).
3. Zhang, T., Li, H., Xi, H., Stanton, R. V. & Rotstein, S. H. HELM: A Hierarchical Notation
Language for Complex Biomolecule Structure Representation. J. Chem. Inf.
Model. 52, 2796–2806 (2012).
4. ChemDraw | Revvity Signals Software at <https://revvitysignals.com/
products/research/chemdraw>
HELM and ChemDraw representations of the antisense oligonucleotide drug mipomersen (used to treat homozygous familial hypercholesterolemia, now withdrawn). Non-natural
molecular features are highlighted in green.
CREDIT: REVVITY SIGNALS