For example, one of the best-known rules is Lipinski's Ruleof Five (RoF), which is based on four easily calculated properties:
- Molecular weight (MW) <500
- Logarithm of octanol: waterpartition coefficient (logP) <5
- Number of hydrogen bonddonors (HBD) < 5
- Number of Hydrogen bondacceptors (HBA) < 10
However, it is important to note that this rule was definedfor compounds absorbed through the human intestine, and it was never intendedas a general definition of drug-likeness. The rules for compounds intended forother routes such as inhalation, topical or intravenous administration might bequite different.
Many other, similar rules have been proposed that definedrug-like properties. For example, Veber, etal. (Journal of Medicinal Chemistry,2002) found that the majority of compounds with good oral bioavailability inrats had less than 10 rotatable bonds (ROTB) and polar surface area (PSA) lessthan 140 Å2. Others have explored different drug discovery objectives: Hughes, et al. (Bioorganic and Medicinal Chemistry, 2008) found that compounds withlogP less than 3 and PSA greater than 75 Å2 were six times less likely toexhibit adverse events in in-vivotolerance studies than compounds failing to meet both of these criteria;Ritchie, et al. (Drug Discovery Today, 2009) looked at several "developability"requirements such as solubility, serum albumin binding and inhibition of thehERG ion channel and found relationships with the number of aromatic rings(AROM) in a compound, suggesting that AROM greater than 3 significantlyincreases the risk of compound attrition; and finally, Lovering, et al. (Journal of Medicinal Chemistry, 2009) related the "flatness" ofcompounds, as defined by the fraction of carbons that are sp3 hybridized(fSP3), to their success in clinical development.
These rules for drug-like properties are appealing becausethey are very simple to apply. The properties to which they relate are easilycalculated, and it is easy to see when a compound meets the criteria in eachcase. They provide guidance regarding potential risk factors and indicatestrategies for improvement, if an issue should be encountered.
However, due to the apparent simplicity of these rules,there may be a tendency to apply them as filters or hard cutoffs when selectingcompounds. This is a risky approach because the simple compound properties onwhich the rules depend have only a weak correlation with the objectives towhich they relate, such as oral bioavailability or toxicity.
For example, does a compound with a logP of 5.1 have asignificantly lower chance of success than one with a logP of 4.9? By rigidly applyingthese criteria, a project team runs the risk of rejecting good compounds thatrepresent potentially valuable opportunities. The RoF states that 80 percent ofcompounds with good oral absorption fail no more than one of the four criterianoted above, i.e., failing only onecriterion should not be enough to reject a compound. Of course, this also meansthat 20 percent of orally absorbed drugs fail even this more relaxeddefinition. Similarly, the majority of non-oral drugs also meet the RoF.Therefore, the RoF helps to improve the odds of success when looking for anorally absorbed compound, but is far from a guarantee of success.
The dangers of hard filters are further exacerbated by thefact that some of the properties on which these rules are based havesignificant uncertainty; for example, a prediction from a good model of logPhas an uncertainty of approximately 0.4 log units. Therefore, drawingconclusions based on differences of less than this range would beinappropriate.
Moving away from hardcutoffs
Instead of applying hard cutoffs, it would be better to rankcompounds according to the similarity of their properties to the majority ofdrugs. In a recent paper, Bickerton, etal. (Nature Chemistry, 2012)proposed a new metric, the Quantitative Estimate of Drug-likeness, or QED. Toderive this, the authors generated "desirability functions'" for eightproperties commonly used to define drug-likeness: MW, logP, HBA, HBD, PSA,AROM, ROTB and ALERTS (the number of matches to undesirable functionalities). Adesirability function maps the value of a property onto a scale between 0 and1, where a desirability of 1 indicates an ideal value of the property and adesirability of 0 corresponds to a completely unacceptable outcome. Thedesirability functions used in QED were fitted to the distributions of theeight properties for a set of oral drugs, so that a desirability of 1 wasassigned to the property values of oral drugs that occur most commonly, and 0to property values that are not observed. The QED for a new compound can thenbe calculated from the desirability of the eight properties by taking theirgeometric mean, giving an overall value between 0 and 1 that indicates thesimilarity of the compound to the majority of oral drugs. Using this, compoundscan be ranked according to their drug-likeness, instead of simply being labeleddrug-like or non-drug-like.
The authors of the QED paper showed that the QED metriccorrelated well with the subjective opinions of medicinal chemists regardingthe attractiveness of compounds for undertaking further chemistry. They alsoshowed that the QED performed well in distinguishing oral drugs from othersmall-molecule ligands taken from the Protein Data Bank.
Drug-like or likelyto be a drug?
QED, like many other definitions of drug-likeness, considersthe properties that drugs have in common. The assumption is that a compoundwith similar properties to successful drugs will have a lower risk thancompounds with one or more significantly different properties. This makesintuitive sense, because there is no precedence for the success of a compoundwith radically different properties. However, having a similar value of aproperty does not necessarily increase the chance of a compound being asuccessful drug; if the distribution of a property for successful drugs is thesame as that for all compounds that have been explored, the property will notprovide any information about the chance of success of a compound. In otherwords, we would like to identify the properties that make successful drugsdifferent from other compounds and hence increase the chance of success.
To achieve this, the property distributions of successfuldrugs can be compared with those of unsuccessful compounds explored in thesearch for a drug. A branch of mathematics called Bayesian probability allowsfor rigorous comparison of the probability of a drug having a property value,with the probability of an unsuccessful compound having the same value, andhence estimate the relative likelihood of success of a compound with thatproperty value.
The relative likelihood can identify the properties that aremost important in distinguishing drugs from non-drugs. For example, if oraldrugs are compared with other compounds explored in the course of drugdiscovery projects, then of the eight properties listed above for QED, theproperties that best distinguish these sets of compounds are MW, PSA and ROTB.Low values of these properties can increase the likelihood of success by morethan a factor of two. Conversely, this analysis indicates that HBA and HBD havea limited impact on the chance of success of a compound as an oral drug.
Furthermore, the relative likelihoods derived from the eightproperties may be combined into a single metric by taking their geometric meanto calculate a new metric, the Relative Drug Likelihood (RDL), which can beused to rank compounds in a similar manner to QED. When trained to distinguishoral drugs, as described above, the RDL outperforms rules and metrics basedonly on similarity of properties.
Words of caution
These analyses of property trends across a wide range of chemistriescan provide some general guidance on appropriate compound properties. However,the most relevant information will come from comparison of successful andunsuccessful compounds explored for a specific objective, such as a target ortherapeutic class. The properties of compounds intended for use as anantibiotic provide little information about what makes a successful kinaseinhibitor and may even be misleading. Therefore, when sufficient data is available,metrics such as QED or RDL should be "trained" to identify the specificrequirements for a given project.
Finally, it must be emphasized that high similarity to successful drugsor relative likelihood is far from a guarantee of success, and in some cases, itmay be necessary to explore "outside of the box," particularly for new targetclasses such as protein-protein interactions. All the evidence suggests thatthe absolute chance of any single compound becoming a successful drug is verylow. Therefore, all of these rules and metrics should be used only asguidelines, and they should be given appropriate weight when making decisions.They are most useful early in a project when there are many compounds fromwhich to choose. Once data on structure-activity relationships is available,this provides much better information to guide the design and selection ofcompounds. Combining these data to identify compounds with the optimal balanceof potency, physicochemical, ADMET and safety characteristics—for example,using a multiparameter optimization approach (Current Pharmaceutical Design, 2012), will enhance the chance ofsuccess much more than by considering simple drug-like properties.
Dr. Matthew Segall isthe CEO of Optibrium Ltd., a software developer based in Cambridge, England.Segall's prior positions include the associate director of Amitro, ArQule Inc.and Inpharmatica and senior director of BioFocus DPI's ADMET division. He hasan M.Sc. degree in computation from the University of Oxford and a Ph.D. intheoretical physics from University of Cambridge.