New machine-learning approach promises to speed drug design

By Leah Sherwood, The Science Advisory Board assistant editor

December 1, 2021 -- A new approach to machine learning (ML) outperformed current ML methods in drug design, demonstrating its potential in speeding up the drug discovery process, according to research published online November 29 in the Proceedings of the National Academy of Sciences.

The new approach, dubbed "transformation machine learning" (TML) by the research team, made better predictions than traditional ML in three domains that address scientific problems, including drug design.

"In drug design, we found that TML provided insight into drug target specificity, the relationships between drugs, and the relationships between target proteins," wrote the authors, led by Ivan Olier of the School of Computer Science and Mathematics at John Moores University in the U.K.

Traditional ML vs. TML

Traditional supervised ML algorithms are trained on labeled examples (for example, labeled photographs of different animals), from which they learn to recognize intrinsic features (for example, "furry" and "small"). TML instead relies on extrinsic features derived from predictions from ML models trained on other related tasks.

For example, to train a TML model to recognize all known species of animals, with new ones expected to be added, one would start by applying existing prediction models for known species such as cats, rabbits, and donkeys. The outputs of these models would generate new extrinsic features such as "catness," "rabbitness," and "donkeyness," which would then be used to train a metalevel ML model to make predictions using this level of representation. The approach enables TML models to capture attributes of animals not originally encoded such as cuteness (shared by cats and rabbits) and having eyes at the side of the head (shared by rabbits and donkeys).

"Where a typical ML system has to start from scratch when learning to identify a new type of animal -- say a kitten -- TML can use the similarity to existing animals: kittens are cute like rabbits, but don't have long ears like rabbits and donkeys," said Ross King, a professor in Cambridge's Department of Chemical Engineering and Biotechnology, who led the research, in a statement. "This makes TML a much more powerful approach to machine learning."

Promise of drug discovery

The researchers said that TML shows particular promise in the area of drug discovery. Whereas a typical ML approach will search for drug molecules based on intrinsic features such as molecular shape and structure, TML speeds up the process by checking what other ML models convey about a particular molecule.

The paper includes a case study using TML to predict quantitative structure activity relationships (QSAR), a common step in early-phase drug discovery. Given a target (usually a protein) and a set of chemical compounds (small molecules) with associated activities (e.g., inhibition of the target protein), the QSAR task is to learn a predictive mapping from molecular representations to activities. In the TML approach, standard ML methods based on intrinsic descriptors are first applied to existing QSAR prediction tasks, and then their outputs are used as the extrinsic features for a new TML model that can be applied to a new QSAR task.

To evaluate the TML approach on QSAR learning, the researchers trained a variety of ML methods on 2,219 QSAR problems using 1,024-bit molecular fingerprint representations as intrinsic features. They then used the predicted compound activities from the previously learned ML models as extrinsic attributes for the TML QSAR model.

Even though the QSAR datasets had been extensively studied in the literature (using a total of 18 learning methods and six molecular representations, according to the authors), a comparison showed that TML significantly outperformed the best previous results.

"I was surprised how well it works -- better than anything else we know for drug design," said King. "It's better at choosing drugs than humans are -- and without the best science, we won't get the best results."

Do you have a unique perspective on your research related to drug design or machine learning? Contact the editor today to learn more.


Machine-learning system decodes nuclear magnetic resonance spectra of organic crystals
A group of researchers has developed a machine-learning framework that assigns the chemical shifts of organic crystals directly from their 2D structure...
AI drug development market projected to reach $5.1B by 2025
The world market for AI applications in drug development is projected to reach $5.1 billion by 2025, an increase of more than $3.6 billion from the market...
5 ways social media can make drug development more patient-centered
Insights gleaned from analyzing social media can facilitate patient-centered drug development and help spur innovation in drug discovery, according to...
Funding for AI drug development skyrockets in 2021
The market for artificial intelligence (AI) in drug development and clinical trials has seen renewed interest in recent months, with total funding in...
AI drug development startups raised $2.1B in 1st half of 2021
The market for artificial intelligence (AI) in drug development and discovery has been red-hot in recent years. The potential impact that AI can offer...

Copyright © 2021

Glasgow International Health Festival
January 25-26, 2023
Science Advisory Board on LinkedIn
Science Advisory Board on Facebook
Science Advisory Board on Twitter