Machine learning predicts if a COVID-19 clinical trial will be successful

By Samantha Black, PhD, ScienceBoard editor in chief

July 23, 2021 -- Using machine learning, researchers have teased out the underlying factors of success for COVID-19 clinical trials, according to a PLOS One article published on July 12. The analysis showed that drug features and study keywords are the most informative features of COVID-19 trials.

Randomized clinical trials have provided bodies of evidence for the approval of emergency use authorizations for several COVID-19 drugs and vaccines. As of July 15, more than 6,180 COVID-19 clinical trials have been registered through, the U.S. registry and database for privately and publicly funded clinical studies conducted worldwide.

Not all studies are expected to be completed or achieve the intended goals, according to researchers from Florida Atlantic University. Around 10% to 12% of trials are terminated due to various reasons, including the following:

  • Insufficient enrollment
  • Scientific data from trial
  • Safety/efficacy concerns
  • Administrative reasons
  • External information from a different study
  • Lack of funding/resources
  • Business/sponsor decisions

It is imperative to understand what type of trials are going to be successful in the face of a global epidemic, as they provide critical safety and efficacy data leading to regulatory approvals. In the current study, Xingquan Zhu, PhD, senior author and a professor in the department of computer and electrical engineering and computer science at Florida Atlantic University, and Magdalyn Elkin, a second-year doctoral student in the department, used machine learning to predict COVID-19 trial completion or cessation (terminated, withdrawn, or suspended).

"The main purpose of our research was to predict whether a COVID-19 clinical trial will be completed or terminated, withdrawn, or suspended," said Zhu, in a statement. "Clinical trials involve a great deal of resources and time, including planning and recruiting human subjects."

Because COVID-19 is still a relatively novel disease, most clinical trials are still actively recruiting, and only around 14% have completed. The researchers selected 772 COVID-19 clinical trials from the dataset in which the status was marked as "completed," "terminated," "withdrawn," or "suspended," with 81.3% completed and 18.7% cessation.

Each trial was identified by selected clinical trial features. Statistics features, which model clinical trials with respect to administrative, study information, study design, and eligibility criteria, were first considered. Next, the researchers considered drug features, which are derived from the Intervention Medical Subject Heading field in the clinical trial report. Of the trials in the study, 48.83% were interventional.

Keyword features capture important terms describing a clinical trial's summary and were the third category included in the analysis. Finally, embedding features, which provide textual descriptions of the trial, such as study objective, expected participants, and process and procedures, were considered. Cumulatively, 693 features were represented in each of the 772 trials included in the analysis.

Click image to enlarge.

As of July 15, more than 6,180 COVID-19 clinical trials have been registered through Image courtesy of Florida Atlantic University/College of Engineering and Computer Science.

As of July 15, more than 6,180 COVID-19 clinical trials have been registered through Image courtesy of Florida Atlantic University/College of Engineering and Computer Science.

The researchers used the ReliefF similarity-based feature selection method to study each feature's impact on completion or cessation. After training and testing samples, the research duo compared the data using four predictive models: neural network, random forest, XGBoost, and logistic regression.

"If we can predict the likelihood of whether a trial might be terminated or not down the road, it will help stakeholders better plan their resources and procedures," Zhu stated. "Eventually, such computational approaches may help our society save time and sources to combat the global COVID-19 pandemic."

The authors noted that the current study was focused on investigating a particular disease (COVID-19), and the trials compared shared more common features, which provides a large improvement in the predictive power of modeling clinical trials. Segregating clinical trial data by research area or disease resulted in balanced accuracy as high as 70% and F1 score (harmonic mean of precision and recall) of 50%. The study achieved area under the curve (AUC) scores of over 0.87 and balanced accuracy for prediction scores of over 0.81, indicating high efficacy of using computational methods for COVID-19 clinical trial prediction.

Feature selection and ranking showed that keyword features were the most informative for COVID-19 trial prediction, followed by drug features, statistics features, and embedding features. Because most of the trials were interventional, it is logical that drug intervention (a drug feature) is a key component to a trial's success.

"Clinical trials that have stopped for various reasons are costly and often represent a tremendous loss of resources," said Stella Batalama, PhD, dean of the College of Engineering and Computer Science at Florida Atlantic University. "As future outbreaks of COVID-19 are likely even after the current pandemic has declined, it is critical to optimize efficient research efforts.

"The new approach ... will be helpful to design computational approaches to predict whether or not a COVID-19 clinical trial will be completed so that stakeholders can leverage the predictions to plan resources, reduce costs, and minimize the time of the clinical study," Batalama concluded.

Do you have a unique perspective on your research related to clinical trials or artificial intelligence? Contact the editor today to learn more.


AI drug development startups raised $2.1B in 1st half of 2021
The market for artificial intelligence (AI) in drug development and discovery has been red-hot in recent years. The potential impact that AI can offer...
Quantitative image analysis provides confidence in drug development
Olga Kubassova, PhD, CEO of Image Analysis Group, spoke with about how computational analysis is improving clinical trial imaging.
AACR 2021: How machine learning and artificial intelligence are transforming cancer research
The use of digital tools has been incorporated into many areas of cancer research. However, experts agree that machine learning and artificial intelligence...
Safety board raises concern over AstraZeneca COVID-19 vaccine efficacy
Late in the day on March 22, the data safety monitoring board notified the National Institute of Allergy and Infectious Diseases, the Biomedical Advanced...

Copyright © 2021

Science Advisory Board on LinkedIn
Science Advisory Board on Facebook
Science Advisory Board on Twitter