Machine learning helps identify antimicrobial-resistant bacteria

By Samantha Black, PhD, ScienceBoard editor in chief

July 7, 2020 -- A new graphical user interface-driven, machine learning-based approach has successfully identified antimicrobial resistance genes for gram-positive and -negative bacteria. This work, presented in Scientific Reports on July 3, may make it easier to identify deadly antimicrobial-resistant bacteria.

Antimicrobial resistance occurs when bacteria become less susceptible to an antimicrobial agent. They can achieve this by overexpressing or duplicating genes, undergoing chromosomal mutation, or obtaining resistance genes from other bacteria. Infection by antimicrobial-resistant organisms results in over 35,000 deaths per year and could cost up to $3.5 billion in healthcare costs per year in the U.S. alone. These resistant bacteria pose a worldwide threat and preclinical tools to predict these species are urgently needed.

Conventional strategies for determining genetically encoded mechanisms for antimicrobial resistance involve sequence assembly and read-based techniques that map sequence data to reference databases. These techniques are reliable for well-known conserved resistance genes but tend to produce a large number of false-positive results for gram-negative bacteria.

Machine learning can be applied for predicting putative antimicrobial-resistant genes. Rather than comparing similar sequences, machine learning models detect features that are unique to antimicrobial-resistant genes.

Washington State University researchers recently introduced a game theory-based feature selection approach to predict genes that encode resistance with an accuracy of 93% to 99% in gram-negative bacteria. Game theory is commonly used to model strategic interactions between game players, which could be useful in identifying resistance genes.

A medical illustration of Clostridioides difficile bacteria
A medical illustration of Clostridioides difficile bacteria, formerly known as Clostridium difficile, presented in the Centers for Disease Control and Prevention (CDC) publication "Antibiotic Resistance Threats in the United States, 2019." Image courtesy of the CDC.

Now, the researchers have applied the tool to gram-positive bacteria and introduced software for identifying antimicrobial resistance genes for both gram-positive and gram-negative bacteria.

"Our software can be employed to analyze metagenomic data in greater depth than would be achieved by simple sequence matching algorithms," explained author Abu Sayed Chowdhury, a doctoral candidate in computer science at Washington State University, in a statement. "This can be an important tool to identify novel antimicrobial resistance genes that eventually could become clinically important."

In their previous work, the researchers considered amino acid sequences of aac, bla, and dfr for gram-negative bacteria including Acinetobacter, Klebsiella, Campylobacter, Salmonella, and Escherichia as training datasets for the machine learning model to test sequences from Pseudomonas, Vibrio, and Enterobacter.

Subsequently, the model was validated in gram-positive bacteria with the antimicrobial resistance sequences bac and van from Clostridium and Enterococcus used as a training dataset for gram-positive bacteria. Then the model was used to identify antimicrobial resistance genes from the common food-borne pathogens Staphylococcus, Streptococcus, and Listeria. The researchers found that the system can be used to predict antimicrobial sequences for gram-positive bacteria with accuracy ranging from 87% to 90%.

"The virtue of this program is that we can actually detect antimicrobial resistant bacteria in newly sequenced genomes," said author Shira Broschat, PhD, from the School of Electrical Engineering and Computer Science at Washington State University. "It's a way of identifying antimicrobial resistant bacteria genes and their prevalence that might not otherwise have been found. That's really important."

The researchers developed open-source software packages (in Python 3 and R) that other scientists can download and implement to predict antimicrobial resistance genes in bacteria. The easy-to-use software includes all the required bioinformatic tools and scripts necessary to generate protein features in order to use the machine learning model. In this way, researchers can retrain the algorithm as more data and sequences become available to improve predictions.

Do you have a unique perspective on your research related to bacteriology or bioinformatics? Contact the editor today to learn more.


SARS-CoV-2 rewires host proteins to promote infection
To successfully infect human cells, SARS-CoV-2 may hijack host proteins in target cells to promote its own replication. Researchers may be able to leverage...
Artificial intelligence helps researchers find new antibiotics
To address antibiotic resistance, researchers have developed a machine-learning approach that can search millions of known chemicals to find new...
Together, machine learning and tumor DNA provide new tools for colorectal cancer patients
Researchers utilize a new machine learning platform to identify patients with colorectal cancers and predict their disease severity and survival. The...

Copyright © 2020

Science Advisory Board on LinkedIn
Science Advisory Board on Facebook
Science Advisory Board on Twitter