October 16, 2019 -- Computational biologists at Carnegie Mellon University and seven other institutions developed a software tool that can play a high-speed "Match Game" to identify bioactive molecules and microbial genes that produce them so they can be evaluated as new antibiotics and other therapeutic agents. This work was published in Cell Systems on October 16.
Some microbes produce molecules that protect their host and, thus, are candidates to become therapeutic drugs. In the last decade, microbiologists have generated a number of large databases of microbe DNA, such as the microbial genomes database by National Institutes of Health (NIH) or EnsemblBacteria hosted by the European Bioinformatics Institute (EMBL-EBI). But microbe communities consist of hundreds or thousands of different types of microbes -- and millions of different molecular products -- and each microbe tends to die quickly if removed individually for study.
The researchers used genome mining to identify microbial gene clusters and infer what molecules those genes produce. This process is normally error-prone, so they used a workaround called Viterbi decoding to find the best match among a noisy signal. They used the algorithm to build an error-tolerant search engine that could find matches between databases of microbial DNA and databases that identify products by their mass spectra. The software product, called MetaMiner, addresses concerns with traditional methodology, and has the capacity to handle large-scale screening.
The Global Natural Products Social (GNPS) is an open access knowledge base for community-wide organization and sharing of raw, processed or identified tandem mass spectrometry data.
The team applied the MetaMiner to post-translationally modified peptides (or RiPPs), a family of natural products that have found applications in pharmaceuticals and the food industry. Current methods for discovery of RiPPs involve combining genome mining and computational mass spectrometry which are limited to discovering specific classes of RiPPs from small datasets and fail to handle unknown post-translational modifications.
To date around 20,000 gene clusters that encode RiPPs have been discovered, but only a handful of RiPPs have been matched to one of those clusters. MetaMiner led to the discovery of seven previously unknown molecules of biological interest from various environments like the human gut, the deep ocean, and the International Space Station. Moreover, the team proved that their software could identify bioactive molecules at least 100 times faster than previous methods.
"Normally, you'd be happy to find one match," said Hosein Mohimani, an assistant professor in CMU's Computational Biology Department. "Obtaining these results with manual methods likely would take decades."
Do you have a unique perspective on your research related to bioinformatics or microbiology? Contact the editor today to learn more.