July 22, 2021 -- Google Health's DeepMind announced a partnership with the European Molecular Biology Laboratory (EMBL) to offer the scientific community open access to the most comprehensive database developed to date of predicted 3D protein structures.
The free database includes more than 350,000 protein structure predictions based on 20,000 proteins expressed by the human genome. Determining how proteins "fold" and form their 3D structures is an essential step in determining their functions, which underpin every biological process.
The database was developed with artificial intelligence (AI) technology and is intended to help researchers across a variety of fields accelerate their work. A team at the University of Colorado, Boulder, for instance, is using predicted protein structures to study antibiotic resistance, while a group at the University of California, San Francisco is using them to elucidate SARS-CoV-2 biology, DeepMind said in a release.
"Our goal at DeepMind has always been to build AI and then use it as a tool to help accelerate the pace of scientific discovery itself, thereby advancing our understanding of the world around us," said Demis Hassabis, PhD, DeepMind's founder and CEO, in the statement.
The AI technology driving the database is called AlphaFold, which was developed by DeepMind and recognized last year by the critical assessment of protein structure prediction group as a major breakthrough in the 50-year-old challenge of determining how proteins fold from amino acids into their 3D structures.
The AlphaFold database is an example of the virtuous circle of open science, according to Edith Heard, PhD, director general of EMBL.
"AlphaFold was trained using data from public resources built by the scientific community, so it makes sense for its predictions to be public. Sharing AlphaFold predictions openly and freely will empower researchers everywhere to gain new insights and drive discovery," stated Heard.
In addition to the human proteome, the database launches with structures of 20 other biologically-significant organisms, such as Eschercherichia coli, the fruit fly, the mouse, zebrafish, the malaria parasite, and tuberculosis bacteria.
Last week, Nature published the methodology and open-source code behind the latest version of AlphaFold.