New genomics database increases research collaboration

November 4, 2019 -- An international research team launched a new open-source database – multiplex assays of variant effect (MAVEs) – as a public repository for large-scale measurements of sequence variant impact, designed for interoperability with applications to interpret these datasets. Details of the database are presented in the November 3 issue of Genome Biology.

This experimental data will provide valuable information about how proteins produced by gene function caused by an individual variant, how variants in a gene may contribute to disease, and how to engineer synthetic versions of naturally occurring proteins.

Sharing datasets that reveal the function of genomic variants in health and disease has become easier, with the launch of a new, open-source database called MaveDB. Walter and Eliza Hall Institute, Australia.

The database was developed by Alan Rubin from the Walter and Eliza Hall Institute, Australia, Douglas Fowler from the University of Washington, US, and Frederick Roth from the University of Toronto, Canada.

At a glance

A newly-developed database, MaveDB, enhances the sharing of complex functional genomic data sets.
MaveDB is an easy-to-use repository for data from multiplex assays of variant effect (MAVEs), which are used to interpret the results of experiments that exhaustively measure the impact of different variants of a gene.
MaveDB enhances researchers' ability to access and interpret complex functional genomic data, accelerating research into the basic biology of genes, their role in disease and how proteins can be engineered to create more effective variants.

"In the past, researchers had to focus on a handful of changes in a gene to understand its function," Rubin said. "It was too complex to generate the data from an exhaustive scan of variants of a gene that might be hundreds or thousands of bases in length.

"The development of MAVEs provided a way for researchers to experimentally measure every single genetic change in a gene with its functional consequence. These assays can handle tens of thousands of genetic variants, allowing researchers to home in on the relevant changes and place them in context."

Previous work has relied on isolated experiments with a limited data set being published to a journal, sometimes making it difficult for other researchers to access the data. "MaveDB makes it easier for scientists to share their datasets in a single location, using a flexible format that is applicable to multiple research fields, and enables other scientists to easily access this data to enhance their research" said Rubin. "We've also ensured MaveDB can 'talk' to other databases to add an extra level of collaborative capacity. For the growing field of MAVE research, this database is an important step towards open science and reproducibility by ensuring data is made available."

In addition to developing the database, the team of researchers created data visualization software, called MaveVis, which makes it easier for scientists to understand and interpret the results of MAVE experiments.

"We envision that as MaveDB becomes more widely used within the bioinformatics community, other applications will be added that provide new ways to visualize and interpret complex genomics data - leading to new discoveries that enhance biomedical research" states Rubin. "This could underpin the development of new medicines, or the understanding of how a patient's genomic variants contribute to a disease."

If you like this content, please share it with a colleague!