Improved tool for long-read RNA sequencing unveiled

By Elissa Wolfson, The Science Advisory Board assistant editor

January 20, 2023 -- Researchers at Children's Hospital of Philadelphia (CHOP) have developed a new, more accurate computational tool for long-read RNA sequencing. The tool, called Error Statistics Promoted Evaluator of Splice Site Options (Espresso), described January 20 in Science Advances, may allow for better diagnosis of rare genetic diseases caused by disrupted RNA and the discovery of potential therapeutic targets in disease.

An RNA molecule from a gene can be cut and joined, or spliced, in different ways before being translated into a protein. This alternative splicing process allows a single gene to encode several different proteins and occurs in many biological processes, including when stem cells differentiate. In diseases, however, alternative splicing can be dysregulated. Examining the transcriptome -- all RNA molecules stemming from genes --can help reveal a condition's root causes.

Historically, it has been difficult to "read" entire RNA molecules because they are usually thousands of bases long. Instead, researchers have used short-read RNA sequencing, which breaks RNA molecules up and sequences them into much shorter pieces. Computer programs are then used to reconstruct the full sequences. Short-read RNA sequencing can provide highly accurate sequencing data with a low per-base error rate. Nevertheless, the information it can provide is limited.

More recently available long-read platforms can sequence RNA molecules over 10,000 bases in length. These platforms do not require RNA molecules to be broken up before sequencing, but they have a much higher per-base error rate, making it difficult to determine the validity of previously unknown RNA molecules discovered in rare genetic diseases and cancers. This limitation has hampered its widespread adoption.

The new computational tool Espresso can more accurately discover and quantify RNA molecules from the same gene -- called RNA isoforms -- using error-prone long-read RNA sequencing data. To do so, Espresso compares all long RNA sequencing reads of a given gene to its corresponding genomic DNA and then uses the error patterns of individual long reads to confidently identify splice junctions, along with their corresponding full-length RNA isoforms. By finding perfect match areas between long RNA sequencing reads and genomic DNA and borrowing information across all long RNA sequencing reads of a gene, the tool can identify highly reliable splice junctions and RNA isoforms, including those not previously documented in existing databases.

The researchers evaluated Espresso's performance using both simulated data and real biological data. They found Espresso performed better than many current tools, both in discovering RNA isoforms and quantifying them. They also generated and analyzed over 1 billion long RNA sequencing reads covering 30 human tissue types and three human cell lines, providing a useful resource for studying human transcriptome variation.

"The transition from short-read to long-read RNA sequencing represents an exciting technological transformation," Yi Xing, PhD, CHOP senior author, said in a statement. "We envision that ESPRESSO will be a useful tool for researchers to explore the RNA repertoire of cells in various biomedical and clinical settings."

Linus survey finds top two techniques or applications in life sciences R&D
A new survey of scientists finds that the top two techniques or applications in life sciences research and development (R&D) are CRISPR/cell-line engineering...
Lexogen debuts whole-transcriptome library prep kit
Lexogen has launched its Corall RNA sequencing (RNA-Seq) V2 whole-transcriptome library prep kit to analyze RNA molecules at the moment of sampling.
Long-read sequencing of circRNA finds autism-linked microexons
Scientists have developed a new laboratory protocol and bioinformatic pipeline using a long-read sequencing platform from Oxford Nanopore Technologies...
Biogen, Envisagenics collaborate on RNA splicing
Biogen and Envisagenics have announced a new collaboration to advance RNA splicing research within central nervous system diseases.
Molecular barcoding of DNA identifies rare mutations in stem cells
Scientists have developed a new next-generation sequencing technique using molecular barcodes that can accurately detect a single genetic mutation in...

Copyright © 2023

Science Advisory Board on LinkedIn
Science Advisory Board on Facebook
Science Advisory Board on Twitter