Current genetic sequencing techniques lack the sensitivity required to detect rare gene mutations in a large pool of cells. Investigational gene therapy platforms such as CRISPR-Cas9 rely on detection of gene mutations to produce on-target changes to treat cancers and other diseases.
Third-generation sequencing technologies (long-read sequencing) frequently produce reads in excess of 10,000 base pairs, allowing for improved de novo assembly, mapping, and detection of structural variants compared to earlier sequencing techniques. However, these single-molecule sequencing technologies provide lower per read accuracy and lower resolution compared to short-read sequencing.
Detecting rare variants
Scientists at King Abdullah University of Science and Technology (KAUST) have developed a highly accurate strategy called targeted individual DNA molecular sequencing (IDMseq). This type of sequencing involves attaching molecular barcodes (or unique molecular identifiers, UMI) to every DNA molecule in a sample of cells and making a large number of copies using polymerase chain reaction (PCR).
The technique guarantees that each original DNA molecule is uniquely represented by one molecular barcode after sequencing, thereby preventing false barcodes and allowing quantification of allele frequency in the original population. It is able to sensitively detect genetic variants, including single nucleotide variants, indels, large deletions, and complex rearrangements.
The researchers also developed a bioinformatics toolkit called variant analysis with UMIs for long-read technology (VAULT) to analyze sequencing data from IDMseq. VAULT uses a combination of algorithms to detect mutations based on the unique barcodes. In the system, every barcode represents one of the original DNA molecules. It works well with third-generation, long-read sequencing technologies and helped the researchers determine the frequency of all types of mutations -- from changes in a single DNA nucleotide to large deletions and insertions in the original DNA molecules.
In the study, the team was able to detect rare variants and accurately estimate variant frequency of a knock-in gene mutation mixed with a group of wild-type cells at the ratios of 1:100. 1:1,000, and 1:10,000. Due to the large size and low frequency of the genetic variant, it would have been missed by alternative short-read sequencing or ensemble long-read sequencing techniques.
Searching for mutations caused by CRISPR/Cas9
Recent studies have shown that genome editing by CRISPR-Cas9 can lead to large deletions and complex rearrangements of harmful on-target mutagenesis in various cell types. In the past, these large deletions were detected by ensemble amplicon sequencing, which is prone to amplification bias, or whole-genome sequencing, which cannot adequately detect large and complex variants due to limited read length.
"Several recent studies have reported that Cas9 introduces unexpected, large DNA deletions around the edited genes, leading to safety concerns. These deletions are difficult to detect and quantitate using current DNA sequencing strategies. But our approach, in combination with various sequencing platforms, can analyze these large DNA mutations with high accuracy and sensitivity," noted co-first author, Chongwei Bi, a PhD student at KAUST.
The analysis showed that large deletions occurred in 2.8% to 5.4% of Cas9 repair editing outcomes. The authors suggested that Cas9 cutting may not be random and there are likely hotspots for Cas9-induced large deletions or insertions.
"Our study revealed potential risks associated with CRISPR/Cas9 editing and provides tools to better study genome editing outcomes," explained lead author Mo Li, PhD, bioscientist at KAUST.
They also detected a 300% increase in the number of somatic single nucleotide variants after CRISPR-Cas9 editing.
"This shows that there is a lot that we need to learn about CRISPR/Cas9 before it can be safely used in the clinic," said co-author Yanyi Huang, PhD, of Peking University.
Do you have a unique perspective on your research related to genomics or gene therapy? Contact the editor today to learn more.
Copyright © 2020 scienceboard.net