February 16, 2023 -- A machine learning algorithm has predicted the chances of successfully inserting a gene-edited sequence of DNA into a cell using a next-generation CRISPR-Cas9 approach.
The next-generation approach, prime editing, is still new, with the first paper describing the technique published in 2019; as such, researchers are still optimizing the technique. Prime editing expands the limited changes that are possible through base editing, an earlier CRISPR invention, by empowering scientists to replace any DNA nucleotide with any of the other four nucleotides.
Although prime editing opens up new opportunities for treating genetic diseases, researchers have lacked a clear picture of the factors that determine whether a particular edit will be successful. The lack of knowledge has forced scientists to take trial-and-error approaches and has slowed their progress.
Writing in Nature Biotechnology, researchers at the Wellcome Sanger Institute describe their attempt to fill the knowledge gap. The team inserted 3,604 DNA sequences, each between one and 69 DNA bases in length, into three different human cell lines using various prime editor delivery systems. After one week, the researchers sequenced the genomes of the cells to determine if the edits were successful.
The study revealed that the length of sequence and type of DNA repair mechanism were key factors but the interplay of different elements is complex, as Jonas Koeppel, the first author of the study, explained in a statement.
"We're beginning to discover what factors improve the chances of success. Length of sequence is one of these factors, but it's not as simple as the longer the sequence the more difficult it is to insert. We also found that one type of DNA repair prevented the insertion of short sequences, whereas another type of repair prevented the insertion of long sequences," Koeppel said.
The complexity led the researchers to apply a machine-learning algorithm to the problem. The scientists trained the algorithm to detect patterns that recur in successful edits. After training the algorithm, the team used it to accurately predict the likelihood of success. The algorithm ranked the range of potential gene edits made possible by prime editing based on how likely they were to work.
Next, the collaborators plan to create models for all known human genetic diseases. The work, which will involve other research groups, could help to determine the safest and most efficient ways to make edits that correct the causes of genetic diseases and thereby help to realize the therapeutic potential of prime editing.