An NDORMS team has developed a new approach to significantly improve the accuracy of RNA sequencing. They identified the main source of inaccurate quantification in short- and long-read RNA sequencing, and introduced the concept of “majority vote” error correction, leading to a substantial improvement in RNA molecular counting.
Precise sequencing of genetic material is crucial in modern biology, particularly for understanding and treating diseases linked to genetic abnormalities. However, current methodologies face significant constraints.
In a landmark study, an international consortium of researchers, led by Adam Cribbs, associate professor of computational biology, and Jianfeng Sun, postdoctoral research associate at the Botnar Institute at the University of Oxford, developed an innovative method to correct PCR amplification errors. -a technique widely used in high-throughput sequencing.
By identifying PCR artifacts as the primary source of inaccurate quantification, researchers address a long-standing challenge of generating accurate absolute counts of RNA molecules, which is crucial for various applications in genomics research. The study is published in the journal Natural methods.
The researchers focused on unique molecular identifiers (UMIs), which are random oligonucleotide sequences used to eliminate bias introduced during PCR amplification. Although UMIs have been widely adopted in sequencing methods, the study reveals that PCR errors can harm the accuracy of molecular quantification, especially across different sequencing platforms.
Jianfeng explained: “PCR amplification, essential for most RNA sequencing techniques, can introduce errors, compromising data integrity. We addressed this problem by synthesizing UMI barcodes using homotrimeric nucleotide blocks, thereby improving error correction and enabling near absolute quantification of RNA molecules, thereby significantly improving molecular counting. precision.”
Homotrimers are nucleotide sequences consisting of three identical bases, for example AAA, CCC, GGG. By assessing the nucleotide similarity of homotrimers, errors are detected and corrected using a “majority vote” method (see figure above).
The study demonstrates that homotrimeric UMIs significantly outperform traditional monomeric UMIs in reducing false-positive fold enrichment when analyzing differentially expressed genes and transcripts (DEGs and DETs). This improvement is vital for the accurate identification and quantification of DEGs or DETs, especially in bulk sequencing approaches.
Additionally, in single-cell sequencing, where extensive PCR amplification is often required, homotrimeric UMIs have been shown to be effective in mitigating the effects of PCR artifacts, thereby significantly improving the reliability of sequencing data.
“By constructing UMIs from homogeneous blocks of nucleosides, we sought to improve error correction in short- and long-read sequencing, demonstrating our commitment to improving sequencing technology applications,” says Associate Professor Adam Cribbs , lead author of the article and group leader. in computational biology.
This research has profound implications. By rectifying PCR errors in UMIs, it significantly improves the accuracy of molecular quantification in various sequencing applications. It is an essential tool for bulk RNA, single-cell RNA and DNA sequencing researchers, enabling precise analyzes of gene expression and molecular profiles.
Enhanced UMI error correction not only reduces the incidence of false positives, but also provides multiple diagnostic applications, especially in scenarios requiring longitudinal sample analysis.
More information:
Jianfeng Sun et al, Correcting PCR amplification errors in unique molecular identifiers to generate accurate numbers of sequencing molecules, Natural methods (2024). DOI: 10.1038/s41592-024-02168-y
Provided by the University of Oxford
Quote: Improving the Accuracy of Molecular Quantification in High-Throughput Sequencing (February 5, 2024) retrieved February 5, 2024 from
This document is subject to copyright. Except for fair use for private study or research purposes, no part may be reproduced without written permission. The content is provided for information only.