Researchers at the Donnelly Center for Cellular and Biomolecular Research at the University of Toronto have discovered nearly a million new exons (DNA fragments expressed in mature RNA) in the human genome.
The results were published in the journal Genome research.
There are approximately 20,000 protein-coding genes in humans containing approximately 180,000 known internal exons. These protein-coding regions represent only one percent of the entire human genome. The vast majority of what remains is a mystery – aptly called the “dark genome”.
“We began destroying the dark genome by finding nearly a million previously unknown exons through a method called exon trapping,” said Timothy Hughes, principal investigator of the study and professor and chair of the department. in Molecular Genetics from the University of Washington. T’s Temerty Faculty of Medicine.
“The technique involves testing with plasmids to find exons in DNA fragments of unknown composition,” said Hughes, Canada Research Chair in Decoding Gene Regulation and the John W. Medical research logs at the University of Toronto. trapping is no longer widely used, but has been shown to be effective when used in combination with high-throughput sequencing to analyze the entire human genome.
Exons are segments of the genome that can encode proteins to direct tissue development and biological processes within the body. They are considered autonomous if they do not require external assistance to form a mature RNA transcript, which is then translated into protein.
The team behind the study was led to test the exon definition model that guides molecular genetics research after questioning one of its hypotheses: that the precise elimination of intronic regions non-coding proteins in the genome is facilitated by clear and consistent indicators of where introns are located in the genome. exons start and end. This assumption does not appear to be valid in all cases, because exon splicing does not always proceed smoothly, sometimes resulting in mature RNA transcripts containing nonfunctional components.
“Almost none of the newly discovered exons are found consistently across the genomes of different species,” Hughes said. “They appear to appear in the human genome mainly due to random mutations and are unlikely to play a significant role in our biology. This proves that evolution in humans involves a lot of trial and error, probably made possible by the vast size of our planet.” genome.”
It is useful to document randomly mutated exons in the human genome because their translation could potentially be dangerous. Long noncoding RNA exons, autonomous but often without known function, have been associated with cancer development. Of the approximately 1.25 million known and unknown exons the team discovered through exon trapping, nearly four percent were long, non-coding RNA exons.
Additionally, exons residing within noncoding introns, called pseudoexons, can mutate to strengthen a weak splice site. This results in the exon being included in a mature RNA transcript, potentially leading to disease.
“This is an exciting study that expands our knowledge of human genome sequences that may be recognized as exons in transcribed RNA,” said Benjamin Blencowe, professor of molecular genetics at Temerty School of Medicine. from the University of Toronto. involved in the study.
“Although the importance of the majority of newly detected exons is unclear, some of them may be activated in certain contexts, for example by disease mutations, and it is therefore important to catalog them. This study will serve additionally a valuable resource facilitating ongoing efforts to decipher the splicing code.
A better understanding of the factors impacting exon inclusion in mature RNA can help improve programs like SpliceAI, a widely used tool for predicting splice sites and aberrant splicing. SpliceAI can be trained on new data such as that produced in this study to refine its prediction capabilities.
“SpliceAI often does not provide details about exon characteristics and has poor ability to predict splicing of exons that are not already cataloged,” Hughes said.
“Our exon trapping data contains biologically meaningful information that can be fed into SpliceAI and other splicing predictors to open new avenues for exploring the dark genome.”
More information:
Nicholas Stepankiw et al, The human genome contains more than a million autonomous exons, Genome research (2023). DOI: 10.1101/gr.277792.123
Provided by University of Toronto
Quote: Researchers discover a million new components of the human genome (February 9, 2024) retrieved February 9, 2024 from
This document is subject to copyright. Apart from fair use for private study or research purposes, no part may be reproduced without written permission. The content is provided for information only.