DNA, often considered “the blueprint of life,” contains instructions for building the proteins cells need to survive and function properly. But DNA is not perfect and errors can occur during replication. Sometimes this can result in repeating snippets of the building blocks of DNA called nucleotides – G (guanine), A (adenine), T (thymine), C (cytosine) – too many times in a row.
This can lead to a type of mutation, known as nucleotide repeat expansions, that can alter the function and structure of vital proteins and give rise to rare neurodegenerative diseases like Huntington’s disease and amyotrophic lateral sclerosis ( ALS).
New research led by Whitehead Institute member Ankur Jain, graduate student Rachel Anderson, and colleagues takes a closer look at how the repeat sequence involved in Huntington’s disease (a CAG repeat) leads to the production of abnormal proteins that fold poorly and clump together in cells, blocking them. important cellular processes.
Their conclusions, published in the journal Molecular cell on January 30, reveal that the expanded CAG repeat can interfere with splicing. As shown in the illustration below, this is the process by which parts of RNA that do not code for proteins, also called introns, are removed. The remaining sections, called exons, are then joined together to form the final messenger RNA that carries the instructions needed to build a protein.
According to the researchers, the expanded CAG repeat creates new markers, or splice acceptor sites, which leads to the copying and pasting of genetic information at different junctions than usual.
“The question of why the brains of patients with repeat expansion disorder contain stray proteins has baffled scientists for some time,” says Jain, who is also an assistant professor of biology and the Thomas D. Career Development Professor. and Virginia W. Cabot at the University of Washington. Massachusetts Institute of Technology. “Now, because we understand the molecular mechanism, we can try to target the splicing pathway and decrease the production of these proteins.”
Deployment of RNA hairpins
RNA is less stable than DNA, and common approaches to analyzing RNA rely on an enzyme called reverse transcriptase. Although usually in a cell DNA is read into RNA, this enzyme reads RNA molecules into a complementary DNA (cDNA) strand. This allows researchers to closely analyze RNA sequences without risking degradation of genetic information.
But reverse transcription of repeat-containing RNAs comes with its own challenges: These molecules tend to fold on themselves, forming hairpin loops, and when these loops don’t fully unwind during reverse transcription, researchers end up with gaps and errors in the cDNA.
In the new paper, Jain and Anderson used a different approach to sensitively reverse transcription of repeat-containing RNAs into cDNA. Specifically, the researchers worked with an enzyme called TGIRT (Thermostable Group II Intron Reverse Transcriptase) that remains active at high temperatures, allowing it to open hairpin structures and capture sequences containing repeats with greater loyalty.
“When you heat an egg, it turns yellow because the proteins in the egg unfold due to the high temperature. We exploit the same thing but with RNA structures,” says Anderson.
The researchers then began mapping these repeats onto a reference genome, which serves as a guide to genetic information in a human, but they quickly ran into difficulties. The “letters” that make up the human GATC genome combine in various sequences to form the DNA strands of our cells.
This means that repeat patterns in the human genome are inevitable (repeat diseases only occur when a single sequence, like CAG, is repeated too many times in a row) and each pattern can appear in multiple places in the genome. Thus, identifying the origin of the RNA containing the repeat amounts to reconstructing a story from fragmented sentences without context.
“That’s when we decided to approach rep mapping differently,” says Anderson. The researchers developed a new tool, called SATCfinder, which selects RNA sequences with at least three CAG repeats. These repeats are then cut out computationally and the rest of the sequence is mapped to the reference genome. The location, or map coordinates, of the pattern immediately before the CAG repeat is tracked, allowing researchers to determine exactly where the repeats are supposed to go.
Taking a Closer Look at Splicing
Previous work from the Jain lab showed that once repeat-containing RNAs leave the nucleus and reach the cell’s cytoplasm, they form gel-like clumps.
Typically, in the cytoplasm, RNAs interact with cellular machinery that looks for a marker on the RNA, called a start codon, to begin translating the instructions for building proteins. The researchers hypothesized that RNAs containing repeats might disrupt the machinery, causing it to translate instructions coming from different starting points. This process, called RAN translation, could then lead to the creation of unnecessary proteins that not only tend to clump together, but also contribute to RNA gumming in the cytoplasm.
But this explanation wasn’t entirely satisfying to Jain and Anderson, and they wanted to know more about why RNAs containing repeats led to random translation of instructions in the first place. To study this, they created a set of sequences with the “CAG” motif repeated 240 times consecutively. As they predicted, when these sequences reached the cytoplasm, they began to aggregate.
When the researchers performed RNA sequencing on these cells and analyzed the results using SATCfinder, they found their answer: CAG repeats in the RNA were often stitched to unexpected sequences, further away from the repetition in DNA, the intermediate regions being cut out. This meant that the presence of CAG repeats several times in a row led to the creation of new copy-and-paste sites at the edges of the repeat itself, creating abnormal RNA transcripts which then produced proteins that misfolded and s. ‘clumped together.
Now, researchers in the Jain lab want to study in more detail how the expanded CAG repeat induces splicing errors. They also hope to learn more about the extent to which these splicing errors contribute to the pathology of diseases such as Huntington’s disease.
“There are a whole host of mechanisms that come together and contribute to cell death in Huntington’s disease. This is a piece of the puzzle that contributes to our molecular understanding of how these repeats distort cellular functions.” , explains Jain.
More information:
Rachel Anderson et al, CAG repeat expansions create splice acceptor sites and produce RNAs containing aberrant repeats, Molecular cell (2024). DOI: 10.1016/j.molcel.2024.01.006
Provided by the Whitehead Institute for Biomedical Research
Quote: Protein production problems in Huntington’s disease revealed (February 19, 2024) retrieved February 19, 2024 from
This document is subject to copyright. Apart from fair use for private study or research purposes, no part may be reproduced without written permission. The content is provided for information only.