JOVANA
Library Glossary Getting Started Three Levels Fields How it works Mission
Join the mission
All guides

Capping, Splicing & the Poly-A Tail

In a eukaryote, the RNA that leaves the polymerase is not yet a usable message. Watch it get a protective cap, a long tail, and its interruptions cut out — all while it is still being written — before it earns its passport out of the nucleus.

The raw transcript is not the finished message

In the transcription rung you watched RNA polymerase read a gene and write out an RNA copy. It is tempting to think the job is done — the message exists, send it to be translated. In a bacterium that is roughly true. But you also learned, back in the foundations, about the prokaryote–eukaryote divide: a eukaryotic cell keeps its DNA sealed inside a nucleus, and transcription happens in there, walled off from the ribosomes outside. That wall changes everything. The fresh RNA cannot simply be read where it is made — it has to be prepared, inspected, and given permission to leave.

So a eukaryotic gene's first RNA product is called a pre-mRNA — a precursor, not the finished article. Turning it into a mature, ready-to-read mRNA is the job of [[pre-mrna-processing|RNA processing]], and it has three big jobs: adding a protective cap to the front, cutting and adding a long tail to the back, and removing internal stretches that do not belong in the final message. Only after all three is the molecule a true messenger. Think of the pre-mRNA as a rough draft fresh off the printer: it has the words, but it still needs a cover sheet, a signature at the bottom, and the editor's deletions before anyone should act on it.

The 5' cap: a hard hat for the front end

The very first thing that happens — before the polymerase has written even a hundred letters — is that the front end of the RNA gets a [[molbio-five-prime-cap|5' cap]]. The cell snaps a modified guanine nucleotide onto the 5' end, joined by an unusual backwards linkage, and then adds a methyl group to it. The result is a small chemical knob sitting on the front of the molecule that simply does not look like an ordinary RNA end. Recall that every strand has a 5' end and a 3' end; the cap is bolted onto the 5' end specifically.

That odd knob earns its keep in two ways. First, it is armour. The cell is full of enzymes that chew up loose RNA from its ends, and a naked 5' end is an open invitation; the cap is a cork in the bottle that those enzymes cannot grip, so a capped RNA survives far longer. Second, it is a handle. When the message finally reaches the cytoplasm, the ribosome's loading machinery recognises the cap and uses it to find the front of the message and thread it in the right way round. In short: the cap protects the RNA and helps the ribosome find where to start. An uncapped eukaryotic transcript is both fragile and hard to translate.

The 3' end: a clean cut and a long poly-A tail

While the cap guards the front, the back end gets its own treatment, and it begins with a surprise: the polymerase usually keeps going past the true end of the message. The actual 3' end is not where transcription stops — it is set by cleavage. A protein crew rides along the RNA watching for a signal sequence (in humans, often the letters AAUAAA), and a short distance downstream of it they cut the RNA cleanly in two. The front piece is the one that matters; the trailing tail beyond the cut is discarded and degraded.

The instant that cut is made, a dedicated enzyme — not the RNA polymerase, and using no template — starts adding a long run of adenine nucleotides, one after another, to the new 3' end. This is [[polyadenylation|polyadenylation]], and the result is the poly-A tail: a string of perhaps 50 to 250 A's tacked onto the back, written in plain text as ...AAAAAAA. Notice it is not coded by any gene; the DNA does not say "put 200 A's here." The tail is added afterward, by an enzyme that simply repeats one letter.

What is the tail for? Like the cap, it is both shield and signal. It protects the 3' end from the same end-nibbling enzymes, and proteins that bind the tail help carry the message out of the nucleus, help the ribosome get going, and, crucially, set the message's lifespan. The tail is slowly trimmed over time; once it grows too short, the whole mRNA is marked for destruction. So the poly-A tail acts like a fuse on a sparkler: a long tail means a long-lived message that can be translated many times, and a shortened tail counts down to the end. Cap on the front, tail on the back — the message now has protected, recognisable ends on both sides.

Splicing: cutting out the interruptions

Now the strangest and most beautiful step. You met it in the genome rung as exon–intron organization: a eukaryotic gene is not one continuous stretch of message. The parts that will end up in the final mRNA, the exons (think: *ex*pressed), are interrupted by stretches that will not, the introns (think: *in*tervening). The pre-mRNA is transcribed straight through both, exons and introns alike, so the raw transcript reads like a sentence with whole irrelevant paragraphs jammed into the middle of it. Before the message can mean anything, those intruding intron paragraphs must be cut out and the exon pieces joined seamlessly. That editing is [[molbio-rna-splicing|splicing]].

How does the cell know exactly where one intron starts and ends, down to the single letter? It reads short marks in the RNA. Almost every intron begins with the letters GU and ends with AG, and a little upstream of the end sits a special A called the branch point — these are the splice sites and branch point that label the cut. Getting these boundaries exactly right matters enormously: shift the cut by even one nucleotide and every codon downstream is read in the wrong frame, garbling the protein. The machine that performs the surgery is the [[molbio-spliceosome|spliceosome]], a large assembly built from small RNAs and proteins. It is itself partly made of RNA — a hint we will pick up in the next guides, where RNA turns out to be able to act as a catalyst, not just a message.

pre-mRNA:  cap-[ exon1 ]--GU~intron~AG--[ exon2 ]--GU~intron~AG--[ exon3 ]-AAAA...
                          \___ cut out ___/         \___ cut out ___/
mature mRNA: cap-[ exon1 ][ exon2 ][ exon3 ]-AAAA...   (exons joined; introns gone)
Introns (marked by GU...AG) are excised and the exons are spliced together into one continuous coding message, capped and tailed.

Here is where an old textbook slogan quietly dies. Splicing does not have to glue the *same* exons together every time. A cell can keep some exons and skip others, producing several different mRNAs — and thus several different proteins — from one gene. This is [[molbio-alternative-splicing|alternative splicing]], and it is the honest reason "one gene, one protein" is outdated. It is also a large part of why humans manage with only about 20,000 protein-coding genes yet build a body of staggering complexity: one gene is often a kit that can be assembled in more than one way, not a single fixed product.

All at once: processing while the RNA is still being written

It is natural to imagine these three steps happening in tidy sequence after transcription finishes: write the whole RNA, then cap it, then splice it, then tail it. That picture is wrong in an instructive way. Most processing is co-transcriptional — it happens *while the polymerase is still transcribing*, on the part of the RNA that has already emerged, before the gene has even been fully read.

  1. Capping comes first, almost immediately: as soon as the new 5' end pokes out of the polymerase — within the first few dozen nucleotides — capping enzymes riding on the polymerase clap the cap onto it.
  2. Splicing begins on the fly: as each intron is fully transcribed and dangles out behind the moving polymerase, the spliceosome can already assemble on it and excise it, long before the downstream exons exist.
  3. Cleavage and tailing finish the back: when the polymerase finally transcribes the AAUAAA signal near the end, the cleavage crew (which has also been hitching a ride) cuts the RNA and the poly-A enzyme adds the tail — and the cut helps trigger transcription to terminate.

The thread tying this together is the polymerase's own tail — a long, flexible arm of protein that trails off the back of RNA polymerase and serves as a mobile workbench. Capping enzymes, splicing components, and the cleavage crew all dock onto that arm and are carried right alongside the emerging RNA, so the moment each part of the transcript is ready, its editing crew is already on the spot. Processing is not a later assembly line; it is woven into transcription itself. This coupling is also a safeguard: a transcript that fails to get capped, spliced, and tailed correctly is usually held back and destroyed rather than exported — which is exactly why the nucleus is the right place for all this to happen.