One bacterial machine, three eukaryotic ones
In the previous guide you watched a bacterium transcribe a gene. Its setup is lean: a single RNA polymerase does every job, and it finds a gene by borrowing one swappable subunit — a sigma factor — that reads the promoter and lets go once transcription is underway. One core enzyme, one guide protein, and the cell is off. This guide carries you across the prokaryote–eukaryote divide to see how a cell with a nucleus does the very same chemistry, only far more elaborately — and why that extra machinery is not bureaucratic clutter but the price of control.
The first surprise is that eukaryotes do not have one RNA polymerase — they have three, and each is a specialist that transcribes a different class of gene. Pol I transcribes the big ribosomal RNA genes — the bulk RNA that becomes the body of ribosomes. Pol II transcribes all the protein-coding genes into messenger RNA (mRNA), plus many regulatory RNAs; it is the one we care about most, because mRNA is the working copy that gets translated into protein. Pol III transcribes the short housekeeping RNAs: transfer RNAs and other small RNAs the cell needs in quantity. Three enzymes, three job descriptions — this division of labour is the three-polymerase system.
EUKARYOTIC RNA POLYMERASES -- three specialists
-------------------------------------------------
Pol I --> ribosomal RNA (rRNA) the bulk of ribosomes
Pol II --> messenger RNA (mRNA) protein-coding genes
+ many regulatory RNAs
Pol III --> transfer RNA (tRNA) + other small RNAs
(a bacterium does ALL of this with ONE polymerase)Pol II cannot find a gene by itself
Here is the second big difference, and it is the heart of this guide. Bacterial polymerase carries its own gene-finder, the sigma factor. Eukaryotic Pol II carries no such thing — left to itself it cannot recognize a promoter, cannot melt the DNA open, and cannot start. It is a superb writing engine with no sense of where to begin. To position it on a gene, the cell assembles a set of separate proteins called the [[tata-box-and-general-transcription-factors|general transcription factors]] (the GTFs). 'General' because the same handful of them are used at almost every Pol II gene, as opposed to the gene-specific regulators we will meet in later rungs.
Many Pol II promoters carry a short, recognizable landmark a little upstream of where transcription starts: the TATA box, a run of DNA rich in T and A, often written 5'-TATAAA-3'. Because A-T base pairs are held by only two hydrogen bonds while G-C pairs have three, a T-and-A-rich stretch is the easiest place on the whole gene to pry the two strands apart — exactly what you want at a starting line. A dedicated protein clamps onto the TATA box, bends the DNA sharply, and plants the flag that says 'build here.' Be honest about the limits, though: not every gene has a TATA box. Many promoters, especially housekeeping genes, use other landmarks instead, and the TATA box is best treated as the cleanest textbook example rather than a universal rule.
Building the pre-initiation complex, piece by piece
The general transcription factors and Pol II do not arrive as a finished machine. They assemble on the promoter in a roughly ordered sequence, each piece creating the landing pad for the next, until the whole structure — Pol II plus its GTFs, poised over the start site — is complete. That assembled structure is the [[pre-initiation-complex|pre-initiation complex]], or PIC. Think of it as the difference between a lone driver wandering the streets and a driver, navigator, ignition key, and starting marshal all coordinated at the start line: only the assembled committee can actually begin the race.
- A TATA-binding protein lands on the TATA box and kinks the DNA, marking the spot. (At TATA-less promoters, other factors do the equivalent flag-planting.)
- More general transcription factors dock onto that anchor, building an ordered platform on the promoter.
- Pol II is recruited into the platform, positioned precisely over the start site — the pre-initiation complex is now assembled.
- A factor with helicase activity unwinds a short patch of DNA, opening a transcription bubble; another phosphorylates Pol II's tail to release it.
- Pol II shrugs off most of the GTFs and moves down the gene, leaving the launch crew behind to start the next round.
There is one more crucial player, and it explains why eukaryotes went to all this trouble. A large bridge of proteins called the [[mediator-complex|Mediator complex]] sits between Pol II and the gene-specific regulatory proteins that bind far away. When an activator latches onto an enhancer thousands of base pairs distant, the looping DNA brings it close, and Mediator relays that 'switch me on' message to the pre-initiation complex — helping it assemble faster and fire more often. Mediator is the wiring that lets distant signals reach the start line. Bacteria, with their compact genomes and direct regulators, have nothing like it.
Transcription and processing, hand in hand
There is a further twist that bacteria never face. A bacterium has no nucleus, so its ribosomes start translating an mRNA while it is still being transcribed. A eukaryote walls its DNA inside the nucleus, and the raw transcript Pol II makes — the pre-mRNA — is not yet a finished message. It must be capped at its front end, trimmed and tail-added at its back end, and have its internal non-coding stretches cut out before it can leave the nucleus. This editing is [[pre-mrna-processing|pre-mRNA processing]], the whole subject of the next rung.
What is elegant — and a fairly recent realization — is that this processing is not a separate factory the transcript visits afterwards. It happens largely co-transcriptionally: as Pol II crawls along, the very tail that got phosphorylated to release it now acts as a moving workbench, carrying the capping, splicing, and tail-adding crews and handing them each stretch of fresh RNA the instant it emerges. Transcription and processing are coupled — one continuous, coordinated operation rather than two stations on an assembly line. The polymerase is not just a copier; it is a mobile platform that organizes everything downstream.
Why all this complexity is the gateway to control
Step back and the point of all the extra machinery comes into focus. Every added piece — the dedicated polymerases, the committee of general factors, the Mediator bridge, the start-site that depends on assembling a whole complex — is one more handle the cell can grab to decide whether a gene fires. A bacterium's single sigma-driven polymerase offers few such handles; a eukaryote's many-part initiation gives it dozens. This is exactly why transcription is the cell's main control point: it is far cheaper to never start a copy than to make one and destroy it, so the cell concentrates its decisions right here, at initiation.
And these handles do not act alone. Remember from the chromatin guide that the same promoter can be buried in tightly packed heterochromatin or laid open in euchromatin; the pre-initiation complex can only assemble where the DNA is accessible. Layer the packaging state on top of enhancers, activators, and Mediator, and you get combinatorial control — many inputs converging on a single yes-or-no at the start site. That is how one three-polymerase system and ~20,000 protein-coding genes can build a liver cell and a neuron from the identical genome. The elaborateness you have just met is not an accident of evolution being fussy; it is the substrate on which all of the regulation in the rungs ahead is written.