JOVANA
Library Glossary Getting Started Three Levels Fields How it works Mission
Join the mission
All guides

Reading a Gene: Transcription Overview

To use a gene, a cell first copies it into RNA. Meet RNA polymerase, the machine that finds a gene, reads one DNA strand, and writes a matching RNA copy — and see why this first step is the cell's main control switch.

From the master archive to a working copy

You have already met the central dogma and its everyday traffic, DNA -> RNA -> protein, and in the replication rung you watched a cell copy its entire DNA before dividing. This rung zooms the magnifying glass onto the very first arrow, DNA -> RNA. The act of copying one gene into RNA is called [[transcription-overview|transcription]], and it is where gene expression begins — the moment a quiet stretch of DNA is finally read aloud.

Here is the picture to keep. Think of the DNA as a master cookbook so precious it never leaves the kitchen. When the cell wants to actually cook one recipe, it does not haul the whole heavy book to the counter — it copies out just that one page onto a slip of paper and works from the slip. Transcription is the cell making that slip. The original gene stays locked in the double helix, unharmed; what travels out to the workbench is a fresh, single-stranded RNA copy. Because the master is never spent, the same gene can be transcribed thousands of times, on demand.

Notice the contrast with replication, which you just left behind. Replication copies the *entire* genome, once, before a cell divides, and the new DNA is permanent — it is inherited by every descendant. Transcription copies *single genes*, over and over, and the RNA it makes is meant to be temporary, used and then discarded. One is the cell archiving its whole library; the other is the cell photocopying one page because it needs that page right now.

The machine that does the writing

The hand that does the copying is an enzyme called [[molbio-rna-polymerase|RNA polymerase]]. It is a large molecular machine, shaped a bit like a crab's claw, that clamps onto the DNA, pries the two strands apart over a short stretch, reads one of them, and stitches together a matching RNA chain one building block at a time. Nothing in transcription happens without it. The RNA's building blocks are ribonucleotides — the RNA cousins of the DNA letters — and they arrive as energy-rich triphosphates (ATP, GTP, CTP, UTP) that snap on and release energy as each link forms.

Here is the single difference from replication that surprises most beginners, and it is worth pausing on. Recall from the replication rung that DNA polymerase cannot start a chain from nothing — it can only *extend* an existing 3' end, which is why the cell first lays down a short RNA primer for it. RNA polymerase has no such limitation. It can begin a brand-new chain from scratch, joining the very first two nucleotides on its own. So transcription needs no primer at all. That is one reason the cell keeps two distinct polymerases instead of one: the replication machine is built for fidelity and inheritance and leans on a primer; the transcription machine is built to start anywhere a gene says "begin here" and write a disposable copy.

Which strand is read, and why the RNA looks like the other one

DNA has two strands wound together, but RNA polymerase reads only one of them for any given gene. The strand it actually reads is the template strand; the polymerase builds an RNA complementary to it, pairing A across from T, U across from A, G across from C, and C across from G — the same base-pairing logic you know, with one substitution: RNA carries uracil (U) wherever DNA would have used thymine (T). U pairs with A just as T does, so the rule barely changes; RNA simply spells that letter differently.

Now the satisfying twist that trips people up. Because base-pairing is symmetric, the RNA ends up matching not the strand it was read from, but the *other* one — the [[template-versus-coding-strand|coding strand]] (also called the sense strand), letter for letter, with U standing in for T. So if the coding strand reads 5'-ATGCCT-3', the template strand is its complement 3'-TACGGA-5', and the RNA the polymerase makes is 5'-AUGCCU-3' — identical to the coding strand except T became U. This is exactly why, when you look up a gene's "sequence" in a database, you are shown the coding strand: it reads like the RNA, even though the polymerase never touched it. It read the mirror image.

coding strand    5'-A T G C C T A G-3'   (matches the RNA; T->U)
template strand  3'-T A C G G A T C-5'   (the strand actually read)
                          |  base-pairing
RNA made         5'-A U G C C U A G-3'   (built 5'->3', read 3'->5')
The RNA equals the coding strand with U for T, and is complementary to the template strand it was read from.

One honest caveat: which strand serves as template is not fixed for a whole chromosome. It is decided gene by gene. Neighbouring genes can point in opposite directions, so for one gene a given strand is the template, and a little further along the *same physical strand* may be the coding strand of its neighbour. The labels "template" and "coding" describe a role within one gene, not a permanent property of a strand.

Which way it runs: the directionality

Reading and writing both have a direction, and transcription obeys one strictly. Recall from the nucleic-acid rung that every strand has two chemically different ends, named 5' and 3' after carbon positions on the sugar, and that the two strands of the helix are antiparallel — running opposite ways, like two lanes of traffic heading in opposite directions. RNA polymerase can only add a new nucleotide to the free 3' end of the growing RNA, so the RNA is always built in the 5'-to-3' direction.

Because the template is antiparallel to the RNA being made, the polymerase must scan the template in the *opposite* direction: it reads the template 3'-to-5'. Picture a typist who reads a page from the bottom line upward while typing a fresh page from the top down — the two run opposite ways, yet stay perfectly in step. The newest nucleotide is always at the leading edge of the RNA, where the 3' end keeps growing. A common slip is to say "the polymerase moves 5'-to-3'." Be precise: the *RNA* grows 5'-to-3'; the *enzyme* travels along the template toward the template's 5' end, i.e. it reads 3'-to-5'.

This fixed direction is not mere bookkeeping. It defines where a gene "starts" (its 5' end, the side where the polymerase first lands) and where it "ends," it settles which strand can serve as template for that gene, and it is why we write sequences 5'-to-3' by convention. The very same chemistry — new nucleotides can attach only at a 3' end — governs replication and, later, translation too, giving the whole flow of molecular information one consistent grain.

The three acts, and why this is the cell's main switch

The whole job unfolds in three acts, which the next guides in this rung open up one by one. For now, just hold the shape of the story — it is the same arc for any gene, in any organism.

  1. Initiation: RNA polymerase finds the start of a gene by recognizing a road-sign sequence called a promoter, binds there, and melts open a short bubble of DNA to expose the template — then commits to the first few RNA letters.
  2. Elongation: now committed, the polymerase marches steadily along the gene, reading the template 3'-to-5' and adding ribonucleotides to the RNA's 3' end at tens per second, while the DNA re-zips behind it and the finished RNA peels away.
  3. Termination: at a stop signal the polymerase releases the finished RNA, lets go of the DNA, and quits — giving each gene a discrete RNA of the right length rather than a read-through into its neighbours.

Of the three, initiation is by far the most decisive, and that brings us to why this rung sits where it does in the ladder. Nearly every cell in your body carries the same DNA, yet a neuron and a skin cell are profoundly different — because they transcribe *different* genes. The cell controls a gene chiefly by controlling whether transcription begins at all, which makes initiation the main control point of gene expression. Activators, repressors, and other regulators we will meet later act largely by speeding up or blocking this first step. Deciding what to copy is, in large part, deciding what kind of cell to be.