Base Pairing, Antiparallel Strands & Directionality

From a shape to its rules

In the last guide you met the double helix as an object — two strands twisting around a shared axis, the sugar-phosphate backbone on the outside, the bases tucked into the middle like rungs of a spiral ladder. That told you what DNA *looks* like. This guide is about the handful of quiet rules that make the shape *work*: which base pairs with which, what holds them together, and which way each strand points. These rules are not decoration. They are the reason the molecule can be copied and read at all.

There is also a lovely piece of detective work behind these rules. Before anyone saw the helix, the chemist Erwin Chargaff measured the four bases in DNA from many species and noticed something odd: the amount of A always equalled the amount of T, and the amount of G always equalled the amount of C. These regularities — now called [[molbio-chargaffs-rules|Chargaff's rules]] — were a clue hiding in plain sight. They only made sense once Watson and Crick realised the bases were pairing up one-to-one across the two strands. The structure *explained* the numbers.

A pairs with T, G pairs with C

Here is the heart of it. Across the middle of the helix, the four bases pair in a strict, lock-like way: A always reaches across to T, and G always pairs with C. (In RNA, T is replaced by U, so the partner of A becomes U.) This is [[watson-crick-base-pairing|Watson-Crick base pairing]], and it is not arbitrary. It comes from two facts working together: the *size* of the bases and the *shape* of the hydrogen bonds they can offer.

Recall from the nucleotide guide that the bases come in two sizes: the bigger two-ring purines (A and G) and the smaller single-ring pyrimidines (C, T, and U). For the helix to stay an even width all the way down — no bulges, no pinches — each rung must be one big base plus one small base. So a purine always faces a pyrimidine. But that alone would still allow A-C or G-T; what forbids those and locks in A-T and G-C is the pattern of hydrogen bonds each base offers. A and T present donors and acceptors that line up perfectly for *two* hydrogen bonds; G and C line up for *three*. Mismatched pairs simply cannot make their hydrogen-bond partners meet, so they do not fit.

One honest caveat: hydrogen bonds across the rungs are not the *only* thing holding the helix together, and arguably not even the main one. The flat bases also stack on top of one another like a tidy roll of coins, and this base stacking contributes a great deal of the stability. Hydrogen bonding is what makes pairing *specific* — it decides *who* pairs with *whom* — while stacking does much of the work of keeping the stack stable. Both matter; if a textbook tells you hydrogen bonds alone hold DNA together, it has simplified a bit too far.

Every strand has a direction: 5' and 3'

Now the second rule, which is easy to overlook but just as deep: a strand of DNA is not symmetric. It has a head and a tail. To see why, look back at the backbone. Each sugar-phosphate backbone is a chain of sugar rings linked by phosphate groups, and the linkage is lopsided. The phosphate joins the *5th* carbon of one sugar to the *3rd* carbon of the next. The numbers refer to positions on the sugar ring; chemists write them with a prime mark, so we say five-prime (5') and three-prime (3').

Because every link runs 5'-to-3' the same way, the whole strand inherits a direction. One end is left with a free 5' position (usually carrying a phosphate); the far end is left with a free 3' position (carrying a hydroxyl). This is [[strand-directionality-5-3|5' to 3' directionality]]. By firm convention we read and write sequences in the 5'-to-3' direction, just as English runs left to right, so "5'-ATGC-3'" and "3'-CGTA-5'" describe the very same physical strand written from opposite ends. Whenever you see a bare sequence like ATGC with no labels, assume it means 5'-ATGC-3'.

Why the two strands must run antiparallel

Put the two rules together and a third one falls out for free. If each strand has a direction, which way do the two strands of a helix point? The answer is that they point *opposite* ways: where one strand runs 5'-to-3' going down the page, its partner runs 3'-to-5' alongside it. The two strands are [[molbio-antiparallel-strands|antiparallel]] — think of two lanes of traffic on a highway, heading in opposite directions, with A in one lane always reaching across to pair with T in the other.

This is not a free choice — the chemistry forces it. For two bases to make their hydrogen bonds meet across a rung, the two backbones have to approach the pair from opposite orientations. If you tried to lay the strands *parallel* (both pointing the same way), the donor and acceptor atoms would no longer line up, the pairs could not form their hydrogen bonds, and the helix would not close. Antiparallel is simply the geometry that lets A-T and G-C pairs sit flat and snug between two backbones. Here is the whole picture in one small sketch:

5'- A   T   G   C   A   A -3'
    |   |   |   |   |   |       A=T : 2 H-bonds
3'- T   A   C   G   T   T -5'   G=C : 3 H-bonds

read the top strand left-to-right (5'->3'):  A T G C A A
read the bottom strand left-to-right:         T A C G T T   <- runs 3'->5'
the bottom strand, read 5'->3', is:           T T G C A T

Two antiparallel strands. Each column is a base pair held by hydrogen bonds (two for A-T, three for G-C). Knowing one strand fixes the other completely.

Complementarity: why this lets DNA be copied and read

Now the payoff — the single most important consequence of everything above. Because the pairing is strict and the strands are antiparallel, the two strands are complementary: each one carries the full information needed to rebuild the other. Wherever you see an A on one strand, you know its partner is a T; wherever a G, you know its partner is a C. The second strand is not extra information — it is the first strand's mirror, written in pairing rules. This redundancy is the deep secret of heredity.

Watson and Crick understood this the moment they got the structure right, and famously noted that the pairing "immediately suggests a possible copying mechanism." The idea became semiconservative replication: to copy the molecule, the cell unzips the two strands and uses each old strand as a template to spell out a fresh complementary partner, base by base. Walk it through:

The cell breaks the weak hydrogen bonds between the bases and unzips the two strands, leaving each old strand with its bases exposed — covalent backbones intact, only the rungs split.
Each exposed base now demands its one legal partner: an exposed A calls for a T, a G calls for a C. The pairing rules turn each old strand into a precise instruction sheet for its new partner.
An enzyme reads each old strand 3'-to-5' and builds the new strand 5'-to-3' against it, so the new partner ends up antiparallel — just as the geometry requires.
The result is two identical double helices, each made of one old strand and one brand-new one. The information was never invented — it was simply read off a complementary template.

Reading, detecting, and two honest reminders

The very same complementarity is what makes DNA *readable*, not just copyable. When a gene is transcribed, the cell opens the helix and uses one strand as a template to build a complementary RNA copy — the same base-pairing logic, with U standing in for T. And complementarity is the workhorse of the lab too: a short single strand will find and bind its complement out of a soup of millions, a trick called hybridization that underlies DNA probes, microarrays, and the primers in PCR. One quiet rule — A with T, G with C, on antiparallel strands — turns out to power copying, reading, and detecting alike.

Two honest reminders to carry upward. First, the pairing is strict but not flawless: occasionally a wrong base slips in, and that is one origin of mutation — a topic we treat properly two rungs up, where you will see that most such changes are harmless and that this very imperfection is the raw material of evolution. Second, do not picture DNA as a rigid ladder. It is a dynamic, bendable molecule that flexes, breathes open locally, and wraps tightly around proteins; the clean diagrams are a snapshot, not the whole truth. Keep both caveats in mind and the rules in this guide will serve you faithfully all the way up the ladder.