Nucleic Acids

Nothing New, Just a New Arrangement

A protein is a polyamide and a fat is almost a polyester — you saw those echoes in the last guides. A nucleic acid is the third great biopolymer, and the encouraging news is that it hides no new chemistry at all. Pull DNA apart and the only ingredients you find are a functional group toolkit you have already met: an aromatic heterocyclic ring for the base, a five-carbon sugar, and a phosphate ester. The art is entirely in how these three are clipped together and then strung into a chain.

Let us fix the vocabulary up front, because the names trip everyone up at first. A nucleoside is just a base joined to a sugar — two pieces. A nucleotide is a nucleoside plus one or more phosphate groups bolted onto the sugar — three kinds of piece. The nucleotide is the monomer, the single Lego brick; a nucleic acid is thousands of those bricks linked into one long polymer. So the whole subject is: what are the three pieces, what bonds hold a brick together, and what bond joins one brick to the next?

The Bases: Aromatic Rings Doing the Talking

The bases are where the information lives, and they are flat, nitrogen-rich aromatic rings — direct relatives of the pyridine and pyrrole heterocycles you met in the aromatics rung. There are two shapes. The pyrimidines (cytosine, thymine, uracil) are a single six-membered ring with two nitrogens. The purines (adenine, guanine) are bigger: a six-membered ring fused to a five-membered ring, four nitrogens in all. Both are genuinely aromatic — flat, fully conjugated, with the right 4n+2 count of delocalized pi electrons — which is why they are chemically tough, planar, and able to stack like coins.

Look closely at the rim of each ring and you find the very functional groups that will later do the base pairing: ring N-H bonds, carbonyl C=O groups, and amino -NH2 groups, all pointing outward. Here is a subtlety worth getting right. A carbonyl on an aromatic ring can in principle sit as a C=O (the keto form) or as an aromatic -OH (the enol form), and similarly an amino group as -NH2 or =NH. For every base, the form that nature uses — and overwhelmingly the one that exists — is the keto/amino form, not the enol/imino one. That choice is not arbitrary: it is what presents the correct pattern of hydrogen-bond donors and acceptors. This is the same keto-enol logic from the carbonyl rung, now quietly deciding whether the genetic code reads correctly.

The Sugar and the Glycosidic Bond

The base does not float free; it rides on a five-carbon sugar, a monosaccharide called ribose (in RNA) or 2-deoxyribose (in DNA). The sugar is in its ring form — and that ring, you will recall, is a cyclic hemiacetal: the chain's aldehyde was captured intramolecularly by one of its own hydroxyls, leaving a special anomeric carbon (C1') that carries both an -OH and the ring oxygen. That anomeric carbon is the reactive handle, exactly as it was when sugars formed glycosidic bonds to each other in the carbohydrate guide.

To attach the base, the sugar's anomeric -OH is replaced by a nitrogen of the base. The result is an N-glycosidic bond — the same acetal-forming step as a normal glycosidic linkage, except the incoming nucleophile is a ring nitrogen rather than another sugar's oxygen. Mechanistically the anomeric -OH leaves (it is a hemiacetal, so this position readily forms a stabilized cation), the base nitrogen attacks C1', and you have locked the base onto the sugar through a single C-N bond. That bond is robust in DNA precisely because it is a full acetal-type linkage, not a free hemiacetal — the anomeric carbon is now capped on both sides and can no longer pop open to the reactive aldehyde.

That single missing oxygen — the 2' position carries an -OH in ribose but only an -H in deoxyribose — is the entire chemical difference between RNA and DNA, and it has outsized consequences. The 2'-OH of RNA is a built-in nucleophile sitting right next to the backbone; it can swing around and attack the neighboring phosphate, cleaving the chain. DNA, lacking that hydroxyl, has no such internal saboteur, so it is far more stable. Honest takeaway: 'deoxy' is not a minor footnote — it is precisely why DNA, not RNA, became the long-term archive of the genome.

The Phosphodiester Backbone

Now we string the bricks together, and this is where the phosphate earns its keep. Phosphoric acid, H3PO4, has three -OH groups, so it can form up to three ester bonds. In the backbone it forms exactly two: one ester to the 3'-OH of one sugar and a second ester to the 5'-OH of the next sugar. A phosphate bridging two alcohols through two ester linkages is a phosphodiester ('di' = two esters). Chain these together — sugar, phosphate, sugar, phosphate — and you have built the continuous backbone that runs the length of the molecule.

one strand, read 5' -> 3':

   5'-end
     |
   [sugar]--base
     |  (3')
     O
     |
  -O-P=O          <- phosphodiester:
     |               TWO ester C-O-P bonds,
     O               ONE -O- left over, carrying
     |  (5')         a negative charge at body pH
   [sugar]--base
     |  (3')
     O
     |
  -O-P=O
     |
   ...continues...
     |
   3'-end

  third -OH of each phosphate stays free and ionized: -O(-)

The repeating sugar-phosphate backbone. Each phosphate uses two of its three -OH groups to form ester bonds (to 3' of one sugar and 5' of the next); the third stays as an ionized -O-, giving the backbone its overall negative charge. The chain has a direction: a free 5'-phosphate at one end, a free 3'-OH at the other.

Three honest consequences fall out of this design. First, the leftover third -OH on every phosphate is acidic (pKa near 1-2 for the first proton), so at the pH inside a cell it is deprotonated — the backbone carries one negative charge per nucleotide. That is why these molecules are called nucleic ACIDS, and why the whole strand is a polyanion that repels itself and demands counter-ions and proteins to fold up. Second, the chain has a genuine direction: one end terminates in a free 5'-phosphate, the other in a free 3'-OH, and biology always reads and builds 5' to 3'. Third — and this is the family resemblance — building each link expels water, so a nucleic acid is, like a protein, a condensation polymer.

Base Pairing: Information Held by Hydrogen Bonds

We have one strand. The double helix is two strands held face to face, and the glue is the weakest force in the whole molecule: hydrogen bonding. Recall the rule — a hydrogen bond needs a donor (an H attached to N or O) and an acceptor (a lone pair on N or O). The rims of the bases are studded with exactly these: the N-H and -NH2 groups are donors, the C=O and ring-N lone pairs are acceptors. Two bases pair when their donors and acceptors line up complementarily, like a key fitting a lock across the gap between the strands.

Size first. A purine (big, two rings) always pairs with a pyrimidine (small, one ring). Two purines would be too fat for the gap and two pyrimidines too thin, so only big-with-small keeps every rung the same width — the geometry that lets the helix stay smooth.
Then pattern. Among the big-small options, only A-with-T and G-with-C actually line up donors against acceptors. A and T present a matching two-bond pattern; G and C present a matching three-bond pattern. The wrong pairs would put donor against donor — a clash, not a clasp.
Count the bonds. A=T is held by two hydrogen bonds; G≡C by three. So a stretch of DNA rich in G and C is glued more tightly and needs more heat to pull apart — a real, measurable difference, not a metaphor.

Be honest about what makes the helix hold, though, because the hydrogen bonds are only part of the story. Each individual hydrogen bond is weak — a few percent of a real covalent bond — and on its own would be no match for the jostling of water. Two things rescue it. First, there are thousands of them in a row, and many weak grips add up to a firm one (the same cooperative logic that lets a gecko hang from glass). Second, and arguably more important, the flat aromatic bases stack on top of one another like a roll of coins, and that base-stacking — a mix of van der Waals contact and pi-system overlap — contributes much of the real stability. The hydrogen bonds choose the correct partner; the stacking and the sheer count make the structure last.

Why the Design Works

Stand back and the genius of the molecule is a division of labor between strong and weak bonds. The backbone is held by strong covalent phosphodiester bonds — the sequence of bases, the actual message, is written in unbreakable ink and does not fall apart on its own. The two strands, by contrast, are held to each other only by weak, reversible hydrogen bonds. That is exactly what you want: to copy or read the information you must briefly unzip the two strands, and a weak, reversible glue lets an enzyme peel them apart and reseal them without ever cutting the permanent backbone. Strong bonds store; weak bonds let you read.

And complementary pairing is the trick that makes copying possible at all. Because A only fits T and G only fits C, each strand is a perfect negative of the other: if you know one sequence you can deduce the other letter by letter. To replicate, the cell unzips the helix and lets each old strand act as a template, lining up free nucleotides by base pairing and stitching their phosphodiester backbone — the very condensation chemistry from the last section, now run by an enzyme. The chemistry you have been learning all rung long is, in the end, how a cell remembers what it is.

Carry a few honest caveats forward. The bases are aromatic heterocycles, not random rings — their flatness and their fixed keto/amino tautomer are what make reliable pairing possible; a base that flips to the wrong tautomer can mispair and cause a mutation. Hydrogen bonds are weak and reversible by design, and they do not act alone — base stacking carries much of the load. And the single -OH that distinguishes RNA from DNA is not a footnote but the reason one molecule archives the genome while the other is the disposable working copy. Hold those, and you understand nucleic acids as chemistry, not just as a diagram from a biology textbook.