DNA Sequencing with Chain-Terminating Inhibitors
Let DNA copy itself, but spike each base with a “stop” — the lengths spell out the sequence.
To read the order of letters in a strand of DNA, Sanger's trick was to let it copy itself — but secretly slip in “stop” letters, so the copies pile up at every position and their lengths spell out the sequence.
The big idea
DNA is a string of four letters — A, C, G, T. For decades we could see that the letters were there but not read their order. Frederick Sanger found a way. You take the strand you want to read and let a copying enzyme build its complement, one letter at a time. The clever part is what you add to the mix: a few sabotaged letters — “dideoxy” versions — that the enzyme will happily attach, but which then refuse to let any further letter join. Each one is a full stop.
Do this four times, once for each kind of letter, each time sabotaging only the A's, or only the C's, and so on. In the “stop-at-A” batch you get copies that end at every A in the sequence; in the “stop-at-C” batch, copies ending at every C. Now you have, across four piles, a copy that stops at every single position — and the length of each copy tells you exactly where its stopping letter sits.
How it came about
By the mid-1970s the chemistry of DNA was understood, but reading a long sequence was painfully slow. Sanger, working quietly at the Medical Research Council's Laboratory of Molecular Biology in Cambridge, had already won a Nobel Prize for working out the sequence of insulin, a protein. He turned the same patience on DNA. With Alan Coulson he first built a clumsier “plus and minus” method; then, in 1977, came the elegant one — the dideoxy, or chain-termination, method. That same year an American pair, Allan Maxam and Walter Gilbert, published a completely different chemical method. For a while both were used, but Sanger's proved easier and was the one machines could eventually be taught to run. In 1980 it brought him a second Nobel Prize.
Why it mattered
Before this, the genome was a closed book. Sanger's method opened it. Once you can read the order of the letters, you can find the gene behind a disease, compare one species' DNA with another's, and check whether an edit did what you intended. Sped up and handed to machines, the very same idea read the entire human genome — three billion letters — and laid the foundation of modern genetics and medicine.
A way to picture it
Imagine photocopying a long sentence, but your copier is rigged so that, now and then, it jams right after copying a particular letter — say, every time it hits an “e”. Run a stack of copies and you'll get pages that stop at the first “e”, the second, the third, and so on. Line them up shortest to longest and the place each one stops tells you exactly where every “e” falls in the sentence. Do it for each letter of the alphabet and you can reconstruct the whole sentence from nothing but the lengths of the jammed copies. The “stop” letters are dideoxy bases; the lengths are read off a gel.
Where it sits
Watson and Crick (1953) showed DNA was a four-letter code paired in a double helix; that pairing is exactly what Sanger's copying enzyme exploits. Mendel's abstract “factors” had by now become readable stretches of letters. Sanger sequencing then joined forces with PCR (1985), which makes enough copies of a gene to read, and led straight to the Human Genome Project and to today's gene-editing medicine — where a CRISPR edit is checked by sequencing the very letters it changed.
A new method for determining nucleotide sequences in DNA is described.
It is similar to the "plus and minus" method but makes use of the 2′,3′-dideoxy and arabinonucleoside analogues of the normal deoxynucleoside triphosphates, which act as specific chain-terminating inhibitors of DNA polymerase.
The technique has been applied to the DNA of bacteriophage ϕX174 and is more rapid and more accurate than either the plus or the minus method.