Molecular Biology 1958

On Protein Synthesis

Francis Crick

A gene's base sequence is a code for a protein — and information never flows back out.

Choose your version

In depth · the introduction

Five years after he helped find the shape of DNA, Francis Crick wrote down the rules for what that shape does — and predicted a molecule nobody had ever seen.

The idea

DNA spells out proteins. The order of its four chemical 'letters' is a code, and that order alone decides the order of the building blocks — amino acids — strung together to make a protein. Get the sequence, and the protein folds itself into a working shape.

Crick added a second rule about direction. Information flows from DNA to a working copy to protein, and it never runs back the other way: a finished protein can't dictate the sequence of the gene that made it. He called this the central dogma.

How it came about

In 1957 Crick gave a lecture in London; the next year he published it as a paper, 'On Protein Synthesis'. Molecular biology was then a jumble of half-connected findings, and Crick was its boldest theorist — happy to reason his way to a conclusion and dare the experiments to catch up.

His most daring guess: that a small 'adaptor' molecule must exist to bridge the gap between the gene's code and the amino acids, since the two have no natural chemical attraction. He predicted it on logic alone. That same year, experimenters found exactly such a molecule — what we now call transfer RNA.

Why it mattered

This short paper gave molecular biology its constitution. It turned a pile of observations into a single, one-directional logic, and it named the great puzzle that came next — the 'coding problem' of how three-letter words in DNA map to amino acids. The race to crack that code, which Crick himself helped win, is one of the triumphs of twentieth-century science.

An analogy

Think of a master blueprint locked in a library (the DNA in the cell's nucleus). You never take the original to the workshop; you make a working photocopy (messenger RNA) and carry that. On the workshop floor, tiny translators (the adaptor molecules) each read one three-letter word and fetch exactly the matching part, and the machine is assembled piece by piece. And here's the one-way rule: you can build the machine from the blueprint, but you can never read the blueprint back off the finished machine.

Where it sits

It begins where Watson and Crick's 1953 structure left off (see Watson–Crick, 1953): the double helix showed how DNA could be copied; this paper asked what the copying is for. It set the stage for the cracking of the genetic code in the 1960s, was later refined when reverse transcription was discovered, and lives on today in tools you've heard of — gene editing (see CRISPR, 2012) and the mRNA vaccines that handed your own cells a recipe to follow.

The original document

Original source text

F. H. C. Crick · Symp. Soc. Exp. Biol. XII (1958), 138–163 · from a lecture to the Society for Experimental Biology, University College London, September 1957

Two general principles

Crick frames the whole problem of protein synthesis around the order of the amino acids, and proposes two ideas to govern it. He is candid that both are, at this point, hypotheses — a theorist's scaffold for facts not yet in hand.

The Sequence Hypothesis

In its simplest form it assumes that the specificity of a piece of nucleic acid is expressed solely by the sequence of its bases, and that this sequence is a (simple) code for the amino acid sequence of a particular protein.

One linear sequence dictates another: the order of bases along the nucleic acid fixes the order of amino acids along the chain — and that order alone, Crick argues, is enough to determine how the protein folds and what it does.

The Central Dogma

This states that once ‘information’ has passed into protein it cannot get out again. In more detail, the transfer of information from nucleic acid to nucleic acid, or from nucleic acid to protein may be possible, but transfer from protein to protein, or from protein to nucleic acid is impossible.

Information means here the precise determination of sequence, either of bases in the nucleic acid or of amino acid residues in the protein.

The claim is specifically about sequence information. It permits the flows we now call replication and transcription and translation, and forbids the reverse — protein dictating the sequence of a nucleic acid or of another protein.

The adaptor hypothesis

Because an amino acid has no obvious chemical affinity for the bases that supposedly specify it, Crick reasons there must be an intermediary: a set of small ‘adaptor’ molecules, very likely containing nucleotides, each of which base-pairs with the code on one side and carries its own amino acid on the other — with a special enzyme to load each one. He expects about twenty such adaptors. The molecule he is describing, still hypothetical here, is transfer RNA.

[ … ]

What is left open

The coding problem — exactly how a run of bases is parsed into amino acids — Crick leaves explicitly unsolved, the central puzzle he hands to the next decade. He is equally frank about how much of the rest is informed guesswork, sketching where in the cell synthesis happens (on RNA-rich particles) while marking the speculation as speculation.

Medical Research Council Unit, Cavendish Laboratory, Cambridge · 1958