Naming Organic Molecules (IUPAC)

Why a shared name matters

From the previous guide you can already turn a molecule into a skeletal drawing and read the zig-zag back. A drawing is wonderful in a notebook, but you cannot read it aloud, type it into a database, or fax it to a supplier in another country. We need a *name* — a string of words and numbers that one chemist can write down and another, who has never seen the drawing, can use to reconstruct the exact same structure. That is what IUPAC nomenclature is: an agreed-upon recipe, published by the International Union of Pure and Applied Chemistry, for turning a structure into one and only one systematic name.

The deeper point is that a systematic name is two-way. Many everyday names — water, aspirin, glucose — are *trivial* names: you simply memorise that the word maps to a structure, and the word itself tells you nothing. An IUPAC name is the opposite. It is built by rules, so it can be *decoded* by the same rules. Hand someone '2-methylbutane' and, even if they have never met the compound, they can draw it without hesitation. The name is not a label stuck on the molecule; it is a compressed set of drawing instructions.

The roots: counting carbons in a straight chain

Every alkane name is built from two pieces: a *root* that says how many carbons are in the main chain, and the ending *-ane* that says the molecule is a saturated alkane — all single bonds, no rings. The roots for the first four are historical oddities you simply learn — meth- (1), eth- (2), prop- (3), but- (4) — and from five onward they become the friendly Greek number words: pent- (5), hex- (6), hept- (7), oct- (8), non- (9), dec- (10). So a five-carbon straight chain is *pentane*, an eight-carbon one *octane*. These names march up in a regular ladder, each member differing from the last by one CH2 — exactly the homologous series idea from the foundation rungs.

Memorising the first ten roots is the one genuine act of rote learning the whole system asks of you, and it pays for itself instantly, because these same roots reappear everywhere — in branch names, in cycloalkanes, in alcohols and acids you will name later. Notice the homologous series is not just a naming convenience: because each step adds one identical CH2 unit, physical properties like boiling point creep up smoothly down the list, so the name quietly predicts behaviour. Methane and ethane are gases, hexane is a runny liquid, and the long chains in candle wax are soft solids — all the same family, just longer.

Branches: the substituents and their alkyl names

Real molecules are rarely a single straight line; they sprout side chains. A side chain branching off the main chain is a substituent, and when that substituent is itself a piece of carbon chain it is an alkyl group — an alkane that has lost one hydrogen so it can attach somewhere. You build an alkyl name by swapping the *-ane* ending for *-yl*: methane minus one H becomes a methyl group (CH3-), ethane becomes ethyl (CH3CH2-), propane becomes propyl. The branch is named as a unit, then bolted onto the front of the parent name as a prefix.

Three- and four-carbon branches can attach by different carbons, so they earn extra everyday names worth knowing. A propyl group attached by its end carbon is *propyl*; attached by its middle carbon it is *isopropyl*, the familiar branched group of rubbing alcohol. Among the four-carbon groups, *tert-butyl* is the one to recognise: a central carbon bound to three methyls, attaching through that crowded central carbon, written (CH3)3C-. The 'tert' (tertiary) flags that the attaching carbon touches three other carbons — the same primary/secondary/tertiary carbon classification you met earlier, now doing real naming work.

common alkyl groups (the - is the attachment point)

  methyl       CH3-                  from methane
  ethyl        CH3CH2-               from ethane
  propyl       CH3CH2CH2-            attaches by an END carbon
  isopropyl    (CH3)2CH-            attaches by the MIDDLE carbon
  tert-butyl   (CH3)3C-             attaches by a carbon touching 3 carbons

The everyday alkyl groups. Same atoms can give different groups depending on which carbon does the attaching — that is what 'iso' and 'tert' encode.

The naming algorithm, step by step

Now we assemble the pieces. Naming a branched alkane is a short, deterministic procedure — follow it in order and a single correct name falls out every time. The two places beginners stumble are choosing the parent chain (it is the *longest*, even if that means a crooked path through the drawing) and choosing the direction to number (the rule is *lowest locants*, and you sometimes have to test both ends). A parent chain and its locants are the load-bearing decisions; the rest is bookkeeping.

Find the longest continuous chain of carbons — this is the parent chain, and it gives the root + -ane ending. If two chains tie in length, pick the one with more branches. The longest chain need not be drawn in a straight line; trace it around corners.
Number the chain from one end to the other, choosing the direction that gives the substituents the lowest set of locants. Compare the two possible numberings term by term; the first point of difference decides which wins.
Identify each substituent and the number of the carbon it sits on. Name each one as an alkyl group (methyl, ethyl, and so on) carrying its locant, e.g. '3-methyl'.
If the same substituent appears more than once, gather them with a multiplying prefix — di, tri, tetra — and give every copy its own locant, e.g. '2,3-dimethyl' (two locants, even for one 'dimethyl').
Assemble the full name: list substituent prefixes in alphabetical order, then the parent. Alphabetise by the substituent's base name and IGNORE the multiplying prefixes — 'ethyl' comes before 'dimethyl' because you alphabetise on 'methyl' vs 'ethyl', not on the 'd'.

Watch the algorithm run on one molecule. Suppose the longest chain is five carbons (pentane) with a methyl branch hanging off. Number from the end that reaches the branch sooner: numbering left-to-right might put the methyl on carbon 2, right-to-left on carbon 4. Since 2 beats 4, we keep the left-to-right numbering and the name is 2-methylpentane. Had we numbered the other way we would have written '4-methylpentane' — a name that describes the *same molecule* but breaks the lowest-locant rule, so it is simply wrong. That uniqueness is the entire reason the rules exist.

Rings, traps, and honest exceptions

Naming a ring is the same algorithm with one tweak: add the prefix *cyclo-* and treat the ring as the parent. A six-membered carbon ring is cyclohexane; a methyl on it makes methylcyclohexane. With a single substituent you need no number, because every ring carbon is equivalent — there is no '1' worth writing. With two or more substituents you do number, starting from one of them and going around the ring in whichever direction gives the lowest locants. A cycloalkane also follows a special rule when the ring and a chain compete: whichever has *more* carbons is the parent, and the smaller piece becomes the substituent.

Be honest with yourself about three traps. First, the longest chain hides: it loves to bend around a corner of the drawing, so a chain that *looks* like a four-carbon parent with two branches is often really a six-carbon parent with one — always count, never eyeball. Second, alphabetising ignores multiplying prefixes (di, tri) but does *not* ignore the structural prefixes baked into a name like 'isopropyl' — there 'isopropyl' alphabetises under 'i', while a detachable 'di' does not. Third, the lowest-locant rule is applied as a *set* compared at the first point of difference, not as a sum: {2,2,5} beats {3,3,4} because 2 is less than 3 at the very first comparison, even though both sets total 9.