Mean and Spread: Your First Two Numbers

Five tries, five answers

Imagine you weigh the same small coin five times on a sensitive balance and write down: 4.01, 4.03, 3.99, 4.02, 4.00 grams. The coin did not change — your readings did. Tiny, unpredictable nudges (a draft of air, the balance settling, where exactly you set the coin down) make each number a little different. This unavoidable jitter is called random error, and the fact that repeats land close to each other is called repeatability.

Reporting all five numbers is honest but clumsy. What we want is two summary numbers: one that says 'where is the middle?' and one that says 'how spread out are they?' Those two numbers — the mean and the standard deviation — are the backbone of every result in analytical chemistry.

The mean: add them up, divide

The mean (everyday people say 'average') is the simplest middle. Add the numbers, divide by how many there are. For our coin: 4.01 + 4.03 + 3.99 + 4.02 + 4.00 = 20.05, and 20.05 ÷ 5 = 4.010 grams. The mean is the balance point of the data — if you stacked the readings on a seesaw, that's where it tips level.

Why does averaging help? Because random error pushes some readings up and others down. When you add them, the high nudges and low nudges partly cancel. The more repeats you average, the better the cancellation — which is the whole reason chemists measure things more than once.

When the mean lies: the median

The mean has one weakness: a single wild value drags it. Suppose your fifth weighing was 4.40 (you bumped the bench). The mean jumps to 4.090, even though four readings cluster near 4.01. The median — the middle value when you line the numbers up in order — barely moves. Sort them: 4.01, 4.02, 4.03, 4.40 plus the others; the middle one stays near 4.02. The median ignores how far the wild value is, only that it's on one side.

So the median is robust: it shrugs off one bad point. For a clean set of repeats the mean and median agree closely, and the mean is preferred because it uses every number. But if they disagree a lot, that's a flag — you may have a suspect reading worth investigating (a later guide shows how to test it).

The standard deviation: measuring the wobble

Now for the second number. The standard deviation (written s) answers 'how far, typically, does a reading sit from the mean?' It's almost a plain average of the gaps — with two twists that exist for good mathematical reasons.

Find each reading's distance from the mean (4.01 − 4.010 = 0.000, 4.03 − 4.010 = +0.020, and so on).
Square each distance. Squaring turns negatives positive (so they don't cancel) and punishes big gaps more than small ones.
Add the squares and divide — but by (n − 1), not n. For five readings, divide by 4. This result is the variance.
Take the square root to get back to grams. That square root is the standard deviation — here, about 0.0158 g.

Before the square root, that quantity is the variance. Variance lives in squared units (grams²), which is awkward to picture, so we usually quote the standard deviation instead. Why divide by (n − 1) and not n? Because you used the data itself to compute the mean, you've already 'spent' a little information; dividing by the smaller number gently corrects for that and stops s from coming out too small. You'll meet this idea again as degrees of freedom.

Putting spread in context: %RSD

Is a standard deviation of 0.016 g good or bad? It depends on what you're weighing. For a 4-gram coin it's tiny; for a 0.02-gram speck it's a disaster. To judge spread fairly, divide s by the mean and multiply by 100. That's the relative standard deviation (%RSD), also called the coefficient of variation. For the coin: 0.0158 ÷ 4.010 × 100 ≈ 0.39% — excellent.