When the textbook lies: floating-point and backward stability

What Volume I quietly assumed

Throughout Volume I, every Gaussian elimination step, every inverse, every dot product was computed in exact arithmetic. Add two numbers, get their exact sum. Solve a system, get the exact solution. That world does not exist on a computer. A real machine stores numbers in floating-point arithmetic, a finite grid of representable values, and every operation rounds its result onto that grid.

The headline number is the unit roundoff u, about 1.1e-16 in IEEE double precision. The model is simple: when you compute a op b, the machine returns the true result perturbed by a relative error no larger than u, i.e. fl(a op b) = (a op b)(1 + d) with |d| <= u. One operation loses almost nothing. A billion operations, chained, is where trouble can compound.

# The classic shock, in any IEEE-double language:

0.1 + 0.2 == 0.3      ->  False
0.1 + 0.2             ->  0.30000000000000004

# 0.1 is not representable exactly; it is stored as the
# nearest grid point, and the tiny error survives the add.

# Catastrophic cancellation: subtracting near-equal numbers
a = 1.0000000001
b = 1.0000000000
a - b   ->  1.00000008274e-10   (only ~1 good digit left)
# The leading digits agreed and cancelled; what remains is
# dominated by each operand's rounding error.

Two everyday symptoms of finite precision: numbers you cannot store exactly, and subtraction that destroys significant digits.

Backward error: the right way to grade an answer

Asking 'how close is my computed x to the true x?' (forward error) is often the wrong question, because for an ill-conditioned problem even a perfect algorithm cannot get close. The better question is backward error: is my computed x the exact solution of a slightly perturbed problem? An algorithm is backward stable if it always solves a nearby problem — one perturbed by only O(u). That is the gold standard, and it is achievable even when forward error is large.

This split is liberating. You stop blaming your code for a wildly wrong x when the real culprit is a condition number of 1e14 baked into the matrix. And it gives you a concrete test: compute the residual r = b - A x_computed. If ||r|| is tiny relative to ||A|| ||x||, your solver was backward stable, full stop — even if x itself looks nothing like the truth.

Cheap insurance: iterative refinement

When backward stability is not enough and you want the forward error back down too, iterative refinement often buys it for almost free. The idea: you already factored A to get x, so reusing that factorization to solve a correction system is cheap. Compute the residual in higher precision, solve for the correction, add it back, repeat.

Solve A x = b once with your stable factorization to get x_0.
Form the residual r = b - A x_0, ideally in extended precision so it carries real information.
Solve A d = r reusing the same factorization — this is cheap, no new elimination.
Update x_1 = x_0 + d, then loop until the residual stops shrinking.