When small steps are forced for the wrong reason
By now you can step a solution forward several ways: the crude tangent hop of Euler's method, the four-sample polish of RK4, the recycled-history thrift of a predictor-corrector scheme. Every one of them came with the same promise — shrink the step h and the local truncation error shrinks faster, so a smaller h buys more accuracy. This guide is about a nastier situation, where you are forced to take an absurdly tiny step even though the true solution is smooth and boring and a big step would be plenty accurate. The step is not paying for accuracy. It is paying for the method to not blow up.
Here is the cleanest test tube. Take the linear decay y' = -50 y with y(0) = 1. The exact solution is y = e^(-50 x), a curve that drops fast and then flattens into a nearly horizontal line at zero — utterly tame. Now run forward Euler on it. Each step does y_new = y + h * (-50 y) = (1 - 50 h) y. The number 1 - 50 h is an amplification factor: every step multiplies the answer by it. If h = 0.1 then 1 - 50 h = -4, and your numerical solution becomes 1, -4, 16, -64, 256, ... — oscillating and exploding to infinity while the true answer is quietly dying to zero. Nothing is wrong with the equation. The method is the one having a panic attack.
y' = -50 y, y(0) = 1 true: y = e^(-50x) -> 0 forward Euler: y_(n+1) = (1 - 50 h) y_n h = 0.1 factor = -4.0 1, -4, 16, -64, ... EXPLODES h = 0.05 factor = -1.5 1, -1.5, 2.25, ... EXPLODES h = 0.04 factor = -1.0 1, -1, 1, -1, ... oscillates forever h = 0.02 factor = 0.0 1, 0, 0, 0, ... (lucky) decays h = 0.01 factor = 0.5 1, 0.5, 0.25, ... decays, stable
Stability regions: a method's safe zone
Let us name the pattern. Apply a one-step method to the model equation y' = lambda y, where lambda is a constant (here negative, since the solution decays). One step multiplies y by some amplification factor R that depends only on the combination z = h * lambda. For forward Euler, R(z) = 1 + z. The method stays bounded — it does not blow up — exactly when |R(z)| <= 1. The set of complex z for which that holds is the method's region of absolute stability, the stability region. It is a fixed shape in the z-plane, a property of the algorithm, not of your particular equation.
Now the whole story snaps into focus. To get a decaying true solution you need lambda < 0, and to keep the method bounded you need h * lambda to land inside that fixed region. For forward Euler |1 + z| <= 1 is a little disk of radius 1 sitting at z = -1, so on the real axis it demands -2 <= h * lambda <= 0, i.e. h <= 2 / |lambda|. With lambda = -50 that is h <= 0.04 — exactly the threshold we saw. RK4 has a fatter, lobed stability region, so it tolerates a somewhat bigger h, but it is still a bounded region: every explicit method has a finite safe zone, and a large negative lambda will eventually force h down to nothing.
Stiffness: when the fast bit is gone but still in charge
A single decaying exponential rarely traps you; the real pain comes from a mixture. Picture a system whose solution is something like y = e^(-1 x) + e^(-1000 x). The second term has vanished into the noise after x = 0.01; for the rest of the interval the answer is essentially just the gentle e^(-x). Yet to keep an explicit method stable you must keep h * (-1000) inside the stability region, forcing h around 0.001 or smaller — for the entire run, long after the fast term died. You are taking a million baby steps to trace a curve a pocket calculator could sketch. That mismatch is stiffness: widely separated time scales, where the fastest (already-decayed) mode dictates the step size for everyone.
Stiffness is slippery to define precisely — there is no single number that flips on. A useful working description: a problem is stiff over an interval if an explicit method is forced to take steps far smaller than the smoothness of the solution would ever justify, purely for stability. Stiff systems are everywhere chemists and engineers look: chemical kinetics with fast and slow reactions, electrical circuits with tiny capacitors next to big inductors, control systems with fast actuators and slow dynamics. The honest takeaway is that on a stiff problem RK4's accuracy is wasted — its modest stability region, not its error, is the bottleneck.
The implicit cure: backward Euler and BDF
If the problem is the small stability region, the cure is a method with a huge one. Switch from evaluating the slope at the old point to evaluating it at the new point: backward (implicit) Euler reads y_(n+1) = y_n + h * f(x_(n+1), y_(n+1)). On the model y' = lambda y this gives y_(n+1) = y_n / (1 - h * lambda), so the amplification factor is R(z) = 1 / (1 - z). For any z with negative real part, |R(z)| < 1 automatically — the stability region is the entire left half-plane. The method is A-stable: no matter how large the negative lambda or how big the step h, the computation stays bounded and decays, just like the true solution. The stiff term can no longer hold you hostage.
There is no free lunch, only a different bill. Because y_(n+1) appears on both sides, each step of an implicit method is an equation to solve, not a formula to evaluate — for a nonlinear f you run a Newton iteration every step. That makes each step more expensive. But on a stiff problem it is a spectacular trade: backward Euler might take a step h = 0.1 where explicit RK4 was pinned at h = 0.001, so even at a few Newton iterations per step it wins by orders of magnitude. Production stiff solvers use backward differentiation formulas (BDF), the higher-order cousins of backward Euler, which keep the big stability region while lifting the accuracy past first order.
Adaptive steps: letting the solver pick its own pace
There is a second reason a fixed step is wasteful that has nothing to do with stability. Real solutions loaf along flat stretches and then lurch through sharp turns; a step small enough for the turns is overkill on the flats, and a step sized for the flats botches the turns. The fix is adaptive step-size control: at each step, estimate the local error, and if it is comfortably under your tolerance, accept the step and grow h; if it overshoots, reject the step and shrink h, then retry. The solver feels its way along, taking long strides where the curve is lazy and mincing steps where it is busy.
But how do you estimate the error without knowing the true answer? The elegant trick is an embedded pair: run two Runge-Kutta formulas of different order that deliberately share their slope evaluations, so the second answer comes almost free. Their difference is a cheap, reliable estimate of the local error. This is exactly what RKF45 (Runge-Kutta-Fehlberg) does — it pairs a 4th-order and a 5th-order formula reusing the same sampled slopes, and the gap between them drives the step controller. The famous ode45 in MATLAB and SciPy's default RK method are modern descendants of this idea.
- From the current point, compute two estimates of the next value with the embedded pair (one of order p, one of order p+1).
- Take their difference as the local error estimate e, and compare it with your tolerance tol.
- If e <= tol, accept the step (use the higher-order value) and enlarge h for next time; if e > tol, reject it and shrink h, then redo the step.
- Scale the new step by a formula like h_new = h * (tol / e)^(1/(p+1)), with a safety factor, so the controller eases toward the target error.
Putting it together — and what still can bite
These two ideas — stability and adaptivity — are not rivals; the best solvers braid them. An adaptive controller protects you from accuracy disasters by shrinking h around sharp features, while an implicit core (BDF, A-stable) protects you from stability disasters on stiff stretches. That is why mature libraries ship a stiff solver and a non-stiff solver, and some even sniff the problem and switch between them. The art of numerical ODEs is largely matching the method to the equation's temperament rather than reaching for one favorite.
Be honest about the remaining traps. Adaptive control bounds the local error per step, not the accumulated global error — small per-step errors still add up over a long integration, and a tight tolerance does not guarantee a globally faithful curve. Push the tolerance too low and you hit a floor: round-off from finite-precision arithmetic starts to dominate, so shrinking h further makes things worse, not better. And no stepper rescues an ill-posed problem — if the equation itself has sensitive dependence (chaos) or a solution that blows up in finite time, the numbers will faithfully track that instability right up until they cannot. The methods are honest mirrors; they reflect the equation's true character, warts and all.