Order Notation & Asymptotic Series

Why we need a language for "how big"

Welcome to the first rung of asymptotics. The whole game here begins with a confession: most interesting problems have no exact answer in closed form. There is no tidy formula for the period of a real pendulum at large swing, no elementary antiderivative for the bell curve, no finite expression for many sums and integrals that physics throws at you. The asymptotic frame of mind accepts this gracefully and asks a different, more useful question — not "what is the exact value?" but "how does this behave as some variable gets very large, or very small, and how good is my approximation?"

You already used this instinct in Volume I without naming it. When you computed a limit at infinity of (3x^2 + 5x) / (x^2 + 1) and said "it goes to 3 because the x^2 terms dominate," you were doing asymptotics: throwing away everything that becomes negligible and keeping what controls the growth. Order notation is simply the precise, portable vocabulary for that move — a way to say exactly how negligible a discarded term is, so that the bookkeeping never lies to you.

Big-O, little-o, and the tilde

There are three workhorse symbols, and the difference between them is the whole point. Big-O is a ceiling. We write f = O(g) to mean |f| stays bounded by some constant times |g| once we are close enough to the limit point — f grows no faster than g, up to a fixed factor. It says nothing about whether f is actually as big as g; it only forbids f from being bigger. So x^2 = O(x^3) as x to infinity is perfectly true, even though x^2 is much smaller, because the claim is just "x^2 is not larger than a constant times x^3." Read Big-O as "at most this order."

Little-o is much stronger — it is a statement of true negligibility. We write f = o(g) to mean the ratio f/g goes all the way to zero in the limit. Now f is not merely bounded by g, it is utterly swamped by it: f becomes an arbitrarily tiny fraction of g. So x^2 = o(x^3) as x to infinity, because x^2/x^3 = 1/x to 0. The mental picture: little-o is the dust you sweep away without a second thought, while Big-O is a piece of furniture that might be the same size as g or might be smaller, but is definitely not bigger.

The third symbol is the tilde of asymptotic equivalence, f ~ g, meaning the ratio f/g goes to exactly 1. This is the strongest and most informative: f and g become essentially the same size, equal in the leading term. It is what Stirling really claims when we write n! ~ sqrt(2 pi n) (n/e)^n — not that the two agree (they differ by a percent or two for moderate n) but that their ratio marches to 1 as n grows. Be careful: asymptotic equivalence controls the ratio, not the difference. n! and its Stirling estimate can differ by an enormous absolute amount even while their ratio is nearly 1, because both are astronomically large.

From Taylor series to asymptotic series

Recall the Taylor series from Volume I. For a function like e^x it converges: pile on more and more terms at a fixed x and the partial sums close in on the true value, the error shrinking to zero. Convergence is a statement about what happens as the number of terms goes to infinity, with x held still. An asymptotic series turns this on its head. It is a series, usually in powers of 1/x, that may not converge at all for any fixed x — yet for each fixed number of terms, the approximation gets better and better as x grows large. The limit being taken is in x, not in the term count.

Here is the precise definition, and it is worth reading slowly. A series sum of a_n x^{-n} is asymptotic to f(x) as x to infinity, written f(x) ~ sum a_n x^{-n}, when for every fixed N the remainder after N terms is little-o of the last term you kept: f(x) - (a_0 + a_1/x + ... + a_N/x^N) = o(x^{-N}). In plain words: chop the series off anywhere, and the error you make is negligible compared to the smallest term you retained — provided x is large enough. That "provided x is large enough" is doing enormous work, and it is exactly what separates an asymptotic series from a convergent one.

The beautiful paradox, worked out

Let us make this concrete with the cleanest example in the subject. Consider the function f(x) = integral from x to infinity of e^{x - t} / t dt — closely related to the exponential integral, the kind of object that appears whenever you ask how much of an exponential tail is left beyond a point. There is no elementary closed form for it, but we can squeeze an expansion out by repeated integration by parts, each step pulling out one more power of 1/x.

f(x) = int_x^inf e^{x-t}/t dt

Integrate by parts repeatedly (each step differentiates 1/t):
  f(x) = 1/x - 1/x^2 + 2!/x^3 - 3!/x^4 + ... + (-1)^N N!/x^{N+1} + R_N

Term n is  (-1)^n n! / x^{n+1}.
  Ratio of term (n+1) to term n  =  (n+1)/x.
  -> terms SHRINK while n < x, then GROW once n > x.
  -> for ANY fixed x the series DIVERGES (n! beats x^n eventually).

But the remainder obeys |R_N| <= N! / x^{N+1}  =  size of first DROPPED term.
  Smallest term is near n = x; stop there for best accuracy.

x = 10:  best truncation error ~ 10!/10^11 ~ 4e-5   (excellent!)
         keep summing past n=10 and the "approximation" blows up.

Repeated integration by parts gives a series whose terms shrink, bottom out near n = x, then grow without bound — divergent, yet superbly accurate if you stop at the smallest term.

Stare at the term (-1)^n n! / x^{n+1}. The factorial n! eventually outruns any fixed power x^n, so summed to infinity the series is hopelessly divergent — for every x. And yet the magic is in the remainder. Because the error after N terms is bounded by the size of the first term you threw away, and because the terms shrink while n is below x before turning around, the best you can do is to stop near the smallest term, roughly at N close to x. At x = 10 that smallest term is around 4 times 10^{-5}: a few correct decimal places from a divergent series that no amount of extra terms could improve. Add more and the approximation gets worse, not better.

This is the central, honest lesson of the whole subject, so let us say it plainly and not soften it. Divergence and accuracy are not enemies here. A convergent series promises you can reach any accuracy if you are patient enough to add terms; an asymptotic series makes no such promise — it has a floor, an irreducible best error set by its smallest term, and trying to push past that floor by adding terms makes things worse. What you buy in return is that the floor is often extraordinarily low and is reached in just two or three terms. For large x it is the better deal, which is why physicists reach for it constantly.

How to actually use it

The single most useful asymptotic series you already half-know is Stirling's series for the factorial. Its leading term gives n! ~ sqrt(2 pi n) (n/e)^n, and the correction factor (1 + 1/(12n) + ...) is itself a divergent asymptotic series. The recipe for using any such expansion is mercifully short, and it is the same recipe whether you are estimating a factorial, an exponential integral, or the tail of a probability distribution.

Identify the large (or small) parameter — call it x — and make sure you are genuinely in the regime where it is large; an asymptotic series is a promise only out there.
Generate terms one at a time (often by repeated integration by parts, or by substituting a trial series), watching their sizes shrink.
Find the smallest term — the place where the next term would stop shrinking and start growing — and truncate THERE, not before and not after.
Estimate your error as roughly the size of the first dropped term; that is your honest error bar, and it is the best this series can do.

One last honesty check, because it trips up newcomers. An asymptotic series captures the part of a function that is visible to powers of 1/x, but a function can carry pieces that are invisible to it — terms like e^{-x} that go to zero faster than every power and so leave no trace in the expansion. Two genuinely different functions can share the very same asymptotic series. So the series tells you how a function behaves to high precision for large x, but it does not, by itself, pin the function down uniquely. Knowing what your tool sees, and what it is blind to, is the mark of someone who actually understands asymptotics rather than just reciting it.

Where this rung is heading

You now hold the grammar of the whole rung. Big-O, little-o, and the tilde let you say precisely how big and how small things are; the idea of an asymptotic series lets you trust a divergent expansion as long as you stop at the right place. Every method ahead is built on this footing. The next guides will show you machines that manufacture these series automatically from integrals — Laplace's method and Watson's lemma for integrals dominated by a sharp peak, the method of steepest descent for the complex plane — and then perturbation theory, where you treat a hard problem as an easy one nudged by a small parameter.

Keep one image in your pocket as you go: the asymptotic series as a marksman who can put two or three shots dead in the bullseye and then, asked for a fourth, starts spraying the wall. The skill is not in firing more rounds — it is in knowing exactly when to stop. Master that, and a vast territory of problems that have no exact answer suddenly yields answers you can stake your engineering on.