Laplace's Method & Watson's Lemma

The one picture behind it all

The previous guide taught the language of asymptotics — what 'f ~ g' means and why an asymptotic series can be wildly useful even while it diverges. Now we put it to work on a single, enormously common shape of problem: an integral 'I(s) = integral of e^{s*phi(x)} dx' where s is a large positive parameter and phi(x) is some smooth real function. The whole game is the exponential. As s grows, 'e^{s*phi(x)}' magnifies the tallest part of phi out of all proportion to everything else — the integrand becomes a single needle-thin spike sitting over the maximum of phi.

Here is the intuition in numbers. Suppose phi peaks at x = 1 with phi(1) = 1, and just nearby phi = 0.99. The ratio of the two heights of the integrand is 'e^{s*1} / e^{s*0.99} = e^{0.01*s}'. At s = 100 that ratio is already e ~ 2.7; at s = 1000 it is e^{10} ~ 22000. A point that is barely lower in phi is exponentially crushed in the integrand. So almost the entire value of the integral comes from an ever-shrinking window around the single highest point — and Laplace's method is the discipline of estimating the integral using only what happens inside that window.

Gaussian-izing the peak

Suppose phi has a single interior maximum at a point x0, so it is a genuine critical point with phi'(x0) = 0 and phi''(x0) < 0. Near x0 we do the most natural thing in calculus: we replace phi by its second-order Taylor expansion. Because the first-derivative term vanishes at a maximum, that expansion is 'phi(x) ~ phi(x0) + (1/2) phi''(x0) (x - x0)^2'. The exponential then factors into a constant times a Gaussian bump: 'e^{s*phi(x)} ~ e^{s*phi(x0)} * e^{-(1/2) s |phi''(x0)| (x - x0)^2}', since phi''(x0) is negative.

Now the integral is something you can do exactly. Pull the constant 'e^{s*phi(x0)}' out front and you are left with the Gaussian integral 'integral over all x of e^{-(1/2) s |phi''(x0)| (x - x0)^2} dx', which equals 'sqrt(2*pi / (s |phi''(x0)|))'. Two sleights of hand are hiding here, and both are honest. First, we extended the limits to all of x even if the real interval is finite — legitimate because the tails the Gaussian adds are exponentially negligible. Second, we kept only the quadratic term of phi — legitimate because as s grows the peak narrows toward x0 faster than the cubic and higher terms can matter.

I(s) = integral of g(x) e^{s*phi(x)} dx,   s -> +infinity

  1. find x0 where phi'(x0) = 0, phi''(x0) < 0   (the peak)
  2. Taylor expand:  phi(x) = phi(x0) - (1/2)|phi''(x0)|(x-x0)^2 + ...
  3. freeze the slow factor:  g(x) ~ g(x0)
  4. extend limits to +/- infinity, do the Gaussian:
        integral e^{-(1/2) s |phi''(x0)| (x-x0)^2} dx = sqrt( 2*pi / (s|phi''(x0)|) )

  LEADING TERM:
        I(s) ~ g(x0) * e^{s*phi(x0)} * sqrt( 2*pi / (s |phi''(x0)|) )

Laplace's method in four steps, ending in the leading-order formula. The slowly varying prefactor g(x) is just frozen at its value at the peak.

Reading the leading-order formula

Putting it together gives the headline result for an integral with a slowly varying prefactor g(x): 'integral of g(x) e^{s*phi(x)} dx ~ g(x0) e^{s*phi(x0)} sqrt(2*pi / (s |phi''(x0)|))' as s -> infinity. Every piece is telling you something physical. The factor 'e^{s*phi(x0)}' is the height of the spike — it sets the overall scale and grows or shrinks exponentially. The factor g(x0) is just the slow prefactor sampled at the peak (its variation across the narrow window is negligible). And 'sqrt(2*pi / (s |phi''(x0)|))' is the *width* of the spike: a sharper peak (larger |phi''|) or a larger s makes the window narrower, so the integral is smaller, shrinking like 1/sqrt(s).

Worked masterpiece: where Stirling's formula comes from

The most famous payoff is Stirling's approximation for n! when n is large. Start from the integral definition of the Gamma function: 'n! = Gamma(n+1) = integral from 0 to infinity of t^n e^{-t} dt'. This is not yet in Laplace form, because the large parameter n sits in t^n rather than in the exponent. The standard fix is to put everything into one exponential: 't^n e^{-t} = e^{n ln t - t}'. To expose a clean large-s structure, substitute t = n*x, which scales the peak to a fixed location. The integral becomes 'n^{n+1} integral from 0 to infinity of e^{n(ln x - x)} dx', and now it is textbook Laplace's method with s = n and phi(x) = ln x - x.

Now turn the crank. phi'(x) = 1/x - 1 vanishes at x0 = 1, and phi''(x) = -1/x^2 gives phi''(1) = -1, a maximum. The peak value is phi(1) = ln 1 - 1 = -1. Plug into the leading formula with g = 1: the integral is approximately 'e^{n*(-1)} sqrt(2*pi/(n*1)) = e^{-n} sqrt(2*pi/n)'. Multiply back the n^{n+1} out front and you have arrived: 'n! ~ n^{n+1} e^{-n} sqrt(2*pi/n) = sqrt(2*pi*n) (n/e)^n'. That is Stirling's formula, derived from nothing but 'expand around the peak'.

How good is it? At n = 10 the true 10! = 3628800 and Stirling gives about 3598696 — a relative error under 1%, from a one-line approximation. And the method does more than the leading term: keep the cubic and quartic terms of phi in the expansion and you systematically generate the corrections, the famous series 'n! ~ sqrt(2*pi*n)(n/e)^n (1 + 1/(12n) + 1/(288 n^2) - ...)'. That correction series is itself an asymptotic series — it eventually diverges, yet truncated at the right point it is breathtakingly accurate, exactly the behavior the previous guide warned you to expect.

When the action is at the endpoint: Watson's lemma

Laplace's method as stated assumes an interior peak. But a huge family of integrals has the dominant contribution sitting at an endpoint instead, with no interior critical point at all. The canonical shape is 'I(s) = integral from 0 to infinity of f(t) e^{-s*t} dt' — a one-sided exponential decay anchored at t = 0. As s grows, the weight e^{-s*t} dies off so fast that only the behavior of f(t) right at t = 0 survives. This is the natural home of Watson's lemma, and notice the form is exactly a Laplace transform of f, read in the large-s limit.

Watson's lemma is gloriously mechanical. If near t = 0 the function f(t) has a (possibly fractional-power) expansion 'f(t) ~ sum of a_k t^{k+b}' with the exponent b > -1 so the integral converges, then you may integrate it term by term — even though that series may not converge for any t > 0. Each power t^{c} integrated against e^{-s*t} gives a Gamma function: 'integral from 0 to infinity of t^c e^{-s*t} dt = Gamma(c+1) / s^{c+1}'. The result is the asymptotic series 'I(s) ~ sum of a_k Gamma(k+b+1) / s^{k+b+1}'. So the small-t Taylor (or Puiseux) data of f, term by term, becomes the large-s asymptotics of its transform.

Limits, cousins, and what to remember

Be honest about the hypotheses. Laplace's method needs a real exponent, so phi must be real and the peak must genuinely dominate; if two maxima tie, you add a contribution from each. It needs phi''(x0) to be nonzero — a degenerate peak (phi'' = 0 too) makes the spike fatter, the Gaussian is replaced by a quartic, and the decay rate changes from 1/sqrt(s) to 1/s^{1/4}. And it is an asymptotic statement: it tells you the behavior as s -> infinity, not a guaranteed bound at a fixed finite s, though in practice it is often excellent even for modest s (recall the 1% error at n = 10).

The biggest restriction is that phi must be real, so the spike is a real bump you can sit on. The moment the exponent is imaginary — an integral of 'e^{i*s*psi(x)}', the realm of waves and oscillations — there is no peak to expand around; instead rapid oscillation makes neighboring contributions cancel, and the dominant points are where psi'(x) = 0. That is the method of stationary phase. Unify both pictures by letting the exponent be complex and deforming the path of integration to pass through a saddle, and you reach the method of steepest descent — the next guide. Laplace's method is the real-axis, single-peak special case of that grander complex-plane story.