Optimizing Over Functions

From the best number to the best curve

In Volume I you learned to find where an ordinary function f(x) is largest or smallest: take the derivative, set f'(x) = 0, and read off the special inputs. The answer was always a number — the x where the peak or valley sits. The calculus of variations asks a bolder question. What if the thing we are choosing is not a number at all, but an entire shape — a whole curve y(x) stretching from one point to another? What shape should a hanging chain take? Along what path does a bead slide down fastest? Here the unknown is a function, and the playing field is infinite-dimensional.

The machine that turns a whole function into a single number is called a functional. Think of it as a function of a function: you feed in an entire curve y(x), and it hands back one number — the length of the curve, the time of descent, the energy stored. A typical functional looks like J[y] = integral from a to b of L(x, y, y') dx, where the integrand L depends on the position x, the height y(x), and the slope y'(x) at each point. The square brackets J[y] are a deliberate reminder: the input is a function, not a number.

Wiggling the curve: the first variation

How do we even define a maximum or minimum when the candidate is a whole curve? Borrow the spirit of Volume I. To test whether a number x0 minimizes f, you nudged it: f(x0 + h) and asked whether the value went up. To test whether a curve y(x) minimizes J, we nudge the entire curve. Pick a small "wiggle" eta(x) — any smooth function that vanishes at both endpoints, so the deformed curve still starts and ends where it must — and look at the family y(x) + epsilon eta(x). As the dial epsilon turns away from zero, the curve bends; the question is whether J goes up.

Once the wiggle eta is fixed, J[y + epsilon eta] is just an ordinary function of the single number epsilon. We are back on Volume I ground! Call it Phi(epsilon) = J[y + epsilon eta]. If the original curve y is to be a minimum, then epsilon = 0 must be an ordinary minimum of Phi, so the ordinary derivative dPhi/d epsilon at epsilon = 0 must vanish. That derivative has a name: the first variation, written delta J. It plays exactly the role that f'(x0) played for a single number — it is the slope of the functional in the direction of the wiggle.

What stationary means for a whole curve

Here is the crucial twist. For a single number, f'(x0) = 0 is one condition. But the first variation must vanish for every allowed wiggle eta at once — there are infinitely many ways to deform the curve, and a true minimum must beat all of them. Setting delta J = 0 and computing it by integration by parts (the variational workhorse, recalled from Volume I) turns the requirement into an integral: integral from a to b of [ partial L / partial y - d/dx ( partial L / partial y' ) ] eta(x) dx = 0, and this must hold for every eta vanishing at the ends.

Now comes a small but decisive piece of logic, the fundamental lemma of the calculus of variations: if a continuous quantity multiplied by every possible eta always integrates to zero, then that quantity must itself be zero everywhere. The intuition is honest and simple — if the bracketed expression were positive on some little stretch, we could choose an eta that bulges exactly there and nowhere else, making the integral positive, a contradiction. So the bracket must vanish at every point, giving the celebrated Euler-Lagrange equation.

Euler-Lagrange equation:

  d  ( partial L )    partial L
  -- ( -------- )  -  --------  =  0
  dx ( partial y')    partial y

for a functional   J[y] = integral_a^b  L(x, y, y') dx

The Euler-Lagrange equation: a single differential equation the optimal curve must satisfy at every point.

Reading the equation, and what it does not promise

Notice what just happened. The condition delta J = 0 was a statement about infinitely many wiggles, yet it collapsed into a single, ordinary differential equation — usually second order — for the unknown curve y(x). The infinite has become finite. Solving the calculus-of-variations problem is now a familiar task: integrate that differential equation and fix the two constants of integration using the two endpoint values y(a) and y(b). The whole machinery of Volume II's differential-equation methods becomes available the moment the Euler-Lagrange equation is written down.

Be honest about what the equation guarantees, exactly as Volume I was honest about f'(x) = 0. A solution of Euler-Lagrange is a stationary curve: the first variation vanishes there. That is necessary for a minimum, but not sufficient. Just as f'(x) = 0 can mark a maximum, a minimum, or an inflection, a stationary curve may be a true minimizer, a maximizer, or a saddle in function space — and a genuine minimizer must also clear a second-order test (the analog of f''(x) > 0). Worse, a minimum need not exist at all: some functionals have no smallest value, sliding toward an infimum that no actual curve attains. The equation finds candidates; confirming the winner takes more.

Two questions this unlocks

The power of one method is best felt on real questions. Take the hanging chain: a flexible chain of fixed length hangs between two posts and settles into the shape that minimizes its gravitational potential energy. Writing that energy as a functional and applying Euler-Lagrange yields not a parabola, as Galileo first guessed, but a hyperbolic cosine, y = c cosh(x/c) — the catenary. The shape is not a designer's choice; it is forced by the demand that the energy functional be stationary.

Or the question that founded the field, the brachistochrone: shape a wire between a high point and a lower point so a bead, sliding from rest under gravity, arrives in the least time. The descent time is a functional of the wire's shape; Euler-Lagrange returns a cycloid — the curve traced by a dot on a rolling wheel. Strikingly, the fastest path dips below the straight line, trading a longer route for early speed. The same first-variation logic that tamed the chain settles the race.

Write the quantity you want to optimize as a functional J[y] = integral of L(x, y, y') dx.
Form the first variation by wiggling y -> y + epsilon eta with eta zero at both ends, and demand delta J = 0.
Use the fundamental lemma to convert delta J = 0 into the Euler-Lagrange differential equation.
Solve that equation and fix its constants using the two endpoint conditions y(a) and y(b).