JOVANA
Library Glossary Getting Started Three Levels Fields How it works Mission
Join the mission
All guides

The Euler–Lagrange Equation

Demand that the first variation vanish, push the perturbation under the integral, and out drops one ordinary differential equation whose solution is the optimal shape — the equation behind every brachistochrone, catenary, and law of physics.

From a number to a whole function

In Volume I, optimization meant finding a number: you set the derivative to zero, dy/dx = 0, and solved for the x that minimized a function. Here we want something bolder — the best whole curve y(x) running between two fixed endpoints. The quantity we minimize is a functional, usually an integral J[y] = integral from a to b of L(x, y, y') dx, which eats an entire function and returns one number (a time, a length, an energy). The previous guide introduced J and its first variation; this one turns that variation into a concrete equation you can solve.

The strategy is a faithful copy of single-variable calculus, but lifted up a level. To probe the minimizer y(x), nudge it: replace y by y(x) + epsilon eta(x), where eta is any smooth 'wiggle' that vanishes at both endpoints (so the competitor still passes through the same fixed points) and epsilon is a tiny dial. Plug this into J and you get an ordinary function of the single number epsilon, call it Phi(epsilon) = J[y + epsilon eta]. If y is genuinely optimal, then epsilon = 0 is an ordinary minimum of Phi, so dPhi/depsilon at epsilon = 0 must equal zero — exactly the critical-point condition from Volume I.

Differentiating under the integral

Compute dPhi/depsilon by differentiating inside the integral. Because L depends on epsilon only through y + epsilon eta and its derivative y' + epsilon eta', the chain rule gives a clean integrand: at epsilon = 0, dPhi/depsilon = integral from a to b of [ (partial L / partial y) eta + (partial L / partial y') eta' ] dx. This is precisely the first variation delta J. The eta term is harmless, but the eta' term is awkward — it carries the derivative of our arbitrary wiggle, and we cannot conclude anything while eta and eta' both float free.

The fix is the oldest trick in the trade: integration by parts, which trades a derivative on eta' for a derivative on the coefficient in front. Writing F = partial L / partial y', the second piece becomes integral of F eta' dx = [F eta] from a to b minus integral of (dF/dx) eta dx. The boundary term [F eta] vanishes outright, because eta was chosen to be zero at both a and b. That single design choice — fixed endpoints force the wiggle to die at the ends — is what makes the whole method work.

The fundamental lemma seals it

After the by-parts, every wiggle now multiplies eta alone: delta J = integral from a to b of [ partial L / partial y minus d/dx ( partial L / partial y' ) ] eta(x) dx, and optimality demands this integral equal zero for EVERY admissible eta. Here the fundamental lemma of the calculus of variations does the heavy lifting: if a continuous function g(x) satisfies integral of g(x) eta(x) dx = 0 for all smooth eta vanishing at the ends, then g(x) is identically zero. The intuition is sharp — if g were positive on some little interval, pick an eta that bumps up only there, and the integral would come out positive, a contradiction.

Setting that bracket to zero is the prize. The Euler–Lagrange equation reads d/dx ( partial L / partial y' ) minus partial L / partial y = 0. Read it carefully: partial L / partial y and partial L / partial y' are partial derivatives taken treating x, y, y' as three independent slots, while d/dx out front is a TOTAL derivative along the curve, which lets x, y(x), and y'(x) all vary together — a distinction worth pausing on, because confusing the two is the single most common error in the subject.

J[y] = integral_a^b  L(x, y, y') dx        (minimize over y, endpoints fixed)

      d  ( dL  )     dL
      -- ( --- )  -  --  = 0          <-  Euler-Lagrange equation
      dx ( dy' )     dy

  (dL/dy, dL/dy'  =  partial derivatives;   d/dx  =  total derivative along y(x))
The functional, then the equation its minimizer must obey.

A worked path: light and the shortest line

Try the friendliest case: the shortest curve between two points. Arc length is J[y] = integral from a to b of sqrt(1 + (y')^2) dx, so L = sqrt(1 + (y')^2). Notice L does not contain y at all, so partial L / partial y = 0. Meanwhile partial L / partial y' = y' / sqrt(1 + (y')^2). The Euler–Lagrange equation then says d/dx of that quantity equals zero, which means y' / sqrt(1 + (y')^2) is a constant — and that forces y' itself to be constant. The minimizer is a straight line, exactly as it must be. The machine recovered an obvious truth, which is how you trust it on the non-obvious cases.

That example exposed a gift: when L has no explicit x (when L = L(y, y') only), there is a first integral that spares you a hard second-order equation. The Beltrami identity, L minus y' (partial L / partial y') = constant, follows directly from Euler–Lagrange and drops the order by one. It is exactly this shortcut that cracks the hanging-chain catenary problem and the fastest-descent brachistochrone, the headline puzzles of the next guides — there L hides x, and Beltrami turns a fearsome equation into a separable one.

The recipe, and the honest fine print

  1. Write the functional J[y] = integral of L(x, y, y') dx and read off the integrand L.
  2. Compute the two partial derivatives partial L / partial y and partial L / partial y', treating x, y, y' as independent.
  3. Take the total d/dx of partial L / partial y' (use the chain rule — it will produce y'' terms in general).
  4. Set d/dx(partial L / partial y') minus partial L / partial y = 0; if L has no explicit x, use the Beltrami first integral instead.
  5. Solve the resulting ODE and fix the constants with the two endpoint values y(a) and y(b).

Two honest cautions. First, Euler–Lagrange is only a NECESSARY condition: it locates a stationary function the way dy/dx = 0 locates a stationary point, but a solution might be a minimum, a maximum, or a saddle, and proving it truly minimizes needs a second-order test (the analogue of the Volume I second-derivative test). Second, the derivation quietly assumed the minimizer is smooth enough to be twice differentiable and that you may differentiate under the integral; for badly behaved L or non-smooth competitors those steps need more care. The equation is powerful and central, not magic.