From a zoo of eigenvalue problems to one master form
By now you have met the eigenvalue problem several times: a second-order equation with a parameter lambda, plus boundary conditions at two ends, that has nonzero solutions only for a special discrete list of lambda values. In the previous guide you found, for y'' + lambda y = 0 on [0, L] with y(0) = y(L) = 0, the eigenvalues lambda_n = (n pi / L)^2 and the eigenfunctions sin(n pi x / L). Lovely — but it can feel like a lucky accident of that one tidy equation. Sturm-Liouville theory exists to show it is no accident at all.
The key move is to stop looking at each equation's accidental coefficients and instead force every second-order equation into one canonical shape. The Sturm-Liouville form writes the eigenvalue problem as (p(x) y')' + q(x) y + lambda w(x) y = 0, or equivalently -(p y')' + q y = lambda w y. Read it slowly: the highest-derivative term is wrapped inside a single outer derivative, (p y')', rather than left as a bare p y''. That wrapping is not cosmetic — it is the entire reason the beautiful properties follow. Here p(x) > 0 and w(x) > 0 are given positive functions, and w is called the weight.
Any second-order equation can be put in this form
Here is the reassuring part: you do not have to be handed an equation already in Sturm-Liouville form. ANY linear second-order equation a(x) y'' + b(x) y' + c(x) y + lambda d(x) y = 0 can be massaged into it by multiplying through by a single carefully chosen function. The trick is the same idea as the integrating factor you used for first-order equations, only now its job is to make the first two terms collapse into one exact derivative (p y')'.
Concretely: starting from y'' + (b/a) y' + ..., multiply the whole equation by mu(x) = (1/a) exp(integral of (b/a) dx). After that multiplication the leading part mu y'' + mu (b/a) y' becomes exactly (mu y')' — the product rule run backwards. So p(x) = mu(x), and the weight w(x) is whatever sits in front of lambda after the same multiplication. The recipe never fails as long as a(x) does not vanish on the interval; this is why we can honestly say Sturm-Liouville form is not a special case but a universal disguise that every regular second-order eigenvalue problem can wear.
start: a(x) y'' + b(x) y' + c(x) y + lambda d(x) y = 0
multiply by mu(x) = (1/a) * exp( integral (b/a) dx )
result: ( p y' )' + q y + lambda w y = 0
p = mu w = mu * d / a ... (the weight)
q = mu * c / a
example: Legendre (1 - x^2) y'' - 2x y' + lambda y = 0
b/a = -2x/(1-x^2), exp(integral) = (1 - x^2)
=> ( (1 - x^2) y' )' + lambda y = 0 p = 1 - x^2, w = 1Symmetry, and why orthogonality is forced
Now we cash in. Define the operator L[y] = -(p y')'+ q y, so the problem reads L[y] = lambda w y. The whole edifice rests on one identity — Lagrange's identity — which says that for two functions u and v, the combination u L[v] - v L[u] is itself a perfect derivative: it equals -d/dx[ p (u v' - u' v) ]. Integrate both sides across the interval [a, b]: the right side becomes a pure boundary term, evaluated only at the two ends.
Here is the punchline. For the standard boundary conditions of a regular Sturm-Liouville problem — fixed ends, free ends, or periodic ends — that boundary term vanishes exactly. What remains is integral of (u L[v] - v L[u]) dx = 0, the statement that L is self-adjoint: it can hop from one slot to the other inside an integral with no leftover, precisely the way a symmetric matrix satisfies u^T A v = v^T A u. Everything good now follows from this one equation, the same way every property of symmetric matrices flows from A = A^T.
Apply self-adjointness to two eigenfunctions y_m and y_n with different eigenvalues lambda_m and lambda_n. A two-line calculation gives (lambda_m - lambda_n) times integral of w y_m y_n dx = 0. Since the eigenvalues differ, the integral itself must be zero. That is orthogonality with a weight: integral of w(x) y_m(x) y_n(x) dx = 0 whenever m is not n. The eigenfunctions are perpendicular — not in the ordinary sense, but in the inner product weighted by w. This is the central treasure of the theory, and it is what makes the next guide's eigenfunction expansions possible.
The guarantees: real, ordered, complete
Self-adjointness hands you a package of guarantees, all proved once and for all rather than rediscovered example by example. First, every eigenvalue is REAL — no surprise complex frequencies sneak in, which is exactly what you want for a physical vibration or temperature. Second, the eigenvalues form an infinite increasing sequence lambda_1 < lambda_2 < lambda_3 < ... marching off to infinity, with no accumulation in the middle and none missing. There is a genuine 'first' eigenvalue and a clean ladder above it.
Third — and this is the deepest gift — comes completeness. The eigenfunctions are not merely orthogonal; they are a complete set, meaning ANY reasonable function on [a, b] can be written as an infinite combination sum c_n y_n(x) of them. This is the abstract engine under Fourier series: sines and cosines work because they are the eigenfunctions of one particular Sturm-Liouville problem, and they span the space because the theory guarantees every such family does. The orthogonality from the previous section is precisely what lets you compute each coefficient c_n by a single integral, with no solving of simultaneous equations.
There is also a structural bonus worth naming: the eigenfunction y_n has exactly n - 1 interior zeros (the Sturm oscillation theorem). The first eigenfunction never crosses zero inside the interval, the second crosses once, the third twice, and so on — higher eigenvalue means a more wiggly shape. Physically this is the overtone series: the fundamental mode of a string is a single arch, the next harmonic has one node, the next two. The mathematics has quietly encoded the picture of vibration before you ever drew it.
The fine print: regular, singular, and what can break
Be honest about the conditions, because the clean guarantees are not unconditional. The full package — real, simple, increasing eigenvalues and orthogonal, complete eigenfunctions — is proved for a regular Sturm-Liouville problem: a finite interval [a, b], with p(x) > 0 and w(x) > 0 holding on the WHOLE closed interval including the endpoints, and 'separated' boundary conditions at each end. Break any of those and the theorem you are quoting may no longer be the theorem that applies.
The most important real-world cases are deliberately NOT regular — they are singular Sturm-Liouville problems. Bessel's equation has p(x) = x, which vanishes at the endpoint x = 0; Legendre's has p(x) = 1 - x^2, vanishing at both x = +/-1. When p hits zero at an end, or the interval is infinite, the tidy boundary condition is replaced by a softer demand — usually that the solution merely stay BOUNDED at the troublesome end. Remarkably, much of the good behaviour survives: you still get orthogonality (with the right weight) and a workable expansion, but the spectrum can change character — sometimes the eigenvalues become a continuum instead of a discrete list, as for the Fourier transform on the whole line.
One last caution against over-reading the theory. Orthogonality and completeness say nothing about finding the eigenfunctions in CLOSED FORM — for most Sturm-Liouville problems there is no elementary formula, just as most ODEs have no closed-form solution, and the eigenfunctions are special functions defined by the very problem (Bessel, Legendre, Hermite, ...). The theory is a guarantee about STRUCTURE, not a shortcut to formulas. What it promises is that the structure — real spectrum, weighted orthogonality, completeness — is always there to be exploited, whether or not you can write the eigenfunctions down. That structure is exactly what the next guide turns into a working expansion.