Separation of Variables

The bold guess that makes a PDE tractable

You have learned to classify a partial differential equation and to read off whether it is parabolic, hyperbolic, or elliptic. Now comes the first real solving technique, and it is the one you will reach for again and again: separation of variables. The strategy is almost cheeky in its boldness. A PDE couples two independent variables — for the heat equation in a rod, position x and time t, knotted together by the partial derivatives. Rather than untangle them, we simply guess that the knot was never there: we look for solutions that are a product of a function of x alone times a function of t alone, u(x, t) = X(x) T(t). It seems too much to ask. Remarkably, for the right problems, it works.

Make the picture concrete. Take a thin metal rod of length L, laid along the x-axis from 0 to L, with both ends clamped in ice so they are held at temperature zero. At the first instant you hand it some temperature profile f(x) — maybe the middle is hot and the ends are cold. The heat equation, du/dt = k d^2u/dx^2 with k > 0 the thermal diffusivity, governs every later moment: the rate of change in time at each point is proportional to the curvature in space there. Where the profile bulges upward (concave down, negative second partial derivative) the temperature falls; where it dips (concave up) it rises. Heat flows from hot to cold and the bumps relax. Our task is to predict the whole future u(x, t) from that one starting shape f(x).

Substituting the guess and prying the variables apart

Take the guess u = X(x) T(t) and feed it into du/dt = k d^2u/dx^2. The time derivative only touches T, so du/dt = X(x) T'(t). The two spatial derivatives only touch X, so d^2u/dx^2 = X''(x) T(t). The heat equation becomes X T' = k X'' T. Every term is now a product of an x-thing and a t-thing, and we can sort them: divide both sides by k X T to gather all the t-dependence on one side and all the x-dependence on the other. Out comes T'/(k T) = X''/X. Stare at this equation, because the next sentence is the whole trick.

The left side depends only on t; the right side depends only on x. Yet they are equal for all x and all t at once. The only way a pure function of t can equal a pure function of x everywhere is for both to be the same constant — wiggle t and the left side cannot move (it would break the equality the right side, frozen in x, refuses to follow), and vice versa. So each side equals a fixed number. We name it minus lambda, written T'/(k T) = X''/X = -lambda, the minus sign chosen with foresight because (as we will see) the boundary conditions force lambda to be positive. In one stroke a single PDE in two variables has fissioned into two separate ordinary differential equations: X'' + lambda X = 0 and T' + k lambda T = 0.

The boundary conditions select the allowed modes

Now we solve the space equation X'' + lambda X = 0 — but not in a vacuum. The ends of the rod are held at zero, u(0, t) = 0 and u(L, t) = 0 for all t, and since u = X T this forces X(0) = 0 and X(L) = 0 (we discard the trivial T = 0). These are homogeneous Dirichlet boundary conditions on X. The ODE X'' + lambda X = 0 has a characteristic equation r^2 + lambda = 0 you know well from second-order linear ODEs. Its solution depends on the sign of lambda, and checking the three cases is what kills off all but a special discrete family.

X'' + lambda X = 0,   X(0) = 0,   X(L) = 0

Case lambda < 0  (write lambda = -mu^2):  X = A cosh(mu x) + B sinh(mu x)
   X(0)=0 => A = 0;  X(L)=0 => B sinh(mu L) = 0 => B = 0.   Only X == 0.  Rejected.

Case lambda = 0:                          X = A + B x
   X(0)=0 => A = 0;  X(L)=0 => B L = 0 => B = 0.            Only X == 0.  Rejected.

Case lambda > 0  (write lambda = mu^2):   X = A cos(mu x) + B sin(mu x)
   X(0)=0 => A = 0;  X(L)=0 => B sin(mu L) = 0.
   Nontrivial (B != 0) needs sin(mu L) = 0 => mu L = n pi,  n = 1, 2, 3, ...

   ==> lambda_n = (n pi / L)^2,    X_n(x) = sin(n pi x / L)

Only positive lambda survives. The clamped ends behave like a guitar string fixed at both bridges: only whole numbers of half-waves fit, so lambda is forced onto the discrete ladder lambda_n = (n pi / L)^2 with shapes X_n = sin(n pi x / L).

Read what just happened. The boundary conditions acted as a filter: out of a continuum of possible lambda, only the discrete ladder lambda_n = (n pi / L)^2 for n = 1, 2, 3, ... passes through, each with its own spatial shape X_n(x) = sin(n pi x / L). These special numbers are the eigenvalues and the sine shapes are the eigenfunctions of the problem — the natural modes of the rod, fixed entirely by its geometry and how its ends are held. They are precisely the standing-wave shapes a string of length L can hold: one hump, two humps, three. With each X_n in hand, the time equation T' + k lambda_n T = 0 is a one-line separable first-order ODE whose solution is a pure exponential decay, T_n(t) = e^(-k lambda_n t). So one separated solution is u_n(x, t) = sin(n pi x / L) e^(-k (n pi / L)^2 t).

Superposition: building any shape from the modes

Each u_n alone is a genuine solution, but a single sine almost never matches the messy initial profile f(x) you were handed. Here the linearity of the heat equation rescues us. Because the equation is linear and homogeneous, the superposition principle applies: any sum of solutions is again a solution. So we form the most general combination, u(x, t) = sum over n of b_n sin(n pi x / L) e^(-k (n pi / L)^2 t), with constants b_n still free. This sum automatically satisfies the PDE and both boundary conditions for any choice of the b_n. Only one condition remains unused — the initial shape — and it is exactly enough to fix every b_n.

Set t = 0. Every decay factor becomes e^0 = 1, and the requirement u(x, 0) = f(x) reads f(x) = sum of b_n sin(n pi x / L). That is a Fourier sine series for f on the interval [0, L] — the half-range expansion from the Fourier guides, now arriving with a purpose. The coefficients are not guessed; they are computed by the orthogonality of the eigenfunctions. The sines sin(n pi x / L) for different n are mutually orthogonal: the definite integral of sin(n pi x / L) sin(m pi x / L) over [0, L] is zero whenever n is not m, and equals L/2 when they match. Multiply both sides by sin(m pi x / L), integrate from 0 to L, and the entire sum collapses — every term but the m-th integrates to zero — leaving b_m = (2/L) integral from 0 to L of f(x) sin(m pi x / L) dx.

Separate: substitute u = X(x) T(t), divide by k X T, and set both sides equal to the constant -lambda. The PDE splits into X'' + lambda X = 0 and T' + k lambda T = 0.
Solve the eigenvalue problem: impose X(0) = X(L) = 0 on the space ODE; only lambda_n = (n pi / L)^2 survive, with eigenfunctions X_n = sin(n pi x / L).
Solve the time ODE for each mode: T_n(t) = e^(-k lambda_n t), a pure exponential decay with rate set by lambda_n.
Superpose and fit the data: sum b_n X_n(x) T_n(t), then set t = 0 and pick the b_n so the sum equals f(x) — they are the Fourier sine coefficients b_n = (2/L) integral from 0 to L of f(x) sin(n pi x / L) dx.

Reading the answer, and reading its fine print

The finished solution is u(x, t) = sum of b_n sin(n pi x / L) e^(-k (n pi / L)^2 t), and it tells a vivid physical story. Each mode decays exponentially, but the decay rate is k (n pi / L)^2 — proportional to n squared. The wiggly high-n modes, with their sharp curvature, melt away ferociously fast; the gentle first mode, sin(pi x / L), fades the slowest. So whatever jagged profile you start with, the rod almost instantly smooths into the shape of that lowest single hump, which then sags quietly toward zero. This is the mathematical face of a fact you know in your bones: sharp temperature differences even out first, and heat always blurs detail. The eigenfunction expansion has turned a single PDE into an infinite orchestra of independently decaying notes.

Honesty about what the series does and does not promise. The b_n are exactly the Fourier sine coefficients, so if f has a jump, that initial slice of the series carries the same Gibbs overshoot you met before — at t = 0 the truncated sum overshoots a corner. But here the physics is forgiving: for any t > 0, the factor e^(-k (n pi / L)^2 t) crushes the high-n terms so violently that the series becomes infinitely smooth the instant heat starts flowing. The Gibbs ringing lives only at the initial instant and is gone for all later time. This 'instant smoothing' is special to the parabolic heat equation; the wave equation, solved by the very same separation method, has no such damping factor — its modes oscillate forever instead of decaying, so a corner in the initial data rides along undimmed. Same method, very different physics.

Two more honest caveats before you carry this off. First, the recipe leaned on homogeneous boundary conditions — both ends at zero. If instead the ends are held at fixed nonzero temperatures, the product form fails directly, because constants do not factor as X(x)T(t); the standard fix is to first subtract off the steady-state solution (the straight line the rod settles to) so the leftover problem has zero boundaries again, then separate that. Second, that this product-then-superpose construction reproduces every reasonable f, and is the only solution, is not obvious by waving hands — it rests on a real theorem. The completeness of the sine eigenfunctions guarantees the expansion exists, and the maximum principle guarantees the heat problem is well-posed, so the answer you built is the answer. The method is a faithful servant precisely because that theory stands behind it.