First-Order PDEs & Characteristics

A PDE is a statement about a slope

Take the simplest possible partial differential equation of first order: du/dt + c du/dx = 0, with c a constant. Stare at the left side and ask what it is really measuring. Recall from Volume I that the partial derivative du/dt is how fast u changes as you step forward in time holding x fixed, and du/dx is how fast u changes as you step rightward in space holding t fixed. The equation glues them together: it says u is NOT free to wiggle however it likes — at every point of the (x, t) plane, a particular combination of its two slopes must cancel to zero. That is a constraint on the shape of the whole surface u(x, t), and our whole job is to decode what shape obeys it.

Here is the spark. The combination du/dt + c du/dx is exactly the chain rule in disguise. Suppose you are not standing still but walking through the plane along some path x = x(t), and you watch u change as you go. The total rate you see is d/dt of u(x(t), t) = du/dt + (dx/dt) du/dx. Compare that with the PDE: the two expressions become identical the moment you choose to walk at speed dx/dt = c. So along the special path moving rightward at speed c, the PDE is no longer saying 'a mysterious slope combination is zero' — it is saying the plain, ordinary thing du/dt-along-the-path = 0. The function does not change at all as you ride that path.

Characteristics: the curves that carry the answer

Those special paths have a name: they are the characteristics, and the whole technique is the method of characteristics. For our equation the characteristics solve the tiny ordinary differential equation dx/dt = c, whose solutions are the straight lines x = c t + x0, one line for each starting point x0. Picture the (x, t) plane combed through by a family of parallel lines all tilted at slope c. The PDE has just told us that u is CONSTANT along each of these lines. So if you know u at one point of a characteristic, you know it everywhere on that whole line for free — the value simply slides along, unchanged.

Now feed in an initial profile. Suppose at time t = 0 the function is some given shape u(x, 0) = f(x). Each characteristic line carries the value it starts with. The line through the starting point x0 satisfies x0 = x - c t, so the value sitting at (x, t) is whatever started at x0 = x - c t, namely f(x - c t). That is the entire solution: u(x, t) = f(x - c t). Read it physically and it is gorgeous — it is the initial shape f sliding rightward, rigid and undistorted, at speed c. Our abstract slope constraint turned out to mean nothing more exotic than 'the picture travels'. This equation is, fittingly, called the transport or advection equation.

PDE:   u_t + c u_x = 0,    u(x,0) = f(x)

Step 1  Characteristic ODE:   dx/dt = c     ->   x = c t + x0
Step 2  Along it:  d/dt[ u(x(t),t) ] = u_t + (dx/dt) u_x = u_t + c u_x = 0
                   so u is CONSTANT on each line.
Step 3  Label the line by its start:   x0 = x - c t
Step 4  Constant value = its initial value:   u = f(x0)

        ===>   u(x,t) = f(x - c t)      (initial shape, rigid, moving right at speed c)

   t
   ^      \    \    \     each \ is a characteristic x = c t + x0;
   |       \    \    \    u keeps its starting value all the way up the line
   +--------\----\----\-----> x

The four-step recipe in miniature: turn the PDE into the ODE dx/dt = c, ride the line, carry the value, read off f(x - c t).

When the speed depends on u, and where it breaks

The method does not need the equation to be linear, and that is where it earns its keep. Consider du/dt + u du/dx = 0 — the same form, but now the transport speed is u itself, the very thing you are solving for. The chain-rule argument is untouched: along a path with dx/dt = u, the equation again says du/dt-along-the-path = 0, so u is still constant on each characteristic. But now the speed of a characteristic equals the constant value it carries — tall parts of the wave race ahead, short parts lag. The characteristics are still straight lines, but they fan out at DIFFERENT slopes instead of marching in parallel.

Picture a smooth hump of water. The crest moves faster than the leading edge, so the front face steepens, leans forward, and eventually goes vertical — exactly an ocean wave curling before it breaks. Mathematically, two characteristics carrying different values collide; at that crossing the would-be solution is asked to take two values at once, and a smooth single-valued u(x, t) simply ceases to exist. This is a shock: the genuine birth of a discontinuity out of perfectly smooth initial data. It is not a flaw in the method — the method is precisely what predicts, to the instant, when and where smoothness must die.

Factoring the wave equation into two transports

Now turn the same idea on the crown jewel of this rung, the second-order wave equation d^2u/dt^2 = c^2 d^2u/dx^2. It looks like a different animal — second order, two time derivatives — but there is a beautiful sleight of hand. Treat the derivative operators as algebra: the equation is (d/dt)^2 u - c^2 (d/dx)^2 u = 0, and a difference of squares factors. So it reads (d/dt - c d/dx)(d/dt + c d/dx) u = 0. The single second-order wave equation has split into a PRODUCT of two of the first-order transport operators we just mastered — one carrying things right at speed c, one carrying them left.

This is why the wave equation is the model hyperbolic equation: it has two real, distinct families of characteristics, the right-movers x - c t = const and the left-movers x + c t = const. The cleanest way to see it is to change coordinates to xi = x - c t and eta = x + c t — the characteristic coordinates. Grind the chain rule through (this is the one piece of honest bookkeeping) and the wave equation transforms into the strikingly simple d^2u/(d xi d eta) = 0. Read that literally: the eta-derivative of (du/d xi) is zero, so du/d xi depends on xi alone, and integrating once more, u must be a function of xi plus a function of eta. Two arbitrary functions, one per characteristic family.

d'Alembert: two waves passing through each other

Undo the substitution and that two-function answer becomes the celebrated d'Alembert solution: u(x, t) = F(x - c t) + G(x + c t). In plain words, every solution of the wave equation is a right-travelling shape F plus a left-travelling shape G, each gliding rigidly at speed c and passing clean through the other. A plucked guitar string is exactly this — the initial bump splits into two half-height copies that run apart in opposite directions. Unlike separation of variables, which builds the answer slowly out of infinitely many standing-wave modes, d'Alembert hands you the complete general solution in closed form, no series required.

To finish a concrete problem you fit F and G to the two initial conditions the wave equation needs: the starting shape u(x, 0) = f(x) and the starting velocity du/dt(x, 0) = g(x). Matching them and solving the little pair of equations gives d'Alembert's explicit formula u(x, t) = (1/2)[ f(x - c t) + f(x + c t) ] + (1/(2c)) times the integral from x - c t to x + c t of g(s) ds. Look at what the formula confesses: the value at (x, t) depends only on the initial data between x - c t and x + c t. That interval is the domain of dependence, and its edges are exactly the two characteristics through (x, t) — physical proof that nothing in this world travels faster than the wave speed c.

Write the general solution from the factoring: u(x, t) = F(x - c t) + G(x + c t), one travelling wave per characteristic family.
Impose the initial shape: at t = 0, F(x) + G(x) = f(x).
Impose the initial velocity: differentiate in t and set t = 0 to get -c F'(x) + c G'(x) = g(x).
Solve the two relations for F and G (one division, one integration) and reassemble to get d'Alembert's formula.

Why this method belongs in your toolkit

Two honest boundaries on the power you have just gained. First, the d'Alembert formula in this clean form is the free-space, infinite-line answer; the moment you have walls — a string clamped at both ends — the travelling waves must reflect, and you stitch that in by extending f and g periodically as odd reflections, or you switch back to separation of variables and standing waves. The two pictures, travelling waves and standing modes, are the same solution viewed two ways, and which is easier depends entirely on the geometry. Second, the clean factoring into two real characteristic families is special to the hyperbolic case.

That last point ties the whole rung together. A second-order equation is classified by its characteristics: a hyperbolic equation like the wave equation has two real characteristic families and information travels along them at finite speed; a parabolic one like the heat equation has a single (repeated) family and smooths data instantly; an elliptic one like Laplace's equation has no real characteristics at all, which is why a steady-state potential feels every part of its boundary at once. Characteristics are not a trick reserved for first-order problems — they are the very thing the type of a PDE is built from, the skeleton that decides how a disturbance is allowed to propagate.

Step back and the unity is striking. A first-order PDE is solved by riding curves on which it becomes an ODE; the wave equation is solved by factoring it into two such first-order pieces; and the d'Alembert formula that drops out is not just an answer but a proof of well-posedness — a unique solution that depends continuously on the data, with a strict, finite speed limit baked in. From one humble observation — that du/dt + c du/dx is a chain rule waiting to happen — you have reached travelling waves, shock formation, the classification of all second-order PDEs, and the principle of causality itself.