JOVANA
Library Glossary Getting Started Three Levels Fields How it works Mission
Join the mission
All guides

The Chain Rule and Equality of Mixed Partials

Composition of differentiable maps multiplies their Jacobians. Then we tackle higher derivatives and Clairaut's theorem: when can you swap the order of partial differentiation?

The chain rule is matrix multiplication

Once the derivative is a linear map, the multivariable chain rule becomes beautifully simple: the derivative of a composition is the composition of the derivatives. In matrix terms, the Jacobian of g∘f is the product of the Jacobians. This is the whole reason for defining the total derivative as a linear map in the first place.

Chain rule:  if f differentiable at a, g differentiable at f(a), then

   D(g . f)(a) = Dg(f(a)) * Df(a)     (matrix product).

Entrywise, for h = g(f(x_1,...,x_n)) with f having components u_k:

   dh/dx_j = sum_k  (dg/du_k) * (du_k/dx_j).

Worked: z = g(u, v) with u = x^2 y, v = x + y. Then

   dz/dx = g_u * (2xy) + g_v * (1)
   dz/dy = g_u * (x^2) + g_v * (1).

As matrices, Df = [ 2xy   x^2 ;  1   1 ] and Dg = [ g_u  g_v ];
the row-times-matrix product reproduces both lines above.
The chain rule as a Jacobian product, then unpacked entry by entry.

Higher derivatives and mixed partials

Differentiate a partial again and you get a second-order higher-order derivative. The interesting ones are the mixed partial derivatives, where you differentiate with respect to different variables. Collecting all second partials into a matrix gives the Hessian, the second-order analogue of the gradient. A natural worry: does the order of differentiation matter — is d/dx d/dy f the same as d/dy d/dx f?

Usually it does not. Clairaut's theorem (also called Schwarz's theorem) says: if the mixed partials d/dx d/dy f and d/dy d/dx f exist and are continuous on a neighborhood of a, they are equal at a. This is why the Hessian is symmetric for any well-behaved C^2 function.

When the order does matter

Practically, every smooth function you meet is C^2 or better, so you may freely swap the order of partials. The counterexample is a reminder that the hypothesis is doing real work, not a piece of pedantry.