The Total Derivative as a Linear Map

Differentiability done right

On the line, f is differentiable at a if f(a + h) = f(a) + f'(a) h + o(h). Reread that: the increment f'(a) h is linear in h, and the remainder is small compared to h. We copy this exactly. The total derivative of f at a is a linear map L: R^n -> R^m such that the error term is negligible relative to |h|.

Definition ([[differentiability-in-rn|differentiability in R^n]]).

f is differentiable at a if there is a linear map L with

   lim_{h -> 0}  | f(a + h) - f(a) - L(h) |  /  |h|  =  0.

L is unique; we write L = Df(a), the total derivative.

Key consequences when f is differentiable at a:

  (1) every partial exists, and the matrix of L has entries D_j f_i(a);
  (2) every directional derivative equals  D_v f(a) = L(v);
  (3) f is continuous at a.

Contrast with anl-mul-1: partials existing does NOT give (3),
but a single linear approximation valid in EVERY direction does.

The o(|h|) definition forces approximation in all directions at once.

Gradient and Jacobian

When the target is R (so m = 1), the linear map L(h) is just a dot product g . h, and the vector g is the gradient grad f(a) = (D_1 f(a), ..., D_n f(a)). So D_v f(a) = grad f(a) . v, which makes the gradient the direction of steepest ascent. When the target is R^m, stacking the m gradients as rows gives the Jacobian matrix J, the matrix of the total derivative.

Verify f(x, y) = x^2 + y^2 is differentiable at a = (1, 2).

Partials: D_1 f = 2x = 2,  D_2 f = 2y = 4, so guess L(h) = (2, 4) . h.

Let h = (h1, h2). Compute the remainder R(h):

  f(a+h) = (1+h1)^2 + (2+h2)^2
         = 1 + 2h1 + h1^2 + 4 + 4h2 + h2^2
         = f(a) + (2 h1 + 4 h2) + (h1^2 + h2^2)
         = f(a) + L(h) + |h|^2.

So  |R(h)| / |h| = |h|^2 / |h| = |h| -> 0  as h -> 0.   QED.

Gradient: grad f(1,2) = (2, 4). Steepest ascent points that way.

Verifying the limit-zero condition directly; the remainder is |h|^2.

A clean sufficient condition

Be careful with the converse: differentiable does not require the partials to be continuous, and existence of partials does not give differentiability. The chain of strength is: C^1 ⟹ differentiable ⟹ partials exist and f continuous. Each arrow can fail to reverse.