Computing the Matrix Exponential

Why we never sum the series

The previous guide defined the matrix exponential e^(At) as the same infinite series as the ordinary exponential, with the matrix A t slotted into every power: e^(At) = I + A t + (A t)^2 / 2! + (A t)^3 / 3! + and so on forever. That definition is beautiful and it is honest — but try to actually add it up for a generic 3-by-3 matrix and you will be computing matrix powers and dividing by factorials until your patience runs out, with no closed form in sight. The series tells you what the object IS; it almost never tells you what the object EQUALS.

There is one happy exception worth keeping in your pocket: if A is diagonal, say A = [a, 0; 0, b], then every power A^k is just the diagonal of a^k and b^k, the series splits into two independent scalar exponential series, and you read off e^(At) = [e^(at), 0; 0, e^(bt)] with no work at all. That tiny miracle is the seed of the first real method — because most matrices are not diagonal, but many can be MADE diagonal by changing coordinates.

Method 1: diagonalize and ride the eigenbasis

Suppose A has a full set of independent eigenvectors — collect them as the columns of a matrix P, and put the matching eigenvalues on the diagonal of D. Then A = P D P^(-1), and here is the magic: every power obeys A^k = P D^k P^(-1), because the inner P^(-1) P pairs cancel in the middle. Feed that into the series term by term and the P and P^(-1) factor cleanly out to the two ends, leaving the series for e^(Dt) trapped in the middle. Since D is diagonal, e^(Dt) is the easy diagonal-of-exponentials matrix from a moment ago.

That gives the whole formula for computing e^(At) by diagonalization: e^(At) = P e^(Dt) P^(-1). In words, change coordinates into the eigenbasis where the system decouples into independent scalar pieces, exponentiate each scalar, then change back. This is exactly the same idea as the eigenvalue method you already met for systems — solving each straight-line mode by hand — repackaged as one clean matrix. The eigenvalues live in the exponents, the eigenvectors live in P, and the matrix exponential just bookkeeps all of them at once.

e^(At) = P e^(Dt) P^(-1),     where  A = P D P^(-1)

   D = [lambda1,   0   ]        e^(Dt) = [ e^(lambda1 t),     0       ]
       [   0,   lambda2]                 [     0,        e^(lambda2 t) ]

   columns of P  = eigenvectors
   diagonal of D = eigenvalues

Diagonalization recipe: only the easy diagonal block e^(Dt) ever gets exponentiated; P and P^(-1) carry it back to the original coordinates.

Method 2: when diagonalization fails, climb to Jordan

Diagonalization has a real limit, and it is the same one that haunts the repeated-eigenvalue case: some matrices simply do not have enough independent eigenvectors to fill P. A classic culprit is [2, 1; 0, 2] — its only eigenvalue is 2, but it offers just one eigenvector direction, so no invertible P of eigenvectors exists and the diagonalization route stalls. You cannot wish the second eigenvector into existence; you have to widen the toolkit.

The fix is the Jordan form. Any matrix can be written A = P J P^(-1) where J is block-diagonal, and each block is an eigenvalue lambda on the diagonal with a chain of 1's just above it — the columns of P are now eigenvectors padded out with generalized eigenvectors to complete the basis. The same factoring trick gives e^(At) = P e^(Jt) P^(-1), so once again you only need to exponentiate the block J. And here a second small miracle saves you: a Jordan block N above its diagonal is nilpotent (some power of it is the zero matrix), so its exponential series TERMINATES after a few terms instead of running forever.

Concretely, for the block [lambda, 1; 0, lambda] the Jordan form method yields e^(Jt) = e^(lambda t) [1, t; 0, 1]. That stray factor of t is exactly where the familiar t e^(lambda t) term in repeated-root solutions comes from — the polynomial-times-exponential signature you saw in second-order equations is the matrix exponential of a Jordan block, wearing a different hat. Same mathematics, finally unified.

Methods 3 and 4: skip the eigenvectors entirely

Finding eigenvectors (and worse, generalized ones) is tedious and error-prone, so two further methods sidestep them. The first leans on the Cayley-Hamilton theorem: a matrix satisfies its own characteristic equation. The deep consequence is that ALL powers of an n-by-n matrix, and hence e^(At) itself, can be written as a polynomial in A of degree at most n-1. So e^(At) = c0(t) I + c1(t) A + ... + c_(n-1)(t) A^(n-1), and you only have to find the scalar coefficient functions, not any eigenvectors. The slick organized way to grind those coefficients out is the Putzer algorithm, which builds them from the eigenvalues alone via a small recursive system of scalar ODEs.

The fourth route reuses a tool you already trust. Because e^(At) is the principal fundamental matrix — the unique solution of X' = A X with X(0) = I — you can solve that matrix initial-value problem with the Laplace transform. Transforming X' = A X turns the derivative into multiplication, giving (s I - A) times the transform equals I, so the transform of e^(At) is the matrix inverse (s I - A)^(-1). Invert each entry back to the time domain and you have your answer. This is computing e^(At) by Laplace transform: no eigenvectors, just one matrix inverse and a table lookup per entry.

Diagonalizable, distinct or enough eigenvectors? Use e^(At) = P e^(Dt) P^(-1) — usually the fastest by hand.
Defective (repeated eigenvalue, missing eigenvectors)? Use the Jordan form, e^(At) = P e^(Jt) P^(-1), and remember each block's series stops early.
Want to dodge eigenvectors but know the eigenvalues? Run Putzer / Cayley-Hamilton to get e^(At) as a polynomial in A.
Comfortable with transforms, or the matrix is messy? Compute (s I - A)^(-1) and invert each entry with the Laplace table.

Same matrix, four routes, honest caveats

All four methods compute the identical e^(At) — they are different doors into the same room, and which door is easiest depends on the matrix in front of you and which tools feel natural in your hands. Once you have e^(At), the entire homogeneous system is solved in one stroke: x(t) = e^(At) x(0). And because e^(At) is the genuine fundamental matrix anchored at the identity, it also obeys the flow law e^(A(t+s)) = e^(At) e^(As), the semigroup property that lets you compose evolutions over successive time intervals — a fact the next two guides will exploit when forcing enters the picture.

Two honest limits before you celebrate. First, every hand method above secretly needs the eigenvalues, which means solving the characteristic polynomial — and for a 5-by-5 or larger matrix there is in general no formula for its roots, so 'compute e^(At) exactly' quietly becomes impossible by hand and the work moves to a computer. Second, even on a computer the matrix exponential is famously delicate: the naive series and even diagonalization can be numerically unstable when eigenvectors are nearly parallel or eigenvalues are widely spread, which is why specialized algorithms (scaling-and-squaring, Krylov methods) exist. The four routes here are exact-arithmetic tools for human-sized problems; real large-scale computation is its own craft.