When diagonalization fails
Volume I taught diagonalization: A = P D P^-1 when A has a full set of eigenvectors. But many matrices are *defective* — they don't have enough eigenvectors, so no eigenbasis exists and P is not invertible. Worse, even when P exists it can be wildly non-orthogonal, making the factorization numerically useless. We need a decomposition that *always* exists and uses only well-conditioned orthogonal factors.
The Schur factorization is that decomposition: every square matrix (over the complex numbers) can be written A = Q T Q^* with Q unitary (orthogonal in the real case) and T upper-triangular. No defectiveness can stop it. And because Q is unitary, similar to A means T shares A's eigenvalues — they sit right on T's diagonal, in plain sight.
Reading the Schur form
The diagonal of T gives every eigenvalue without ever solving the characteristic polynomial — which is exactly how serious software finds eigenvalues, since polynomial root-finding is hopelessly ill-conditioned for large n. The upper-triangular part records how the eigenvectors fail to be orthogonal (it is zero precisely when A is normal). You can also reorder T's diagonal to bring a chosen cluster of eigenvalues to the top, which is the basis of invariant-subspace and control-theory computations.
A = [ 1 1 ; Q ~ orthonormal, T = [ 3 * ;
2 2 ] 0 0 ]
# A is defective-ish (rank 1), but Schur still exists.
# eigenvalues of A are 3 and 0 -> they appear on diag(T).
# the off-diagonal '*' encodes the non-orthogonality of A's eigenvectors.
# (The real Schur form keeps 2x2 blocks for complex-conjugate pairs.)Polar: rotation times stretch
The polar decomposition writes any square matrix as A = U P, where U is orthogonal (a pure rotation/reflection) and P is symmetric positive semidefinite (a pure non-negative stretch along orthogonal axes). It is the matrix analogue of writing a complex number as z = e^{i theta} r — a phase times a magnitude. The factorization splits *what the operator does* into 'turn' and 'scale'.
Polar and SVD are two views of the same truth: if A = W Sigma V^T is the SVD, then U = W V^T and P = V Sigma V^T. So the stretch P shares its eigenvalues with A's singular values. Engineers love the polar form because U is the nearest orthogonal matrix to A — it answers 'what is the closest pure rotation to this slightly-distorted transform?', the heart of orthogonalizing measured rotations in graphics, robotics, and crystallography.