Symmetry should buy you something
Many matrices that arise in practice — covariance matrices, the normal equations A^T A of least squares, stiffness matrices in physics — are symmetric and positive definite (SPD): A = A^T, and x^T A x > 0 for every nonzero x. Running general LU on such a matrix wastes effort, because L and U are mirror images of each other up to a diagonal scaling. The factorization should reflect the symmetry, not ignore it.
The Cholesky decomposition does exactly that: it writes A = LL^T with L lower-triangular and *positive* diagonal entries. One triangular factor and its transpose carry all the information — no separate U, no pivoting needed, because positive definiteness guarantees the pivots are always positive. It is the most efficient and most stable factorization in all of dense linear algebra, and you should reach for it whenever you can prove A is SPD.
Computing it column by column
Cholesky has a clean recursive recipe. Match entries of LL^T to A: the (1,1) entry forces L_11 = sqrt(A_11); the rest of column 1 of L is the first column of A divided by L_11; then you subtract that column's outer product from the remaining block and repeat. Each diagonal step takes a square root — and if any value under the root goes non-positive, A was *not* positive definite. That failure is a feature.
A = [ 4 12 -16 ; L = [ 2 0 0 ;
12 37 -43 ; 6 1 0 ;
-16 -43 98 ] -8 5 3 ]
# L_11 = sqrt(4) = 2
# L_21 = 12/2 = 6 , L_31 = -16/2 = -8
# L_22 = sqrt(37 - 6^2) = sqrt(1) = 1
# L_32 = (-43 - (-8)(6)) / 1 = 5
# L_33 = sqrt(98 - (-8)^2 - 5^2) = sqrt(9) = 3
# check: L @ L^T == AWith positive diagonal entries pinned down, Cholesky is unique — there is exactly one such L. Compare LU, where you had to fix L's diagonal to 1 by convention; here the positivity of A does the pinning for free. This is uniqueness earned by structure rather than imposed by fiat.
LDL^T: Cholesky without the square roots
Square roots are mildly expensive and undefined for symmetric matrices that are *not* positive definite (the indefinite case, common in optimization). The LDL^T decomposition sidesteps both problems: write A = LDL^T with L unit-lower-triangular and D a diagonal matrix. The signs in D now record indefiniteness instead of crashing the algorithm — by Sylvester's law of inertia, the count of positive, negative, and zero entries on D is an invariant of A.