Why A^T A is the right object
Whatever A is, the matrix A^T A is square (n-by-n), symmetric, and positive semidefinite: x^T (A^T A) x = ||A x||^2 >= 0. By the spectral theorem it has an orthonormal eigenbasis with nonnegative eigenvalues. That is the seed of everything.
Write those eigenpairs as A^T A v_i = lambda_i v_i with lambda_1 >= lambda_2 >= ... >= 0. Define the singular values as sigma_i = sqrt(lambda_i). Because the lambda_i are nonnegative the square root is real — this is the precise reason singular values are always real and nonnegative even when A has complex or negative eigenvalues. See singular values vs eigenvalues.
Manufacturing the left singular vectors
The v_i are in hand. For each i with sigma_i > 0, define u_i = (A v_i) / sigma_i. Two short calculations show these are exactly the left singular vectors: they are unit vectors, and they are mutually orthogonal.
Unit length:
||u_i||^2 = (A v_i)^T (A v_i) / sigma_i^2
= v_i^T (A^T A) v_i / sigma_i^2
= v_i^T (lambda_i v_i) / sigma_i^2
= lambda_i / sigma_i^2 = 1 (since sigma_i^2 = lambda_i)
Orthogonality (i != j):
<u_i, u_j> = (A v_i)^T (A v_j) / (sigma_i sigma_j)
= v_i^T (A^T A) v_j / (sigma_i sigma_j)
= v_i^T (lambda_j v_j) / (sigma_i sigma_j)
= (lambda_j / (sigma_i sigma_j)) <v_i, v_j>
= 0 (the v's are orthonormal)
=> the u_i form an orthonormal set, and A v_i = sigma_i u_i.The existence proof, assembled
- Form A^T A; by the spectral theorem get orthonormal eigenvectors v_1, ..., v_n with eigenvalues lambda_1 >= ... >= lambda_n >= 0.
- Set sigma_i = sqrt(lambda_i); let r be the number of strictly positive sigma_i (this r is the rank of A).
- For i <= r set u_i = A v_i / sigma_i; extend u_1, ..., u_r to a full orthonormal basis of R^m.
- Then A = U Sigma V^T holds exactly. Since this construction never required A to be square or invertible, the SVD exists for every matrix.
In practice nobody forms A^T A numerically — squaring the matrix squares its condition number and loses precision. The proof tells you the SVD *exists*; real algorithms compute it far more stably. But for understanding, A^T A is the master key.