JOVANA
Library Glossary Getting Started Three Levels Fields How it works Mission
Join the mission
All guides

PCA and Dimensionality Reduction

[[la-principal-component-analysis|PCA]] finds the directions along which your data varies the most — the top [[singular-value-decomposition|SVD]] directions of the centered data. Keep a few and you compress and visualize high-dimensional data. We will be honest about the catch: high variance is not the same as importance, and PCA only sees linear structure.

The idea: rotate to the axes of greatest variance

Imagine a cloud of data points. It is wider in some directions than others. Principal component analysis finds the direction of greatest spread (the first principal component), then the next-greatest direction perpendicular to it, and so on. It is nothing more than choosing a better set of axes — a rotation aligned with the shape of your data.

Computationally, PCA is just the SVD of your data after centering it (subtracting the mean of each feature). The principal components are the top right-singular vectors; the singular values tell you how much variance each one captures.

  1. Center the data: subtract each feature's mean so the cloud sits at the origin.
  2. Take the SVD of the centered data matrix, A = U*S*V^T.
  3. The columns of V are the principal components; sigma_i^2 is the variance along each.
  4. Project the data onto the top k components to reduce it to k dimensions.

Compress and visualize

Keeping only the top few components is a projection onto the directions that matter most for variance. Often a 1000-feature dataset has its shape captured by a handful of components, letting you plot it in 2D or feed a smaller, faster model. This is the same move as low-rank approximation: throw away the small singular values, keep the big ones.

variance kept = (sigma_1^2 + ... + sigma_k^2) / (sigma_1^2 + ... + sigma_r^2)
# pick the smallest k that keeps, say, 95% of the variance
The explained-variance ratio guides how many components to keep.

Honest caveats