The disappointment of the first real FFT
After rung 4 you can compute an FFT and read its magnitude spectrum. So you run an experiment that *should* be the easiest possible case: a pure sine wave at, say, 1000 Hz, sampled cleanly, fed into a 1024-point transform. Textbook theory promises a single delta — one bin lit up, everything else dead silent. You hit run, and instead of a needle you get a lumpy hill: the peak is near 1000 Hz, sure, but there are visible skirts spilling 30, 40, 50 dB down into neighbouring bins on both sides. It looks like noise. It is not.
The clue is hiding in *which* frequency you chose. Pick a frequency that lands exactly on a bin centre and the spike comes back, clean as a whistle. Nudge it a fraction of a bin off-centre and the skirts explode back into view. The transform is not lying about your signal — it is honestly reporting on a signal you did not realise you created the moment you grabbed a finite chunk of samples.
Why a finite block has invisible edges
Here is the mental model that fixes everything. When you take N samples and run a DFT, the maths implicitly assumes those N samples are one period of a signal that repeats forever. The transform tiles your block end-to-end into an infinite loop. If your sine completes a whole number of cycles inside the block, the last sample flows smoothly into the first sample of the next copy — the loop is seamless and you get a clean line. This is called a bin-aligned or coherent frequency.
But a real-world tone almost never lands on a bin centre. Say your 1024-point block at 48 kHz spans about 21.3 ms, and the bin spacing is 48000/1024 ≈ 46.9 Hz. A 1000 Hz tone sits at bin 21.33 — *between* bins. Now the block contains 21.33 cycles. The last sample ends partway up the waveform while the next copy starts at zero, so the seam has a sudden jump, a vertical cliff repeated every block. The transform sees that cliff and faithfully decomposes it. A sharp discontinuity is rich in harmonics, so the energy of your single tone is sprayed across a wide swath of bins. That spray *is* the leakage.
Bin-aligned (5.0 cycles in block) Off-bin (5.3 cycles in block)
copy A | copy B copy A | copy B
/\ /\ /| /\ /\ /\ /\ /\ /|
/ \/ \/ |/ \/ \/ \ / \/ \/ |\
----------+---------- ---------+ +--------
^ ^ ^
seam is SMOOTH seam has a JUMP
-> one clean bin -> energy leaks everywhereThe rectangular window and its ugly side-lobes
Whether you realise it or not, every plain FFT already applies a window: the rectangular (boxcar) window, which multiplies each in-block sample by 1 and everything outside by 0. The problem is what a box looks like in the frequency domain. Its transform is a sinc-shaped pattern with one tall main lobe flanked by a parade of side-lobes that decay slowly — the first side-lobe of a rectangular window is only about −13 dB below the peak, and the lobes fall off at a lazy 6 dB per octave. Those side-lobes are exactly the skirts you saw.
Why does −13 dB matter so much? Imagine two tones in your signal: a loud one at 1000 Hz and a quiet one at 1200 Hz that is 25 dB weaker. The loud tone's side-lobes, sitting at −13 dB and decaying slowly, are *taller than the entire quiet tone*. The quiet tone is buried under the leakage skirt of its loud neighbour and becomes invisible. In audio, RF, vibration and radar work this is a daily hazard: leakage from a strong carrier masks the weak signal you actually care about.
Taper the edges: Hann and Hamming windows
The cure follows directly from the diagnosis. The cliff at the seam came from the signal being chopped off abruptly. So instead of multiplying by a hard box, multiply your N samples by a smooth window function that fades to zero at both ends. Now the first and last samples are gently pulled to nothing, the seam in the repeating loop becomes continuous, the cliff vanishes — and with it, most of the leakage. This is windowing: a sample-by-sample multiply, dirt cheap, applied before the FFT.
Two raised-cosine windows dominate everyday practice. The Hann window (often misnamed 'Hanning') is a single cosine bump, w[n] = 0.5·(1 − cos(2πn/(N−1))), touching exactly zero at both ends. Its first side-lobe drops to about −31 dB and the lobes roll off fast at 18 dB/octave — superb for hunting weak tones. The Hamming window, w[n] = 0.54 − 0.46·cos(2πn/(N−1)), is almost the same shape but does not quite reach zero at the ends. That tiny pedestal cancels the nearest side-lobe to about −43 dB, the best first-side-lobe of the two, at the price of slower far-out roll-off.
# Apply a Hann window before the FFT (Python / NumPy) import numpy as np N = 1024 sig = np.sin(2*np.pi*1000*np.arange(N)/48000) # 1 kHz, off-bin win = np.hanning(N) # raised-cosine taper, 0 -> 1 -> 0 spec_r = np.abs(np.fft.rfft(sig)) # rectangular: ugly skirts spec_w = np.abs(np.fft.rfft(sig*win)) # windowed: skirts collapse # Coherent-gain fix: a Hann window throws away ~half the energy. # Scale the magnitude back up so amplitudes read correctly: spec_w *= 2.0 / np.sum(win) # 1/mean(win) = 1/0.5 = 2.0
The trade-off you can never escape
Windowing is not free magic — it is a transaction. Tapering the edges suppresses side-lobes, but it also widens the main lobe. A rectangular window has a main lobe ~1 bin wide (measured to the first null it is 2 bins). A Hann window's main lobe is roughly *twice* as wide; a Hamming's is similar. So the very act of killing leakage blurs the peak, and two tones that the rectangular window could *just* resolve may merge into one fat bump under a Hann. You buy dynamic range by spending frequency resolution.
Window Main-lobe width Highest side-lobe Side-lobe roll-off
----------- --------------- ----------------- ------------------
Rectangular 1.0 bins (best) -13 dB (worst) 6 dB/oct
Hamming ~1.4 bins -43 dB 6 dB/oct
Hann ~1.5 bins -31 dB 18 dB/oct (best)
Blackman ~1.7 bins (worst) -58 dB 18 dB/oct
narrow main lobe <----------------> low side-lobes
(sharp resolution) (wide dynamic range)- Just looking for tones in a clean signal? Start with Hann — a great all-rounder with fast-decaying side-lobes.
- Hunting a weak tone hiding next to a strong one? Reach for a low-side-lobe window like Blackman or a Kaiser, and accept the wider main lobe.
- Need to split two tones that are extremely close in frequency? Keep the main lobe narrow — stay rectangular (or near it) and live with the skirts.
- Measuring the absolute power of a tone? Window first to stop leakage stealing energy into neighbours, then apply the window's coherent-gain correction.
Putting it together: read a spectrum like a pro
Step back and the whole chain makes sense. A continuous wave becomes a discrete-time signal by sampling; sampling already obliged you to respect Nyquist (rung 3). Then you grab a finite block — and that grab silently multiplies by a window. The FFT reports the spectrum of *signal × window*, which in the frequency domain is your true spectrum convolved with the window's response. Every peak you see is the window's shape, parked on top of a real tone. Once you internalise that, a 'messy' spectrum stops being a mystery and becomes a measurement you can read and trust.
This intuition is also the foundation of the next rung. Designing a windowed-sinc lowpass filter is *exactly the same operation* run in reverse: you take an ideal brick-wall response, inverse-transform it to an infinitely long impulse response, then window it to a finite length — and the side-lobes you fought here reappear as ripple in the filter's stopband. Master windows now and FIR filter design will feel like an old friend.