The z-Transform: How DSP Systems Are Described

From the s-plane to the z-plane

If you have met the Laplace transform, you already know its central trick: take a messy differential equation in time and turn it into ordinary algebra in a complex variable s. Capacitors and inductors stop being calculus and become simple impedances; a whole circuit collapses to a ratio of polynomials. The z-transform is the exact same idea, ported to the discrete world — the world of samples, not continuous voltages. Where Laplace lives in the s-plane, the z-transform lives in the z-plane, and once you learn to read that plane you can describe any DSP system on the back of a napkin.

Why a *new* transform at all? Because a discrete-time signal x[n] is not a smooth function — it is a list of numbers spaced one sample apart. The z-transform meets it on its own terms. We define X(z) as a sum over those samples, each multiplied by a power of z⁻¹: X(z) = Σ x[n]·z⁻ⁿ for n from 0 upward (for the causal signals DSP usually cares about). That single line is the whole definition. The variable z is complex, and the magic is entirely in what its powers *mean*.

z⁻¹ is a memory cell

Picture a conveyor belt carrying your samples past you, one per clock tick. A unit delay z⁻¹ is a single bin on that belt: a sample drops in, and exactly one tick later it pops out the other side. In hardware that bin is literally a register — a flip-flop or a memory cell — clocked by the sample rate. In software it is one slot of an array you read from. This is why DSP engineers say z⁻¹ *is* memory: every delay you draw in a block diagram is a place where the system remembers a past value.

Now watch a *system* — anything that takes an input x[n] and produces an output y[n] — get written down. Almost every useful DSP system is linear and time-invariant (an LTI system): scaling the input scales the output, and delaying the input just delays the output. For such systems the rule is always the same shape: today's output is a weighted sum of recent inputs and recent outputs. That sentence, written as math, is a difference equation.

Difference equation (what the code actually does each tick):

    y[n] = b0*x[n] + b1*x[n-1]  -  a1*y[n-1]

Delay-line / block-diagram view (-> is one sample of flow):

   x[n] --->(x b0)----+
      |               |
    [z^-1]            (+)---> y[n] ---+----> out
      |               |               |
   x[n-1]-->(x b1)----+             [z^-1]
                                      |
                      (x -a1)<------ y[n-1]
            (-a1 * y[n-1] fed back into the sum)

The two [z^-1] boxes are the only memory in the system.

A first-order system as both an equation and a delay-line diagram. Each [z⁻¹] box is one register holding one past sample.

Difference equation → transfer function H(z)

Here is where the whole machine pays off. We take the difference equation and apply the z-transform to every term. Because z⁻¹ simply means 'shift by one sample', the substitution is mechanical: wherever you see x[n−k], write z⁻ᵏ·X(z); wherever you see y[n−k], write z⁻ᵏ·Y(z). The calculus-flavoured time delays turn into plain multiplications by powers of z⁻¹. Then we gather terms and ask for the ratio Y(z)/X(z). That ratio is the system's [[ee-transfer-function|transfer function]], written H(z) — the discrete cousin of the transfer function you may know from continuous control.

Start from the difference equation: y[n] = b0·x[n] + b1·x[n−1] − a1·y[n−1].
Replace each delayed term: y[n−1] → z⁻¹·Y(z), x[n−1] → z⁻¹·X(z). The equation becomes Y = b0·X + b1·z⁻¹·X − a1·z⁻¹·Y.
Collect all Y on the left: Y·(1 + a1·z⁻¹) = X·(b0 + b1·z⁻¹).
Divide to get the transfer function: H(z) = Y(z)/X(z) = (b0 + b1·z⁻¹) / (1 + a1·z⁻¹).

Look at the shape: a ratio of two polynomials in z⁻¹. The numerator came from the input weights (the b's), the denominator from the feedback weights (the a's). Solving numerator = 0 gives the zeros — input frequencies the system kills. Solving denominator = 0 gives the poles — the system's natural resonances, the frequencies it amplifies and the modes it rings at. Together, poles and zeros are the entire fingerprint of an LTI system: tell me where they sit and I can reconstruct everything it does.

Reading the z-plane: the unit circle is the boundary of stability

Now plot the poles and zeros on the complex z-plane and a story appears. The single most important landmark is the unit circle — the set of points exactly distance 1 from the origin. It is the discrete-time equivalent of the imaginary axis in the s-plane, and it carries enormous meaning. For a causal system, the rule is blunt: every pole must lie strictly inside the unit circle, or the system is unstable. A pole inside means a mode that decays sample by sample; a pole on the circle rings forever without dying; a pole outside means a response that grows without bound and blows up.

             Im(z)
               |        . . . .
           .       unit circle (|z| = 1)
         .           |           .
        .            |            .
       .   x <-- pole INSIDE      .
      .    (stable, decays)       .
   ---+--------o--------+-------+----- Re(z)
      .       zero    (origin)  .
       .                       .
        .   x  <-- pole ON circle = rings forever
         .         (marginally stable) .
           .                       .
               . . . .   x <-- pole OUTSIDE = blows up

  Inside circle  -> stable
  On  circle     -> oscillates, never settles
  Outside circle -> unstable (output grows without limit)

The unit circle splits the z-plane into stable (inside) and unstable (outside). Poles (x) decide stability; zeros (o) carve notches in the response.

The same picture also tells you the filter shape, and this is the part that feels like a superpower. To find the frequency response — how the filter treats each frequency — you walk a point all the way around the unit circle, from z = 1 (DC, the lowest frequency) around to z = −1 (the highest, the Nyquist frequency set by the sampling theorem). At each spot, the gain is roughly *the product of distances to the zeros, divided by the product of distances to the poles*. Slide near a pole and the denominator shrinks, so the gain spikes — a resonance, a peak. Slide near a zero and the numerator shrinks, so the gain dips — a notch. The geometry literally draws the filter's response curve for you.

A worked first-order example: the one-pole smoother

Let's make it real with the simplest useful IIR filter — a one-pole low-pass smoother, the kind hiding inside every 'exponential moving average' and every gentle volume fader. Its difference equation is y[n] = (1−α)·x[n] + α·y[n−1], where α is between 0 and 1. In words: the new output is a blend of the new input and a fraction α of the previous output. Crank α up and the filter leans heavily on its memory, so it smooths hard and reacts slowly; turn α down and it follows the input almost instantly.

Take the z-transform of   y[n] = (1-a)*x[n] + a*y[n-1]

   Y = (1-a)*X + a*z^-1 * Y
   Y - a*z^-1*Y = (1-a)*X
   Y*(1 - a*z^-1) = (1-a)*X

              (1 - a)
   H(z) = ---------------       <-- transfer function
            1 - a*z^-1

   Pole: 1 - a*z^-1 = 0  ->  z = a   (a real pole on the +Re axis)
   Zero: numerator is constant -> a zero at z = 0

   DC gain (set z = 1):  H(1) = (1-a)/(1-a) = 1   (passes DC perfectly)
   Nyquist gain (z=-1):  H(-1)= (1-a)/(1+a) < 1   (cuts high freq)

   With a = 0.9 :  pole at 0.9 (just inside circle) -> stable,
                   heavy smoothing, slow ~ settles in tens of samples
   With a = 0.5 :  pole at 0.5 -> light smoothing, fast settle

Deriving H(z) for the one-pole smoother, then reading stability and filter shape straight off the pole at z = α.

Read what the z-plane just told us, for free. The pole sits at z = α on the real axis. As long as α < 1, that pole is inside the unit circle, so the filter is stable — exactly the requirement we'd hope for. It passes DC (the slow, average part of the signal) at full gain and attenuates the fast wiggles near Nyquist: a genuine low-pass. And the closer α creeps toward 1, the closer the pole creeps to the circle, the more sluggish and heavily-smoothing the filter becomes. One pole, one parameter, and the geometry explained stability, passband, and speed all at once.

Why this is the keystone of DSP

Step back and see what one idea bought you. A difference equation — the literal C code that runs on a chip — became a polynomial ratio H(z). That ratio became a scatter of poles and zeros. That scatter, plotted against the unit circle, instantly revealed stability and the entire shape of the filter. Convolving signals in time, which is a fiddly sliding-and-summing operation, becomes simple *multiplication* of z-transforms — the same way convolution collapses to multiplication in any transform domain. Cascade two filters? Just multiply their H(z)'s. The z-transform is the lingua franca that makes all of this one fluent conversation.

There is one honest caveat to carry forward. The z-transform tells you a system's *gain* at each frequency beautifully — but the unit-circle distances only give magnitude. To also see the phase (the time-delay each frequency suffers), and to compute an actual spectrum from finite data, you'll reach for the discrete Fourier transform, which is essentially H(z) evaluated right on the unit circle at evenly spaced points. Think of the DFT as the z-transform's measuring tape: same circle, now with tick marks. That's the next tool, and the one after that — the FFT — is just a blisteringly fast way to read it.