Transmit Side: Power Amplifiers and the Full Transceiver

The shout at the end of the chain

For four rungs we have been listening. The LNA cupped its hand to its ear so as not to breathe on the microphone; the mixer slid the signal down in frequency; the PLL held the local-oscillator tone rock-steady. All of that is the receiver's craft of hearing a whisper. Now turn the radio around. On the transmit side the job is the opposite, and far less subtle: shout loud enough to be heard across the gap. Your phone has to fling enough energy into the air that a base station a kilometre away — past walls, rain and the bodies of a thousand other phones — can still pluck your signal out of the noise. That shouting is the job of the power amplifier, or PA, and it is almost always the very last active block before the antenna.

How loud is a shout? A smartphone PA delivers up to roughly +23 dBm at the antenna — about 200 milliwatts. A Wi-Fi radio pushes +18 to +23 dBm; a cellular base station runs tens of watts; a broadcast tower, kilowatts. But here is the cruelty the receiver never faced: at +23 dBm the PA might be drawing a watt or more from the battery to deliver those 200 milliwatts. The other 800 milliwatts becomes heat in your palm. That ratio — RF out divided by DC in — is efficiency, and on a phone it is not an academic nicety. The PA is usually the single hungriest block in the whole transceiver, and talk time lives or dies on its efficiency.

Efficiency versus linearity: the PA's defining tension

Picture the output transistor as a valve on a water main. To be linear, you set the valve half-open and let the signal swing it gently up and down around that midpoint — the flow tracks the signal faithfully. But a half-open valve is always passing current even when the signal is small, so it always burns power. This is the spirit of Class A: superbly linear, dismally efficient (theoretical ceiling 50%, in practice often 20–30%). The transistor conducts for the entire signal cycle, idling like an engine that never turns off.

Now bias the valve nearly shut, so it only opens when the signal pushes hard. The transistor conducts for less than a full cycle — half in Class B, a sliver in Class C — and idles near zero, so efficiency soars (Class B tops out at 78.5%). The price is distortion: each device handles only part of the waveform, and the kinks where they hand off must be smoothed out. Class AB is the everyday compromise that powers most phones — biased just above cut-off, conducting for slightly more than half a cycle, trading a touch of efficiency for enough linearity to be usable. It is the workhorse of integrated handset PAs.

Beyond these lie the switching classes — D, E, F — where the transistor is never a valve at all but a switch: hard on (low voltage, high current) or hard off (high voltage, no current), never lingering in the lossy middle where voltage and current overlap. In the ideal switch, power dissipation is zero and efficiency approaches 100%. The catch is brutal: a switch has no notion of 'half-loud'. It can carry amplitude information only by clever tricks (turning it on and off faster, or modulating its supply), so switching PAs are wonderful for constant-envelope signals like GSM or classic FM and treacherous for the amplitude-rich signals of modern data.

PA class vs conduction & efficiency
----------------------------------------------------------------
        conduction angle      peak eff.   linearity   used for
 Class A     360 deg            50%        excellent   labs, low-power
 Class AB  180-360 deg        50-78%        good        phone PAs
 Class B     180 deg          78.5%        fair        push-pull audio
 Class C    < 180 deg          ~80%+       poor        FM, constant-env
 Class E/F   switch (0/on)     ~90-100%    none*        GSM, ET, PWM-RF
                                          (*needs envelope tricks)

 Drain efficiency  eta = P_RF_out / P_DC_in
 PAE (power-added) = (P_out - P_in) / P_DC   <- the honest number
----------------------------------------------------------------

Conduction angle is the dial between linearity and efficiency. **PAE** (power-added efficiency) is the metric that counts, because it charges the PA for its own drive power.

Back-off, AM/AM, AM/PM — and why your phone gets hot

Every PA has a ceiling. Drive it harder and harder and the output rises... until it doesn't. The gain curve compresses and flattens as the transistor runs out of voltage swing or current to give. The standard yardstick is the 1-dB compression point (P1dB): the output level at which gain has sagged 1 dB below its small-signal value. Push past it toward saturation and the amplifier clips — and a clipped sinewave is a buffet of harmonics and, worse, intermodulation products that splatter energy into your neighbour's channel. The receiver's third-order intercept (IP3) measured this same nonlinearity from the input side; on the PA it shows up as adjacent-channel leakage that regulators will fine you for.

Engineers describe PA distortion with two paired curves. AM/AM captures how output *amplitude* deviates from a straight line as input amplitude grows — that's the gain compression. AM/PM captures something sneakier: how the output *phase* shifts as amplitude grows, because the transistor's internal capacitances change with drive level. AM/PM is a silent killer for phase-modulated signals — it rotates points in the constellation purely because they got louder, and no amount of gain calibration alone will fix it.

The escape valve is back-off: deliberately operate the PA *below* its peak power, in the well-behaved linear region, leaving headroom for the signal's peaks. But back-off is expensive — drop the average output and the efficiency collapses, because Class-AB efficiency falls roughly as the square root of the power ratio. How much back-off you need is dictated by the signal's peak-to-average power ratio (PAPR). And here is the modern tragedy: the very modulations that make data fast — high-order QAM, and especially OFDM — have *terrible* PAPR. OFDM is a sum of hundreds of subcarriers that occasionally line up into a towering peak 8–12 dB above the average. To pass those peaks undistorted, the PA must back off 6–8 dB, dragging efficiency from a healthy 50% down toward a miserable 15–20%.

Zooming out: superheterodyne vs zero-IF

We now have all the bricks — LNA, mixer, PLL, PA. How do they assemble into a radio? For nearly a century the canonical answer was the superheterodyne, Edwin Armstrong's 1918 invention. The idea is to never do the hard work at the radio frequency. Instead, mix the incoming signal down to a fixed, moderate intermediate frequency (IF) — say 70 MHz or 10.7 MHz — and do the heavy lifting of filtering, gain and demodulation there, where high-Q filters are cheap and amplifiers behave. Tune the radio by tuning only the local oscillator; the IF stays put. It is a beautiful architecture and it dominated discrete radios from car stereos to spectrum analyzers.

Superheterodyne has a famous wound: the image frequency. A mixer multiplying by the LO can't tell a signal *above* the LO from one *below* it — both fold to the same IF. So any energy at the image frequency (2×IF away from your wanted channel) lands right on top of your signal, and you must kill it with a sharp image-reject filter *before* the mixer. That filter is the problem. It is a high-Q, fixed-frequency, off-chip component — a SAW or ceramic brick — that you simply cannot integrate onto silicon. For a single-chip radio, the superhet is a non-starter: it demands expensive external filters the integration was meant to eliminate.

The architecture that *did* win on silicon is the [[ic-direct-conversion|direct-conversion]], or zero-IF, radio. Here the trick is audacious: set the IF to zero. Mix the RF directly down to baseband by choosing an LO at *exactly* the carrier frequency. Now the wanted channel sits at DC, and channel-select filtering is just a humble low-pass filter — a resistor and capacitor on-chip, tunable, integrable. The image problem vanishes too, because the image *is* the signal's own mirror across DC, handled by using quadrature (I and Q) downconversion. No external IF filter, no image filter, everything on one die. This is why essentially every smartphone, Wi-Fi, Bluetooth and 5G chip you own is a zero-IF radio.

ZERO-IF RECEIVER  (quadrature downconversion)

  ant                        cos(w_LO t)
   |    +-----+   +-------+      |     +-----+   +----+
   +--->| LNA |-->| split |--+-->(X)-->| LPF |-->|ADC |-->  I
        +-----+   +-------+  |   ^      +-----+   +----+
                             |   |
                          [ PLL / synth @ f_carrier ]
                             |   |
                             |   v  sin(w_LO t)  (90 deg shift)
                             +-->(X)-->| LPF |-->|ADC |-->  Q
                                       +-----+   +----+

  TX side mirrors this: DAC->LPF->(X) with I & Q, summed, then PA->ant

The integrated zero-IF chain: LNA, a quadrature [[ic-rf-mixer|mixer]] driven by the [[ic-frequency-synthesizer|synthesizer]] at the carrier, low-pass channel filters, and ADCs. The transmitter is the same picture in reverse, ending in the [[ic-power-amplifier|PA]].

Zero-IF's three gremlins: DC offset, LO leakage, IQ imbalance

Zero-IF wins on integration but pays for it with subtle pathologies — precisely because it parks the wanted signal at DC, where the analog world is full of slow-moving demons. The first is DC offset. Tiny transistor mismatches in the mixer and baseband amplifiers produce a static DC voltage that sits *right where your signal lives*. Worse, it drifts with temperature and gain settings. In a superhet at a non-zero IF this offset is harmlessly off to the side; at zero-IF it's a parked truck in your lane. Radios fight it with AC-coupling (a high-pass notch at DC — but that eats the centre of your spectrum) or with digital offset estimation and cancellation.

The second gremlin is LO leakage / self-mixing. The local oscillator runs at *exactly* the carrier frequency and is a strong, nearby tone. Through parasitic coupling it leaks back into the LNA and mixer input, then mixes with *itself* — and any signal mixed with itself produces DC. So the leaking LO manufactures its own DC offset, one that changes whenever the antenna environment changes. On the transmit side the same leakage shows up as LO feedthrough: a spurious tone right at the carrier (an unwanted carrier) that sits in the middle of your transmitted spectrum, degrading the signal and wasting power. It appears in the constellation as a constant vector offset of every symbol.

The third — and the most insidious — is [[ic-iq-imbalance|IQ imbalance]]. Zero-IF depends utterly on two LO copies that are *exactly* 90° apart and two baseband paths (I and Q) with *exactly* equal gain. Reality never obliges. The 90° phase split is off by a degree or two; the I and Q amplifiers differ by a fraction of a dB; the layout to one mixer is a few microns longer than to the other. These mismatches mean the I and Q channels are no longer perfectly orthogonal — and orthogonality is the whole load-bearing assumption that let you stuff two independent data streams onto one carrier.

What does broken orthogonality *do*? It leaks each channel into the other and, in spectral terms, folds the image of your signal back on top of itself with finite suppression. In the constellation it warps a clean square QAM grid into a sheared, lopsided rhombus — points that should sit on a crisp lattice smear into ovals. The single number that scores this damage is EVM (error vector magnitude): the RMS distance between where each symbol *should* land and where it *actually* lands, as a percentage of the ideal amplitude. Every nonideality in the whole chain — PA distortion, phase noise, DC offset, LO leakage, IQ imbalance — pours into the same EVM bucket.

How IQ imbalance shears the constellation

  IDEAL 16-QAM            WITH IQ IMBALANCE (gain g, phase err phi)
  . . . .                  .  .  .  .
  . . . .   --gain+phase->   .  .  .  .     <- grid is sheared,
  . . . .                   .  .  .  .         axes no longer 90 deg,
  . . . .                  .  .  .  .          symbols smear -> high EVM

  EVM (%) = 100 * sqrt( mean|S_rx - S_ideal|^2 ) / |S_ref|

  Spec examples:   QPSK    ~17.5% EVM ok
                   64-QAM  needs < ~8%
                   256-QAM needs < ~3.5%   <- IQ cal is mandatory here
                   1024-QAM (Wi-Fi 6/7) < ~2%

Higher-order constellations leave almost no error budget — which is why modern transceivers run digital **IQ calibration** loops at startup and in the background.

Stitching it together: the whole transceiver

Step back and look at the whole machine. A modern transceiver is two zero-IF chains, receive and transmit, sharing one synthesizer and one antenna, knitted together by a duplexer or switch. Trace a packet through it in each direction — every block we built across rungs 2 through 5 takes its place:

Receive, RF→bits. Antenna → switch/duplexer → LNA (lift the whisper above the noise floor without adding its own) → quadrature mixer driven by the synthesizer at the carrier → I/Q low-pass channel filters → ADCs → DSP demodulation, where IQ-imbalance and DC-offset are digitally scrubbed and the constellation is recovered.
Transmit, bits→RF. DSP shapes the I/Q symbols (often pre-distorted to cancel PA AM/AM and AM/PM) → DACs → reconstruction low-pass filters → quadrature upconverter on the same LO → driver stage → power amplifier (the loud last block, with envelope tracking for efficiency) → switch/duplexer → antenna.
The shared spine. One frequency synthesizer/PLL provides the clean LO for both directions; its phase noise sets a noise floor on the constellation that no calibration can remove. Switches or a duplexer let RX and TX share the antenna without the +23 dBm transmit shout deafening the LNA listening for a −90 dBm whisper just microns away.