Leakage and Multi-Vt: The Power That Never Sleeps

The bill that arrives even when nothing happens

Back in rung 1 we split a chip's power into two halves. Dynamic power is what you pay to *do* work: every time a transistor flips, it charges and discharges tiny capacitors, and that costs energy. Stop the clock, and dynamic power falls to zero. It is the power of motion — no motion, no bill. For thirty years that intuition was almost the whole story, and engineers obsessed over switching activity and clock frequency.

But there is a second half, and it does not care whether the clock is running. [[ic-leakage-power|Leakage power]] — also called static power — is current that flows through a transistor that is supposedly *off*. A transistor is meant to be a switch: closed when you want current, open when you don't. The cruel truth of modern silicon is that the open switch is leaky. A tiny but unceasing current seeps through, and because billions of transistors are leaking at once, the trickles add up to a river. This bill arrives even when the chip sits idle in your pocket, even when the screen is black.

Two leaks: under the gate and through the gate

Leakage is not one phenomenon but a family of them. Two dominate. The first is subthreshold leakage — current flowing from drain to source while the transistor is nominally off. The gate voltage is below the threshold voltage Vt, the level at which the transistor is supposed to turn on. You would expect zero current below threshold, but the channel does not slam shut at exactly Vt; it dims gradually. Just under threshold, a small population of charge carriers still has enough thermal energy to cross. The current does not stop — it falls off exponentially, and exponentially is not the same as zero.

How fast it falls off is captured by the subthreshold slope, typically around 70–100 mV per decade. That means every ~80 mV you raise Vt cuts subthreshold leakage roughly tenfold. Flip it around: every 80 mV you *lower* Vt to make the transistor faster multiplies its leakage by ten. That single sentence is the entire drama of this guide. Speed and leakage are locked together through Vt, and you cannot cheat the exponential.

The second big leak is gate leakage — current tunnelling *straight through* the gate insulator. The gate sits over the channel separated by an oxide so thin it is measured in atoms. At older nodes that oxide was a comfortable wall. As transistors shrank, the oxide had to thin too, and below roughly 2 nm electrons stop respecting the wall altogether: quantum mechanics lets them tunnel through it. The fix was changing the recipe — high-k dielectrics (hafnium-based) replaced silicon dioxide to give a thicker physical barrier with the same electrical effect, and FinFET and gate-all-around structures wrap the gate around the channel for tighter control. Gate leakage is largely a materials and structure war; subthreshold leakage is the one designers fight cell by cell.

Why leakage exploded — and why it loves heat

For decades shrinking transistors was pure good news, and the bargain that made it work was Dennard scaling: as you halved the transistor's dimensions, you also lowered the supply voltage and the threshold voltage to match, keeping the electric fields constant and power density flat. Smaller, faster, cooler — all at once. But Vt could not keep falling forever. Push it too low and the off transistor leaks too much; the exponential we just met turns the savings into a penalty. Around the mid-2000s Vt scaling stalled while everything else kept shrinking, and Dennard scaling broke. Leakage, once a rounding error, climbed until at the most advanced nodes it can be 30–50% of total chip power.

Leakage also has a temperament: it is exquisitely sensitive to temperature. Subthreshold current depends on how many carriers have enough thermal energy to cross the barrier, and that population grows fast as the chip warms. As a rule of thumb leakage roughly doubles for every 10–15 °C rise. This creates a vicious loop — thermal runaway: leakage heats the chip, heat raises leakage, which heats the chip more. A part that looks fine on a cool bench can melt down inside a sealed phone in summer. This is why leakage is always specified at a hot junction temperature, not at room temperature.

Finally, leakage swings wildly with the [[process-corner|process corner]]. Manufacturing is never exact — Vt, oxide thickness and channel length scatter from wafer to wafer. The corner you worry about for leakage is FF (fast-fast): transistors that came out faster than nominal did so partly because their Vt landed low, so the fastest silicon is also the leakiest. Combine the worst case — FF process, high voltage, hot junction — and you get the leakage corner sign-off engineers must hold the budget against. The same chip at the SS (slow-slow), cold, low-voltage corner barely leaks at all but is sluggish. You design to survive both extremes.

Subthreshold leakage scales with three knobs:

  I_leak  ~  W/L * exp( -Vt / (n * kT/q) )
                    ^^^^^^^^^^^^^^^^^^^^^^
                    the exponential that rules everything

  Lower Vt  by 80 mV  ->  I_leak  x10   (faster cell, leakier)
  Raise Vt  by 80 mV  ->  I_leak  /10   (slower cell, miserly)
  +10..15 C temperature ->  I_leak  x2   (heat feeds the leak)
  FF corner vs SS corner ->  I_leak  x5..x10  (process scatter)

  kT/q at 25 C ~ 26 mV ;  at 110 C ~ 33 mV  (so slope worsens hot)

The three levers that move subthreshold leakage. Vt is the one the designer chooses cell by cell; temperature and corner are conditions you must survive.

Multi-Vt: a fast cell and a slow cell for the same logic

Here is the elegant trick the industry settled on. Recall from earlier rungs that designers don't draw transistors by hand — they assemble chips from a [[standard-cell|standard-cell]] library, a catalogue of pre-built gates (inverters, NANDs, flip-flops) all the same height so they snap into rows. A modern library doesn't ship just one inverter. It ships the *same* inverter built three or four times over, each version using transistors with a different threshold voltage. These are the [[ic-multi-vt|multi-Vt]] flavours.

LVT — low-Vt: low threshold, so the transistor turns on hard and switches fast. The price is heavy leakage — often 5–10× an HVT cell. Use sparingly, only where you need every picosecond.
SVT — standard-Vt: the balanced middle. Reasonable speed, reasonable leakage. The sensible default for most logic.
HVT — high-Vt: high threshold, so it switches slowly but leaks very little — often a fifth to a tenth of an LVT cell. Use it everywhere timing is comfortable.
(Some libraries add ULVT / eLVT for extreme speed, and ratchet up to several flavours.) The logic function is identical across all of them — only the leakage-vs-speed point differs.

Because all flavours of a cell are logically identical and pin-compatible, the synthesis and place-and-route tool can swap one for another freely, late in the flow, without redrawing anything. That swap-ability is the whole point: it turns leakage into a knob the tool can turn gate by gate.

Same inverter, three thresholds — same footprint, same pins:

   LVT inverter      SVT inverter      HVT inverter
   delay  : 8 ps     delay  : 11 ps    delay  : 16 ps
   leak   : 10 nA    leak   : 2.5 nA   leak   : 1 nA
      ^ fast, thirsty    ^ balanced         ^ slow, frugal

   in --|>o-- out    in --|>o-- out    in --|>o-- out
   (drops into the SAME row slot; tool picks the flavour)

Illustrative numbers only, but the shape is real: across Vt flavours, delay changes by ~2× while leakage changes by ~10×. That asymmetry is exactly why mixing pays.

The critical path gets the fast cells; everyone else pays less

Now the strategy writes itself. Recall the [[critical-path|critical path]] — the slowest logic path between two flip-flops, the one that sets how fast the whole clock can run. A path is only critical if it has *no slack*: no spare time before the next clock edge. Every other path finishes early and is sitting idle, waiting. Spending fast, leaky LVT cells on a path with slack buys you nothing — it makes an already-fast path faster, which the clock never notices, while paying the leakage penalty for the privilege.

So the tool runs a leakage-recovery pass. It starts (or ends up) with everything fast, then walks the design swapping cells to higher Vt wherever there is slack to spare, stopping each swap just before the path would miss timing. Fast LVT cells survive only on the genuinely critical paths; everywhere with comfortable slack ends up HVT. The result: the chip hits its frequency target on the few paths that need speed, and leaks like an HVT design across the vast majority of gates that didn't.

Let's make the trade-off concrete. Suppose a block has 100,000 cells. Done entirely in LVT to be safe, the critical path is 1.00 ns (1 GHz, target met) but block leakage is 100 µA. Done entirely in HVT to save power, leakage drops to 12 µA — but now the critical path swells to 1.45 ns and the block fails timing at 1 GHz. Neither extreme is acceptable. The mixed solution keeps LVT only on the ~6% of cells that sit on critical or near-critical paths and makes the rest HVT:

Block of 100,000 cells, target clock = 1.00 ns (1 GHz)

  Strategy        Critical path   Block leakage   Meets 1 GHz?
  -----------     -------------   -------------   ------------
  All LVT         1.00 ns         100  uA         YES  (but worst power)
  All HVT         1.45 ns          12  uA         NO   (45% too slow)
  Multi-Vt mix    1.00 ns          ~22 uA         YES  <-- ship this
   (6% LVT on critical paths, 94% HVT elsewhere)

  Result of the mix vs all-LVT:
    timing       : identical (still 1.00 ns, still meets 1 GHz)
    leakage      : 100 uA -> 22 uA   = ~78% leakage saved
    cost         : zero area, zero logic change, one tool pass

The headline of multi-Vt: cut leakage roughly 3–5× while meeting the *exact same* frequency, just by choosing the right Vt for each cell based on its slack.

Where multi-Vt sits in the bigger leakage war

Multi-Vt is the *everyday* leakage defence — cheap, automatic, no area cost, applied on virtually every block in every modern SoC. But it only reshapes leakage; it never turns it off. A leaking transistor still leaks, just less. When a block is going to be *idle for a long time*, the heavier weapons come out — and those are the subject of later rungs. Power gating inserts a header/footer switch that cuts a sleeping block off from the supply entirely, dropping its leakage toward zero (at the cost of wake-up time). Body biasing electrically nudges Vt up while idle. Voltage scaling lowers the supply itself, since leakage falls steeply with voltage.

Think of it as a layered defence. Multi-Vt is the standing army you deploy everywhere by default — it makes the whole chip leak less, all the time, for free. Power gating and body biasing are the special forces you call in for blocks that can be put to sleep. A real low-power design uses all of them together: multi-Vt sets the floor, and the sleep techniques drive idle blocks below it.