Power is a first-class signoff
By now you've watched a design march through place and route, close timing against every corner, and pass DRC and LVS. It's tempting to think the hard part is over. But a chip that is logically perfect and timing-clean can still be dead on arrival — because nobody checked that enough clean current can actually reach all those transistors, or that the metal carrying it will survive the years it's supposed to run.
Here's the shift in thinking that marks a back-end engineer: power is not a side-effect you tidy up at the end — it is its own signoff, with its own pass/fail gates, run alongside timing and physical verification. Think of your chip as a city. Timing signoff asks whether the traffic can get everywhere on schedule. Power signoff asks the utility company's questions: is there a power station big enough, do the substations sag under load, will the cables overheat and fail, and is the whole grid wasting electricity it doesn't need to?
The power delivery network
Every transistor on the chip needs to be fed two things: a steady supply voltage (call it VDD) and a path back to ground (VSS). The plumbing that carries that current from the package pins, down through the metal layers, and into each standard cell is the power delivery network, or PDN. It is, quite literally, the electrical grid of your chip.
Picture it as a hierarchy of water mains, exactly like a city's. The fattest pipes — wide metal straps on the top, thickest metal layers — bring current in from the package. They feed a regular crisscross power grid (or power mesh) stitched together with vias. That mesh drops down through the stack until, on the lowest layers, thin power rails run right alongside the rows of standard cells, tapping each one's VDD and VSS pins. Wide trunks up top, fine capillaries at the bottom — the same shape as arteries branching into capillaries.
This is why the PDN is built first, woven into the floorplan before a single signal wire is routed. The grid claims its territory on the upper metal layers up front; signal routing then threads through what's left. Build the grid too thin and you starve the logic; build it too fat and you've burned routing tracks and worsened congestion for nothing. Sizing the PDN is the first real power trade-off you'll make.
IR drop: voltage sag
The PDN's wires are excellent conductors, but they are not perfect — every strap, via, and rail has a tiny bit of resistance. And here's the unavoidable law of the land: the moment current I flows through a resistance R, the voltage across that resistance drops by I times R. So by the time VDD has traveled from the package pins, through the grid, and down to a cell buried in the middle of the chip, it arrives a little lower than it started. That sag is called IR drop.
V_drop = I * R // voltage lost across PDN resistance V_cell = VDD_pin - V_drop // what the cell actually sees // Example: 0.10 A through 50 milliohm of PDN = 5 mV sag // on a 0.75 V supply, that's a 0.7% droop
Why does a few millivolts matter? Because transistors switch slower at lower voltage — give a gate less VDD and it takes longer to flip. So IR drop quietly stretches every gate delay in the sagging region, and timing you carefully closed at the nominal voltage can suddenly fail when the real, drooped voltage shows up. This is the deep link back to the previous guide: IR drop and timing are the same problem wearing two hats. Modern signoff is voltage-aware — the timing engine is told the actual drooped voltage at each cell, so the two checks agree.
There are two flavors. Static IR drop is the steady-state sag from average current — fixed by a beefier grid or more package pins. Dynamic IR drop is the nasty one: a transient dip when a huge swath of logic switches on the same clock edge and yanks a current spike out of the grid all at once. The classic cure is decoupling capacitance (decap) — little reservoir capacitors sprinkled near hungry logic that dump charge locally during the spike, like water towers smoothing out a morning rush before the mains can catch up.
Electromigration: wires wear out
Here is a failure mode that has nothing to do with logic and everything to do with physics over time. Push enough current through a thin metal wire and the flowing electrons literally start knocking metal atoms downstream, like a river slowly dragging gravel along its bed. Over months and years the atoms pile up in some spots and thin out in others, until the wire grows a void and opens — or a whisker bridges to its neighbor and shorts. This slow erosion is electromigration, or EM, and it's why a chip can pass every test on day one and fail in the field two years later.
The key intuition: EM is driven by current density — amperes crammed into a tiny cross-section — not raw current. A given current through a hair-thin wire is far more dangerous than the same current through a wide strap, because the electrons are packed tighter. That's why EM bites hardest exactly where current concentrates: the PDN straps, the power vias, and the clock network, all of which carry heavy, relentless current.
EM signoff sets a current-density limit for every wire — the manufacturer's rule for how much a given metal width can safely carry for the chip's rated lifetime at temperature (EM accelerates fast as the chip runs hotter). The tool flags any wire that exceeds it, and you fix the violation by widening the wire, splitting current across parallel paths, or adding more vias so no single via is overloaded. It's bookkeeping at enormous scale — but skip it and you've shipped a chip with an expiration date.
Where the power goes (dynamic vs leakage)
Delivering power is only half the story. The other half is not wasting it — every watt the chip burns becomes heat you must remove and battery you must drain. To spend power wisely you first have to know where it goes, and in a CMOS chip it splits into two very different buckets.
The first is dynamic power — the cost of actually *doing work*, burned every time a signal flips and charges or discharges the tiny capacitance on a wire. It follows one of the most quoted formulas in chip design:
P_dynamic = alpha * C * V^2 * f // alpha = activity factor (fraction of nodes switching per cycle) // C = capacitance being switched // V = supply voltage // f = clock frequency
The second bucket is leakage (static) power — current that trickles through transistors even when nothing is switching at all. A modern transistor is an imperfect switch; "off" still leaks a little, and across billions of them that trickle adds up to a real, always-on tax you pay just for the circuit being powered, idle or not. Unlike dynamic power, leakage doesn't care about your clock — it bleeds 24/7 as long as the block has voltage.
Clock & power gating
Start with the cheapest, most universal win: clock gating. The clock itself is the busiest, most power-hungry net on the chip — it toggles every single cycle, everywhere, and each toggle charges capacitance and burns dynamic power. But much of that switching is pointless: a register holding a value that won't change this cycle still gets clocked, flips its internal nodes, and burns power doing nothing useful.
Clock gating's idea is dead simple: if a block has no work to do this cycle, don't deliver it a clock edge. A small gate sits in front of the register's clock and shuts it off — like switching the lights off in an empty room instead of lighting the whole building all night. No clock edge means no flipping means no dynamic power, and because it strikes the busiest net, the savings are large and nearly free. Synthesis tools insert it automatically, but writing RTL that *enables* gating (clean enable signals on your registers) is a skill you build.
Power gating is the heavier hammer, and it targets the *other* bucket — leakage. Instead of just pausing the clock, you cut the supply voltage to a whole block that's idle for a while, using big sleep transistors (power switches) in series with its VDD. No voltage means no leakage — you've unplugged that part of the city, not just dimmed it.
Multi-Vt, DVFS & UPF (a glimpse)
Three more levers round out the low-power toolkit. The first lives inside the standard cells themselves: multi-Vt. The same logic gate is offered in several flavors with different threshold voltages — a low-Vt cell is fast but leaks a lot, a high-Vt cell is slow but sips almost no leakage. Tools exploit this beautifully: use leaky-but-fast cells only on the critical paths that actually need the speed, and swap in stingy high-Vt cells everywhere else, where there's slack to spare. Same function, a fraction of the leakage — just by picking the right flavor per gate.
The second is DVFS — dynamic voltage and frequency scaling. Remember the V-squared in the dynamic-power formula? DVFS leans on it hard: when a block doesn't need full speed, lower both its clock frequency *and* its voltage together. Halving voltage roughly quarters dynamic power — an enormous win — which is exactly why your phone runs its cores slow and cool when you're reading, and ramps voltage and clock up only when you launch a game.
# UPF (Unified Power Format) — power intent, kept SEPARATE from the RTL
create_power_domain PD_CPU -elements {cpu_core}
create_supply_net VDD_CPU -domain PD_CPU
# a switchable supply: this domain can be powered down
create_power_switch sw_cpu -domain PD_CPU \
-input_supply_port {vin VDD} -output_supply_port {vout VDD_CPU} \
-control_port {sleep cpu_sleep}That last snippet hints at the third piece: with multiple voltage domains, switchable supplies, isolation, and retention, the power architecture becomes too complex to track by hand. So the intent lives in a separate file — UPF (Unified Power Format), an IEEE standard. UPF names the power domains, says which can be switched off, and where isolation and retention go. Every tool in the flow — synthesis, place and route, verification, and signoff — reads the *same* UPF, so the power intent stays consistent from RTL all the way to tapeout. It's the single source of truth for power, just as the netlist is for logic.