The factory floor: why four roles, not one big blob
Imagine you ran a workshop where one person took the order, machined the part, watched the machine, and inspected the result — all at once. The first time anything went wrong you'd have no idea which job that person was doing when it broke. Early Verilog testbenches were exactly that: one enormous `initial` block that generated stimulus, drove pins, sampled outputs, and checked results, all tangled together. Reuse was hopeless; the moment the protocol changed, the whole thing collapsed.
UVM breaks that blob into four crisp roles with a single job each. The sequencer decides *what* to send. The driver decides *how* to put it on the wires. The monitor watches the wires and reconstructs *what actually happened*. The scoreboard decides *whether that was correct*. Each component knows nothing about the others' internals — they speak only through well-defined handshakes and ports. That separation is the whole reason a UVM testbench can be reused across projects, mixed and matched, and scaled to a 200-engineer SoC.
Sequencer and driver: turning intent into pin wiggles
Stimulus in UVM is generated by sequences — scripted recipes like "write ten random addresses, then read them back." But a sequence doesn't touch the DUT itself. It hands its transactions to the sequencer, which acts as a traffic cop: it queues requests, arbitrates between competing sequences, and releases one transaction at a time when the driver says it's ready. Think of the sequencer as a dispatcher in a taxi depot — it holds the ride requests and pairs each with a free car.
The driver is the only component that knows the pin-level protocol — setup times, the order of `valid`/`ready`, how many clocks a burst takes. It runs a forever loop: ask the sequencer for the next transaction (`get_next_item`), wiggle the bus exactly as the protocol demands, then tell the sequencer it's done (`item_done`). This `get_next_item`/`item_done` handshake is the heartbeat of the whole flow — it provides natural back-pressure, so the sequencer never floods the driver faster than real hardware can absorb.
task run_phase(uvm_phase phase);
forever begin
seq_item_port.get_next_item(req); // pull one transaction from sequencer
drive_transfer(req); // wiggle pins per protocol
seq_item_port.item_done(); // release sequencer for the next
end
endtask
task drive_transfer(bus_txn req);
@(posedge vif.clk);
vif.addr <= req.addr; // present address
vif.wdata <= req.data;
vif.valid <= 1'b1;
vif.we <= (req.kind == WRITE);
do @(posedge vif.clk); while (!vif.ready); // wait for slave handshake
vif.valid <= 1'b0; // de-assert; transfer complete
endtaskThe monitor: a one-way mirror onto the bus
If the driver is the hand that writes on the bus, the monitor is the eye that reads it. It is strictly *passive*: it never drives a single signal, it only samples. Its job is the mirror image of the driver's — where the driver took a transaction and produced pin wiggles, the monitor takes pin wiggles and reconstructs the transaction. It watches `valid`, `ready`, `addr`, and `data`, recognises a complete transfer, packages it into a fresh transaction object, and publishes it.
The monitor reaches the outside world through an analysis port — written `uvm_analysis_port`. Unlike the sequencer↔driver handshake (which is point-to-point and blocking), an analysis port is a fire-and-forget broadcast: the monitor calls `write(txn)` and the transaction is delivered to *every* subscriber connected to that port, with no back-pressure. One monitor can feed a scoreboard, a coverage collector, and a logger simultaneously, none of them slowing the monitor down. It's the publish/subscribe pattern realised in silicon-verification form.
class bus_monitor extends uvm_monitor;
uvm_analysis_port #(bus_txn) ap; // broadcast outlet
task run_phase(uvm_phase phase);
forever begin
@(posedge vif.clk iff vif.valid && vif.ready); // sample on a real beat
bus_txn t = bus_txn::type_id::create("t");
t.addr = vif.addr;
t.data = vif.we ? vif.wdata : vif.rdata;
t.kind = vif.we ? WRITE : READ;
ap.write(t); // fan out to ALL subscribers
end
endtask
endclassThe scoreboard: the final inspector
Everything so far has only *moved* data. The scoreboard is where pass and fail are actually decided. It subscribes to the monitor's analysis port (via a `uvm_analysis_imp`), so every reconstructed transaction lands in its `write()` method. Inside, it compares what the DUT did against what *should* have happened — and that expectation comes from a reference model (sometimes called a golden model or predictor): a small, independent piece of code that models the correct behaviour.
A common scoreboard for a memory-mapped block keeps a shadow model — a simple associative array mirroring what the DUT's registers should hold. On a WRITE, it updates its shadow. On a READ, it looks up the expected value and compares it to the data the monitor observed. A mismatch fires a `uvm_error`; agreement quietly increments a pass counter. The reference model is deliberately *not* a copy of the RTL — if it were, it would repeat the RTL's bugs. It is an independent description of intent.
class bus_scoreboard extends uvm_scoreboard;
`uvm_analysis_imp_decl(_obs)
uvm_analysis_imp_obs #(bus_txn, bus_scoreboard) obs_imp;
bit [31:0] shadow [bit [31:0]]; // golden reference model
function void write_obs(bus_txn t); // called per observed transaction
if (t.kind == WRITE) begin
shadow[t.addr] = t.data; // predict
end else begin // READ: check
bit [31:0] exp = shadow.exists(t.addr) ? shadow[t.addr] : 32'h0;
if (t.data !== exp)
`uvm_error("SCB", $sformatf("addr %0h: got %0h exp %0h",
t.addr, t.data, exp))
else
`uvm_info("SCB", "read match", UVM_HIGH)
end
endfunction
endclassThe agent: packaging it all, and how the wires connect
A sequencer, a driver, and a monitor almost always travel together for one interface — so UVM bundles them into a reusable container called an agent. Think of an agent as a fully equipped pit crew for *one* protocol port: hand it a virtual interface and it can both stimulate and observe that port. A chip with an AXI master, an APB slave, and a UART gets three agents, each self-contained.
Agents come in two flavours controlled by one flag, `is_active`. An active agent builds all three children — it both drives and watches. A passive agent builds *only the monitor* — it watches a port that something else is driving (the real DUT, or another block). This single switch is why a UVM environment scales: in a block-level test the agent is active and drives the DUT; reuse that same agent at the SoC level where the port is now driven internally, flip it to passive, and it becomes a pure observer feeding the same scoreboard.
The wiring happens in the `connect_phase`. The driver's `seq_item_port` is connected to the sequencer's `seq_item_export` (the handshake channel for stimulus). The monitor's analysis port is connected to the scoreboard's analysis import and, in parallel, to a coverage subscriber. Crucially, the *agent* exposes the monitor's analysis port outward, so the environment can route observed transactions anywhere — keeping the scoreboard outside the agent, where it belongs.
agent (active) env
+-------------------------+ +------------------+
| sequencer --seq_item--> | | |
| | driver --->|=pins=| DUT |
| v | | | |
| (stimulus) monitor--|--ap--+-->| scoreboard |
+-------------------------+ | +--> coverage |
+------------------+
seq_item_port <-> seq_item_export (blocking handshake)
analysis_port --> analysis_imp(s) (non-blocking broadcast)One transaction, end to end
Let's trace a single `READ addr=0x40` all the way through, and watch each component touch it in turn. This is the loop every UVM testbench runs millions of times a night, and seeing it once makes the whole architecture click.
- Sequence → sequencer. A sequence randomises a transaction (`kind=READ, addr=0x40`) and calls `start_item`/`finish_item`. The object is parked in the sequencer, waiting for a free driver.
- Sequencer → driver. The driver's loop calls `get_next_item`; the sequencer hands over our READ. The driver now owns the transaction.
- Driver → pins. The driver presents `addr=0x40`, asserts `valid`, clears `we` (it's a read), and waits for the DUT's `ready`. After the handshake it calls `item_done` — its job is finished. It never looks at the returned data; checking isn't its job.
- DUT → pins. The DUT decodes the address, fetches `0xDEAD` from its register file, and drives it onto `rdata` with `ready` high on the response beat.
- Pins → monitor. Independently, the monitor sees `valid && ready` on that beat, samples `addr=0x40` and `rdata=0xDEAD`, builds a fresh transaction `{READ, 0x40, 0xDEAD}`, and calls `ap.write(t)`.
- Analysis port → scoreboard + coverage. That one `write` fans out. The scoreboard's `write_obs` looks up its shadow model (which an earlier WRITE set to `0xDEAD`), compares, and finds a match — pass. In parallel, the coverage collector samples the address bin so the verification plan can later prove `0x40` was exercised.