Procedural Blocks
initial, always, always_comb, always_ff, always_latch — what hardware each one infers.
Module 5 · Page 5.1
What Is a Procedural Block?
A procedural block is a container for sequential behavioral code. Everything inside a procedural block executes line by line, from top to bottom, like a programming language. Outside a procedural block — in the module body — every statement executes concurrently in parallel hardware.
SystemVerilog has five procedural block types. Each one has a different execution model and infers different hardware (or no hardware at all). Picking the right one is the most important RTL decision you make every time you write a block.
- initial — Runs once at simulation time 0. Not synthesisable. Used in testbenches to drive stimulus and initialise memory.
- always — Verilog legacy. Runs forever with a manual sensitivity list. Avoid in new SystemVerilog code — too ambiguous.
- always_comb — SystemVerilog. Combinational logic. Auto-sensitivity list. Tool guarantees no latches. Use for all combinational RTL.
- always_ff — SystemVerilog. Registered (flip-flop) logic. Requires a clock edge. Tool guarantees register inference. Use for all sequential RTL.
- always_latch — SystemVerilog. Latch inference. Level-sensitive. Tool guarantees latch inference. Rarely needed — avoid unless interface demands it.
Figure 1 — When Each Block Executes During SimulationtimeT=0clk↑clk↑clk↑clk↑clk↑initialalwaysalways_combalways_ffalways_latchruns once at T=0 then stopsfires on sensitivity list eventsfires on ANY input changefires ONLY at posedge/negedge clkfires when enable is HIGH
Figure 1 — Execution model for each block.
initialfires once;always_combfires on any input change;always_fffires only at clock edges;always_latchfires when the enable level is high.
initial — Runs Once at Time 0
An initial block begins executing at simulation time 0 and runs until it reaches the end — or a $finish. It never repeats. It is the primary tool of testbenches: apply reset, drive stimulus, check results, finish.
initial is not synthesisable. If you accidentally write it in RTL, synthesis tools either error or silently ignore it. Keep it in files that never go through synthesis.
// ── Testbench: clock gen + reset + stimulus ─────────────────────
module tb;
logic clk = 0;
logic rst_n;
logic [7:0] data;
// ① Clock generator — runs forever via always
always #5 clk = ~clk; // 10-unit period
// ② Reset + stimulus — runs once from T=0
initial begin
rst_n = 0; data = 8'h00;
@(posedge clk); @(posedge clk); // hold reset 2 cycles
rst_n = 1;
@(posedge clk); data = 8'hAA;
@(posedge clk); data = 8'h55;
@(posedge clk);
$display("Simulation complete");
$finish;
end
// ③ Memory pre-load — another initial block
initial begin
$readmemh("program.hex", u_dut.mem); // backdoor load
end
// ④ Multiple initial blocks run CONCURRENTLY starting at T=0
initial begin
$dumpfile("sim.vcd");
$dumpvars(0, tb); // VCD waveform dump
end
endmodule🧠 How the Simulator Actually Handles initial at T=0
All initial blocks are added to the event queue at simulation time 0. The simulator processes them in the Active region — but the order between multiple initial blocks is non-deterministic by the IEEE spec. In practice, most simulators (VCS, Questa, Xcelium) process them top-to-bottom in file order, but you must never depend on this. If two initial blocks write to the same signal at T=0, which value wins is tool-dependent. Design your testbench so initial blocks are independent — each owns distinct signals.
Waveform — initial block execution at T=0 with 2-cycle resetTime0 5 10 15 20 25 30clk0________1_______0________1_______0________1_______0rst_n0__________________________1 (released after 2 posedges)dataX______________________________________AA_____55phase│ initial fires │ @posedge×2 │ rst_n=1 │ stim │ stim │ finish
⚠ Common Industry Mistake: Using initial to Reset RTL Flip-Flops
Junior engineers sometimes write initial q = 0; inside RTL modules to avoid wiring up a reset. This works only in simulation — synthesis ignores it entirely. The flip-flop powers on to an unknown state in silicon. Always model reset in always_ff. This mistake appears constantly in code reviews and is a guaranteed flag during RTL sign-off.
always — Legacy Verilog (Avoid in New Code)
The plain always block is a general-purpose procedural loop that runs forever. It requires you to manually write a sensitivity list using @(*) or @(signal_list). The tool does not check what you intended — it will equally happily infer combinational logic, a flip-flop, or a latch from the same always block.
The core problem: an always block gives the tool zero information about your intent. If you forget a signal in the sensitivity list, you get a simulation–synthesis mismatch with no warning. SystemVerilog replaced it with three specialised blocks — always_comb, always_ff, always_latch — each of which tells the tool exactly what you mean.
// ❌ Legacy always — tool doesn't know your intent
always @(*) begin // @(*) = manual auto-sensitivity
y = a & b; // intended: combinational — but tool can't verify
end
always @(posedge clk) begin // intended: flip-flop — but tool can't verify
q <= d;
end
// ✅ Modern SystemVerilog — intent is explicit and verified by the tool
always_comb begin
y = a & b; // tool GUARANTEES combinational, auto-sensitivity
end
always_ff @(posedge clk) begin // tool GUARANTEES flip-flop inference
q <= d;
end
// ── The sensitivity list bug that always hides ──────────────────
always @(a) begin // ❌ b missing from list!
y = a & b; // simulation: y doesn't update when b changes
end // synthesis: works (uses all inputs)
// → sim/synth mismatch — impossible to debug
// always_comb would catch this immediately — no sensitivity list to get wrong🚀 RTL Design Insight: Why always Still Exists in SystemVerilog
always is in SystemVerilog purely for Verilog backward compatibility. Every major synthesis and lint tool (Spyglass, Synopsys Lint, Cadence JasperGold) flags unqualified always @(*) in new RTL as a lint warning. Most modern project lint rules enforce "No always @(*) in RTL — use always_comb" and "No always @(posedge clk) — use always_ff". If your project does not have these lint rules, add them. The migration cost is a one-time find-and-replace; the quality improvement is permanent.
🧠 How the Simulator Sees always @(*) vs always_comb
always @(*) builds its sensitivity list at elaboration time from the signals read during the first execution of the block. If a signal is only read on a conditional path that wasn't taken in the first evaluation, it may be absent from the sensitivity list — a subtle and hard-to-reproduce bug. always_comb builds its sensitivity list from static analysis of all possible read paths through the block, including all branches. This is the deeper reason always_comb is safer, even beyond the missing-signal problem.
always_comb — Combinational Logic
always_comb is the correct block for any logic that has no memory — a mux, a decoder, an adder, an ALU, a priority encoder. The tool automatically infers the sensitivity list from every signal read inside the block. You never need to write @(*) — and you never get the sensitivity list wrong.
The tool enforces two guarantees: (1) every output must be assigned on every possible path through the block — no latches can be inferred. If you forget a default, the tool errors. (2) no timing controls (#, @, wait) are allowed inside.
Figure 2 — always_comb Infers Pure Combinational Gates (No Memory)always_comb begin case (sel) 2'b00: y = a; 2'b01: y = b; 2'b10: y = c; default: y = d; endcaseend→synthesisMUX4-to-1abcdsel[1:0]yNo flip-flops.Pure gates.
Figure 2 — always_comb with a 4-to-1 mux infers pure combinational gates. No clock, no flip-flop, no memory element. Output changes immediately when any input changes.
// ── Rule 1: No sensitivity list — it's automatic ─────────────────
always_comb begin
y = a & b; // auto-senses {a, b} — no @(*) needed or allowed
end
// ── Rule 2: Every output must be assigned on ALL paths ────────────
always_comb begin
out = 4'h0; // ← default assignment: prevents latch
if (enable)
out = data_in; // overrides default when enable=1
end
// ── Rule 3: Functions are fine inside always_comb ─────────────────
function automatic logic [2:0] priority_enc(input logic [7:0] req);
priority_enc = 3'b0;
for(int i=7; i>=0; i--) if(req[i]) priority_enc = i[2:0];
endfunction
always_comb begin
grant = 8'h0;
grant_id = priority_enc(request); // function call in always_comb ✅
if (|request)
grant[grant_id] = 1;
end
// ── What always_comb CANNOT contain ──────────────────────────────
always_comb begin
// @(posedge clk); ← ILLEGAL — timing control in always_comb
// #10; ← ILLEGAL — delay in always_comb
// wait(valid); ← ILLEGAL — wait in always_comb
y = a ^ b; // ✅ pure combinational
endalways_comb Pitfalls Engineers Hit in Real Projects
These are not textbook edge cases — they appear in peer code reviews every week on real RTL projects. ❌ Pitfall 1 — Task call with timing// always_comb cannot call tasks // that contain timing controls task capture(output logic v); @(posedge clk); // ← timing! v = bus_data; endtask always_comb begin capture(result); // COMPILE ERROR end✅ Fix — Use functions only// Functions: no timing controls // allowed — safe in always_comb function automatic logic [7:0] decode(input logic [3:0] op); case (op) 4'h0: decode = 8'hA5; default: decode = 8'h00; endcase endfunction always_comb out = decode(op);❌ Pitfall 2 — Writing to same var in two always_comb// Two blocks both drive 'grant' always_comb begin grant = req_a ? 2'b01 : 2'b00; end always_comb begin grant = req_b ? 2'b10 : 2'b00; end // → Multi-driver: grant becomes X // → Tool ERROR or undefined result✅ Fix — Single always_comb with priority// One block owns one output always_comb begin grant = 2'b00; // default if (req_a) grant = 2'b01; else if (req_b) grant = 2'b10; end // One driver, clear priority. // No ambiguity in sim or synth.❌ Pitfall 3 — Combinational feedback loop// Output feeds back into input always_comb begin a = b & c; b = a | d; // b depends on a! end // → Simulator: X oscillation // → Synthesis: LOOP error // → Real chip: metastability✅ Fix — Register to break loop// Separate with register stage always_comb begin a = b_reg & c; // uses registered b next_b = a | d; end always_ff @(posedge clk) begin b_reg <= next_b; // pipeline break end
🏗 Synthesis Concern: always_comb Fires at T=0 (IEEE Spec)
Unlike always @(*), the always_comb block is guaranteed by the IEEE 1800 spec to evaluate once at simulation time 0, even before any events have been triggered. This ensures combinational outputs are valid from the very first simulation instant — outputs won't show as X if inputs are driven. This is one of the subtle reasons always_comb is superior to always @(*) even when they appear equivalent.
always_ff — Registered (Flip-Flop) Logic
always_ff is the correct block for all sequential (clocked) logic — shift registers, counters, state machines, pipeline stages, anything with a flip-flop. It requires at least one clock edge in the sensitivity list (posedge clk or negedge clk). The tool guarantees that only registers are inferred — no latches.
Figure 3 — always_ff Infers Flip-Flops (Registers with Memory)always_ff @(posedge clk or negedge rst_n) begin if (!rst_n) q <= 8'h00; else q <= d;end→synthesisD QFlip-Flop(with async reset)dclk↑rst_nqHolds valuebetween edges
Figure 3 — always_ff infers a flip-flop (D-type register). The output q changes only on the rising clock edge. The async reset sets q=0 immediately regardless of clock.
// ── Pattern 1: Asynchronous active-low reset (most common in ASIC) ─
always_ff @(posedge clk or negedge rst_n) begin
if (!rst_n) q <= 8'h00; // async reset: fires WITHOUT clock
else q <= d;
end
// ── Pattern 2: Synchronous reset (common in FPGA) ─────────────────
always_ff @(posedge clk) begin // only posedge in sensitivity list
if (!rst_n) q <= 8'h00; // sync reset: fires WITH clock
else q <= d;
end
// ── Pattern 3: 8-bit counter with enable ────────────────────────
always_ff @(posedge clk or negedge rst_n) begin
if (!rst_n) count <= 8'h00;
else if (en) count <= count + 8'h01;
end
// ── Pattern 4: 3-state FSM ───────────────────────────────────────
typedef enum logic [1:0] {IDLE, BUSY, DONE} state_t;
state_t state, next_state;
always_ff @(posedge clk or negedge rst_n) begin
if (!rst_n) state <= IDLE;
else state <= next_state; // registers the NEXT state
end
always_comb begin // next-state logic is COMBINATIONAL
next_state = state;
case (state)
IDLE: if (start) next_state = BUSY;
BUSY: if (done) next_state = DONE;
DONE: next_state = IDLE;
endcase
end
// ── Rule: always use <= (non-blocking) inside always_ff ──────────
always_ff @(posedge clk) begin
q1 <= d; // ✅ non-blocking: models a real register correctly
// q1 = d; ← ❌ blocking in always_ff: timing hazard (covered in 5.6)
endAsync vs Synchronous Reset — The Real Engineering Trade-off
This is one of the most discussed topics in ASIC interviews and code reviews. The choice is not about preference — it has real timing and DFT implications.
| Aspect | Asynchronous Reset | Synchronous Reset |
|---|---|---|
| Sensitivity list | @(posedge clk or negedge rst_n) | @(posedge clk) |
| Reset timing | Fires immediately — no clock needed | Fires only at next posedge clk |
| ASIC suitability | Preferred — can reset even with clock stopped | Acceptable but needs clock to be running |
| FPGA suitability | Supported but uses dedicated async reset path | Preferred — maps to synchronous reset in fabric |
| Timing closure | Harder — async reset path needs special STA rules | Easier — reset treated as data path |
| Reset glitch risk | Higher — a glitch on rst_n can accidentally reset | Lower — glitches filtered by clock edge |
| Safe deassertion | Requires reset synchronizer (2-FF synchronizer) | Built-in — deassertion is synchronous by nature |
// ── The Reset Synchronizer — industry standard for async reset ──
// Problem: Async reset deassertion can violate setup/hold on FF.
// Solution: Synchronize the deassertion to the clock domain.
module reset_sync #(parameter int STAGES = 2) (
input logic clk,
input logic rst_n_async, // async from power-on or pin
output logic rst_n_sync // synchronized, safe to use
);
logic [STAGES-1:0] sync_chain;
always_ff @(posedge clk or negedge rst_n_async) begin
if (!rst_n_async)
sync_chain <= '0; // async assert propagates immediately
else
sync_chain <= {sync_chain[STAGES-2:0], 1'b1};
end
assign rst_n_sync = sync_chain[STAGES-1];
// Deassertion travels through N flip-flops — synchronized to clk.
// All downstream FFs see clean synchronous deassertion.
endmodule
// ── 2-stage Pipeline using always_ff — waveform-correct model ───
always_ff @(posedge clk or negedge rst_n) begin
if (!rst_n) begin
stage1 <= '0;
stage2 <= '0;
end else begin
stage1 <= data_in; // RHS captured at clock edge (old value)
stage2 <= stage1; // RHS = old stage1 — correct 2-cycle latency
end // LHS updates in NBA region after both evaluate
endWaveform — Non-blocking correctly models 2-stage pipelineclk_‾‾‾‾‾_data_in A B C D Estage1 X →A → →B → →C → →D ← 1 cycle latencystage2 X X →A → →B → →C ← 2 cycle latency └── both update simultaneously in NBA region (non-blocking) ──┘
🔍 Debugging Insight: Blocking = in always_ff Is the #1 RTL Bug
When you use blocking (=) in always_ff, the RHS is evaluated AND written immediately in the Active region. So stage2 = stage1 reads the already-updated stage1 value — both registers end up holding the same value. This is not a 2-stage pipeline, it's a 1-stage pipeline with an alias. The waveform looks wrong, bugs appear non-deterministically, and the behavior differs between simulators. Always use <= in always_ff. No exceptions.
always_latch — Level-Sensitive Latch
always_latch explicitly declares that you intend to infer a latch — a level-sensitive storage element that holds its value when the enable is low and is transparent when the enable is high.
Latches are almost never the right choice in synchronous digital design. They cause timing closure problems in ASIC flows, are harder to test, and often appear accidentally from an incomplete if or case statement in an always_comb block. The only legitimate use is an intentional latch demanded by a specific interface protocol (e.g., some bus standards).
Figure 4 — Latch (Level-Sensitive) vs Flip-Flop (Edge-Triggered)always_latch — Level-Sensitiveenable:d:q:q=d (transparent)q heldalways_ff — Edge-Triggered (Preferred)clk:d:q:q updates at ↑clk only
Figure 4 — A latch (left) is transparent when enable=1: output follows input in real time. A flip-flop (right) captures the input only at the clock edge and holds it until the next edge.
// ── Intentional latch: always_latch makes intent clear ─────────
always_latch begin
if (enable)
q = d; // transparent when enable=1
// holds q when enable=0 (latch behaviour)
end
// ── Accidental latch from always_comb (common mistake) ──────────
always_comb begin
if (enable)
out = data; // ❌ 'out' not assigned when enable=0
end // → tool ERRORS: "always_comb infers a latch"
// Fix: add default assignment at top
always_comb begin
out = '0; // ✅ default: no latch possible
if (enable)
out = data;
end
// ── When you might legitimately use always_latch ────────────────
// Specific bus protocols (some older standards) require transparent latches
// for address/data hold. Outside of that: always prefer always_comb or always_ff.Diagnosing Accidental Latch Inference
When you see a latch warning from the synthesis tool, here is the exact diagnostic process used in real RTL sign-off reviews:
- **** — Identify the signal name from the warning message:
"Latch inferred for signal 'result' in module 'alu'" - **** — Find every code path through the always_comb block. Draw them out if needed. Ask: "Is
resultassigned on ALL paths?" - **** — Find the missing path — it is always an
ifbranch without anelse, or acasewithout adefault, or a case item that misses one signal. - **** — Add a default assignment at the very top of the block:
result = '0;. This covers all paths. The latch warning disappears instantly. - **** — Verify in simulation: confirm the output now drives 0 for the previously uncovered path, not the old held value. This is a functional change — re-run regressions.
// ── Root Cause 1: if without else ────────────────────────────────
always_comb begin
if (valid) data_out = fifo_data; // ❌ what when valid=0? → latch
end
always_comb begin
data_out = '0; // ✅ default → no latch
if (valid) data_out = fifo_data;
end
// ── Root Cause 2: case missing default, one signal unassigned ────
always_comb begin
case (opcode)
2'b00: begin result = a + b; carry = 1'b0; end
2'b01: begin result = a - b; end // ❌ carry missing here → latch on carry
default: begin result = '0; carry = 1'b0; end
endcase
end
// ✅ Fix: assign carry at top, OR assign in every case arm
always_comb begin
{carry, result} = '0; // ✅ default: both covered
case (opcode)
2'b00: {carry, result} = a + b;
2'b01: result = a - b; // carry stays 0 from default
endcase
end
// ── Root Cause 3: Nested if — inner branch misses assignment ─────
always_comb begin
out = '0; // outer default: not enough!
if (mode) begin
if (enable) out = data; // ✅ covered
// mode=1, enable=0 → out not assigned here
// BUT outer default covers it! Wait — does it?
// YES — out='0 at top covers ALL paths including this one.
// Lesson: one default at the TOP covers ALL nested paths.
end
end🚀 RTL Design Insight: When a Latch IS the Right Answer
In specific protocol interfaces — some legacy bus standards, certain clock-gating cell structures, and address latching in microprocessor designs — a transparent latch is required. In these cases, use always_latch explicitly and add a comment explaining the protocol requirement. The tool will correctly infer and constrain the latch path. Your STA engineer will thank you for the clear intent signal instead of discovering an accidental latch during timing sign-off.
Complete Comparison — All Five Blocks
| Block | Fires When | Hardware Inferred | Synthesisable? | Assignment Type | Use For |
|---|---|---|---|---|---|
initial | Once at T=0 | None | No — TB only | = (blocking) | Testbench stimulus, memory init, VCD setup, watchdog |
always | On sensitivity list events (loops forever) | Ambiguous — depends on code | Yes (avoid in new code) | = or <= depending on use | Legacy Verilog; clock generator in testbench only |
always_comb | Any input change + once at T=0 | Combinational gates — no memory | Yes | = (blocking only) | All combinational RTL: muxes, decoders, ALU, next-state logic |
always_ff | Clock edge (posedge or negedge only) | Flip-flops (D-type registers) | Yes | <= (non-blocking — mandatory) | All sequential RTL: state registers, counters, pipelines, FSM state |
always_latch | When enable is HIGH (level-sensitive) | Transparent latch | Yes (use sparingly) | = (blocking) | Only when specific protocol mandates a latch — rare |
🏗 Synthesis Concern: What Each Block Tells the Synthesis Tool
Synthesis tools use procedural block types as intent declarations. always_ff tells the tool: "infer exactly one register per LHS signal — if you can't, error." always_comb tells the tool: "infer zero registers — if any path causes latch inference, error." always_latch tells the tool: "infer a latch — this is intentional." These constraints allow synthesis to verify your intent, not just blindly translate code. Tools like Synopsys Design Compiler and Cadence Genus use these annotations to generate better QoR (Quality of Results) because they know exactly what type of hardware was intended.
Common Mistakes
// ════ MISTAKE 1: Putting combinational logic in always_ff ════════
always_ff @(posedge clk) begin
y = a & b; // ❌ combinational inside always_ff — adds unwanted register
end
// ✅ FIX: use always_comb for combinational logic
always_comb begin
y = a & b; // ✅ pure combinational — no register inferred
end
// ════ MISTAKE 2: Missing default in always_comb → latch ══════════
always_comb begin
if (sel) out = a; // ❌ what is 'out' when sel=0? → latch inferred → tool errors
end
// ✅ FIX: default assignment at the top
always_comb begin
out = '0; // ✅ default covers sel=0 path
if (sel) out = a;
end
// ════ MISTAKE 3: Writing initial inside RTL (not testbench) ══════
module bad_rtl (...);
initial q = 0; // ❌ initial in RTL — synthesis ignores or errors
// ✅ FIX: use reset in always_ff instead
always_ff @(posedge clk or negedge rst_n) begin
if (!rst_n) q <= 0; // ✅ synthesisable reset
else q <= d;
end
endmodule
// ════ MISTAKE 4: Blocking assignment (=) inside always_ff ════════
always_ff @(posedge clk) begin
q1 = d; // ❌ blocking in always_ff — race condition (covered in 5.6)
q2 = q1; // q2 gets new q1 — behaviour differs from <= version!
end
// ✅ FIX: always use non-blocking (<=) in always_ff
always_ff @(posedge clk) begin
q1 <= d; // ✅ evaluates RHS at clock edge, updates after
q2 <= q1; // uses OLD q1 — models a 2-stage pipeline correctly
end⚠ Common Industry Mistake: Mixing = and <= in the Same always_ff
Some engineers use blocking for intermediate calculations inside always_ff, thinking it's harmless: tmp = a + b; result <= tmp; — this looks correct but has non-deterministic behavior. The value of tmp depends on when result <= evaluates relative to other blocks that also read tmp. Different simulators will give different answers. Use a local variable declared with automatic, or restructure using functions. Never mix blocking and non-blocking in the same always_ff block.
Quick Reference — Procedural Blocks Cheat Sheet
// ── initial: runs once, testbench only ────────────────────────
initial begin
rst_n = 0; @(posedge clk); rst_n = 1;
$finish;
end
// ── always: legacy — only use for clock gen ────────────────────
always #5 clk = ~clk; // testbench clock generator
// ── always_comb: combinational RTL ────────────────────────────
always_comb begin
out = '0; // default first — prevents latches
if (en) out = data; // auto-sensitivity, no @(*) needed
end
// ── always_ff: sequential RTL ─────────────────────────────────
always_ff @(posedge clk or negedge rst_n) begin
if (!rst_n) q <= '0; // async reset
else q <= d; // use <= (non-blocking)
end
// ── always_latch: intentional latch (rarely needed) ────────────
always_latch begin
if (enable) q = d; // transparent when enable=1
end
// ── Key rules ─────────────────────────────────────────────────
// always_comb: no @, no #, always add a default assignment
// always_ff: always use <= (non-blocking), include reset
// initial: testbench only — never in RTL intended for synthesis
// always: avoid — use always_comb or always_ff instead
// always_latch: avoid — almost always a design error🧠 How the Simulator Schedules Events — Delta Cycles Explained
Every SystemVerilog simulator is an event-driven engine. It does not execute statements continuously — it processes a list of events at each simulation time step. Understanding this model is what separates engineers who can debug race conditions from those who can't.
The Simulation Time Wheel and Event Regions
At each simulation time step, the simulator processes events in a defined order of regions. The IEEE 1800 standard defines these regions. For RTL engineers, the critical ones are:
| Region | What Happens | Relevant To |
|---|---|---|
| Active | Blocking assignments (=) evaluate and update. Continuous assignments. Current-time input port changes. $display executes. | All blocking logic, always_comb, input changes |
| Inactive | #0 delay events. Rarely used — avoid #0 in RTL. | Legacy workarounds |
| NBA | Non-blocking assignment (<=) updates apply. The RHS was evaluated in Active; now the LHS is updated. | always_ff, all non-blocking assignments |
| Observed | SVA concurrent assertions sample values. | Assertions (assert property) |
| Reactive | Program blocks, clocking blocks evaluate. Testbench sampling. | UVM drivers/monitors in program blocks |
| Postponed | $strobe and $monitor display. Final stable values for the time step. | Debug displays showing post-NBA values |
Delta Cycles — Zero-Time Iterations
A delta cycle is an iteration of the Active→NBA→Active loop that occurs at the same simulation time. When a non-blocking assignment updates a signal (NBA region), that update may trigger an always_comb block to re-evaluate (Active region), which may cause another update, triggering another iteration — all at the same timestamp. This is a delta cycle.
module delta_demo;
logic clk = 0;
logic [7:0] d, q, doubled;
always #5 clk = ~clk;
// Sequential: captures d on posedge clk
always_ff @(posedge clk) q <= d;
// Combinational: computes doubled from q
always_comb doubled = q << 1;
// ── What happens at posedge clk? ──────────────────────────────
// T=5ns, Delta 0 (Active): always_ff evaluates: RHS d is read (old d)
// T=5ns, Delta 0 (NBA): q gets new value (= old d)
// T=5ns, Delta 1 (Active): always_comb sees q changed → re-evaluates
// doubled = q << 1 (uses new q)
// T=5ns, Delta 1 (Postponed):$strobe shows final stable doubled value
// ─────────────────────────────────────────────────────────────
// $display at posedge might show OLD doubled (Delta 0).
// $strobe at posedge shows NEW doubled (Postponed — final stable).
// This is why verification engineers use $strobe not $display for FF outputs!
initial begin
d = 8'h0A;
@(posedge clk);
$display("$display: q=%0h doubled=%0h", q, doubled); // may show stale
$strobe ("$strobe: q=%0h doubled=%0h", q, doubled); // shows final
$finish;
end
endmodule
// ── Expected Output ───────────────────────────────────────────────
// $display: q=0a doubled=00 ← caught between Delta 0 and Delta 1
// $strobe: q=0a doubled=14 ← after Delta 1 settles (0x0A<<1=0x14)Delta Cycle Timeline at T=5ns (posedge clk)T=5nsRegionActive Δ0 │ NBA Δ0 │ Active Δ1 │ PostponedEventalways_ff reads d │ q ← old d │ comb sees q↑ │ $strobe firesq= old_q (unchanged)│= new_q(=d) │ new_q (visible) │ stabledoubled= old_q<<1 │ unchanged │= new_q<<1 │ stable$displayfires here ──────┘ ← may show stale doubled!$strobe fires here ─┘
💡 Senior Verification Engineer Tip: Use $strobe for Post-NBA Sampling
In testbench code, $display executes in the Active region — it may capture values before non-blocking updates have committed. $strobe executes in the Postponed region — after all NBA updates and combinational re-evaluations have settled. For monitoring flip-flop outputs, always use $strobe. For UVM monitors in program blocks, the clocking block automatically samples in the Observed region — which is why clocking blocks exist.
🔬 Procedural Blocks in Real Verification Environments
Every component of a UVM or directed-test environment is built on procedural blocks. Understanding which block is right for which verification task is what makes the difference between a testbench that's easy to debug and one that has subtle ordering bugs.
| TB Component | Block Used | Why That Block | Pattern |
|---|---|---|---|
| Clock generator | always | Needs to loop forever, toggle every half-period. No sensitivity list needed. | always #5 clk = ~clk; |
| Reset driver | initial | Runs once, holds reset for N cycles then releases. | initial begin rst=1; repeat(5) @(posedge clk); rst=0; end |
| Stimulus driver | initial or task called from initial | Drives sequences of values, waits for handshakes, checks responses. | Sequence of @(posedge clk) and signal drives |
| Signal monitor | always_ff-style sampling | Needs to capture output at every clock edge — edge-triggered behavior. | always @(posedge clk) if (valid_out) capture(dout); |
| Combinational checker | always_comb | Checks combinational outputs continuously — fires whenever output changes. | always_comb assert(parity == ^data); |
| Protocol monitor | always @(posedge clk) | Samples interface signals at clock edges to detect protocol violations. | State machine checking handshake rules |
| Coverage sampler | always @(posedge clk) | Samples covergroup at clock edge for accurate cycle-accurate coverage. | always @(posedge clk) cg.sample(); |
| Watchdog | initial with timeout | Fires once, waits for maximum simulation time, then calls $fatal. | initial begin #MAX_TIME; $fatal("Watchdog!"); |
module tb_fifo;
// ── Signals ───────────────────────────────────────────────────
logic clk, rst_n;
logic wr_en, rd_en;
logic [7:0] wr_data, rd_data;
logic full, empty;
int error_count = 0;
// ── DUT instantiation ────────────────────────────────────────
fifo_8x16 u_dut (.clk,.rst_n,.wr_en,.rd_en,.wr_data,.rd_data,.full,.empty);
// ── ① Clock generator: always (only legitimate use) ─────────
always #5 clk = ~clk;
// ── ② Reset generator: initial (runs once at T=0) ────────────
initial begin
clk = 0; rst_n = 0; wr_en = 0; rd_en = 0;
repeat(4) @(posedge clk);
rst_n = 1;
end
// ── ③ Stimulus driver: initial with task calls ───────────────
initial begin
@(posedge rst_n); // wait for reset to deassert
repeat(2) @(posedge clk);
write_word(8'hA5);
write_word(8'h3C);
read_word();
read_word();
repeat(2) @(posedge clk);
$display("Errors: %0d", error_count);
$finish;
end
// ── ④ Monitor + scoreboard: always (edge-triggered sampling) ─
logic [7:0] ref_queue[$];
always @(posedge clk) begin
if (wr_en && !full) ref_queue.push_back(wr_data);
if (rd_en && !empty) begin
automatic logic [7:0] exp = ref_queue.pop_front();
if (rd_data !== exp) begin
$error("MISMATCH: got %0h exp %0h", rd_data, exp);
error_count++;
end
end
end
// ── ⑤ Combinational assertion: always_comb ───────────────────
always_comb begin
assert (!(full && empty)) // FIFO can't be both full AND empty
else $fatal(1, "FIFO state machine error!");
end
// ── ⑥ Watchdog: initial (fires once, kills runaway sim) ──────
initial begin
#100_000;
$fatal(1, "Watchdog timeout — simulation stuck!");
end
// ── Tasks ─────────────────────────────────────────────────────
task automatic write_word(input logic [7:0] data);
@(posedge clk); wr_en = 1; wr_data = data;
@(posedge clk); wr_en = 0;
endtask
task automatic read_word();
@(posedge clk); rd_en = 1;
@(posedge clk); rd_en = 0;
endtask
endmodule⚡ Race Condition Analysis — Why They Happen and How to Find Them
A race condition in SystemVerilog simulation occurs when the outcome of a computation depends on the non-deterministic scheduling order of concurrent processes. The IEEE simulator standard allows tools to process events in any order within the Active region — so two always blocks at the same posedge clock have no guaranteed execution order relative to each other.
Race Type 1: Blocking Assignment Across always_ff Blocks
❌ Race Condition — Non-deterministic Resultlogic [7:0] tmp, a, b, c; // Block 1 — runs at posedge clk always @(posedge clk) begin tmp = a; // blocking: writes tmp b = tmp; // reads tmp immediately end // Block 2 — SAME posedge clk always @(posedge clk) begin tmp = c; // blocking: also writes tmp! end // Result: b = a or c depending on // which block the simulator runs first // VCS may give different answer than // Questa. This is a real project bug.✅ Fix — Non-blocking Eliminates the Racelogic [7:0] tmp, a, b, c; // Block 1 — posedge clk always_ff @(posedge clk) begin tmp <= a; // NBA: evaluated now b <= tmp; // uses OLD tmp value end // both update in NBA region // Block 2 — posedge clk always_ff @(posedge clk) begin tmp <= c; // DIFFERENT register! end // ← actually this is still // a multi-driver issue — // better to merge into one FF
Race Type 2: Reading a Signal That Another always_comb Is Writing
// ── This is NOT a race — this is correct combinational chaining ──
always_comb mid = a & b; // Block 1
always_comb out = mid | c; // Block 2 reads mid
// When a changes:
// Delta 0: Block 1 fires, mid updates
// Delta 1: Block 2 sees mid changed, fires, out updates
// Delta 2: Nothing changes → converges. Correct behavior.
// always_comb chains settle through delta cycles automatically.
// ── THIS is a race / oscillation ─────────────────────────────────
always_comb a = b ^ c; // a depends on b
always_comb b = a & d; // b depends on a — FEEDBACK LOOP!
// When b changes → a changes → b changes → a changes → ... forever
// Simulator detects delta cycle limit exceeded → error or X
// VCS: "Combinational loop detected"
// Questa: "Iteration limit exceeded"
// This produces X in simulation, oscillation in real silicon
// ── Identifying race conditions in waveform ───────────────────────
// Symptom 1: Signal toggles X within same timestamp
// Symptom 2: $display shows different value than waveform
// Symptom 3: Simulation gives different result on rerun with different seed
// Symptom 4: Two simulators give different RTL results on same testbench🔍 Debugging Insight: Race Conditions Show Differently in Each Simulator
If your RTL passes in VCS but fails in Questa (or vice versa), the first thing to check is blocking assignment usage in always_ff and multi-driver nets. The simulator that "passes" is not correct — it's just happening to schedule events in an order that produces the expected result. The other simulator reveals the true undefined behavior. Resolution: eliminate all blocking assignments from always_ff, ensure each net has exactly one driver.
⚙ Synthesis vs Simulation Mismatch — The Silent Killer
A simulation-synthesis mismatch means your RTL simulates correctly but the actual synthesized netlist behaves differently. This can lead to silicon bugs that only appear after tape-out — the most expensive bugs in semiconductor design. All five causes below stem from misusing procedural blocks.
| Root Cause | Simulation Behavior | Synthesis Behavior | Prevention |
|---|---|---|---|
Incomplete sensitivity list in always @(a,b) with c also used | Output doesn't update when c changes (simulation misses the event) | Output correctly updates when c changes (synthesis uses all inputs) | Use always_comb — eliminates the bug entirely |
Blocking = in always_ff | Race condition — result depends on simulator scheduling | Synthesis may create correct register chain or may not — tool-dependent | Use <= exclusively in always_ff |
| Latch inferred accidentally (missing default) | Output holds old value when condition is false | Latch is synthesized — timing analysis fails or latch behaves differently under scan | Use always_comb — tool forces you to add default |
initial used to initialize RTL registers | Register starts at 0 in simulation | Register powers on to X — no reset exists in netlist | Use reset in always_ff |
| Async reset sensitivity missing from list | Always block only fires on clock — reset has no effect in simulation | Synthesis may or may not infer async reset correctly | Always include or negedge rst_n for async reset |
// ── The Bug: incomplete sensitivity list in legacy always ─────────
module adder_bad (
input logic [7:0] a, b, c,
output logic [8:0] sum
);
always @(a, b) begin // ❌ c missing from sensitivity list!
sum = a + b + c; // sim: sum only updates when a or b changes
end // synth: sum correctly updates when c changes
endmodule
// ── Symptoms in simulation ────────────────────────────────────────
// At T=10: a=3, b=5, c=2 → sum=10 (a+b+c, a changed) ← correct
// At T=20: c=7 (only c changes) → sum stays 10! ← WRONG in sim
// At T=30: a=4 → sum=4+5+7=16 ← correct again (but c=7 now used)
// Synthesis netlist: sum always = a+b+c → different behavior!
// ── The Fix: always_comb — problem impossible ─────────────────────
module adder_good (
input logic [7:0] a, b, c,
output logic [8:0] sum
);
always_comb
sum = a + b + c; // auto-sensitivity: {a,b,c} — always correct
endmodule🚀 RTL Design Insight: always_comb, always_ff, always_latch Eliminate the Entire Bug Class
The three SystemVerilog-specific blocks were designed precisely to eliminate simulation-synthesis mismatches. always_comb auto-sensitivity eliminates sensitivity list bugs. always_ff intent declaration eliminates ambiguous register inference. always_latch intent declaration eliminates accidental latch inference. A codebase written entirely with these three blocks and zero always @() or always @(*) has an entire category of mismatch bugs eliminated by construction — not by code review.
⏱ Reset Modeling Strategies — ASIC vs FPGA vs Simulation
Reset is the most safety-critical signal in a synchronous design. A wrong reset model means the chip starts up in an undefined state. Here are the patterns used in production RTL, with the reasoning behind each choice.
// ── Pattern 1: Async Active-Low Reset (ASIC standard) ────────────
always_ff @(posedge clk or negedge rst_n) begin
if (!rst_n) q <= '0; // fires immediately when rst_n goes low
else q <= d; // normal operation
end
// Synthesis: infers DFF with asynchronous active-low clear
// Timing: rst_n path constrained by async check (recovery, removal)
// ── Pattern 2: Sync Active-Low Reset (FPGA standard) ─────────────
always_ff @(posedge clk) begin
if (!rst_n) q <= '0; // fires at next posedge clk after rst_n low
else q <= d;
end
// Synthesis: infers DFF with synchronous reset — clean STA
// Concern: if clk stops, sync reset cannot clear the register
// ── Pattern 3: Async Assert, Sync Deassert (best of both) ────────
// Use a reset synchronizer (see always_ff section above) to produce
// rst_n_sync, then use it as async reset input — assertion is still
// immediate, but deassertion is synchronous to clock. This is the
// gold standard in ASIC methodology.
// ── Pattern 4: Reset with Set-able Default (not all 0) ───────────
always_ff @(posedge clk or negedge rst_n) begin
if (!rst_n) state <= IDLE; // reset to IDLE, not 0
else state <= next_state;
end
// Synthesis: infers DFF with preset/clear depending on tool
// Concern: some cells don't support non-zero reset — check lib
// ── Pattern 5: Conditional Reset (only some FFs reset) ───────────
always_ff @(posedge clk or negedge rst_n) begin
if (!rst_n) begin
ctrl_reg <= '0; // control registers reset
// data_pipe <= '0; ← purposely NOT reset (saves area/power)
// DFT tool: data_pipe path marked "no reset" in DFT constraints
end else begin
ctrl_reg <= ctrl_next;
data_pipe <= data_in; // unreset FF — valid after 1st reset cycle
end
end
// Area/power saving: not every FF needs reset in datapath
// Constraint: initialize in SW, or gate output with a valid signalWaveform — Async vs Sync Reset Timing Differenceclk_‾‾‾‾‾_‾rst_n‾‾‾‾‾‾__‾‾‾‾‾‾‾‾‾‾‾‾‾↑ asserted here (between clock edges)q_asyncFF FF 00 00 00 00 FF FF← clears IMMEDIATELY (no clock needed)q_syncFF FF FF 00 00 00 FF FF← clears at NEXT posedge (1 cycle later)↑ first posedge after rst_n asserted
🔬 Debugging Academy — 10 Real RTL Bugs, Step by Step
These are not invented examples. Every bug below has appeared in real RTL projects and verification environments. Each one has a waveform signature, a root cause, and a fix. Study these until you can recognize them in five seconds of waveform inspection. 1Missing Sensitivity — Output Frozen in Simulation, Correct After SynthesisSim/Synth MismatchBuggy Code
// ❌ BUG: c is not in the sensitivity list
always @(a or b) begin // c is missing
out = a & b & c;
end
// Simulation: 'out' only re-evaluates when a or b change.
// If c changes alone, out stays at its old value. Looks like a hold violation.
// Synthesis: synthesis reads ALL signals — out = a&b&c always correct.
// Chip behavior: correct. Simulation: wrong. Bug is invisible at tape-out.
// ✅ FIX: Replace always @(a or b) with always_comb
always_comb out = a & b & c; // auto-sense: {a,b,c} — bug impossible1Waveform Symptom / Root Cause / FixWaveform Symptomout does not change when only c transitions. In the waveform, c shows a clear rising edge at T=50, but out stays flat. Engineers mistake this for propagation delay or a hold violation.Root CauseThe simulator's event scheduler only re-evaluates the always @(a or b) block when a or b change. A change on c alone does not trigger an event for this block. The simulator is doing exactly what the code says — it's the code that is wrong.Debugging Process1. See out frozen in waveform when expected to change. 2. Check if DUT uses always @() instead of always_comb. 3. Search for every signal that drives out — list them: {a, b, c}. 4. Compare against the sensitivity list — c is missing. 5. Fix by switching to always_comb.PreventionUse always_comb for all combinational RTL. This bug is impossible with always_comb. Never use always @(*) or always @(signal_list) in SystemVerilog RTL.2Blocking Assignment in always_ff — Pipeline Collapses to Single StageFunctional BugBuggy Code
// ❌ INTENDED: 2-stage pipeline (data_in → stage1 → stage2)
always_ff @(posedge clk) begin
stage1 = data_in; // blocking: stage1 updates IMMEDIATELY
stage2 = stage1; // blocking: reads NEW stage1, not old stage1
end
// Result: stage2 = data_in at same clock edge → 1-stage pipeline, not 2
// ✅ FIX: Non-blocking — both RHS evaluated with OLD values first
always_ff @(posedge clk) begin
stage1 <= data_in; // RHS captured: data_in (old)
stage2 <= stage1; // RHS captured: old stage1
end // both update in NBA region simultaneously
// Result: stage2 = old stage1 = data_in from previous cycle → correct 2-stage2Waveform Symptom / Root Cause / FixWaveform Symptomstage1 and stage2 have identical values at every clock edge. The expected 1-cycle difference between stage1 and stage2 never appears. Output latency appears to be 1 cycle (correct for stage1) but stage2 tracks stage1 in lock-step.Root CauseBlocking = evaluates AND writes the LHS immediately. By the time stage2 = stage1 executes, stage1 already contains the new value from the line above. Both registers capture data_in in the same clock cycle.Real Project ImpactThis bug caused a reported silicon issue in a DSP pipeline: the expected 2-cycle filter latency appeared as 1 cycle, causing off-by-one errors in the output frame. The synthesis netlist was actually correct (most synthesis tools recognize the pattern), but the simulation model was wrong, so the regressions passed and the bug was missed until integration-level testing.3Infinite Loop — Simulation Hangs at Time 0, No WaveformSimulation HangBuggy Code
// ❌ BUG: always block with no timing control — infinite loop at T=0
always begin
if (enable)
counter = counter + 1; // ← runs forever in zero time
end // ← simulator never advances time
// Symptom: simulation binary starts, prints nothing, hangs forever.
// CPU usage: 100% on one core. No VCD output.
// $finish is never reached. Kill with Ctrl+C.
// ❌ Also wrong: forever loop with no timing control
initial forever begin
data = $random; // infinite Active region loop
end
// ✅ FIX 1: Add timing control — yield simulation time
always @(posedge clk) begin
if (enable) counter <= counter + 1;
end
// ✅ FIX 2: Add delay — yields time between iterations
initial forever begin
@(posedge clk); data = $urandom_range(0, 255);
end3Diagnosis and FixDiagnosisWhen simulation hangs at T=0 with no output: search immediately for any always block or forever loop that has no @(), no #delay, and no wait statement. In large codebases, search for always begin without a following @ or #. The offending block is consuming all CPU in the Active region without advancing time.Simulator BehaviorVCS and Questa both have a delta-cycle limit (typically configurable, default ~1 million iterations). If the loop is truly infinite (no delta cycle convergence), the simulator hits this limit and aborts with "Iteration limit reached" or simply hangs. The simulator is not broken — the code is.4Latch from Incomplete Case — Holds Stale Value, Passes Directed Test but Fails RandomLatch InferenceBuggy Code
// ❌ BUG: case covers only opcodes 0,1,2 — what about 3?
always_comb begin
case (opcode[1:0])
2'b00: alu_out = a + b;
2'b01: alu_out = a - b;
2'b10: alu_out = a & b;
// 2'b11: NOT HANDLED → alu_out holds last value → LATCH
endcase
end
// always_comb tool ERROR: "Latch inferred on alu_out"
// Directed test: only tests 0,1,2 → passes
// Random test: eventually generates opcode=3 → stale alu_out
// ✅ FIX: Always add default assignment at top
always_comb begin
alu_out = '0; // ← default: covers ALL unhandled opcodes
case (opcode[1:0])
2'b00: alu_out = a + b;
2'b01: alu_out = a - b;
2'b10: alu_out = a & b;
2'b11: alu_out = a | b; // ← or: default handles it
endcase
end4Why Directed Tests Pass but Random Tests Catch ThisInsightThis is exactly why constrained-random verification finds bugs that directed tests miss. A directed test that only checks the documented operations (00, 01, 10) will pass. A random test with full-range opcode randomization will eventually hit 11, and either see the wrong (stale) output or detect the latch as X in post-synthesis simulation. This is one of the strongest arguments for randomized testing over directed-only approaches.5Multiple Drivers on Same Net — Signal Becomes XMulti-Driver / X BugBuggy Code
// ❌ BUG: Two always_ff blocks both drive 'q'
always_ff @(posedge clk or negedge rst_n) begin
if (!rst_n) q <= '0;
else q <= channel_a_data;
end
always_ff @(posedge clk) begin
if (override_en) q <= override_val; // ❌ second driver of q!
end
// Symptom: q = X whenever both blocks try to write different values.
// Tool Warning: "Multiple drivers on net q"
// Synthesis: CRITICAL ERROR — cannot resolve multi-driven net
// ✅ FIX: Merge into single always_ff with explicit priority
always_ff @(posedge clk or negedge rst_n) begin
if (!rst_n) q <= '0;
else if (override_en) q <= override_val; // override takes priority
else q <= channel_a_data; // normal operation
end6Wrong Reset Polarity — Module Never Resets, Outputs Stay XReset BugBuggy Code
// ❌ BUG: rst is active-HIGH (testbench drives rst=1 to reset)
// But the module checks !rst — this is always false when rst=1!
always_ff @(posedge clk or posedge rst) begin
if (!rst) q <= '0; // ❌ !rst = !1 = 0 → reset condition is NEVER true
else q <= d; // rst=1 → else branch → q gets d during reset!
end
// Simulation: q never initialises to 0. Stays X throughout.
// Waveform: rst pulse visible, q stays X, no response to reset.
// ✅ FIX: Match polarity — active-HIGH rst → if(rst), not if(!rst)
always_ff @(posedge clk or posedge rst) begin
if (rst) q <= '0; // ✅ active-high: fires when rst=1
else q <= d;
end
// ─── Naming convention prevents this bug ─────────────────────────
// rst → active high (fires when HIGH) → check: if(rst)
// rst_n → active low (fires when LOW) → check: if(!rst_n)
// The _n suffix is an industry convention for active-low signals.
// Sensitivity: posedge rst (active-high) vs negedge rst_n (active-low)7Task with Timing Control Inside always_comb — Compile Error or DeadlockCompile ErrorBuggy Code
// ❌ BUG: A task with timing control called from always_comb
task automatic sample_bus(output logic [7:0] val);
@(posedge clk); // ← timing control inside task
val = data_bus;
endtask
always_comb begin
sample_bus(captured); // ❌ ERROR: always_comb cannot block on @
end
// Compiler: "Timing controls not allowed in always_comb"
// ✅ FIX: Use a function (functions cannot contain timing controls)
function automatic logic [7:0] process_bus(input logic [7:0] raw);
return raw ^ 8'hFF; // ✅ pure combinational — no timing
endfunction
always_comb begin
processed = process_bus(data_bus); // ✅ function call: legal
end
// For the original sampling intent: use always_ff
always_ff @(posedge clk) captured <= data_bus; // correct8Glitch on Combinational Output — Visible in Waveform, Passes Timing AnalysisGlitch / HazardBuggy Code
// ── Scenario: Gray-code counter output feeds combinational decode
// Gray-code: only 1 bit changes at a time — but NOT in simulation!
// In simulation, multiple bits appear to change "simultaneously"
// but they are actually staggered by gate delays in real silicon.
always_comb begin
case (gray_count) // gray_count = [2:0] changes 010→011
3'b000: sector = 3'd0; // In sim: appears atomic (same delta)
3'b001: sector = 3'd1; // In silicon: bit[1] and bit[0] may
3'b011: sector = 3'd2; // switch at slightly different times
3'b010: sector = 3'd3; // → illegal code 3'b010 momentarily
default: sector = 3'd0; // → sector glitches to wrong value
endcase
end
// Simulation: looks clean (atomic change)
// Silicon: glitches on sector during counter transitions
// Symptom in silicon: incorrect sector decode for 1-2 gate delays
// ✅ FIX: Register the output to filter glitches
always_ff @(posedge clk) sector_reg <= sector; // register filters glitch9Uninitialized always_comb Output — X Propagation at StartupX-PropagationBuggy Code
// ❌ BUG: sel starts as X in simulation (undriven at T=0)
always_comb begin
case (sel)
2'b00: mux_out = in_a;
2'b01: mux_out = in_b;
2'b10: mux_out = in_c;
2'b11: mux_out = in_d;
// No default: if sel=2'bXX → mux_out = X → propagates everywhere
endcase
end
// At T=0: sel=XX (not yet driven) → mux_out=X
// X propagates into downstream always_ff → q=X after first clock
// Even after reset, q may stay X if reset doesn't reach this path
// ✅ FIX 1: Add default case (defensive X handling)
always_comb begin
mux_out = in_a; // ← default: covers X input
case (sel)
2'b00: mux_out = in_a;
2'b01: mux_out = in_b;
2'b10: mux_out = in_c;
2'b11: mux_out = in_d;
endcase
end
// ✅ FIX 2: Drive sel before simulation begins
initial sel = 2'b00; // testbench: initialize sel at T=010Mixed Blocking/Non-Blocking in always_ff — Tool-Dependent, Non-ReproducibleNon-DeterministicBuggy Code
// ❌ BUG: mixing blocking and non-blocking in same always_ff
always_ff @(posedge clk) begin
tmp = a + b; // blocking: tmp updates immediately (Active)
result <= tmp; // non-blocking: RHS is current tmp (post-block?)
status <= (tmp > 8'hFF); // which tmp? pre- or post-blocking update?
end
// IEEE spec: This is non-deterministic. The value of tmp seen by <=
// depends on scheduling order within the Active region.
// VCS may give: result = new tmp (post-blocking update)
// Questa may give: result = old tmp (NBA evaluated with pre-Active values)
// Both are "correct" per IEEE — your code is ambiguous.
// ✅ FIX: Use only non-blocking. For intermediate calc, use automatic var.
always_ff @(posedge clk) begin
automatic logic [8:0] tmp_local = a + b; // local: not a net
result <= tmp_local[7:0]; // ✅ clear and deterministic
status <= tmp_local[8]; // ✅ overflow bit
end
// Or: compute in always_comb, register the result in always_ff
always_comb tmp_comb = a + b;
always_ff @(posedge clk) begin
result <= tmp_comb[7:0];
status <= tmp_comb[8];
end💡 Senior Verification Engineer Tip: Run Your RTL on Two Simulators
The single most effective way to catch blocking/non-blocking races and sensitivity list bugs is to simulate the same testbench on two different simulators (e.g., VCS and Questa or Xcelium). Any result that differs between the two is non-deterministic behavior — a bug by definition. Set this up as part of your CI flow. A 30-minute effort to add the second simulator run will catch bugs that months of single-simulator testing will miss.
🎯 Interview Q&A — From Fresher to Senior Engineer
These questions have been asked in ASIC/DV interviews at companies including semiconductor design houses, fabless startups, and system companies. Each answer goes deeper than the typical one-liner — because that is what interviewers are actually looking for.
Beginner Level
BeginnerWhat is the difference between initial and always?initial executes once at simulation time 0 and terminates when it reaches the end. It is not synthesizable — it exists only in simulation. Use it in testbenches to apply reset, drive stimulus, and call $finish.always is an infinite loop that executes repeatedly based on a sensitivity list. It is synthesizable (depending on content). In SystemVerilog RTL, replace it with always_ff, always_comb, or always_latch which carry explicit intent for the tool to verify.BeginnerCan an always_comb block have a sensitivity list?No. The entire point of always_comb is that the sensitivity list is automatically inferred from every signal read inside the block. You cannot write always_comb @(a, b) — it is a compile error. This eliminates the entire class of bugs caused by incomplete sensitivity lists that plague always @(*) code.BeginnerWhy would you use always_latch, and when should you avoid it?always_latch explicitly declares that you intend to infer a transparent latch — a level-sensitive storage element. Use it only when a specific interface protocol requires a latch (some legacy bus standards). Avoid it in virtually all other cases. Latches cause timing closure difficulties (the entire data-to-output path must meet timing while the enable is high), complicate scan-based DFT, and are harder to characterize. If you see always_latch in code review without a protocol-specific comment, question it immediately.
Intermediate Level
IntermediateWhat is a delta cycle? Give a concrete example with always_ff and always_comb.A delta cycle is a zero-time simulation iteration. At a single simulation timestamp, the simulator loops through Active→NBA regions until no new events are generated. Each loop is one delta cycle. Example: At posedge clk, always_ff evaluates q <= d (Active region, captures old d). In the NBA region, q gets its new value. This change triggers always_comb that depends on q, which re-evaluates in the next Active region (delta 1). If always_comb's output doesn't trigger further changes, the simulation converges and time advances. Debugging relevance: $display fires in Active region — it may show pre-NBA values. $strobe fires in Postponed region — it shows post-delta-convergence values. This difference causes mysterious "wrong value" reports in debug prints.IntermediateWhy must you use <= (non-blocking) in always_ff? What breaks with =?Non-blocking (<=) evaluates the RHS in the Active region using the current (pre-clock-edge) values, then updates the LHS in the NBA region. This means all non-blocking assignments in an always_ff block see the same snapshot of signals from before the clock edge — which is how real flip-flops work. Blocking (=) evaluates AND immediately updates the LHS in the Active region. So if you write a = b; c = a;, c gets the new value of a, not the pre-edge value. This: (1) creates a race condition when other blocks read a in the same Active region, (2) collapses pipeline stages (what should be 2 registers becomes 1), (3) causes simulator-dependent behavior — VCS and Questa may give different answers.IntermediateWhat is the standard 2-block FSM pattern and why is it the right approach?The standard pattern separates state storage from state logic:Block 1 — always_ff: State register only. Captures next_state on clock edge, applies reset.Block 2 — always_comb: Next-state logic and output logic. Purely combinational — no clock. Why this is correct: (1) always_ff guarantees register inference — no accidental latch. (2) always_comb guarantees combinational logic — no accidental register. (3) The two block types make intent unambiguous to tools and humans. (4) Changing the state encoding or adding states only affects the always_comb block — the sequential structure is unchanged.
Debugging / Advanced Level
AdvancedYou see a flip-flop output stuck at X throughout simulation. List your debugging steps.Step 1: Check the reset. Is rst_n actually being asserted? View it in the waveform. Is the polarity correct? (if(!rst_n) for active-low, if(rst) for active-high).Step 2: Check for multiple drivers. Search for every always block that drives this signal. Two drivers → X.Step 3: Check the input d. If d is X, and the flip-flop captures it, q will be X. The X is upstream, not in this flip-flop.Step 4: Check for blocking assignment in always_ff. A blocking assignment combined with X inputs can propagate X in non-obvious ways.Step 5: Enable X-propagation analysis in your simulator. Tools like VCS have +xprop options that trace X origins back to their source signal. AdvancedYour simulation passes 100% of tests on VCS but 30% of tests fail on Questa. What is the first thing you investigate?This is the classic signature of non-deterministic behavior — the RTL has a race condition whose outcome varies by simulator scheduling order. First investigation: search the RTL for blocking assignments in always_ff or always @(posedge clk) blocks where the same signal is written in one block and read in another block at the same clock edge. This is the most common source. Second investigation: search for multiple drivers on the same net (two always blocks writing to the same variable). This often produces X in one simulator but 0 or 1 in another depending on which block "wins". Third investigation: check for @() or manual always @(a,b,c) sensitivity lists — if a signal is missing, one simulator may happen to re-evaluate correctly (e.g., if it evaluates more aggressively) while another does not. Solution: convert all RTL to always_ff/always_comb with exclusively non-blocking in sequential blocks.AdvancedExplain why always_comb is guaranteed by the IEEE spec to fire at time 0, and why this matters.IEEE 1800 specifies that always_comb is evaluated once at simulation time 0, before any other events are processed. This is fundamentally different from always @(), which only evaluates when a sensitivity event occurs. Why it matters: At T=0, if inputs are already driven (e.g., by initial blocks running at T=0 or by declaration-time initialization like logic a = 1;), the combinational outputs are computed correctly immediately. With always @(), if no event occurs at T=0, the outputs remain X until the first signal change. This means a testbench checking combinational outputs at T=0 will see X with always @() but correct values with always_comb. In verification environments that check reset values immediately, this subtle difference causes false failures with legacy always @(*) code.
Synthesis / Waveform Level
SynthesisWhy is initial not synthesizable, and what is the correct synthesizable equivalent?initial is a simulation-only construct. Synthesis tools represent circuits as timeless netlists of gates and flip-flops — there is no concept of "runs once at T=0" in a gate netlist. Synthesis tools either ignore initial blocks or error on them. The synthesized flip-flop powers on to an unknown state in real silicon. The synthesizable equivalent is reset in always_ff:always_ff @(posedge clk or negedge rst_n) begin if (!rst_n) q <= '0; ... end This infers a flip-flop with an actual reset pin in the netlist. The reset network is part of the synthesized circuit, not a simulation artifact. This is why ASIC projects require reset coverage: every flip-flop must have a reset path to bring it to a known state at power-on.WaveformIn a waveform, you see q1 and q2 update at the same clock edge but q2 has the value q1 had BEFORE the clock. What does this confirm?This confirms that non-blocking assignments are being used correctly — this is the expected and correct behavior for a 2-stage pipeline. At posedge clk: the simulator evaluates all <= RHS expressions using pre-edge values (Active region). Then in NBA region, all LHS signals update simultaneously. So q1 <= data_in and q2 <= q1 both evaluate with the old values of data_in and q1 respectively — then both update at the same time. The 1-cycle offset between q1 and q2 is exactly what the non-blocking assignment model guarantees. You are seeing correct register pipeline behavior. If q1 and q2 were identical at every clock edge, that would indicate blocking assignments had collapsed the pipeline.