Standard-cell timing characterization: a cell's SPICE netlist in, a Liberty
(.lib) timing model out.
Vyges open EDA tools. Commercial-grade silicon sign-off capability, built on open standards and plain file formats — and meant to be accessible to everyone, not only teams who can license a six-figure tool.
vyges-charopens up standard-cell characterization.
Docs: docs.vyges.com — this engine's chapter, the cross-engine integration guide (how the four Vyges engines work together and where each plugs into an OpenROAD / LibreLane flow), and the job-file formats. Integrating at the binary level and need help? → https://vyges.com/contact.
Timing sign-off needs a timing model for every standard cell — its delay and
output transition as a function of input slew and output load. Foundries ship
these .lib files, but you need to (re)generate one whenever you have a new
cell, a new PVT corner, a tweaked transistor, or simply want to verify a
vendor library against first-principles SPICE. vyges-char produces that model.
In production, characterization means the commercial characterizers —
CCS/ECSM, statistical LVF, full PVT
matrices — the tools foundries and IP teams use to produce the libraries that
ship in the PDK. Most design teams never run them; they consume the delivered
.lib. In the open world the space is thin (CharLib, LibreCell over
ngspice/Xyce), so users mostly reuse pre-characterized libraries and skip it.
vyges-char makes the generate-and-verify path open and scriptable, behind the
same Liberty file format everything downstream already speaks.
Describe the job, not the script. The incumbent flows are driven by
hand-written Tcl — a recurring source of silent typos, copy-paste drift across
cells and corners, and brittle maintenance. Every Vyges tool is instead configured
by a small declarative job file (.char here; .sta, .ext, .charlib,
.emir across the toolchain): readable, diffable, schema-checkable, and reusable,
with no control flow to get wrong. char goes further — point it at a reference
.lib (ref_lib:) and it derives a cell's arcs itself, so even the per-cell setup
is data, not script.
Validate fast, sign off with your tool. char emits standard Liberty, so a
char-generated library drops into OpenSTA, the commercial sign-off timer, or any timer unchanged. Use it
for the fast inner loop — regenerate or verify a .lib in seconds while you iterate —
and keep the vendor-shipped library for final sign-off if you prefer. Nothing about
the flow has to change to adopt it; it sits alongside what you already run.
Given:
- a cell's SPICE netlist (
.subckt …) and the PDK device models, and - the slew × load grid, supply, and temperature to characterize at,
it emits a Liberty NLDM (.lib): for each timing arc it sweeps input slew
against output load, simulates every point in ngspice, measures propagation
delay and output transition, and fills the cell_rise / cell_fall /
rise_transition / fall_transition lookup tables.
It reads the cell's real .subckt port order from the netlist and maps each
pin to the right node — input/output to the driven/measured nets, power/ground
pins to the supplies (sky130 VPWR/VGND/… handled by default, override with
power:/ground:). A port it can't place is an error, not a silent float.
PDK cells *.spice ──[ vyges-char + ngspice ]──► *.lib
*.v + *.spef + *.lib ──[ STA ]──► timing sign-off
Files in / files out; the simulator is driven as a subprocess. The pure pieces
(Liberty emit, SPICE deck gen, .measure parse) run offline and are unit-tested;
only the actual sweep needs ngspice + the PDK on the host.
PDK cell *.spice + device models ─[vyges-char + ngspice]─► *.lib ─► STA
Reach for it when you need a cell's timing model and don't already have a
trustworthy one — a custom or ECO cell, a new PVT corner, or to
verify a vendor .lib against first-principles SPICE. It runs after you
have the cell netlist + device models and before STA, which cannot run
without a .lib. The Liberty it emits is exactly what vyges-sta-si (or
any STA tool) consumes. Most flows consume the foundry's shipped .lib
directly and only reach for vyges-char to fill those gaps.
# build it yourself (std-only, no deps) -- or grab a binary from GitHub Releases:
cargo build --release # std-only, no external deps
vyges-char run cell.char -o cell.lib # characterize one cell (needs ngspice + models)
vyges-char run cell.char --json # machine-readable summary instead of Liberty
vyges-char library lib.charlib -o out -v # characterize many cells in parallel -> merged .lib
vyges-char check cell.char # validate the job, print a summary
vyges-char demo # print a sample .lib (no sim)
vyges-char dataset cell.char -o data.csv # export the characterization as a tidy table (CSV/JSONL)
vyges-char surrogate cell.char --log # experiment: predict the grid from a subset (see "Research" below)
# common flags: -o FILE · --json · -q/--quiet · -v/--verbose · -h/--help · -V/--versionA whole library is just many per-cell jobs run together. A .charlib manifest
names them (or a directory of them) and a thread count; cells are characterized
in parallel — each ngspice point is a subprocess, so the simulator is the
bottleneck and the pool scales the run across cores — then merged per corner into
a single .lib (one shared header, the union of lookup-table templates, every
cell group). A cell that fails to characterize is reported and dropped, never
sinking the whole library.
library: sky130_fd_sc_hd_subset
threads: 12 # default: available parallelism
jobs_dir: cells # every *.char in a dir (or `jobs: a.char, b.char`)
vyges-char library lib.charlib -o out/ writes out/<library>.lib (or one
out/<library>__<corner>.lib per corner). Mixed combinational + sequential cells
merge into one well-formed library.
A job (*.char) is a few key: value lines:
cell: sky130_fd_sc_hd__inv_1
netlist: sky130_fd_sc_hd.spice # contains .subckt for the cell
in_pin: A
out_pin: Y
sense: negative_unate
slews: 0.01, 0.04, 0.16, 0.64 # ns
loads: 0.0005, 0.002, 0.008 # pF
vdd: 1.8
temp: 25
models: params.spice, corners/tt.spice # device models, included in order
montecarlo: 8 # optional: LVF sigma (omit/0 = NLDM only)
ccs: true # optional: emit CCS output-current waveforms
recv: true # optional: emit CCS receiver capacitance (input pin)
power_char: true # optional: leakage_power + internal_power
Multi-arc cells (multi-input gates, multi-output cells) replace the single
in_pin/out_pin/sense with one arc: line per timing arc — <in> <out> <sense> [side=0|1 ...], where each other input is held at its non-controlling value while
this arc is exercised. A 2-input NAND:
cell: sky130_fd_sc_hd__nand2_1
netlist: sky130_fd_sc_hd.spice
slews: 0.05, 0.20
loads: 0.001, 0.005
vdd: 1.8
arc: A Y negative_unate B=1 # A->Y, side input B held high
arc: B Y negative_unate A=1 # B->Y, side input A held high
All arcs of one cell render into a single cell {} (one pin per input, one per
output, a timing () group per arc) — and the A->Y vs B->Y delays come out
distinct (the series-stack input nearer the output switches faster), which is
exactly why each arc must be characterized, not copied.
Arc auto-derivation (function:) — for cells whose arcs are tedious to hand-write
(XORs, AOI/OAI compounds), give the output pin's Boolean function instead of arc:
lines and char derives the arcs by cofactor analysis: which inputs the output responds
to, the sense (positive/negative-unate), and a sensitizing side-input state per arc.
The function uses Liberty operators ('/! NOT, */&/juxtaposition AND, ^ XOR,
+/| OR) — copy it straight from the PDK .lib:
cell: sky130_fd_sc_hd__a21oi_2
netlist: sky130_fd_sc_hd.spice
slews: 0.05, 0.20
loads: 0.002, 0.010
function: Y = (!A1&!B1) | (!A2&!B1) # 3 arcs (A1,A2,B1 -> Y), all derived
#function: X = A ^ B # XOR: non-unate, derived under a definite side
This makes char library over a real netlist's cell list practical without
hand-authoring every compound cell's arcs.
Fully netlist-driven (ref_lib:) — point at a reference .lib and char reads the
cell's output-pin functions straight from it and derives every arc, so no arc: or
function: lines are needed at all (char still measures the values in SPICE; it only
reads the cell's structure from the reference):
cell: sky130_fd_sc_hd__o2bb2a_2
netlist: sky130_fd_sc_hd.spice
slews: 0.05, 0.20
loads: 0.002, 0.010
vdd: 1.8
ref_lib: sky130_fd_sc_hd__tt_025C_1v80.lib # arcs derived from the cell's functions
So a .charlib over a netlist's cell list needs only cell + netlist + grid + ref_lib per cell — the arcs come from the PDK. (Sequential cells still use the
clock_pin/data_pin setup/hold form.)
Sequential cells (flip-flops) characterize the setup/hold constraints on the data pin and the CK->Q delay arc instead of combinational arcs:
cell: sky130_fd_sc_hd__dfxtp_1
netlist: sky130_fd_sc_hd.spice
clock_pin: CLK
data_pin: D
out_pin: Q
clock_edge: rising # rising | falling
slews: 0.05, 0.20 # clock & data transition axes
loads: 0.005 # Q load (CK->Q arc)
vdd: 1.8
Setup/hold are found per (clock slew, data slew) grid point by a push-out
bisection: the data-to-clock separation is squeezed until the CK->Q delay
degrades 10% past its stable value. The emitted .lib is a full sequential cell
(ff group, clock : true, setup_*/hold_* constraint groups, edge-triggered
CK->Q arc) — the exact shape vyges-sta-si reads for reg-to-reg timing.
Per-corner sweeps characterize the cell across PVT corners — one corner: line
per (process models, supply, temperature) — emitting one .lib per corner, the
per-corner library set vyges-sta-si's MCMM consumes:
cell: sky130_fd_sc_hd__inv_1
netlist: sky130_fd_sc_hd.spice
in_pin: A
out_pin: Y
slews: 0.05, 0.20
loads: 0.001, 0.005
# corner: name | models (csv) | vdd [| temp]
corner: ss_n40C_1v60 | params_ss.spice, corners/ss.spice | 1.60 | -40
corner: tt_025C_1v80 | params_tt.spice, corners/tt.spice | 1.80 | 25
corner: ff_125C_1v95 | params_ff.spice, corners/ff.spice | 1.95 | 125
vyges-char run job.char -o <dir> then writes <cell>__<corner>.lib per corner
(nominal voltage/temperature in each header). Without corner: lines the job is a
single run to stdout / -o FILE as before — fully back-compatible.
Async set/reset flops add a tie: list (pins held at their inactive level
during setup/hold/CK->Q) and an optional reset_pin:; the reset's active level is
inferred from the name (_B/_N → active-low) or set with reset_active::
cell: sky130_fd_sc_hd__dfrtp_1
netlist: sky130_fd_sc_hd.spice
clock_pin: CLK
data_pin: D
out_pin: Q
reset_pin: RESET_B # async reset; held inactive (high) for setup/hold/CK->Q
slews: 0.10
loads: 0.005
vdd: 1.8
The emitted .lib gains the ff clear : "!RESET_B" attribute, an async
reset->Q delay arc (timing_type : clear), and the recovery/removal
constraints (the async de-assert-vs-clock timing, found by bisecting the release
edge for the capture/hold boundary). An async set uses set_pin: symmetrically
— ff preset + set->Q arc + recovery/removal; a flop can carry both. Extra unused
inputs (e.g. scan controls) go in tie: as SCE=0, SCD=1.
The sky130 corner decks use relative .include paths and a Monte-Carlo switch
parameter, so:
- prepend a small
params.spicetomodels:defining.param mc_mm_switch=0/.param mc_pr_switch=0, and - run from the PDK's
libs.tech/ngspice/corners/directory so the corner's relative includes resolve.
vyges-char run then sweeps the grid and writes the .lib. Comparing that
output table-by-table against the foundry reference .lib is the recommended
way to confirm a characterization is in tolerance.
Characterization's cost is the SPICE sweep: one simulation per (slew, load)
point, per arc, per corner. But a cell's delay and transition are smooth
functions of slew and load. So a natural question:
Do you have to simulate every grid point — or can you simulate a subset and have a cheap model predict the rest, accurately enough for the fast inner loop?
vyges-char ships two small, std-only tools to explore exactly this — in pure Rust
(experiment with GPUs too via rust-gpu):
vyges-char dataset JOB— flattens a characterization into a tidy, long-format table (one row per measured point: cell, arc, metric, corner, the two grid axes, value, unit), as CSV or JSONL. This is the raw material: a clean dataset you can analyze, plot, or train a model on. Non-physical artifacts (e.g. a near-zero/negative delay at an extreme slew/load corner) are flagged in aflagcolumn rather than hidden;--cleandrops them.vyges-char surrogate JOB— a baseline experiment: it fits a small polynomial model on half the grid, predicts the held-out half, and reports the error (max_abs,rms, and error as a % of the table's peak).--logfits in log-log space (characterization grids are log-spaced);--degreeand--metriclet you probe.
In our early experiments on open-PDK cells, a simple degree-2 model in log space tracks the held-out points well across cells and corners — encouraging evidence that much of the grid is predictable from a fraction of the simulations. We're intentionally not publishing headline accuracy numbers: the honest, useful thing is for you to measure it on your cells, PDK, and grid.
Some directions to explore:
- How few SPICE points can you get away with for a given accuracy (the sample-efficiency curve)? Does a coarse grid + surrogate beat a fine grid at equal cost?
- Better models than a polynomial — splines, RBFs, tiny neural nets — and active sampling (let the model choose the next point to simulate).
- Which metrics/corners/cell families predict well, and which don't (and why)?
- Transfer: does a model trained on one corner help another?
- Across process nodes: does it hold as the physics gets harder? Run it from mature open PDKs (gf180 180 nm, sky130 130 nm) to finer / FinFET ones (ASAP7 7 nm predictive) — and, on a commercial flow, on your own 28 nm / 12 nm / 3 nm PDK.
The engine is open-source and runs locally (std-only Rust + ngspice on a plain CPU — experiment with GPUs too via rust-gpu; nothing leaves your machine), which makes it useful to very different teams:
- University / OSS researchers — a study in sample-efficiency and surrogate modeling on open PDKs (no agreement needed). Chase the questions above; publish and share.
- Enterprise silicon teams — evaluate the payoff on your PDK and node (28 nm,
12 nm, 3 nm): point it at your licensed/NDA models, characterize a few representative
cells, and measure for yourself how much
--sparse/--autowould cut your characterization runtime at an accuracy your flow accepts — all on your confidential PDK, locally, with nothing sent anywhere. (Sign-off-grade libraries on a commercial node use a separate per-foundry calibration plugin — see Open core, certified fab plugins.)
Try it, then tell us what you found — results, surprises, "it worked / it didn't on node X", or a conversation about your team's flow. Start at https://vyges.com/contact. Findings from either cohort may shape where this goes.
vyges-char operates on the standard-cell digital abstraction — it produces Liberty
standard-cell timing models (NLDM/CCS): per-arc delay and transition tables swept over input
slew × output load, plus sequential setup/hold constraints. That makes it a digital sign-off
engine: it characterizes the discrete, characterizable cells a digital library is built from. It
does not apply to analog / mixed-signal blocks — their continuous behavior has no
standard-cell or Liberty-arc analogue, so there are no timing arcs to sweep. For analog /
mixed-signal physical and integrity coverage, reach for the analog-capable Vyges engines —
lvs, layout,
em-ir, thermal,
and extract.
vyges-char is open and contains no foundry-confidential data. It runs out
of the box on open PDKs (sky130, gf180) using their published device models.
vyges-char — OPEN engine (Apache-2.0, contains no fab data)
────────────────────────────────────────────────────────────────────
cell .subckt ─► job.rs ─► engine.rs ─(ngspice)─► liberty.rs ─► *.lib
▲
└─ published plugin contract
(device models · corner · slew×load grid)
│
loads ONE characterization plugin
│
┌──────────────────────────────┴──────────────────────────────┐
│ │
OPEN reference plugin CERTIFIED per-fab plugins
(in-repo · no NDA) (private · one per fab/node 🔒)
• sky130A models + tt corner • vyges-char-tsmc28
✓ M0/M3 validated • vyges-char-sec28
• vyges-char-micron…
open data, ships with the tool correlated corner +
reference .lib — under NDA
sky130A is the starter / reference plugin — open, no NDA, and already proven
by the M3 run (re-characterized inv_1 against the shipped sky130 .lib). Today
a "plugin" is just the models + corner setup you pass on the CLI; formal per-fab
plugin packaging (discovery, signing, repo-per-fab) is the remaining open item.
Getting sign-off-grade libraries on a commercial node takes two things beyond the tool running: the output must be correlated to that foundry's silicon, and the foundry must accept the flow under an agreement. Both live in a separate, per-foundry plugin — never in this repository:
- the open tool defines a published characterization contract (the job + models/corner setup and its calibration extensions);
- a certified per-foundry plugin supplies the silicon-correlated corner setup and reference for a specific node, delivered under that foundry's NDA;
- the open engine loads it through the contract and never embeds or references any foundry-confidential infrastructure. Each foundry has its own plugin.
So the engine and the contract are open for everyone, while the per-foundry
correlation is gated to those with the agreement — the same way a commercial
characterizer separates its engine from the foundry-delivered calibration, except
here the engine is open. Use vyges-char today on open PDKs and to
characterize/verify custom cells on any PDK you have; certified sign-off
libraries on a commercial node come with that node's plugin.
Emits an NLDM (delay + transition lookup tables) from a single-stage
transient deck, correlated cell-by-cell against the foundry .lib on the
exact reference grid. The correlation surfaced a real bug: index_1
(input_net_transition) is the input edge measured between the 20–80% slew
thresholds, but the deck drove a full-swing ramp over slew_ns — making every
input ~1.67× too steep and biasing delays/transitions low. Fixed (ramp spans
slew_ns / 0.6): the rise arcs now correlate to single digits (inv_2
cell_rise 7%, rise_transition 6%) and the weighted error dropped from ~25% to
~13–20%. The fall-arc residual was then chased to ground and clears char:
re-deriving the worst and cleanest grid points with independent hand-written
ngspice decks shows char reproduces clean ngspice to 4 significant figures, so
the gap is not a char defect. It is (a) a clean ~15% ngspice-vs-shipped-vendor-.lib
floor at large load (a symmetric rise-slow/fall-fast P/N drive-strength skew that
raw ngspice also shows — a known sky130 re-characterization gap) and (b) an NLDM
small-load degeneracy (slow input + tiny load trips the gate before input-50%, so
the measured delay is near-zero/negative — physically real, and raw ngspice does the
same). We did not fudge the device model to chase the vendor number.
Adds LVF (statistical OCV): with montecarlo: N, each (slew,load) point runs
N seeded Monte-Carlo samples over device mismatch (mc_mm_switch) and emits
ocv_sigma_cell_rise/fall delay-sigma tables alongside the NLDM — exactly the
tables vyges-sta-si consumes for POCV, closing the loop char → .lib → sta-si.
Zero-cost when montecarlo is unset (NLDM-only).
Adds CCS (composite current source): with ccs: true, each (slew,load) point
captures the driver's output-current waveform — a 0 V sense source in series
with the load lets the transient dump i(out) over a fine step tightened to the
switching window — and emits output_current_rise/fall vector groups (per-edge
reference_time + time/current sampled to a compact vector). These are the
current-source models vyges-sta-si drives into its effective-capacitance (Ceff)
and transient RC-tree solve, the other half of the char → .lib → sta-si loop
beyond LVF. Validated end-to-end on sky130_fd_sc_hd__inv_1: the captured charge
spike peaks ~0.12 mA for a few-fF load (physically sane), and sta-si consuming
the CCS .lib shifts WNS by a sensible CCS-vs-NLDM delta. Zero-cost when ccs is
unset.
Adds CCS receiver capacitance: with recv: true, each (slew,load) point drives
the input pin through a 0 V sense source and integrates the captured input current
Q = ∫i·dt over the two halves of the input ramp → the two-segment receiver model
receiver_capacitance1/2_rise/fall (C1 = static gate cap before the delay
threshold; C2 = after, inflated by Miller from the switching output). The input pin
also gains the conventional single-number capacitance (the C1 lanes). These are
the input-pin load vyges-sta-si charges its drivers with — completing the CCS
model (output current + receiver). Validated on sky130_fd_sc_hd__inv_1: C1/C2
land ~1.8–2.6 fF (matching the foundry input-cap), with C2 inflating over C1 (e.g.
1.44×) exactly when the output switches during the input's second half; sta-si
consuming the receiver load shifts WNS by a sensible Miller delta. Zero-cost when
recv is unset.
Adds multi-arc cells: one arc: line per timing arc, each holding the other
inputs at their non-controlling level via a fixed source, so multi-input gates
(NAND/NOR/AND/OR/MUX) and multi-output cells characterize every arc and render into
one well-formed cell {}. Validated on sky130_fd_sc_hd__nand2_1: both A->Y and
B->Y arcs emit, with the expected series-stack asymmetry (~25% at the first grid
point), and the two-arc .lib round-trips through vyges-sta-si (worst path picks
the slower arc).
Adds sequential (flip-flop) characterization: clock_pin/data_pin switch the
job into setup/hold + CK->Q mode, with a push-out bisection (10% CK->Q degradation)
per (clock slew, data slew) point and a small series resistor on every source to
keep the flop's storage-node feedback converging in ngspice. Validated on
sky130_fd_sc_hd__dfxtp_1: CK->Q ~0.2 ns, setup ~0.04-0.08 ns, and the
characteristic negative hold — all physically sane — and the generated flop
.lib round-trips through vyges-sta-si, which times a reg-to-reg path from it
(setup WNS + hold WHS, the negative hold relaxing the hold check).
Adds per-corner sweeps: corner: lines characterize the cell across PVT
corners (process models + supply + temperature), one .lib per corner with the
corner's nominal V/T in the header. Validated on sky130_fd_sc_hd__inv_1 across
ss/tt/ff: the cell_rise delays order ff (0.019 ns) < tt (0.029) < ss (0.053) as
physics demands, and vyges-sta-si MCMM across the three generated libs binds the
worst setup at the slow (ss) corner — closing char → per-corner .libs → MCMM.
Adds async set/reset flops: a tie: list holds async/unused inputs at their
inactive level (through a series R, same de-stiffening) so setup/hold/CK->Q
characterize normally, and reset_pin:/set_pin: emit the ff clear/preset
attribute plus an async reset->Q (clear) / set->Q (preset) delay arc. Validated
on sky130_fd_sc_hd__dfrtp_1 (reset->Q ~0.149 ns, recovery/removal -0.15/+0.15 ns)
and dfstp_1 (set->Q ~0.221 ns): clocked timing matches the plain dfxtp_1, and both
flop .libs round-trip through vyges-sta-si (reg-to-reg setup+hold timed, the
async clear/preset and recovery/removal correctly skipped, not mistaken for data
paths). Recovery/removal bisect the async release edge for the single
capture/hold boundary t* relative to the clock (recovery = clock - t*, removal =
t* - clock, both signed — a flop that samples just after the clock 50% tolerates a
late release, giving a small negative recovery). The setup/hold push-out
bisection early-exits at 1 ps precision (~halving the ngspice runs per point).
Adds power characterization (power_char: true): per-arc internal_power
(rise/fall switching energy = supply energy minus the load-charging part) and
per-input-state leakage_power (DC quiescent current × VDD), with the
leakage_power_unit header and a cell_leakage_power average. Validated on
sky130_fd_sc_hd__inv_1: cell_leakage_power 0.0043 nW vs the foundry 0.0053 nW
(~19%) and internal energy ~0.007 pJ — right magnitudes and units. This is the
power data vyges-em-ir will drive its dynamic IR analysis with. (v1 caveat: the
per-state leakage spread is narrower than the foundry's N/P asymmetry — a true
DC .op settle would sharpen it; the average correlates well.)
The road to sign-off grade builds on the same emitter + job format: sequential
power (clock/data pin energy), sharper per-state leakage, multi-bit / latch cells,
and a two-sided recovery/removal window. Same run command, no license.