The mental model behind zod4-mock. Read this once and the API will feel obvious.
World
A is a seeded generation session. It holds the PRNG, the registry, and all schema registrations.
const world = createWorld({ seed: 42 })
.withSchema(PersonSchema)
.withSchema(DocumentSchema, { relations: { author: PersonSchema } });One world = one seed = one deterministic dataset. All schemas registered on a world share the same PRNG state and registry, which is what makes cross-schema consistency possible.
Options
@zod4-mock/locale-en / @zod4-mock/locale-nl. See Localization.z.optional() / z.nullable() fields are omitted..min() / .max() is set.Schemas
Every schema you register with withSchema is tracked by the world. There are three
registration modes:
Primary — identity anchor
world.withSchema(PersonSchema);A primary schema generates independent instances. The world cycles through them
deterministically as you call generate(). Instances are stored in the registry and
can be referenced by other schemas.
Derived — projection of another schema
world.withSchema(PersonSummarySchema, {
from: PersonSchema,
matchers: {
id: (ctx) => ctx.source.personId,
name: (ctx) => `${ctx.source.firstName} ${ctx.source.lastName}`,
},
});from: binds this schema to a primary schema. Each generated instance of PersonSummarySchema is a projection of the corresponding PersonSchema instance. ctx.source holds the source entity's data.
Relational — linked to other schemas
world.withSchema(DocumentSchema, {
relations: { author: PersonSchema },
matchers: {
authorId: (ctx) => ctx.related("author").personId,
},
});relations declares which other schemas this one references. ctx.related("author") resolves to the data of a specific instance of PersonSchema.
All three modes can be combined — a schema can have both from and relations.
The generation pipeline
For every field in a schema, values are resolved in this priority order — the seven named steps
of the canonical PIPELINE list in src/pipeline.ts. The first step that
produces a value wins:
- Eager overrides —
options.overridesprimitive/array entries land inctx.currentso sibling matchers can read them viactx.current.<sibling>. - Matchers — user functions from
withSchema({ matchers }). Explicit per-field functions; first to win. - Per-schema key map — entries from
withKeyMap({ ... })matched on the field name. - Unwrap optional — strip
optional/nullable/defaultand roll absent per layer; setsctx.innerfor downstream steps. Internal — does not produce a final value on its own. - World-level custom generators — entries from
withGenerators({ ... })matched on the field name. - Key-based heuristics — built-in
DEFAULT_KEY_MAPexact-key +DEFAULT_KEY_PATTERNSregex matches.email→ realistic email,firstName→ first name,createdAt→ date. Full list → - Schema-based fallback — Zod type introspection.
z.enum([...])→ random member,z.number().int().min(1).max(100)→ integer in range, etc. Always resolves.
After the pipeline
Once the pipeline returns a value for a field, two wrapping passes finish the record:
- Override deep-merge —
options.overridesis deep-merged onto the pipeline's value (covers nested-object slices step 0 didn't eagerly consume; B12 contract). - Transform —
options.transformis called on the merged value.
You only need to provide matchers for fields the pipeline can't resolve correctly on its own.
The ctx object
Every receives a ctx with:
ctx.gen.person.firstName(), ctx.gen.internet.email(), ctx.gen.finance.amount(10, 999).ctx.prng.int(min, max), ctx.prng.pick([...]), ctx.prng.random().from: is declared)."address.street".ctx.gen — generator library
The full generator namespace, with the PRNG already bound. You never pass prng manually:
matchers: {
name: (ctx) => ctx.gen.person.fullName(),
email: (ctx) => ctx.gen.internet.email(),
city: (ctx) => ctx.gen.location.city(),
iban: (ctx) => ctx.gen.finance.iban(),
sentence: (ctx) => ctx.gen.word.sentence(),
}Generators that take arguments work the same way — the PRNG is the first argument and is applied automatically:
(ctx) => ctx.gen.string.alphanumeric(8) // length = 8
(ctx) => ctx.gen.finance.amount(10, 999) // min, maxThe registry
Every generated primary schema instance is stored in the . Other matchers can look it up to establish cross-schema consistency.
// Pick a random instance of a registered schema
const person = ctx.registry.pick(PersonSchema);
// Pick all instances
const people = ctx.registry.all(PersonSchema);
// Filter all matching a predicate
const active = ctx.registry.filter(PersonSchema, (p) => p.active);Registry lookups are typed from the schema — no manual type casts needed.
pick()throws if the registry has no instances of that schema yet. Generate the referenced schema before the one that references it.
Composable nested schemas
Matchers registered for a schema apply automatically wherever that schema appears — including nested inside another schema's fields.
const world = createWorld({ seed: 42 })
.withSchema(AddressSchema, {
matchers: {
street: (ctx) => ctx.gen.location.street(),
city: (ctx) => ctx.gen.location.city(),
},
})
.withSchema(PersonSchema); // PersonSchema has address: AddressSchema
// PersonSchema's address field uses AddressSchema's matchers automatically
const person = world.generate(PersonSchema);Determinism
Two guarantees make stable:
Same seed → same output. The PRNG is deterministic (SFC32). Rebuild the world with the same seed and the same builder chain; you get byte-identical data.
Per-field seeding. Each field gets an independent PRNG derived from hash(worldSeed + schemaId + fieldPath). Adding or removing a field from a schema
does not disturb the values of other fields. The lastName of
instance #1 has the same value before and after you add a middleName field.
This means you can add fields to schemas mid-project without invalidating existing test snapshots.
Localization
A locale decides what data the generators draw from — names, words, currencies, date formats, address shapes, phone formats, and so on. The world carries a single locale; all generators read from it.
zod4-mock ships a built-in minimal English locale that's used when
you don't pass locale. It has small curated word/name arrays — enough to be valid,
deliberately not realistic. Output looks like "John Smith", "Section", "$128.94".
For realistic output, install a locale package and pass it to createWorld:
import { createWorld } from "zod4-mock";
import { en } from "@zod4-mock/locale-en"; // Markov-trained English
import { nl } from "@zod4-mock/locale-nl"; // Markov-trained Dutch
createWorld({ seed: 42, locale: en });
createWorld({ seed: 42, locale: nl });A locale is a plain LocaleData object — sections for person, address, commerce, company, word, finance, date, color, phone. Locales can
supply either Markov models (firstNamesMale, nounModel) or plain
arrays (simpleFirstNamesMale, nouns); generators prefer the model when
present.
For variants, use extend() (re-exported from each locale package, e.g. @zod4-mock/locale-en):
import { createWorld } from "zod4-mock";
import { en, extend } from "@zod4-mock/locale-en";
const enGB = extend(en, {
address: { ...en.address, phonePrefix: "+44", countryCode: "GB", ibanPrefix: "GB" },
commerce: { ...en.commerce, formatPrice: (n) => `£${n.toFixed(2)}` },
});See the API reference for the full LocaleData interface.
Zipf-default picks on open corpora
zod4-mock's open-corpus pickers (e.g. person.firstName, person.lastName) draw from frequency-sorted locale arrays via prng.pickZipf(items, s) — a single closed-form inverse-CDF Zipf draw — rather than
uniform prng.pick(items). The exponent s is resolved per call site as locale.frequencyExponentOverrides?.[corpus] ?? locale.frequencyExponent ?? 1.0, so
shipped locales bias the head of each list toward the real world's frequency curve: "john" shows up far more than "aaden", mirroring SSA / Census
distributions.
This is a deliberate divergence from faker, whose default is uniform across each list. If you
prefer faker-style uniform output, set frequencyExponent: 0 (or override an
individual corpus) on your locale.
Unique contexts auto-flatten to uniform. When you request world.generate(schema, { unique: true }), the engine flattens s to 0 for every pickZipf call inside that loop — uniqueness wins over
realism. The flag has no opt-out; matchers that need head-skewed picks inside a unique loop
should call ctx.prng.pickZipf(arr, s) directly with an explicit s.
Closed / enumerable corpora (states, months, weekdays, currencies, etc.) ignore the Zipf surface
entirely and stay on prng.pick.
Realistic numeric distributions
The same realism axis applies on the numeric side. Money keys (amount, balance, total, revenue, cost, fee, salary, price, …) draw log-uniform — min * Math.pow(max / min, u) — so leading-digit-1 values appear ~30% of the time
(Benford's law), matching real-world ledgers instead of faker's flat uniform-over-range.
Scale-free measurement keys (fileSize, bytes, views, population, distance) follow the same log-uniform default with Math.round for the integer routes. age is a clipped log-normal centred
on μ = ln(36) (US Census median adult), year is an exponential skew toward the
present (λ = 0.05), and quantity / count are truncated geometrics with p = 0.5 (modal at the lower bound).
Three semantic-meaningful keys stay bounded-uniform on a pinned default range: rating ([0, 5]), score and percentage ([0, 100]).
Un-keyed auto-flip on z.number(). A plain (un-routed) numeric
field auto-flips to log-uniform when all four of these hold: min > 0, log10(max / min) ≥ 3 (≥ 3 orders of magnitude), !schema.isInt, and no .multipleOf. Anything else stays on today's
uniform draw. The threshold (3 orders) is deliberately wide enough to catch obvious file-size /
view-count cases without misfiring on probabilities (.min(0.01).max(1)) or
sub-percent ranges.
Cross-zero or non-positive ranges always fall back to uniform (the log-uniform formula is
undefined for min ≤ 0) — zod4-mock does not silently
shift your stated bounds with an epsilon. To opt out per-key, use withGenerators (see docs/recipes.md).
Populate
Use populate() to pre-create a fixed number of instances before generation starts.
This is useful when you need other schemas to reference a specific number of entities:
const world = createWorld({ seed: 42 })
.withSchema(PersonSchema)
.withSchema(DocumentSchema, { relations: { author: PersonSchema } })
.populate(PersonSchema, 5); // ensure exactly 5 persons exist
const documents = world.generate(z.array(DocumentSchema).min(20));
// All 20 documents reference one of the 5 personsOptional and nullable fields
optionalProbability (default 0.2) controls how often z.optional() and z.nullable() fields are omitted.
createWorld({ seed: 42, optionalProbability: 0 }); // always present
createWorld({ seed: 42, optionalProbability: 1 }); // always absentFor test assertions on optional fields, either set optionalProbability: 0 or pin
the field with overrides.