Expressions#

Query arguments are written as expressions compiled at startup using expr-lang/expr. Each expression has access to the built-in functions, globals, and any user-defined expressions.

Tip: Use edg repl to try any expression interactively without a database connection. See REPL for details.

Functions#

These are edg’s built-in functions, available in any expression context (args:, expressions:, globals). They generate data, reference datasets, aggregate values, and control execution flow.

FunctionReturnsDescription
__sep__stringDriver-aware batch field separator. A query-text token that is replaced with the SQL function producing the ASCII unit separator character (char 31) used to delimit values within batch-expanded placeholders. Resolves to chr(31) for pgx, CHAR(31) for MySQL and MSSQL, codepoints-to-string(31) for Oracle, CODE_POINTS_TO_STRING([31]) for Spanner. Can be used in any argument position within SQL. Always use __sep__ instead of a literal comma. Generated values may contain commas, which would silently corrupt your data.

string_to_array('$1', __sep__)
abs(x)float64Absolute value of x.

abs(-5.0) -> 5
acos(x)float64Arc cosine of x (result in radians).

acos(1.0) -> 0
arg(index)anyReturns the value of a previously evaluated arg by its zero-based index or name. Enables dependent columns where later args reference earlier ones.

arg(0) -> "Alice"
arg('email') -> "alice@example.com" (with named args)
array(minN, maxN, pattern)stringPostgreSQL/CockroachDB array literal with a random number of elements.

array(2, 4, 'email') -> {a@b.com,c@d.com,d@e.com}
asin(x)float64Arc sine of x (result in radians).

asin(1.0) -> 1.5707...
atan(x)float64Arc tangent of x (result in radians).

atan(1.0) -> 0.7853...
atan2(y, x)float64Two-argument arc tangent of y/x (result in radians). Handles quadrant correctly.

atan2(1.0, 1.0) -> 0.7853...
avg(name, field)float64Average of a numeric field across all rows in a named dataset.

avg('fetch_products', 'price') -> 19.39
batch(n)[][]anyReturns sequential integers [0, n) as batch arg sets,

batch(3) -> [[0], [1], [2]]
bit(n)stringRandom fixed-length bit string of exactly n bits.

bit(8) -> 10110011
blob(n)[]byteRandom n bytes as raw binary data. Works across all databases (PostgreSQL, MySQL, Oracle, MSSQL) via bind parameters. Use this for BLOB, BYTEA, VARBINARY, and RAW columns.

blob(1024) -> (1024 random bytes)
bool()boolRandom true or false. Useful as a coin flip with cond() and arg() for mutually exclusive columns.

bool() -> true
bytes(n)stringRandom n bytes as a hex-encoded string with \x prefix. PostgreSQL/CockroachDB only. For cross-database binary data, use blob(n) instead.

bytes(4) -> \x1a2b3c4d
ceil(x)float64Smallest integer greater than or equal to x.

ceil(3.2) -> 4
coalesce(v1, v2, ...)anyReturns the first non-nil value from arguments.

coalesce(nil, 'default') -> default
complete_array(tool, prompt, count)[]mapGenerates N structured items in a single LLM call. The tool schema is automatically wrapped in an array request. Returns []map for use with ref_each(). Memoized by (tool, prompt, count). Requires --complete-api-key or EDG_COMPLETE_API_KEY. See Complete.

ref_each(complete_array("review", "Generate 5 reviews", 5)).review_text -> "Great product!"
complete(tool, prompt)mapCalls an LLM with a named tool schema and returns structured data as a map. Access fields with dot notation. Per-row memoization ensures multiple field accesses with the same tool and prompt make only one API call. Requires --complete-api-key or EDG_COMPLETE_API_KEY. See Complete.

complete("review", "Review: Widget").review_text -> "Great product!"
complete("review", "Review: Widget").rating -> 4
cond(predicate, trueVal, falseVal)anyReturns trueVal if predicate is true, falseVal otherwise.

cond(true, 'yes', 'no') -> yes
const(value)anyReturns the value as-is. Useful for literal constants.

const(42) -> 42
cos(x)float64Cosine of x (x in radians).

cos(0.0) -> 1
count(name)intNumber of rows in a named dataset.

count('fetch_products') -> 5
date_offset(duration)stringReturns the current time offset by duration, formatted as RFC3339.

date_offset('-72h') -> 2026-04-08T10:00:00Z
date(format, min, max)stringRandom timestamp formatted using a Go time format string.

date('2006-01-02', '2020-01-01T00:00:00Z', '2025-01-01T00:00:00Z') -> 2023-07-15
distinct(name, field)intNumber of distinct values for a field in a named dataset.

distinct('fetch_products', 'category') -> 3
duration(min, max)stringRandom duration between min and max (Go duration strings).

duration('1h', '24h') -> 14h32m17s
embed(text...)stringCalls an external embedding API (OpenAI-compatible) and returns a vector literal. Variadic - multiple args are joined with a space. Requires --embed-api-key or EDG_EMBED_API_KEY. See Embed.

embed('hello world') -> [0.0123,-0.0456,...]
embed(field('name'), field('description')) -> [0.0789,...]
env_nil(name)anyReturns the value of an environment variable as a string, or nil if unset. Unlike env(), does not error on missing variables. Designed for use with coalesce() to provide defaults: int(coalesce(env_nil('PORT'), 8080)). Always returns a string when the variable exists, so wrap with int() or float() when arithmetic is needed.

env_nil('MISSING') -> nil
env_nil('HOST') -> localhost
env(name)stringReturns the value of a given environment variable (or an error if one doesn’t exist with that name). Missing variables are caught at config load time, before any queries run. Can be composed with other functions, e.g. upper(env('HOST')). For numeric values, use expr-lang conversion: int(env('PORT')), float(env('RATE')).

env('API_KEY') -> ca3864628a8f29d644e1...
exp_f(rate, min, max, precision)float64Exponentially-distributed random number in [min, max], rounded to precision decimal places.

exp_f(0.5, 0, 100, 2) -> 3.72
exp(rate, min, max)float64Exponentially-distributed random number in [min, max], rounded to 0 decimal places.

exp(0.5, 0, 100) -> 4
expr(expression)anyEvaluates an arithmetic expression. Alias for const, the expr engine handles the arithmetic.

expr(2 + 3) -> 5
fail(message)errorReturns an error that stops the current worker gracefully. Useful with ?? to catch unexpected values: {'a': 1}['x'] ?? fail('unknown key').

fail('unexpected region') -> (worker stops with error)
fatal(message)voidTerminates the entire process immediately. Use when an unexpected value should halt all workers, not just the current one.

fatal('missing required config') -> (process exits)
field(name)anyEvaluates a named field from the current query’s object: object. Requires object: to be set on the query. Use in args to cherry-pick fields or control ordering.

field('email') -> alice@example.com
floor(x)float64Largest integer less than or equal to x.

floor(3.7) -> 3
gen_batch(total, batchSize, pattern)[][]anyGenerates total values using gofakeit pattern, grouped into batches of batchSize. Each batch arg is a string of generated values delimited by the ASCII unit separator (char 31, \x1f).

gen_batch(4, 2, 'firstname') -> [["Alice\x1fBob"], ["Carol\x1fDave"]]
gen(pattern)stringGenerates a random value using gofakeit patterns (e.g. gen('number:1,100')).

gen('number:1,10') -> 7
global_iter()int64Monotonic iteration counter shared across all workers in a stage. Increments by 1 each time any worker calls RunIteration. Never resets. Use for time-series seasonality and data drift patterns.

20.0 + 5.0 * sin(2.0 * pi * global_iter() / 1000) -> 22.93...
global(name)anyLooks up a value from the globals section by name. Globals are also available directly as variables, so global('warehouses') and warehouses are equivalent.

global('warehouses') -> 10
inet(cidr)stringRandom IP address within the given CIDR block.

inet('192.168.1.0/24') -> 192.168.1.42
iter()int1-based row counter for exec_batch / query_batch queries. Returns 1 for the first row, 2 for the second, etc. Resets at the start of each batch query. Useful for generating sequential IDs without a global sequence.

iter() -> 1
json_arr(minN, maxN, pattern)stringBuilds a JSON array of N random values (N in [minN, maxN]) generated by a gofakeit pattern.

json_arr(1, 3, 'word') -> ["foo","bar"]
json_obj(k1, v1, k2, v2, ...)stringBuilds a JSON object string from key-value pair arguments.

json_obj('key', 'val') -> {"key":"val"}
local(name)anyReturns the value of a named local variable. Locals can be defined on individual queries or transactions. Query-level locals override transaction locals when both exist. Locals are re-evaluated per row in batch mode. Useful for calling complete() once and accessing multiple fields.

local("review").review_text -> "Great product!"
log(x)float64Natural logarithm of x.

log(1.0) -> 0
log10(x)float64Base-10 logarithm of x.

log10(100.0) -> 2
lognorm_f(mu, sigma, min, max, precision)float64Log-normally-distributed random number in [min, max], rounded to precision decimal places.

lognorm_f(1.0, 0.5, 1, 1000, 2) -> 3.42
lognorm(mu, sigma, min, max)float64Log-normally-distributed random number in [min, max], rounded to 0 decimal places.

lognorm(1.0, 0.5, 1, 1000) -> 3
max(name, field)float64Maximum value of a numeric field in a named dataset.

max('fetch_products', 'price') -> 49.99
min(name, field)float64Minimum value of a numeric field in a named dataset.

min('fetch_products', 'price') -> 1.99
mod(x, y)float64Floating-point remainder of x/y.

mod(10.0, 3.0) -> 1
norm_f(mean, stddev, min, max, precision)float64Normally-distributed random number in [min, max], rounded to precision decimal places.

norm_f(50.0, 15.0, 1.0, 100.0, 2) -> 52.37
norm_n(mean, stddev, min, max, minN, maxN)stringN unique normally-distributed values (N in [minN, maxN]) as a comma-separated string.

norm_n(50.0, 10.0, 1, 100, 2, 4) -> 47,53,61
norm(mean, stddev, min, max)float64Normally-distributed random number in [min, max], rounded to 0 decimal places.

norm(4, 1, 1, 5) -> 4
nullnilNull literal. Alias for nil, for users more familiar with SQL/JSON terminology. Not a function, use as a bare variable.

const(null) -> NULL
nullable(expr, probability)anyReturns NULL with probability (0.0–1.0), otherwise returns the expression result.

nullable(gen('email'), 0.3) -> NULL
nurand_n(A, x, y, min, max)stringGenerates N unique NURand values (N in [min, max]) as a comma-separated string.

nurand_n(255, 1, 100, 3, 5) -> 42,87,13,61
nurand(A, x, y)intTPC-C Non-Uniform Random: (((random(0,A) | random(x,y)) + C) / (y-x+1)) + x.

nurand(255, 1, 100) -> 42
obj(name, field)anyEvaluates only the named field from an object, avoiding the cost of evaluating all fields.

obj('order', 'product') -> Widget
obj(name)mapEvaluates all field expressions for a named object defined in the objects section and returns them as a map. Access individual fields with dot notation.

obj('order').product -> Widget
pifloat64The mathematical constant pi (3.14159…). Not a function - use as a bare variable.

2 * pi -> 6.28318...
point_wkt(lat, lon, radiusKM)stringGenerates a random geographic point as a WKT string: POINT(lon lat).

point_wkt(51.5, -0.1, 10.0) -> POINT(-0.082 51.513)
point(lat, lon, radiusKM)mapGenerates a random geographic point within radiusKM of (lat, lon). Access fields with .lat and .lon.

point(51.5, -0.1, 10.0).lat -> 51.513
polygon_wkt(lat, lon, minKM, maxKM, points)stringGenerates a jagged polygon with points vertices around (lat, lon), each at a random distance between minKM and maxKM. Returns a WKT POLYGON string. The ring is closed (first vertex repeated at end).

polygon_wkt(51.1, -0.4, 5, 15, 6) -> POLYGON((-0.33 51.18, ...))
polygon(lat, lon, minKM, maxKM, points)[]mapGenerates a jagged polygon with points vertices around (lat, lon), each at a random distance between minKM and maxKM. Returns a slice of maps with .lat and .lon fields. The ring is closed (first vertex repeated at end). Requires points >= 3.

polygon(51.1, -0.4, 5, 15, 6)[0].lat -> 51.18
pow(x, y)float64x raised to the power y.

pow(2.0, 10.0) -> 1024
ref_diff(name)mapReturns unique rows across multiple calls within the same query execution. Uses a swap-based index to avoid repeats.

ref_diff('products').name -> Widget
ref_each(query_or_dataset)[][]any or mapWhen given a SQL query string, executes it and returns all rows - each row becomes a separate arg set. When given a named reference dataset (unquoted), iterates sequentially through each row with same-row caching (like ref_same).

ref_each('SELECT id FROM t') -> [[1], [2], [3]]
ref_each(product_catalog).name -> Widget
ref_exp(name, rate)mapReturns a random row from a named dataset using exponential distribution. Lower indices are selected more frequently. rate controls decay speed.

ref_exp('products', 1.5).name -> Widget
ref_lognorm(name, mu, sigma)mapReturns a random row from a named dataset using log-normal distribution. Creates a right-skewed access pattern where early rows are favored.

ref_lognorm('products', 0.0, 0.5).name -> Widget
ref_n(name, field, min, max)stringPicks N unique random rows (N in [min, max]) from a named dataset, extracts field from each, and returns a comma-separated string.

ref_n('products', 'name', 2, 3) -> Widget,Gadget
ref_norm(name, mean, stddev)mapReturns a random row from a named dataset using normal distribution. mean and stddev are expressed as fractions of the dataset length (e.g. 0.5 = middle, 0.2 = narrow spread).

ref_norm('products', 0.5, 0.2).name -> Gadget
ref_perm(name)mapReturns a random row on first call, then the same row for the entire lifetime of the worker.

ref_perm('products').name -> Widget
ref_rand(name)mapReturns a random row from a named dataset (populated by an init query). Access fields with dot notation: ref_rand('fetch_warehouses').w_id.

ref_rand('products').name -> Gadget
ref_same(name)mapReturns a random row, but the same row is reused across all ref_same calls within a single query execution. Cleared between iterations.

ref_same('products').name -> Widget
ref_zipf(name, s, v)mapReturns a random row from a named dataset using Zipfian distribution. The first row is the “hottest”, with frequency dropping off according to s (skew, > 1) and v (>= 1).

ref_zipf('products', 2.0, 1.0).name -> Widget
regex(pattern)stringGenerates a random string matching the given regular expression.

regex('[A-Z]{3}-[0-9]{4}') -> ABK-7291
result()mapReturns the first row of the current query’s SELECT result as a map. Only available in post_print (after query execution). Access columns with dot notation.

result().total -> 10000
results()[]mapReturns all rows of the current query’s SELECT result as a slice of maps. Only available in post_print (after query execution). Use with expr-lang builtins like len(), map(), filter(), reduce() to aggregate across rows.

len(results()) -> 5
reduce(results(), #acc + #.balance, 0) -> 50000
seq_exp(name, rate)intExponentially-distributed value from a global sequence. Lower indices are selected more frequently.

seq_exp("order_id", 0.5) -> 7
seq_global(name)intShared auto-incrementing sequence across all workers. Returns the next value from a named sequence defined in the seq config section. Thread-safe via atomic counters.

seq_global("order_id") -> 1
seq_lognorm(name, mu, sigma)intLog-normally-distributed value from a global sequence.

seq_lognorm("order_id", 2, 0.5) -> 8
seq_norm(name, mean, stddev)intNormally-distributed value from a global sequence. mean and stddev are index positions (0-based).

seq_norm("order_id", 500, 100) -> 487
seq_rand(name)intUniform random value from the already-generated values of a global sequence. Computes valid values from the sequence’s start, step, and current counter (no values stored in memory).

seq_rand("order_id") -> 42
seq_zipf(name, s, v)intZipfian-distributed value from a global sequence. Lower indices (earlier values) are selected more frequently. s (> 1) and v (>= 1) control the distribution shape.

seq_zipf("order_id", 2.0, 1.0) -> 3
seq(start, step)intAuto-incrementing sequence per worker. Returns start + counter * step.

seq(1, 1) -> 1
seq_alpha(length)stringAuto-incrementing alpha sequence per worker. Generates base-26 strings of the given length (e.g. aaa, aab, aac, …).

seq_alpha(3) -> aaa
seq_alpha_global(name)stringShared auto-incrementing alpha sequence across all workers. Returns the next alpha value from a named sequence defined in the seq config section (requires length field).

seq_alpha_global("sku_code") -> aaa
set_exp(values, rate)anyPicks an item from a set using exponential distribution.

set_exp(['low', 'med', 'high'], 0.5) -> low
set_lognorm(values, mu, sigma)anyPicks an item from a set using log-normal distribution.

set_lognorm(['free', 'basic', 'pro'], 0.5, 0.5) -> free
set_norm(values, mean, stddev)anyPicks an item from a set using normal distribution.

set_norm([1, 2, 3, 4, 5], 2, 0.8) -> 3
set_rand(values, weights)anyPicks a random item from a set. If weights are provided, weighted random selection is used; otherwise uniform.

set_rand(['a', 'b', 'c'], []) -> b
set_zipf(values, s, v)anyPicks an item from a set using Zipfian distribution.

set_zipf(['a', 'b', 'c'], 2.0, 1.0) -> a
sin(x)float64Sine of x (x in radians).

sin(pi / 2) -> 1
sqrt(x)float64Square root of x.

sqrt(144.0) -> 12
sum(name, field)float64Sum of a numeric field across all rows in a named dataset.

sum('fetch_products', 'price') -> 96.95
tan(x)float64Tangent of x (x in radians).

tan(pi / 4) -> 1
template(format, args...)stringFormats a string using Go’s fmt.Sprintf syntax.

template('ORD-%05d', seq(1, 1)) -> ORD-00001
time(min, max)stringRandom time of day between min and max (HH:MM:SS format).

time('08:00:00', '18:00:00') -> 14:32:07
timestamp(min, max)stringRandom timestamp between min and max (RFC3339).

timestamp('2020-01-01T00:00:00Z', '2025-01-01T00:00:00Z') -> 2023-07-15T14:32:07Z
timez(min, max)stringRandom time of day with +00:00 timezone suffix.

timez('09:00:00', '17:00:00') -> 14:32:07+00:00
uniform_f(min, max, precision)float64Uniform random float in [min, max] rounded to precision decimal places.

uniform_f(0.01, 999.99, 2) -> 347.82
uniform(min, max)float64Uniform random float in [min, max].

uniform(1, 100) -> 73.12
uniq(expression [, expression...] [, maxRetries])anyEvaluates one or more string expressions repeatedly until a unique value (or composite tuple) is produced. Defaults to 100 retry attempts; pass an optional integer as the last argument to override.

Single expression - returns a single value: uniq("gen('airlineairportiata')") -> LAX

Composite - pass multiple expressions to enforce cross-column uniqueness. Returns []any; index to pick each column. Same-row calls with identical expressions return a cached tuple:
uniq("gen('first_name')", "gen('last_name')")[0] -> Alice
uniq("gen('first_name')", "gen('last_name')")[1] -> Smith

Seen values persist across rows within a query and reset between queries.
uuid_v1()stringGenerates a Version 1 UUID (timestamp + node ID).

uuid_v1() -> 6ba7b810-9dad-11d1-80b4-00c04fd430c8
uuid_v4()stringGenerates a Version 4 UUID (random).

uuid_v4() -> 550e8400-e29b-41d4-a716-446655440000
uuid_v6()stringGenerates a Version 6 UUID (reordered timestamp).

uuid_v6() -> 1ef21d2f-6ba7-6810-9dad-00c04fd430c8
uuid_v7()stringGenerates a Version 7 UUID (Unix timestamp + random, sortable).

uuid_v7() -> 018ef4c9-7f3a-7b3c-8d1a-2b4c5d6e7f8a
varbit(n)stringRandom variable-length bit string of 1 to n bits.

varbit(8) -> 10110
vector_norm(dims, clusters, spread, mean, stddev)stringLike vector but picks centroids using a normal distribution over cluster indices. mean is the center cluster index, stddev controls spread.

vector_norm(32, 5, 0.1, 2.0, 0.8)
vector_zipf(dims, clusters, spread, s, v)stringLike vector but picks centroids using a Zipfian distribution. Cluster 0 is the “hottest”, with frequency dropping off according to s (skew) and v (>= 1). Simulates real-world data where some categories have far more embeddings.

vector_zipf(32, 5, 0.1, 2.0, 1.0)
vector(dims, clusters, spread)stringvector literal with uniform centroid selection. Generates clustered, unit-length vectors for realistic similarity search. dims is the number of dimensions, clusters is the number of cluster centroids, and spread controls intra-cluster noise (Gaussian σ).

vector(4, 3, 0.1) -> [0.512340,-0.234567,0.678901,0.456789]
weighted_sample_n(name, field, weightField, minN, maxN)stringPicks N unique rows using weighted selection, returns a comma-separated string.

weighted_sample_n('products', 'name', 'stock', 2, 3) -> Widget,Pen
zipf(s, v, max)intZipfian-distributed random integer in [0, max].

zipf(2.0, 1.0, 999) -> 3

Choosing a Sequence Generator#

edg has three ways to generate sequential IDs. Picking the wrong one silently produces incorrect data, so choose carefully.

FunctionScopeResets?IDs Unique Across Workers?Use When
iter()Per batch queryYes - resets to 1 at the start of each exec_batch / query_batchN/A (single-worker seed)Seeding tables with fixed-size ID ranges (1..N). Always starts at 1, unaffected by other queries.
seq_global(name)Global (all workers)NeverYes - atomic counterGenerating globally unique IDs across concurrent workers in run. Requires a seq config entry.
seq(start, step)Per workerNeverNo - each worker has its own counterGenerating monotonic values within a single worker’s run loop (e.g. increasing timestamps, per-worker order numbers).
seq_alpha_global(name)Global (all workers)NeverYes - atomic counterGenerating globally unique alpha codes (aaa, aab, …) across workers. Requires a seq config entry with length.
seq_alpha(length)Per workerNeverNo - each worker has its own counterGenerating monotonic alpha codes within a single worker’s run loop.

Common mistakes#

Don’t use seq() across multiple seed queries.

seq(1, 1) is a single counter that never resets. If populate_accounts uses seq(1, 1) with count: 10, the counter reaches 10. A later populate_counters query using the same seq(1, 1) continues from 11, not 1. Use iter() instead - it resets per batch query.

# WRONG - Counter IDs will be 11, 12 (not 1, 2)
seed:
  - name: populate_accounts
    type: exec_batch
    count: 10
    args:
      - seq(1, 1)        # 1..10
    query: INSERT INTO account (id) VALUES ($1)

  - name: populate_counters
    type: exec_batch
    count: 10
    args:
      - seq(1, 1)        # 11..20
    query: INSERT INTO counter (id) VALUES ($1)
# CORRECT - iter() resets per query
seed:
  - name: populate_accounts
    type: exec_batch
    count: 10
    args:
      - iter()           # 1..10
    query: INSERT INTO account (id) VALUES ($1)

  - name: populate_counters
    type: exec_batch
    count: 10
    args:
      - iter()           # 1..10
    query: INSERT INTO counter (id) VALUES ($1)

Don’t use seq() when you need globally unique IDs.

With multiple workers, each worker’s seq(1, 1) produces 1, 2, 3, … independently - you’ll get duplicate IDs. Use seq_global instead.

Don’t use seq_global() for seed queries.

The counter never resets, so re-running deseed + seed produces new IDs each time. Use iter() for seeds and reserve seq_global for run workloads.

Function Lifecycle#

Several functions maintain state. Understanding when that state resets is important for getting correct results:

FunctionScopeResets
arg(index) / arg('name')Per-queryReturns the value of arg at index (or by name when using named args). Cleared before the next query. In batch queries, resets per row.
complete_array(tool, prompt, count)Per-queryMakes one API call per unique (tool, prompt, count) tuple. The result ([]map) is memoized so multiple ref_each(local(...)).field accesses within a row share the same call. Not deferred - resolves immediately even in batch queries.
complete(tool, prompt)Per-batchIn exec/query (non-batch) queries, each unique (tool, prompt) pair makes one API call; same-row field accesses are memoized. In exec_batch/query_batch queries, all complete() calls are deferred - placeholder maps are inserted during arg evaluation, then all pending requests are resolved concurrently (up to 8 parallel) after the batch is generated.
embed(text...)Per-batchIn exec/query (non-batch) queries, each call makes a separate API request. In exec_batch/query_batch queries, all embed() calls within a batch are deferred - placeholders are inserted during arg evaluation, then all pending texts are resolved in a single API call (or multiple calls if --embed-max-batch is set). For example, a 100-row batch with --embed-max-batch 30 produces 4 API calls (30+30+30+10) instead of 100 individual calls.
global_iter()GlobalMonotonic counter incremented once per RunIteration call by any worker. Never resets. Shared across all workers via atomic int64. Use for time-series seasonality and data drift.
iter()Per-queryReturns 1 for the first row, 2 for the second, etc. Resets to 0 at the start of each batch query.
nurand(A, x, y)Per-workerThe TPC-C constant C is generated once per worker per A value and stays fixed for the worker’s lifetime.
ref_diff(name)Per-queryReturns a unique row on each call within a query (no repeats). Index resets before the next query.
ref_exp(name, rate)NoneFresh random row on every call (exponential distribution)
ref_lognorm(name, mu, sigma)NoneFresh random row on every call (log-normal distribution)
ref_norm(name, mean, stddev)NoneFresh random row on every call (normal distribution)
ref_perm(name)Per-workerPicks a row on first call and returns that same row for the entire lifetime of the worker. Never resets.
ref_rand(name)NoneFresh random row on every call
ref_same(name)Per-queryPicks a row on first call within a query; all subsequent ref_same calls for the same dataset within that query return the same row. Cleared before the next query.
ref_zipf(name, s, v)NoneFresh random row on every call (Zipfian distribution)
result() / results()Per-queryReturns the last query’s result rows. Only available in post_print expressions. Set after each type: query execution; cleared after each type: exec.
seq_global(name)GlobalSingle counter shared across all workers via atomic increment. Values are globally unique. Configured in the seq config section.
seq_randGlobalPick from already-generated sequence values using the named distribution. The valid value set grows as seq_global advances the counter. No values are stored in memory.
seq_zipf / seq_norm /
seq_exp / seq_lognorm
GlobalSame as seq_rand but with shaped distributions.
seq(start, step)Per-workerCounter starts at 0 for each worker and increments on every call. Two workers both calling seq(1, 1) will produce the same sequence independently – values are not globally unique.
uniq(expression [, ...])Per-queryTracks seen values (or composite tuples) across all rows within a query. Composite calls are cached per-row so multiple arg positions share the same tuple. Resets between queries.
vector / vector_zipf /
vector_norm
Per-workerCluster centroids are generated on first call (keyed by dims+clusters) and reused for the worker’s lifetime. Each call picks a centroid (uniform, Zipfian, or normal) and adds noise.