CLI Reference#

Commands#

Command	Description
`up`	Create schema (tables, indexes)
`seed`	Populate tables with initial data
`run`	Execute the benchmark workload
`deseed`	Delete seeded data (truncate tables)
`down`	Tear down schema (drop tables)
`all`	Run up, seed, rußn, deseed, and down in sequence
`edg <expression>`	Evaluate a single expression and print the result
`init`	Generate a starter config from an existing database schema
`jobs serve`	Start an HTTP server that accepts workload configs via API
`jobs stream <id>`	Stream live progress from a running job
`jobs submit`	Submit a workload config to the job server
`functions [search]`	List available expression functions, optionally filtered by name
`repl`	Interactive expression evaluator with tab completion
`scaffold`	Interactive config generator wizard
`stage`	Generate data to files instead of a database
`sync down`	Tear down schema on both databases
`sync run`	Write identical data to multiple databases
`sync verify`	Verify data consistency between two databases
`template`	Print a template workload config to stdout
`validate config`	Validate a config file without connecting to a database
`validate license`	Validate a license key and print its details
`version`	Print the version
`workload <name> <command>`	Run a built-in workload without a config file

Running edg with an expression (no subcommand) evaluates it and prints the result. Bare words are treated as gofakeit patterns, so edg email is equivalent to edg "gen('email')". For expressions with parentheses or special characters, quote the argument.

A typical workflow runs the commands in order: up -> seed -> run -> deseed -> down. The all command runs this entire sequence in a single invocation.

Flags#

Flag / Env Var	Short	Default	Description
`--url` `EDG_URL`			Database connection URL. For Cassandra, comma-separated hosts are supported in the host portion (e.g. `cassandra://user:pass@host1,host2,host3:9042/keyspace`). Port, auth, and keyspace are parsed from the URL; the port applies to all hosts.
`--config` `EDG_CONFIG`			Path to the workload YAML config file (required for database commands, optional for `repl`)
`--driver` `EDG_DRIVER`		`pgx`	Database driver name (`pgx`, `mysql`, `mongodb`, `cassandra`, `mssql`, `oracle`, `dsql`, or `spanner`)
`--rng-seed` `EDG_RNG_SEED`			PRNG seed for deterministic output (useful for regression testing)
`--duration`	`-d`	`1m`	Benchmark duration (run and all commands)
`--workers`	`-w`	`1`	Number of concurrent workers (run and all commands)
`--license` `EDG_LICENSE`			License key for pro drivers (see Pricing)
`--retries` `EDG_RETRIES`		`0`	Number of transaction retry attempts on error (run and all commands). Uses exponential backoff (1ms, 2ms, 4ms, …). Only applies to transactions, not standalone queries. See Retries for details.
`--errors` `EDG_ERRORS`		`false`	Print worker errors to stderr (run and all commands). See Error Output for details.
`--print-interval`		`1s`	Progress reporting interval (run and all commands)
`--metrics-addr` `EDG_METRICS_ADDR`			Address for Prometheus metrics endpoint (e.g. `:9090`). See Observability for details.
`--pool-size` `EDG_POOL_SIZE`		`0`	Maximum number of open database connections. `0` uses the driver default (typically unlimited).
`--no-wait` `EDG_NO_WAIT`		`false`	Ignore wait durations configured in workload queries (run and all commands)
`--embed-api-key` `EDG_EMBED_API_KEY`			API key for the embedding provider. Required for `embed()` expressions.
`--embed-url` `EDG_EMBED_URL`			Embedding API URL. Any OpenAI-compatible endpoint works (Ollama, vLLM, Azure OpenAI, etc.).
`--embed-model` `EDG_EMBED_MODEL`			Embedding model name sent in the API request.
`--embed-dimensions` `EDG_EMBED_DIMENSIONS`		`1536`	Number of dimensions requested from the embedding model. Must match the `VECTOR(n)` column type.
`--embed-max-batch` `EDG_EMBED_MAX_BATCH`		`0`	Max texts per embedding API call in batch queries. `0` means unlimited (all texts in one call). E.g. `30` on a 100-row batch produces 4 API calls (30, 30, 30, 10).
`--warmup-duration`		`0`	Warmup period before collecting metrics (e.g. `10s`). Workers run during warmup but results are discarded. See Warmup for details.
`--no-color` `EDG_NO_COLOR`		`false`	Disable colored log output. The standard `NO_COLOR=1` environment variable is also respected per the no-color convention.
`--no-atomic-tx` `EDG_NO_ATOMIC_TX`		`false`	Skip `BEGIN`/`COMMIT` for transaction blocks. Queries still run sequentially with shared locals and `ref_same`, but each statement commits independently. `rollback_if` conditions are skipped.
`--overwrite` `EDG_OVERWRITE_STATS`		`false`	Overwrite printed stats in-place instead of appending new blocks. Each progress tick clears the previous output and reprints, keeping the terminal clean during long runs.
`--cassandra-default-consistency` `EDG_CASSANDRA_DEFAULT_CONSISTENCY`			Default consistency level for Cassandra queries. Accepts: `any`, `one`, `two`, `three`, `quorum`, `all`, `local_quorum`, `each_quorum`, `local_one`. When unset, the driver default (`quorum`) is used.
`--cassandra-idempotent` `EDG_CASSANDRA_IDEMPOTENT`		`false`	Mark all Cassandra queries as idempotent. Enables speculative execution and retry on other nodes when a query fails mid-flight. Safe for read-heavy or repeatable-write workloads.
`--cassandra-no-discovery` `EDG_CASSANDRA_NO_DISCOVERY`		`false`	Skip initial host discovery for Cassandra. Prevents the driver from querying `system.peers` to find other nodes, connecting only to the seed host in `--url`. Speeds up connection startup.
`--cassandra-serial-consistency` `EDG_CASSANDRA_SERIAL_CONSISTENCY`			Serial consistency level for Cassandra lightweight transactions (LWT). Accepts: `serial` (global Paxos across all DCs) or `local_serial` (Paxos within local DC only). When unset, the driver default (`local_serial`) is used. Use `serial` for multi-DC consistency testing.
`--csv-file`			CSV file to load as a reference table. The filename (minus extension) becomes the dataset name. Repeatable. See Configuration > Reference for details.
`--csv-directory`			Directory of CSV files to load as reference tables. Each `.csv` file becomes a dataset. Non-recursive. Repeatable.
`--drain-timeout` `EDG_DRAIN_TIMEOUT`		`5s`	Max time for in-flight operations to complete at shutdown. When the run timer fires, workers finish their current iteration using a separate context with this deadline instead of being cancelled immediately. Prevents silent result drops that cause expectation mismatches.

Cassandra LWT / CAS workloads require --no-atomic-tx. Cassandra does not support lightweight transactions (IF conditions) inside batches, so transaction blocks must run each statement independently. For multi-DC consistency testing, set --cassandra-serial-consistency serial to use global Paxos consensus. A typical LWT consistency test invocation looks like:
edg run \
  --driver cassandra \
  --url "cassandra://user:pass@host1,host2,host3:9042/keyspace" \
  --cassandra-default-consistency local_quorum \
  --cassandra-serial-consistency serial \
  --no-atomic-tx \
  -w 10 -d 5m

MongoDB tuning is done via URI parameters in --url rather than dedicated flags. The MongoDB driver parses all options from the connection string, so append query parameters to control consistency, read routing, and connection behaviour. Common parameters:
Parameter Values Effect
w 0, 1, 2, …, majority, or a custom tag set name Write concern. See write concern values below.
readConcernLevel local, available, majority, linearizable, snapshot Read isolation level. See read concern values below.
readPreference primary, primaryPreferred, secondary, secondaryPreferred, nearest Which replica serves reads. nearest gives lowest latency; secondary offloads the primary.
retryWrites true, false Retry failed writes once (default: true).
directConnection true, false Connect to a single node without topology discovery.
loadBalanced true, false Required when connecting through an L4 load balancer (e.g. Atlas Serverless).
connectTimeoutMS milliseconds Connection timeout (edg default: 10000).
serverSelectionTimeoutMS milliseconds How long the driver waits for a suitable server (edg default: 10000).
Example with majority write concern and linearizable reads:
edg run \
  --driver mongodb \
  --url "mongodb://localhost:27017/mydb?w=majority&readConcernLevel=linearizable" \
  --config workload.yaml \
  -w 10 -d 5m

Parameter	Values	Effect
`w`	`0`, `1`, `2`, …, `majority`, or a custom tag set name	Write concern. See write concern values below.
`readConcernLevel`	`local`, `available`, `majority`, `linearizable`, `snapshot`	Read isolation level. See read concern values below.
`readPreference`	`primary`, `primaryPreferred`, `secondary`, `secondaryPreferred`, `nearest`	Which replica serves reads. `nearest` gives lowest latency; `secondary` offloads the primary.
`retryWrites`	`true`, `false`	Retry failed writes once (default: `true`).
`directConnection`	`true`, `false`	Connect to a single node without topology discovery.
`loadBalanced`	`true`, `false`	Required when connecting through an L4 load balancer (e.g. Atlas Serverless).
`connectTimeoutMS`	milliseconds	Connection timeout (edg default: `10000`).
`serverSelectionTimeoutMS`	milliseconds	How long the driver waits for a suitable server (edg default: `10000`).

MongoDB write concern values#
w value Meaning
0 Fire-and-forget. No acknowledgement from the server.
1 Acknowledged by primary only (default).
2, 3, … Acknowledged by that many replica set members.
majority Acknowledged by a majority of replica set members. Won’t be rolled back.
custom tag Acknowledged by members matching a custom getLastErrorModes tag set.
MongoDB read concern values#
readConcernLevel value Meaning
local Most recent data from the node (default for primary reads). May be rolled back.
available Like local but for sharded clusters. It doesn’t wait for orphaned docs to be cleaned. Lowest latency.
majority Only data acknowledged by a majority. Won’t be rolled back.
linearizable Reflects all successful majority writes before the read. Single-document only, primary only. Highest consistency.
snapshot Point-in-time snapshot across a transaction. Requires w=majority. Transactions only.
For consistency testing, w=majority + readConcernLevel=majority is the common combination. snapshot is stronger but only works inside transactions. Use --retries 3 to handle transient WriteConflict errors under contention.

`w` value	Meaning
`0`	Fire-and-forget. No acknowledgement from the server.
`1`	Acknowledged by primary only (default).
`2`, `3`, …	Acknowledged by that many replica set members.
`majority`	Acknowledged by a majority of replica set members. Won’t be rolled back.
custom tag	Acknowledged by members matching a custom `getLastErrorModes` tag set.

`readConcernLevel` value	Meaning
`local`	Most recent data from the node (default for primary reads). May be rolled back.
`available`	Like `local` but for sharded clusters. It doesn’t wait for orphaned docs to be cleaned. Lowest latency.
`majority`	Only data acknowledged by a majority. Won’t be rolled back.
`linearizable`	Reflects all successful majority writes before the read. Single-document only, primary only. Highest consistency.
`snapshot`	Point-in-time snapshot across a transaction. Requires `w=majority`. Transactions only.

Every flag with an env var listed above can be set via the environment instead of the command line. Flags take precedence over environment variables, which take precedence over defaults.

export EDG_URL="postgres://root@localhost:26257?sslmode=disable"
export EDG_DRIVER="pgx"
export EDG_CONFIG="workload.yaml"

# No need to pass --url, --driver, or --config:
edg run -w 10 -d 5m

# Flags override env vars when both are set:
edg run -w 10 -d 5m --driver mysql --url "user:pass@tcp(localhost:3306)/db"

Example#

Database#

Run each lifecycle command individually against a database, or use all to run the entire sequence in one invocation.

edg up \
--driver pgx \
--config _examples/tpcc/crdb.yaml \
--url "postgres://root@localhost:26257?sslmode=disable"

edg seed \
--driver pgx \
--config _examples/tpcc/crdb.yaml \
--url "postgres://root@localhost:26257?sslmode=disable"

edg run \
--driver pgx \
--config _examples/tpcc/crdb.yaml \
--url "postgres://root@localhost:26257?sslmode=disable" \
-w 100 \
-d 1m

edg deseed \
--driver pgx \
--config _examples/tpcc/crdb.yaml \
--url "postgres://root@localhost:26257?sslmode=disable"

edg down \
--driver pgx \
--config _examples/tpcc/crdb.yaml \
--url "postgres://root@localhost:26257?sslmode=disable"

Or use all to run the entire workflow in one command:

edg all \
--driver pgx \
--config _examples/tpcc/crdb.yaml \
--url "postgres://root@localhost:26257?sslmode=disable" \
-w 100 \
-d 5m

Aurora DSQL#

The dsql driver uses AWS IAM authentication instead of a username and password. Pass the cluster endpoint as the --url value:

edg all \
--driver dsql \
--config workload.yaml \
--url "clusterid.dsql.us-east-1.on.aws" \
-w 10 \
-d 5m

AWS credentials are resolved from the standard chain (environment variables, ~/.aws/credentials, IAM role, etc.). The region is parsed from the cluster endpoint automatically. Auth tokens are refreshed on every new connection, so long-running workloads work without interruption.

DSQL uses PostgreSQL-compatible SQL, so use $1, $2 placeholders in your queries.

Licensing#

The pgx, mysql, mongodb, and cassandra drivers are free to use. Pro drivers (oracle, mssql, dsql, spanner) require a license key passed via --license or EDG_LICENSE. The license is validated before connecting to the database. See the Pricing page for full details.

Validating Config#

The validate config command parses a config file and checks it for errors without connecting to a database. It catches YAML syntax errors, invalid expressions, unknown function calls, duplicate query names, shadowed built-ins, and invalid query types. Errors include YAML line numbers when available.

edg validate config --config _examples/tpcc/crdb.yaml

config is valid

Add --explain for enhanced error messages with explanations and correct syntax examples:

edg validate config --config workload.yaml --explain

line 42: duplicate query name "seed_users"

  Query names must be unique across all sections. They serve as dataset keys
  for ref_* functions and as metric labels. Rename one of the duplicates.

This is useful for catching mistakes before deploying a workload or as a CI check.

Validating a License#

The validate license command checks whether a license key is valid for a given driver and prints the license details.

edg validate license --driver oracle --license "your-license-key"

License info:
  ID:         acme-corp
  Email:      admin@acme.com
  Drivers:    [oracle mssql]
  Issued at:  2025-01-15
  Expires at: 2026-01-15
License is valid for driver "oracle".

If the driver doesn’t require a license, the output tells you:

edg validate license --driver pgx --license "your-license-key"

License info:
  ID:         acme-corp
  Email:      admin@acme.com
  Drivers:    [oracle mssql]
  Issued at:  2025-01-15
  Expires at: 2026-01-15
Driver "pgx" does not require a license.

If the license is expired or doesn’t cover the requested driver, you’ll see an error:

edg validate license --driver dsql --license "your-license-key"

License info:
  ID:         acme-corp
  Email:      admin@acme.com
  Drivers:    [oracle mssql]
  Issued at:  2025-01-15
  Expires at: 2026-01-15
Error: license does not include driver "dsql" (licensed: [oracle mssql])

The EDG_LICENSE environment variable is also accepted:

export EDG_LICENSE="your-license-key"
edg validate license --driver oracle

Retries#

The --retries flag controls how many times a failed transaction is retried before the error is recorded. The default is 0 (no retries). Retries only apply to transactions (queries wrapped in a transaction: block), not standalone queries.

When a transaction fails, edg waits with exponential backoff before retrying:

Attempt	Backoff
1st retry	2ms
2nd retry	4ms
3rd retry	8ms
4th retry	16ms
nth retry	2^n ms

If all retry attempts fail, the last error is recorded in the stats and the worker continues to the next iteration. Context cancellation (e.g. Ctrl+C or duration expiry) stops retries immediately.

edg run \
  --driver pgx \
  --config workload.yaml \
  --url ${DATABASE_URL} \
  --retries 3 \
  -w 10 \
  -d 5m

Error Output#

By default, individual query errors during the run phase are counted but not printed. The --errors flag prints each error to stderr as it occurs, which is useful for debugging:

edg run \
  --driver pgx \
  --config workload.yaml \
  --url ${DATABASE_URL} \
  --errors \
  -w 10 \
  -d 5m

2025/04/23 14:32:07 ERROR run error worker=3 error="running run query debit_source: pq: insufficient funds"
2025/04/23 14:32:07 ERROR run error worker=7 error="running run query debit_source: pq: insufficient funds"

Without --errors, the same failures still appear in the summary table’s ERRORS column and count toward error_rate in expectations.

Connection Pool#

The --pool-size flag sets the maximum number of open database connections (SetMaxOpenConns and SetMaxIdleConns). The default 0 uses the driver’s default, which is typically unlimited.

Setting pool size is useful for:

Simulating constrained environments where the application has a fixed connection budget.
Preventing connection exhaustion when running with many workers against a database with connection limits.
Isolating connection overhead from query performance in benchmarks.

edg run \
  --driver pgx \
  --config workload.yaml \
  --url ${DATABASE_URL} \
  --pool-size 20 \
  -w 50 \
  -d 5m

In this example, 50 workers share 20 connections. Workers that can’t acquire a connection will block until one becomes available.

Warmup#

The --warmup-duration flag runs workers for a specified period before collecting metrics. During warmup, query results are discarded. They don’t appear in progress output, the summary, Prometheus metrics, or expectations.

This produces cleaner benchmark results by allowing the database to warm its caches, JIT-compile query plans, and reach a steady state before measurement begins.

edg run \
  --driver pgx \
  --config workload.yaml \
  --url ${DATABASE_URL} \
  --warmup-duration 30s \
  -w 10 \
  -d 5m

In this example, workers run for 30 seconds of warmup (discarded), then 5 minutes of measured execution. The total wall-clock time is 5m30s.

When using stages, warmup applies before the first stage begins collecting metrics.

Run Behaviour#

Workers and Initialisation#

Each worker gets its own isolated environment. The init section runs once, and its results are cloned to each worker so that functions like ref_rand and ref_diff don’t interfere across workers. Per-worker state includes sequence counters (seq), permanent row picks (ref_perm), and NURand constants.

Stages#

When a config file includes a stages section, each stage defines its own worker count and duration, and stages run sequentially. Explicitly passing -w or -d overrides the stages section and falls back to single-stage mode. See Configuration > Stages for details.

edg run \
--driver pgx \
--config _examples/stages/crdb.yaml \
--url "postgres://root@localhost:26257?sslmode=disable"

Error Handling#

Query errors during run are non-fatal. The worker logs the error and increments an error counter but continues to the next iteration. This lets you observe error rates without aborting the benchmark. Errors in other sections (up, seed, deseed, down, init) are fatal and stop execution immediately.

Interrupting with Ctrl+C#

Pressing Ctrl+C during run or all cancels the workload gracefully. Workers finish their current iteration and stop. When using all, the cleanup phases (deseed and down) still run after interruption, using a fresh context.

Output#

During the run, progress is printed at the --print-interval (default: every second):

59s / 1m0s
QUERY          COUNT  ERRORS  AVG      p50      p95      p99      QPS
check_balance  3674   0       2.631ms  2.367ms  4.154ms  6.252ms  62.3
credit_target  3769   0       1.68ms   1.495ms  2.624ms  3.911ms  63.9
debit_source   3769   0       2.376ms  2.13ms   3.722ms  5.288ms  63.9
read_source    3770   0       2.047ms  1.803ms  3.254ms  5.052ms  63.9
read_target    3769   0       2.839ms  2.579ms  4.486ms  6.446ms  63.9

TRANSACTION    COMMITS  ROLLBACKS  ERRORS  AVG       p50       p95       p99       TPS
make_transfer  3769     0          0       13.053ms  12.424ms  18.498ms  26.074ms  63.9

After all workers complete, a final summary is printed:

summary
Duration:  1m0.004s
Workers:   1

QUERY          COUNT  ERRORS  AVG      p50      p95      p99      QPS
check_balance  3749   0       2.628ms  2.362ms  4.14ms   6.249ms  62.5
credit_target  3828   0       1.681ms  1.497ms  2.624ms  3.911ms  63.8
debit_source   3828   1       2.381ms  2.13ms   3.724ms  5.338ms  63.8
read_source    3829   0       2.046ms  1.802ms  3.25ms   5.052ms  63.8
read_target    3829   0       2.843ms  2.583ms  4.485ms  6.446ms  63.8

TRANSACTION    COMMITS  ROLLBACKS  ERRORS  AVG       p50       p95       p99       TPS
make_transfer  3828     0          1       13.063ms  12.438ms  18.498ms  26.652ms  63.8

Transactions:  19063
Errors:        1
tpm:           19061.6

Metric	Description
COUNT	Total successful query executions
ERRORS	Total failed query executions
AVG	Mean execution time per query
p50	Median latency (50th percentile)
p95	95th percentile latency
p99	99th percentile latency
QPS	Queries per second (count / elapsed seconds)
tpm	Transactions per minute across all queries

Expectations#

When the config file includes an expectations section, results are printed after the summary and the exit code reflects whether all expectations passed:

expectations
  PASS  error_rate < 1
  PASS  check_balance.p99 < 100
  FAIL  tpm > 5000

1 expectation(s) failed

If any expectation fails, edg exits with status code 1. When using all, teardown (deseed and down) still runs before the non-zero exit.

See Configuration > Expectations for the full list of available metrics and expression syntax.