CLI Reference#

Commands#

CommandDescription
upCreate schema (tables, indexes)
seedPopulate tables with initial data
runExecute the benchmark workload
deseedDelete seeded data (truncate tables)
downTear down schema (drop tables)
allRun up, seed, rußn, deseed, and down in sequence
edg <expression>Evaluate a single expression and print the result
initGenerate a starter config from an existing database schema
jobs serveStart an HTTP server that accepts workload configs via API
jobs stream <id>Stream live progress from a running job
jobs submitSubmit a workload config to the job server
functions [search]List available expression functions, optionally filtered by name
replInteractive expression evaluator with tab completion
scaffoldInteractive config generator wizard
stageGenerate data to files instead of a database
sync downTear down schema on both databases
sync runWrite identical data to multiple databases
sync verifyVerify data consistency between two databases
templatePrint a template workload config to stdout
validate configValidate a config file without connecting to a database
validate licenseValidate a license key and print its details
versionPrint the version
workload <name> <command>Run a built-in workload without a config file

Running edg with an expression (no subcommand) evaluates it and prints the result. Bare words are treated as gofakeit patterns, so edg email is equivalent to edg "gen('email')". For expressions with parentheses or special characters, quote the argument.

A typical workflow runs the commands in order: up -> seed -> run -> deseed -> down. The all command runs this entire sequence in a single invocation.

Flags#

Flag / Env VarShortDefaultDescription
--url
EDG_URL
Database connection URL. For Cassandra, comma-separated hosts are supported in the host portion (e.g. cassandra://user:pass@host1,host2,host3:9042/keyspace). Port, auth, and keyspace are parsed from the URL; the port applies to all hosts.
--config
EDG_CONFIG
Path to the workload YAML config file (required for database commands, optional for repl)
--driver
EDG_DRIVER
pgxDatabase driver name (pgx, mysql, mongodb, cassandra, mssql, oracle, dsql, or spanner)
--rng-seed
EDG_RNG_SEED
PRNG seed for deterministic output (useful for regression testing)
--duration-d1mBenchmark duration (run and all commands)
--workers-w1Number of concurrent workers (run and all commands)
--license
EDG_LICENSE
License key for pro drivers (see Pricing)
--retries
EDG_RETRIES
0Number of transaction retry attempts on error (run and all commands). Uses exponential backoff (1ms, 2ms, 4ms, …). Only applies to transactions, not standalone queries. See Retries for details.
--errors
EDG_ERRORS
falsePrint worker errors to stderr (run and all commands). See Error Output for details.
--print-interval1sProgress reporting interval (run and all commands)
--metrics-addr
EDG_METRICS_ADDR
Address for Prometheus metrics endpoint (e.g. :9090). See Observability for details.
--pool-size
EDG_POOL_SIZE
0Maximum number of open database connections. 0 uses the driver default (typically unlimited).
--no-wait
EDG_NO_WAIT
falseIgnore wait durations configured in workload queries (run and all commands)
--embed-api-key
EDG_EMBED_API_KEY
API key for the embedding provider. Required for embed() expressions.
--embed-url
EDG_EMBED_URL
Embedding API URL. Any OpenAI-compatible endpoint works (Ollama, vLLM, Azure OpenAI, etc.).
--embed-model
EDG_EMBED_MODEL
Embedding model name sent in the API request.
--embed-dimensions
EDG_EMBED_DIMENSIONS
1536Number of dimensions requested from the embedding model. Must match the VECTOR(n) column type.
--embed-max-batch
EDG_EMBED_MAX_BATCH
0Max texts per embedding API call in batch queries. 0 means unlimited (all texts in one call). E.g. 30 on a 100-row batch produces 4 API calls (30, 30, 30, 10).
--warmup-duration0Warmup period before collecting metrics (e.g. 10s). Workers run during warmup but results are discarded. See Warmup for details.
--no-color
EDG_NO_COLOR
falseDisable colored log output. The standard NO_COLOR=1 environment variable is also respected per the no-color convention.
--no-atomic-tx
EDG_NO_ATOMIC_TX
falseSkip BEGIN/COMMIT for transaction blocks. Queries still run sequentially with shared locals and ref_same, but each statement commits independently. rollback_if conditions are skipped.
--overwrite
EDG_OVERWRITE_STATS
falseOverwrite printed stats in-place instead of appending new blocks. Each progress tick clears the previous output and reprints, keeping the terminal clean during long runs.
--cassandra-default-consistency
EDG_CASSANDRA_DEFAULT_CONSISTENCY
Default consistency level for Cassandra queries. Accepts: any, one, two, three, quorum, all, local_quorum, each_quorum, local_one. When unset, the driver default (quorum) is used.
--cassandra-idempotent
EDG_CASSANDRA_IDEMPOTENT
falseMark all Cassandra queries as idempotent. Enables speculative execution and retry on other nodes when a query fails mid-flight. Safe for read-heavy or repeatable-write workloads.
--cassandra-no-discovery
EDG_CASSANDRA_NO_DISCOVERY
falseSkip initial host discovery for Cassandra. Prevents the driver from querying system.peers to find other nodes, connecting only to the seed host in --url. Speeds up connection startup.
--cassandra-serial-consistency
EDG_CASSANDRA_SERIAL_CONSISTENCY
Serial consistency level for Cassandra lightweight transactions (LWT). Accepts: serial (global Paxos across all DCs) or local_serial (Paxos within local DC only). When unset, the driver default (local_serial) is used. Use serial for multi-DC consistency testing.
--csv-fileCSV file to load as a reference table. The filename (minus extension) becomes the dataset name. Repeatable. See Configuration > Reference for details.
--csv-directoryDirectory of CSV files to load as reference tables. Each .csv file becomes a dataset. Non-recursive. Repeatable.
--drain-timeout
EDG_DRAIN_TIMEOUT
5sMax time for in-flight operations to complete at shutdown. When the run timer fires, workers finish their current iteration using a separate context with this deadline instead of being cancelled immediately. Prevents silent result drops that cause expectation mismatches.

Cassandra LWT / CAS workloads require --no-atomic-tx. Cassandra does not support lightweight transactions (IF conditions) inside batches, so transaction blocks must run each statement independently. For multi-DC consistency testing, set --cassandra-serial-consistency serial to use global Paxos consensus. A typical LWT consistency test invocation looks like:

edg run \
  --driver cassandra \
  --url "cassandra://user:pass@host1,host2,host3:9042/keyspace" \
  --cassandra-default-consistency local_quorum \
  --cassandra-serial-consistency serial \
  --no-atomic-tx \
  -w 10 -d 5m

MongoDB tuning is done via URI parameters in --url rather than dedicated flags. The MongoDB driver parses all options from the connection string, so append query parameters to control consistency, read routing, and connection behaviour. Common parameters:

ParameterValuesEffect
w0, 1, 2, …, majority, or a custom tag set nameWrite concern. See write concern values below.
readConcernLevellocal, available, majority, linearizable, snapshotRead isolation level. See read concern values below.
readPreferenceprimary, primaryPreferred, secondary, secondaryPreferred, nearestWhich replica serves reads. nearest gives lowest latency; secondary offloads the primary.
retryWritestrue, falseRetry failed writes once (default: true).
directConnectiontrue, falseConnect to a single node without topology discovery.
loadBalancedtrue, falseRequired when connecting through an L4 load balancer (e.g. Atlas Serverless).
connectTimeoutMSmillisecondsConnection timeout (edg default: 10000).
serverSelectionTimeoutMSmillisecondsHow long the driver waits for a suitable server (edg default: 10000).

Example with majority write concern and linearizable reads:

edg run \
  --driver mongodb \
  --url "mongodb://localhost:27017/mydb?w=majority&readConcernLevel=linearizable" \
  --config workload.yaml \
  -w 10 -d 5m

MongoDB write concern values#

w valueMeaning
0Fire-and-forget. No acknowledgement from the server.
1Acknowledged by primary only (default).
2, 3, …Acknowledged by that many replica set members.
majorityAcknowledged by a majority of replica set members. Won’t be rolled back.
custom tagAcknowledged by members matching a custom getLastErrorModes tag set.

MongoDB read concern values#

readConcernLevel valueMeaning
localMost recent data from the node (default for primary reads). May be rolled back.
availableLike local but for sharded clusters. It doesn’t wait for orphaned docs to be cleaned. Lowest latency.
majorityOnly data acknowledged by a majority. Won’t be rolled back.
linearizableReflects all successful majority writes before the read. Single-document only, primary only. Highest consistency.
snapshotPoint-in-time snapshot across a transaction. Requires w=majority. Transactions only.

For consistency testing, w=majority + readConcernLevel=majority is the common combination. snapshot is stronger but only works inside transactions. Use --retries 3 to handle transient WriteConflict errors under contention.

Every flag with an env var listed above can be set via the environment instead of the command line. Flags take precedence over environment variables, which take precedence over defaults.

export EDG_URL="postgres://root@localhost:26257?sslmode=disable"
export EDG_DRIVER="pgx"
export EDG_CONFIG="workload.yaml"

# No need to pass --url, --driver, or --config:
edg run -w 10 -d 5m

# Flags override env vars when both are set:
edg run -w 10 -d 5m --driver mysql --url "user:pass@tcp(localhost:3306)/db"

Example#

Database#

Run each lifecycle command individually against a database, or use all to run the entire sequence in one invocation.

edg up \
--driver pgx \
--config _examples/tpcc/crdb.yaml \
--url "postgres://root@localhost:26257?sslmode=disable"

edg seed \
--driver pgx \
--config _examples/tpcc/crdb.yaml \
--url "postgres://root@localhost:26257?sslmode=disable"

edg run \
--driver pgx \
--config _examples/tpcc/crdb.yaml \
--url "postgres://root@localhost:26257?sslmode=disable" \
-w 100 \
-d 1m

edg deseed \
--driver pgx \
--config _examples/tpcc/crdb.yaml \
--url "postgres://root@localhost:26257?sslmode=disable"

edg down \
--driver pgx \
--config _examples/tpcc/crdb.yaml \
--url "postgres://root@localhost:26257?sslmode=disable"

Or use all to run the entire workflow in one command:

edg all \
--driver pgx \
--config _examples/tpcc/crdb.yaml \
--url "postgres://root@localhost:26257?sslmode=disable" \
-w 100 \
-d 5m

Aurora DSQL#

The dsql driver uses AWS IAM authentication instead of a username and password. Pass the cluster endpoint as the --url value:

edg all \
--driver dsql \
--config workload.yaml \
--url "clusterid.dsql.us-east-1.on.aws" \
-w 10 \
-d 5m

AWS credentials are resolved from the standard chain (environment variables, ~/.aws/credentials, IAM role, etc.). The region is parsed from the cluster endpoint automatically. Auth tokens are refreshed on every new connection, so long-running workloads work without interruption.

DSQL uses PostgreSQL-compatible SQL, so use $1, $2 placeholders in your queries.

Licensing#

The pgx, mysql, mongodb, and cassandra drivers are free to use. Pro drivers (oracle, mssql, dsql, spanner) require a license key passed via --license or EDG_LICENSE. The license is validated before connecting to the database. See the Pricing page for full details.

Validating Config#

The validate config command parses a config file and checks it for errors without connecting to a database. It catches YAML syntax errors, invalid expressions, unknown function calls, duplicate query names, shadowed built-ins, and invalid query types. Errors include YAML line numbers when available.

edg validate config --config _examples/tpcc/crdb.yaml
config is valid

Add --explain for enhanced error messages with explanations and correct syntax examples:

edg validate config --config workload.yaml --explain
line 42: duplicate query name "seed_users"

  Query names must be unique across all sections. They serve as dataset keys
  for ref_* functions and as metric labels. Rename one of the duplicates.

This is useful for catching mistakes before deploying a workload or as a CI check.

Validating a License#

The validate license command checks whether a license key is valid for a given driver and prints the license details.

edg validate license --driver oracle --license "your-license-key"
License info:
  ID:         acme-corp
  Email:      admin@acme.com
  Drivers:    [oracle mssql]
  Issued at:  2025-01-15
  Expires at: 2026-01-15
License is valid for driver "oracle".

If the driver doesn’t require a license, the output tells you:

edg validate license --driver pgx --license "your-license-key"
License info:
  ID:         acme-corp
  Email:      admin@acme.com
  Drivers:    [oracle mssql]
  Issued at:  2025-01-15
  Expires at: 2026-01-15
Driver "pgx" does not require a license.

If the license is expired or doesn’t cover the requested driver, you’ll see an error:

edg validate license --driver dsql --license "your-license-key"
License info:
  ID:         acme-corp
  Email:      admin@acme.com
  Drivers:    [oracle mssql]
  Issued at:  2025-01-15
  Expires at: 2026-01-15
Error: license does not include driver "dsql" (licensed: [oracle mssql])

The EDG_LICENSE environment variable is also accepted:

export EDG_LICENSE="your-license-key"
edg validate license --driver oracle

Retries#

The --retries flag controls how many times a failed transaction is retried before the error is recorded. The default is 0 (no retries). Retries only apply to transactions (queries wrapped in a transaction: block), not standalone queries.

When a transaction fails, edg waits with exponential backoff before retrying:

AttemptBackoff
1st retry2ms
2nd retry4ms
3rd retry8ms
4th retry16ms
nth retry2^n ms

If all retry attempts fail, the last error is recorded in the stats and the worker continues to the next iteration. Context cancellation (e.g. Ctrl+C or duration expiry) stops retries immediately.

edg run \
  --driver pgx \
  --config workload.yaml \
  --url ${DATABASE_URL} \
  --retries 3 \
  -w 10 \
  -d 5m

Error Output#

By default, individual query errors during the run phase are counted but not printed. The --errors flag prints each error to stderr as it occurs, which is useful for debugging:

edg run \
  --driver pgx \
  --config workload.yaml \
  --url ${DATABASE_URL} \
  --errors \
  -w 10 \
  -d 5m
2025/04/23 14:32:07 ERROR run error worker=3 error="running run query debit_source: pq: insufficient funds"
2025/04/23 14:32:07 ERROR run error worker=7 error="running run query debit_source: pq: insufficient funds"

Without --errors, the same failures still appear in the summary table’s ERRORS column and count toward error_rate in expectations.

Connection Pool#

The --pool-size flag sets the maximum number of open database connections (SetMaxOpenConns and SetMaxIdleConns). The default 0 uses the driver’s default, which is typically unlimited.

Setting pool size is useful for:

  • Simulating constrained environments where the application has a fixed connection budget.
  • Preventing connection exhaustion when running with many workers against a database with connection limits.
  • Isolating connection overhead from query performance in benchmarks.
edg run \
  --driver pgx \
  --config workload.yaml \
  --url ${DATABASE_URL} \
  --pool-size 20 \
  -w 50 \
  -d 5m

In this example, 50 workers share 20 connections. Workers that can’t acquire a connection will block until one becomes available.

Warmup#

The --warmup-duration flag runs workers for a specified period before collecting metrics. During warmup, query results are discarded. They don’t appear in progress output, the summary, Prometheus metrics, or expectations.

This produces cleaner benchmark results by allowing the database to warm its caches, JIT-compile query plans, and reach a steady state before measurement begins.

edg run \
  --driver pgx \
  --config workload.yaml \
  --url ${DATABASE_URL} \
  --warmup-duration 30s \
  -w 10 \
  -d 5m

In this example, workers run for 30 seconds of warmup (discarded), then 5 minutes of measured execution. The total wall-clock time is 5m30s.

When using stages, warmup applies before the first stage begins collecting metrics.

Run Behaviour#

Workers and Initialisation#

Each worker gets its own isolated environment. The init section runs once, and its results are cloned to each worker so that functions like ref_rand and ref_diff don’t interfere across workers. Per-worker state includes sequence counters (seq), permanent row picks (ref_perm), and NURand constants.

Stages#

When a config file includes a stages section, each stage defines its own worker count and duration, and stages run sequentially. Explicitly passing -w or -d overrides the stages section and falls back to single-stage mode. See Configuration > Stages for details.

edg run \
--driver pgx \
--config _examples/stages/crdb.yaml \
--url "postgres://root@localhost:26257?sslmode=disable"

Error Handling#

Query errors during run are non-fatal. The worker logs the error and increments an error counter but continues to the next iteration. This lets you observe error rates without aborting the benchmark. Errors in other sections (up, seed, deseed, down, init) are fatal and stop execution immediately.

Interrupting with Ctrl+C#

Pressing Ctrl+C during run or all cancels the workload gracefully. Workers finish their current iteration and stop. When using all, the cleanup phases (deseed and down) still run after interruption, using a fresh context.

Output#

During the run, progress is printed at the --print-interval (default: every second):

59s / 1m0s
QUERY          COUNT  ERRORS  AVG      p50      p95      p99      QPS
check_balance  3674   0       2.631ms  2.367ms  4.154ms  6.252ms  62.3
credit_target  3769   0       1.68ms   1.495ms  2.624ms  3.911ms  63.9
debit_source   3769   0       2.376ms  2.13ms   3.722ms  5.288ms  63.9
read_source    3770   0       2.047ms  1.803ms  3.254ms  5.052ms  63.9
read_target    3769   0       2.839ms  2.579ms  4.486ms  6.446ms  63.9

TRANSACTION    COMMITS  ROLLBACKS  ERRORS  AVG       p50       p95       p99       TPS
make_transfer  3769     0          0       13.053ms  12.424ms  18.498ms  26.074ms  63.9

After all workers complete, a final summary is printed:

summary
Duration:  1m0.004s
Workers:   1

QUERY          COUNT  ERRORS  AVG      p50      p95      p99      QPS
check_balance  3749   0       2.628ms  2.362ms  4.14ms   6.249ms  62.5
credit_target  3828   0       1.681ms  1.497ms  2.624ms  3.911ms  63.8
debit_source   3828   1       2.381ms  2.13ms   3.724ms  5.338ms  63.8
read_source    3829   0       2.046ms  1.802ms  3.25ms   5.052ms  63.8
read_target    3829   0       2.843ms  2.583ms  4.485ms  6.446ms  63.8

TRANSACTION    COMMITS  ROLLBACKS  ERRORS  AVG       p50       p95       p99       TPS
make_transfer  3828     0          1       13.063ms  12.438ms  18.498ms  26.652ms  63.8

Transactions:  19063
Errors:        1
tpm:           19061.6
MetricDescription
COUNTTotal successful query executions
ERRORSTotal failed query executions
AVGMean execution time per query
p50Median latency (50th percentile)
p9595th percentile latency
p9999th percentile latency
QPSQueries per second (count / elapsed seconds)
tpmTransactions per minute across all queries

Expectations#

When the config file includes an expectations section, results are printed after the summary and the exit code reflects whether all expectations passed:

expectations
  PASS  error_rate < 1
  PASS  check_balance.p99 < 100
  FAIL  tpm > 5000

1 expectation(s) failed

If any expectation fails, edg exits with status code 1. When using all, teardown (deseed and down) still runs before the non-zero exit.

See Configuration > Expectations for the full list of available metrics and expression syntax.