Scaffold PRO#

The edg scaffold command is an interactive wizard that generates a complete workload config. It walks you through driver selection, table definitions, and column types, then outputs to stdout.

edg scaffold > workload.edg

Walkthrough#

The wizard prompts for three things:

Database driver - select from pgx, mysql, mssql, oracle, mongodb, cassandra, spanner, or dsql.
Table names - comma-separated list of tables to generate (e.g. users, orders, products).
Seed row counts - comma-separated count per table (e.g. 10000, 5000, 1000). Tables without a count default to 10000.
Columns per table - for each table, enter name:type pairs comma-separated (e.g. id:uuid, email:text, age:int). Leave blank for defaults (id:uuid, name:text).

Flags#

Flag	Default	Description
`--format`	`edg-lang`	Output format (`edg-lang` or `yaml`)

Supported column types#

Type	SQL Type (pgx)	Expression
`uuid`	`UUID`	`uuid_v4()`
`text`	`TEXT`	`gen('name')`
`int`	`INT`	`uniform(1, 10000)`
`float`	`DOUBLE PRECISION`	`uniform(0.0, 100.0)`
`bool`	`BOOLEAN`	`bool()`
`timestamp`	`TIMESTAMPTZ`	`timestamp('2024-01-01T00:00:00Z', '2025-01-01T00:00:00Z')`

SQL types are driver-aware. For example, uuid becomes CHAR(36) on MySQL and UNIQUEIDENTIFIER on MSSQL.

Generated Config#

The output includes all standard sections:

Section	Purpose
`globals`	Per-table row count and batch size
`objects`	Column expressions for each table (SQL drivers only)
`up`	`CREATE TABLE` statements (or MongoDB `create` commands)
`seed`	Batch insert queries using `__columns__` and `__values__`
`init`	`SELECT` queries to fetch seeded data for `ref_*` access
`run`	Point read queries using `ref_rand`
`deseed`	`TRUNCATE` statements
`down`	`DROP TABLE` statements

Example#

edg scaffold > workload.edg
# Select: pgx
# Tables: users, orders
# Row counts: 5000, 10000
# users columns: id:uuid, email:text
# orders columns: id:uuid, total:float

Output defaults to edg-lang. Use --format yaml for YAML:

edg scaffold --format yaml > workload.yaml

Produces:

edg-lang

let users_rows = 5000
let orders_rows = 10000
let batch_size = 1000

object users {
  id = uuid_v4()
  email = gen('name')
}

object orders {
  id = uuid_v4()
  total = uniform(0.0, 100.0)
}

up {
  create_users `CREATE TABLE IF NOT EXISTS users (
    id UUID,
    email TEXT
  )`
  create_orders `CREATE TABLE IF NOT EXISTS orders (
    id UUID,
    total DOUBLE PRECISION
  )`
}

seed {
  seed_users(type: exec_batch, count: users_rows, size: batch_size, object: users)
    `INSERT INTO users __columns__ __values__`
  seed_orders(type: exec_batch, count: orders_rows, size: batch_size, object: orders)
    `INSERT INTO orders __columns__ __values__`
}

init {
  fetch_users `SELECT * FROM users LIMIT 1000`
  fetch_orders `SELECT * FROM orders LIMIT 1000`
}

run {
  read_users
    `SELECT * FROM users WHERE id = $1` (ref_rand('fetch_users').id)
  read_orders
    `SELECT * FROM orders WHERE id = $1` (ref_rand('fetch_orders').id)
}

deseed {
  clean_users `TRUNCATE TABLE users`
  clean_orders `TRUNCATE TABLE orders`
}

down {
  drop_users `DROP TABLE IF EXISTS users`
  drop_orders `DROP TABLE IF EXISTS orders`
}

YAML

globals:
  users_rows: 5000
  orders_rows: 10000
  batch_size: 1000

objects:
  users:
    id: uuid_v4()
    email: gen('name')
  orders:
    id: uuid_v4()
    total: uniform(0.0, 100.0)

up:
  - name: create_users
    query: |-
      CREATE TABLE IF NOT EXISTS users (
        id UUID,
        email TEXT
      )
  - name: create_orders
    query: |-
      CREATE TABLE IF NOT EXISTS orders (
        id UUID,
        total DOUBLE PRECISION
      )

seed:
  - name: seed_users
    type: exec_batch
    count: users_rows
    size: batch_size
    object: users
    query: |-
      INSERT INTO users __columns__ __values__
  - name: seed_orders
    type: exec_batch
    count: orders_rows
    size: batch_size
    object: orders
    query: |-
      INSERT INTO orders __columns__ __values__

init:
  - name: fetch_users
    query: |-
      SELECT * FROM users LIMIT 1000
  - name: fetch_orders
    query: |-
      SELECT * FROM orders LIMIT 1000

run:
  - name: read_users
    query: |-
      SELECT * FROM users WHERE id = ${ref_rand('fetch_users').id}
  - name: read_orders
    query: |-
      SELECT * FROM orders WHERE id = ${ref_rand('fetch_orders').id}

deseed:
  - name: clean_users
    query: |-
      TRUNCATE TABLE users
  - name: clean_orders
    query: |-
      TRUNCATE TABLE orders

down:
  - name: drop_users
    query: |-
      DROP TABLE IF EXISTS users
  - name: drop_orders
    query: |-
      DROP TABLE IF EXISTS orders

MongoDB#

When the mongodb driver is selected, the scaffold generates JSON commands instead of SQL and omits the objects section (MongoDB doesn’t use __columns__/__values__):

edg-lang

up {
  create_orders `{"create": "orders"}`
}

seed {
  seed_orders(type: exec_batch, count: orders_rows, size: batch_size, object: orders)
    `{"insert": "orders", "documents": [{"id": $1, "total": $2}]}`
}

YAML

up:
  - name: create_orders
    query: |-
      {"create": "orders"}

seed:
  - name: seed_orders
    type: exec_batch
    count: orders_rows
    size: batch_size
    object: orders
    query: |-
      {"insert": "orders", "documents": [{"id": $1, "total": $2}]}

Always validate generated configs before running them: edg validate config --config workload.edg

Next Steps#

After generating a config:

Review and customise - add foreign key relationships, transactions, run weights, or more complex expressions.
Validate - run edg validate config --config workload.edg to check for errors.
Test - use edg repl --config workload.edg to try expressions interactively.
Run - execute edg up && edg seed && edg run against your database.