RegtraceRegtrace

CLI Reference

Complete command-line reference for regtrace

Synopsis

regtrace <command> [options]

Global options

Prop

Type

Commands

init

regtrace init

Scaffold a new Regtrace project:

  • Creates regtrace.config.yaml with defaults
  • Creates golden-sets/qa.yaml with sample test cases
  • Creates .gitignore with .regtrace/ and .env*
  • Creates .env.example with commented API key placeholders
  • Creates .regtrace/runs/ directory

Prop

Type

Required config block that must always be present: metrics.tone (can disable all sub-dimensions to skip tone evaluation). It is populated by regtrace init.

Use --force to overwrite existing files:

regtrace init --force

run

regtrace run [options]

Run evaluation on all enabled golden sets.

Prop

Type

--format and --trigger values are validated at startup. Invalid values produce an error and exit code 2. Human-readable output goes to stderr; JSON output (--format json) goes to stdout for piping.

Steps:

Dry-run mode

regtrace run --dry-run validates the config, golden sets, and schema without executing any evaluations or spending tokens. It completes in under two seconds and is the recommended way to verify setup before a full run.

CI mode

--ci suppresses color output and exits with code 1 when quality gates fail. CI auto-detection checks CI, GITHUB_ACTIONS, GITLAB_CI, and CIRCLECI environment variables. Use --no-ci to override auto-detection.

Verbose mode

By default, regtrace run only shows failed test cases. Pass --verbose to list all test cases including passing ones.

Generate mode

regtrace run --generate calls an LLM provider to produce actual_output for test cases where it is null in the golden set, then evaluates normally.

The generator defaults to the judge.primary provider. Override with an optional generator block in the config file (see config file reference). Generated output is stored in the run record only; the golden set YAML is never modified.

scaffold

regtrace scaffold [options]

Scaffold golden sets from existing run records or output files. Generates a complete golden set YAML with auto-populated expected_output and auto-detected metrics (JSON output → format sub-checks, prose → tone).

Prop

Type

Examples:

# Scaffold from a run record, print to stdout
regtrace scaffold --from-run run_20260603_a1b2c3

# Scaffold from a JSONL file and save
regtrace scaffold --from-file outputs.jsonl --write

# Scaffold from JSON with custom name
regtrace scaffold --from-file outputs.json --name my-set --write

list

regtrace list [options]

List recent evaluation runs.

Prop

Type

history

regtrace history [options]

Show detailed run information.

Prop

Type

watch

regtrace watch [options]

Watch golden set files for changes and re-run evaluation.

Prop

Type

baseline

regtrace baseline <subcommand>

Manage regression baselines.

SubcommandArgsDescription
pin <run-id>Run ID to pinPin baseline to a specific run
unpinRevert to last_passing strategy
showDisplay current baseline info

db

regtrace db rebuild [options]

Manage the SQLite run record database.

SubcommandDescription
rebuildRebuild database from .regtrace/runs/ JSON files

Rebuilding imports all existing JSON run records into the SQLite database. Existing records are replaced. Corrupt files are skipped.

upgrade

regtrace upgrade [options]

Upgrade the regtrace binary to the latest GitHub release. Checks your current version against the latest stable release, downloads the matching binary for your platform (linux-x64, darwin-arm64, or windows-x64), verifies its SHA256 checksum, and replaces the running binary in-place.

Prop

Type

The old binary is backed up to .backup before the swap. If the new binary fails verification, the backup is restored automatically. If regtrace was installed with sudo, run sudo regtrace upgrade.

uninstall

regtrace uninstall [options]

Remove the regtrace binary from your system. Only the binary is deleted — project files (configs, golden sets, run history) are left untouched.

On Linux and macOS the binary is removed immediately. On Windows a background script deletes it after the process exits.

Prop

Type

Quality gates

Quality gates are checked after every run. They determine pass/fail.

See config file reference for quality gate options and defaults.

Exit codes

CodeMeaning
0All quality gates passed
1One or more quality gates failed
2Config or schema error — evaluation did not run

Config and schema errors are distinguished from evaluation failures so that pipeline logic can react appropriately to each case.

On this page