Getting Started
Install Regtrace and run your first evaluation in 5 minutes
This tutorial walks you through installing Regtrace, creating a project, and running your first evaluation.
Prerequisites
- Linux, macOS, or Windows
- No runtime dependencies — the binary is self-contained
Install
curl -L https://github.com/decimozs/regtrace/releases/latest/download/regtrace-linux-x64 -o regtrace && chmod +x regtrace && sudo mv regtrace /usr/local/bin/regtrace && regtrace --versioncurl -L https://github.com/decimozs/regtrace/releases/latest/download/regtrace-darwin-arm64 -o regtrace && chmod +x regtrace && sudo mv regtrace /usr/local/bin/regtrace && regtrace --versioncd $env:USERPROFILE
curl -L https://github.com/decimozs/regtrace/releases/latest/download/regtrace-windows-x64.exe -o "$env:USERPROFILE\Downloads\regtrace.exe"
Move-Item "$env:USERPROFILE\Downloads\regtrace.exe" "$env:USERPROFILE\AppData\Local\Microsoft\WindowsApps\regtrace.exe"
regtrace --versionNote:
WindowsAppsis in PATH by default on most systems. Ifregtrace --versionfails, move the binary to another directory in your%PATH%.
You can also build from source with Bun: clone the repo and
run bun run build. The binary is output to the repo root.
Create a project
mkdir my-eval-project
cd my-eval-project
regtrace initThis creates:
regtrace.config.yaml— project configurationgolden-sets/qa.yaml— a sample golden set with two test cases.gitignore— with.regtrace/runs/and.enventries.env.example— template with commented API key placeholders.regtrace/runs/— local run storage
Inspect the golden set
Open golden-sets/qa.yaml. It contains two test cases:
test_cases:
- id: qa-001
description: Basic question about the capital of France
input: "What is the capital of France?"
expected_output: "The capital of France is Paris."
actual_output: null
metrics: [factuality, format, tone]
weight: 1The actual_output field is null because it will be populated when you run
evaluation with an LLM provider.
Set up a judge provider
Regtrace needs an LLM to judge factuality and tone metrics. Set your API key
in a .env file in the project directory:
echo "GROQ_API_KEY=gsk_..." > .envOr export it:
export GROQ_API_KEY=gsk_...For other providers see switching providers.
Populate the golden set with outputs
Two options to fill actual_output:
Option A — manually: Edit golden-sets/qa.yaml and fill in actual_output
for both cases:
actual_output: "The capital of France is Paris."Option B — auto-generate: Run with the --generate flag. Regtrace calls
the configured LLM provider to produce actual_output for every null case:
regtrace run --generateEither way, Regtrace scores each output against the expected value.
Run evaluation
regtrace runTerminal output shows:
- Each test case and its per-metric scores
- Suite-level summary with pass/fail status
- Quality gates results
View run history
regtrace listShow details of a specific run:
regtrace history --run-id run_20260101_abc123Next steps
- Create a golden set — write your own test cases
- Set up regression — track quality over time
- CI integration — add Regtrace to your pipeline