RegtraceRegtrace

Getting Started

Install Regtrace and run your first evaluation in 5 minutes

This tutorial walks you through installing Regtrace, creating a project, and running your first evaluation.

Prerequisites

  • Linux, macOS, or Windows
  • No runtime dependencies — the binary is self-contained

Install

curl -L https://github.com/decimozs/regtrace/releases/latest/download/regtrace-linux-x64 -o regtrace && chmod +x regtrace && sudo mv regtrace /usr/local/bin/regtrace && regtrace --version
curl -L https://github.com/decimozs/regtrace/releases/latest/download/regtrace-darwin-arm64 -o regtrace && chmod +x regtrace && sudo mv regtrace /usr/local/bin/regtrace && regtrace --version
cd $env:USERPROFILE
curl -L https://github.com/decimozs/regtrace/releases/latest/download/regtrace-windows-x64.exe -o "$env:USERPROFILE\Downloads\regtrace.exe"
Move-Item "$env:USERPROFILE\Downloads\regtrace.exe" "$env:USERPROFILE\AppData\Local\Microsoft\WindowsApps\regtrace.exe"
regtrace --version

Note: WindowsApps is in PATH by default on most systems. If regtrace --version fails, move the binary to another directory in your %PATH%.

You can also build from source with Bun: clone the repo and run bun run build. The binary is output to the repo root.

Create a project

mkdir my-eval-project
cd my-eval-project
regtrace init

This creates:

  • regtrace.config.yaml — project configuration
  • golden-sets/qa.yaml — a sample golden set with two test cases
  • .gitignore — with .regtrace/runs/ and .env entries
  • .env.example — template with commented API key placeholders
  • .regtrace/runs/ — local run storage

Inspect the golden set

Open golden-sets/qa.yaml. It contains two test cases:

test_cases:
  - id: qa-001
    description: Basic question about the capital of France
    input: "What is the capital of France?"
    expected_output: "The capital of France is Paris."
    actual_output: null
    metrics: [factuality, format, tone]
    weight: 1

The actual_output field is null because it will be populated when you run evaluation with an LLM provider.

Set up a judge provider

Regtrace needs an LLM to judge factuality and tone metrics. Set your API key in a .env file in the project directory:

echo "GROQ_API_KEY=gsk_..." > .env

Or export it:

export GROQ_API_KEY=gsk_...

For other providers see switching providers.

Populate the golden set with outputs

Two options to fill actual_output:

Option A — manually: Edit golden-sets/qa.yaml and fill in actual_output for both cases:

actual_output: "The capital of France is Paris."

Option B — auto-generate: Run with the --generate flag. Regtrace calls the configured LLM provider to produce actual_output for every null case:

regtrace run --generate

Either way, Regtrace scores each output against the expected value.

Run evaluation

regtrace run

Terminal output shows:

  • Each test case and its per-metric scores
  • Suite-level summary with pass/fail status
  • Quality gates results

View run history

regtrace list

Show details of a specific run:

regtrace history --run-id run_20260101_abc123

Next steps

On this page