Setting Up Regression Detection
Track quality over time with automatic baseline comparison
Regression detection is Regtrace's core feature. Every run is automatically compared against a baseline to detect quality degradation.
How it works
Each time you run regtrace run, the result is stored in .regtrace/runs/.
The next run compares against the previous passing run and reports deltas.
First run
Run evaluation once to establish a baseline:
regtrace runOutput includes a regression block:
Regression: clean
Suite delta: +0.0%
Baseline: noneThe first run has no baseline — it establishes the initial scores.
Second run
After making changes (prompt tweak, model swap, pipeline update), run again:
regtrace runNow regression compares against the first run:
Regression: warning
Suite delta: -7.2%
Baseline: run_20260101_a3f9Configuring thresholds
In regtrace.config.yaml:
metrics:
regression:
enabled: true
baseline_strategy: last_passing
tolerance: 0.05 # -5% triggers warning
critical_threshold: 0.15 # -15% triggers critical
exclude_new_test_cases: trueQuality gates
The quality gates block enforces pass/fail:
quality_gates:
suite_score_minimum: 0.7
max_failed_test_cases: 0
max_low_confidence_ratio: 0.1
regression_gate: trueWith regression_gate: true, a critical regression fails the suite regardless
of the absolute score.
Viewing regression in a report
regtrace history --run-id run_20260101_a3f9Shows regression status, suite delta, and per-metric deltas compared to the baseline.
Diff two runs
regtrace history --diff run_20260101_def run_20260101_abcShows suite score delta, per-metric deltas, and which test cases changed status.
Pinning a baseline
By default, Regtrace uses the last passing run as baseline. Pin a specific run:
regtrace baseline pin run_20260101_a3f9This sets baseline_strategy: pinned and stores the run ID in config.
Revert to automatic:
regtrace baseline unpinCheck current baseline:
regtrace baseline showNext steps
- Configure metrics — tune thresholds and weights
- CI integration — automate regression checks