Trace Evaluate Enforce
LLM Regression and Quality Gate Framework
Regtrace is a CLI tool for evaluating and benchmarking LLM outputs. It helps you detect regressions, enforce quality gates, and understand how your model performs across multiple dimensions.
Getting started
Quick Start
Install Regtrace and run your first evaluation in under 5 minutes. No API key needed for format checks.
What is Regtrace?
Understand the four pillars — Factuality, Format, Tone, and Regression — and how the linter-for-LLM-outputs mental model works.
Installation
Download a standalone binary for Linux, macOS, or Windows. No runtime dependencies needed.
Comparisons
How Regtrace differs from DeepEval and other LLM evaluation frameworks. Human-label-first philosophy with four deep pillars.
Guides
Practical how-to recipes for CI/CD integration, metric configuration, report generation, and provider setup.
How this documentation is organized
Tutorials
Start here if you are new to Regtrace. Follow step-by-step lessons to install, create golden sets, and run your first evaluation.
How-to Guides
Practical recipes for common tasks: CI integration, metric configuration, report generation, and provider setup.
Reference
Technical descriptions of the CLI commands, config schema, golden set format, and run record structure.
Examples
Ready-to-run production examples: customer support, content moderation, RAG documentation, code generation, translation, and more.
Explanation
Deep dives into how Regtrace evaluates outputs, why regression detection is the core feature, and the trade-offs between deterministic and LLM-judged metrics.