# Docs


- **Tutorials**
- [Getting Started](/docs/tutorials/getting-started): Install Regtrace and run your first evaluation in 5 minutes
- [Creating a Golden Set](/docs/tutorials/create-golden-set): Write your own evaluation test cases
- [Setting Up Regression Detection](/docs/tutorials/setup-regression): Track quality over time with automatic baseline comparison
- [Using Watch Mode](/docs/tutorials/watch-mode): Automatically re-run evaluations when golden set files change

- **How-to Guides**
- [CI/CD Integration](/docs/how-to/ci-integration): Add Regtrace to your pipeline — PR gates, nightly generation, trend monitoring
- [Configuring Metrics](/docs/how-to/configure-metrics): Tune thresholds, weights, and sub-checks per metric
- [Pinning a Baseline](/docs/how-to/pin-baseline): Lock regression comparison to a specific run
- [Debugging Failures](/docs/how-to/debug-failures): Diagnose why a test case failed and what to fix
- [Generating Reports](/docs/how-to/generate-reports): JSON and Markdown report output
- [Switching Judge Providers](/docs/how-to/switch-provider): Change the LLM provider used for evaluation
- [RAG Evaluation](/docs/how-to/rag-evaluation): Set up and evaluate RAG-based test cases
- [Troubleshooting](/docs/how-to/troubleshooting): Common issues and solutions when using Regtrace
- [Contributing](/docs/how-to/contributing): How to report bugs, request features, and submit pull requests to regtrace

- **Examples**
- [Examples Overview](/docs/examples/overview): Ready-to-run Regtrace examples organized by evaluation pattern

- **Reference**
- [CLI Reference](/docs/reference/cli): Complete command-line reference for regtrace
- [Config File Reference](/docs/reference/config-file): Complete schema for regtrace.config.yaml
- [Golden Set Reference](/docs/reference/golden-set): Complete schema for golden set YAML files
- [Run Record Reference](/docs/reference/run-record): Structure of persisted run records
- [Database Reference](/docs/reference/database): SQLite storage schema and configuration for run record persistence
- [Metrics Reference](/docs/reference/metrics): Complete reference for all evaluation metrics
- [Judge Provider Reference](/docs/reference/judge-providers): Complete reference for LLM judge providers
- [Agent Skill](/docs/reference/agent-skills): Agent skill for teaching AI agents about the regtrace CLI

- **Explanation**
- [Why Regtrace](/docs/explanation/why-regtrace): Problem statement, comparison to alternatives, and design philosophy
- [How Regtrace Works](/docs/explanation/how-regtrace-works): Architectural overview of the evaluation pipeline
- [The Four Pillars of Regtrace](/docs/explanation/four-pillars): Why golden sets, metrics, baselines, and quality gates are the core concepts
- [Architecture & Design Decisions](/docs/explanation/architecture-decisions): Why certain design choices were made and what they mean for users
- [Quality Gates Deep-Dive](/docs/explanation/quality-gates): How quality gates translate evaluation scores into pass/fail decisions
- [Regression Detection](/docs/explanation/regression): Why regression is the core feature and how it works
- [Limitations & Caveats](/docs/explanation/limitations): Known limitations of LLM-as-judge evaluation and how Regtrace mitigates them
- [Deterministic vs LLM-Judged Metrics](/docs/explanation/deterministic-vs-llm): Trade-offs between rule-based and LLM-based evaluation
- [Glossary](/docs/explanation/glossary): Common terms used throughout the Regtrace documentation