Mar 19, 2026 · 10 min read · audio
Deterministic LLM gating (Part 2): a harness for contract-bound outputs
How the initial kestrel-evals implementation works: YAML suites, provider calls, deterministic checks, baseline reports, and CI gating.
Writing
Posts tagged “tooling”.
Mar 19, 2026 · 10 min read · audio
How the initial kestrel-evals implementation works: YAML suites, provider calls, deterministic checks, baseline reports, and CI gating.