qmd/bench at runtime-6aa78cd469ab - qmd

History

John R Milinovich b7a5a86a9b feat(cli): add `qmd bench` command for search quality benchmarks Adds a benchmark harness that measures search quality across backends. Given a fixture file with queries and expected results, it runs each query through BM25, vector, hybrid (no rerank), and full pipeline, then reports precision@k, recall, MRR, F1, and latency. This is primarily a regression testing tool — users create fixtures for their own vaults to catch quality regressions after config or index changes. Ships with an example fixture against the eval-docs test collection to demonstrate the format. New files: src/bench/bench.ts — main runner src/bench/score.ts — precision, recall, MRR, F1, path matching src/bench/types.ts — fixture and result types src/bench/fixtures/ — example fixture test/bench-score.test.ts — unit tests for scoring (16 tests) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>		2026-04-05 17:17:59 -04:00
..
fixtures	feat(cli): add `qmd bench` command for search quality benchmarks	2026-04-05 17:17:59 -04:00
bench.ts	feat(cli): add `qmd bench` command for search quality benchmarks	2026-04-05 17:17:59 -04:00
score.ts	feat(cli): add `qmd bench` command for search quality benchmarks	2026-04-05 17:17:59 -04:00
types.ts	feat(cli): add `qmd bench` command for search quality benchmarks	2026-04-05 17:17:59 -04:00