qmd/src/bench
John R Milinovich b7a5a86a9b feat(cli): add qmd bench command for search quality benchmarks
Adds a benchmark harness that measures search quality across backends.
Given a fixture file with queries and expected results, it runs each
query through BM25, vector, hybrid (no rerank), and full pipeline,
then reports precision@k, recall, MRR, F1, and latency.

This is primarily a regression testing tool — users create fixtures
for their own vaults to catch quality regressions after config or
index changes. Ships with an example fixture against the eval-docs
test collection to demonstrate the format.

New files:
  src/bench/bench.ts       — main runner
  src/bench/score.ts       — precision, recall, MRR, F1, path matching
  src/bench/types.ts       — fixture and result types
  src/bench/fixtures/      — example fixture
  test/bench-score.test.ts — unit tests for scoring (16 tests)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-05 17:17:59 -04:00
..
fixtures feat(cli): add qmd bench command for search quality benchmarks 2026-04-05 17:17:59 -04:00
bench.ts feat(cli): add qmd bench command for search quality benchmarks 2026-04-05 17:17:59 -04:00
score.ts feat(cli): add qmd bench command for search quality benchmarks 2026-04-05 17:17:59 -04:00
types.ts feat(cli): add qmd bench command for search quality benchmarks 2026-04-05 17:17:59 -04:00