qmd/.gitignore at main - qmd - gitea.svc.plus

ai-workspace-services/qmd

Tobi Lutke f6a6716c44

Refactor evals into separate run and score scripts

New structure:
- evals/run.py: Generate model outputs to JSONL
- evals/score.py: Score outputs with detailed breakdown
- evals/queries.txt: Test queries (26 total)

Features:
- Supports both HF Hub and local model paths
- Named entity preservation scoring
- Chat template leakage detection
- Strict format validation (every line must be lex:/vec:/hyde:)
- Generic phrase detection

Usage:
  uv run evals/run.py --model tobil/qmd-query-expansion-0.6B-v4
  uv run evals/score.py evals/results_*.jsonl

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

2026-01-25 00:40:33 -05:00

4 lines

73 B

Plaintext

Raw Permalink Blame History

	`# Generated results (re-run evals locally)`
	`results_*.jsonl`
	`scores_*.json`