qmd/test
Tobi Lütke 3b87e3e224
feat: query document format, lex phrase/negation syntax, training data
The 'query document' is now a first-class concept in QMD: a structured
document with typed sub-queries that combine for best recall.

## Query types
- lex:    BM25 keyword search with phrase and negation syntax
- vec:    Semantic vector search (natural language questions)
- hyde:   Hypothetical document (write the expected answer)
- expand: Auto-expand via local LLM (max 1, default for plain queries)

## Lex syntax
Full BM25 operator support:
  "exact phrase"     verbatim match, no prefix
  -term              exclude documents containing term
  -"exact phrase"   exclude documents containing phrase

Examples:
  "C++ performance" optimization -sports -athlete
  "connection pool" timeout -redis
  "machine learning" -sports -athlete

## MCP tool description rewritten
The 'query' tool description now fully teaches AI agents the query
document format, lex syntax, and strategy for combining types.
Includes worked examples including intent-aware lex (C++ performance,
not sports) which is critical for disambiguation in dense corpora.

## Unit tests
11 new lex parser tests covering:
- plain terms, quoted phrases, negation, combined
- intent-aware disambiguation (performance -sports -athlete)
- only-negation returns null (FTS5 constraint)
- empty/whitespace handling

## Training data
12 new intent-aware examples for next model training round:
- Real technical topics with lex phrase+negation combinations
- Covers: C++ perf, Python memory, DB connections, rate limiting,
  SQL optimization, ML overfitting, Docker, JWT, async/await,
  git conflicts, Kubernetes, React state
- Each shows how context/intent shapes lex query construction
  (e.g. performance with C++ context → -sports -athlete exclusions)
2026-02-19 06:52:58 -05:00
..
eval-docs Add 6 synthetic evaluation documents 2025-12-21 13:10:35 -04:00
cli.test.ts fix: correct test paths after moving to test/ directory 2026-02-15 21:46:45 -04:00
collections-config.test.ts fix(test): reset currentIndexName between test files 2026-02-18 15:53:58 -04:00
eval-bm25.test.ts test: move all tests to flat test/ directory 2026-02-15 21:37:47 -04:00
eval-harness.ts Fix qmd embed crash and resolve all TypeScript errors 2025-12-31 13:32:30 -04:00
eval.test.ts test: move all tests to flat test/ directory 2026-02-15 21:37:47 -04:00
formatter.test.ts test: move all tests to flat test/ directory 2026-02-15 21:37:47 -04:00
llm.test.ts test: move all tests to flat test/ directory 2026-02-15 21:37:47 -04:00
mcp.test.ts feat: add expand: type, rename to query, document syntax 2026-02-18 22:22:50 -05:00
multi-collection-filter.test.ts fix: support multiple -c collection filters in search commands 2026-02-16 14:03:53 -04:00
store-paths.test.ts test: move all tests to flat test/ directory 2026-02-15 21:37:47 -04:00
store.helpers.unit.test.ts test: move all tests to flat test/ directory 2026-02-15 21:37:47 -04:00
store.test.ts test: move all tests to flat test/ directory 2026-02-15 21:37:47 -04:00
structured-search.test.ts feat: query document format, lex phrase/negation syntax, training data 2026-02-19 06:52:58 -05:00