Separate hardcoded default from env var in DEFAULT_EMBED_MODEL so the
constructor can resolve: config param > env var > hardcoded default.
Also add env var support for QMD_GENERATE_MODEL and QMD_RERANK_MODEL.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Adds a benchmark harness that measures search quality across backends.
Given a fixture file with queries and expected results, it runs each
query through BM25, vector, hybrid (no rerank), and full pipeline,
then reports precision@k, recall, MRR, F1, and latency.
This is primarily a regression testing tool — users create fixtures
for their own vaults to catch quality regressions after config or
index changes. Ships with an example fixture against the eval-docs
test collection to demonstrate the format.
New files:
src/bench/bench.ts — main runner
src/bench/score.ts — precision, recall, MRR, F1, path matching
src/bench/types.ts — fixture and result types
src/bench/fixtures/ — example fixture
test/bench-score.test.ts — unit tests for scoring (16 tests)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Adds QMD_LLAMA_GPU env var (set to false/off/none to force CPU) and
wraps getLlama() in try/catch so Vulkan/CUDA init failures on headless
or driverless machines fall back gracefully instead of crashing the
node process with an uncatchable C++ terminate().
- Restore .toLowerCase() in handelize (was dropped, both test files
expected it inconsistently)
- Convert dots to dashes in filename body (e.g. v2.0 -> v2-0), keeping
only the extension dot. Tobi confirmed this is the intended behavior.
- Align both test/store.test.ts and test/store.helpers.unit.test.ts to
match (they had diverged, one expected case-preserved, one lowercase)
- Adjust 'ensureVecTable recreates' test to expect throw behavior
(matches #501 dimension-mismatch fix)
* Test nix flake builds in CI
* Update outdated bun.lock file
* fix: restore toLowerCase() in handelize and update tests
* Fix flake to use proper FODs
---------
Co-authored-by: Tobias Lütke <tobi@shopify.com>
- Restore .toLowerCase() in handelize (was dropped somewhere, tests expect it)
- Update dimension-mismatch test to expect throw instead of silent rebuild
(matches new behavior from #501)
- Fix one stale test expectation for preserved dots in filenames
Brace expansion patterns like `{doc1,doc2}.md` or `collection/{a,b}.md`
were incorrectly parsed as comma-separated file lists instead of being
passed to the glob matcher (picomatch). This happened because the
comma-detection heuristic only checked for `*` and `?` but not `{`.
Also adds `collection/path` matching in `matchFilesByGlob` so patterns
like `my-collection/{file1,file2}.md` work — previously the glob only
matched against `qmd://collection/path` (virtual) and `path` (relative
to collection root), missing the `collection/path` form.
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
When switching to an embedding model with different dimensions,
ensureVecTableInternal() silently drops the vector table and all
embeddings are lost. Users only discover this when semantic search
returns empty results.
Throw an error instead, telling users to run 'qmd embed -f' to
explicitly re-embed. This is safe because embed -f calls
clearAllEmbeddings() which drops the table before ensureVecTable
is reached.
Related to #497
Co-authored-by: JohnRichardEnders <john@telli.com>
Without an explicit contextSize, node-llama-cpp defaults to "auto" which
allocates the model's full training context (often 32k). For embedding
chunks that are typically ~900 tokens this wastes ~3.5 GB of KV cache
per context on Apple Silicon unified memory.
Default to 2048 (matching the rerank context pattern) and allow override
via QMD_EMBED_CONTEXT_SIZE for users with larger chunks.
Addresses #329, related to #297
Co-authored-by: JohnRichardEnders <john@telli.com>
sanitizeFTS5Term stripped all non-letter/non-number characters including
underscores, causing snake_case identifiers like `my_variable` to become
`myvariable` and silently fail BM25 matches.
Add underscore to the preserved character set in the Unicode regex.
Export the function and add unit tests covering snake_case, contractions,
punctuation stripping, and unicode.
Fixes#305
Co-authored-by: Matt Van Horn <455140+mvanhorn@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The MCP status tool shows collection paths but not names,
making it impossible for agents to discover valid collection
filter values. The CLI 'qmd status' already shows names.
Add col.name prefix to each collection line in the status
tool response.
Resolve conflicts: combine AST chunking args (filepath, chunkStrategy)
with abort signal parameter from #458.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Resolve conflict: use CTE approach from #455 with updated BM25
weights (1.5, 4.0, 1.0) from #462.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
PR #475 changed handelize() to preserve original case and dots,
but the tests still expected lowercase output. Update assertions
to match the new behavior.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The MCP query tool always ran LLM reranking, even for lex-only queries.
On CPU-only infrastructure (e.g. Railway), the reranker adds 60-120s
per query. The SDK and CLI already support skipping reranking, but the
MCP server did not expose this option.
Add a `rerank` boolean parameter (default: true) to the MCP query
tool's input schema, forwarded to store.search() as the existing
`rerank` option.
Fixes#477
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The handelize() regex replaced all non-letter/non-number chars with
dashes, including dots in the filename stem. This mangled session
filenames like "topic-1773595309.753009.md" to "topic-1773595309-753009.md",
breaking memory_get path resolution (file not found on disk).
Fix: add dot to the preserved character class in the filename regex.
After deploying, run qmd-reindex.sh to rebuild indexes with correct paths.
Hyphenated terms like multi-agent, DEC-0054, gpt-4 were being stripped
of hyphens and concatenated (e.g., "multiagent") which missed matches.
Now they're split into FTS5 phrase queries ("multi agent") so the porter
tokenizer matches them correctly.
The bm25() call only had 2 weights for 3 columns (filepath, title, body),
giving body an implicit weight of 0. Add proper weights: filepath=1.5,
title=4.0, body=1.0 so title matches are boosted and body content is scored.
After the session's max duration timer fires (30 min), the embedding loop
continued iterating over all remaining chunks. Each embed call threw
SessionReleasedError, was caught, incremented errors, and the loop moved
to the next chunk — burning 100% CPU for days with zero useful output.
Three targeted fixes:
1. Check session.isValid before each batch iteration in the embedding loop,
breaking early when the session has been aborted.
2. Pass the session's AbortSignal to chunkDocumentByTokens so tokenization
also respects session expiry instead of running unbounded.
3. Add an error-rate circuit breaker: if >80% of processed chunks fail,
abort early rather than grinding through the remaining work.
Fixes#440
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
sqlite-vec's vec0 virtual tables silently ignore the OR REPLACE conflict
clause. When a crash interrupts embedding mid-way, chunks that were
inserted into vectors_vec but not content_vectors get re-selected by
getHashesForEmbedding, causing a UNIQUE constraint error on re-embed.
Two changes:
1. Insert content_vectors first so getHashesForEmbedding won't re-select
the hash if a crash occurs between the two inserts.
2. Use DELETE + INSERT for vectors_vec instead of INSERT OR REPLACE.
Fixes#445
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
MODEL_CACHE_DIR was hardcoded to ~/.cache/qmd/models/, ignoring the
XDG_CACHE_HOME environment variable. This was inconsistent with the rest
of the codebase (store.ts, cli/qmd.ts) which already respects XDG paths.
Fixes#425
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>