Parallel test files each cold-load their own LLM model, competing for
CPU and causing timeouts even at 120s. Sequential execution eliminates
contention — tests that timed out at 30s now complete in 1-15s.
Made-with: Cursor
When qmd update runs against an index created before case-preservation,
documents may exist under lowercase paths (e.g. "skill.md" for a file
actually named "SKILL.md"). Add findOrMigrateLegacyDocument() that:
- Falls back to a lowercase lookup when the canonical path is not found
- Renames the document path in-place via UPDATE OR IGNORE
- Manually rebuilds the FTS entry (FTS5 INSERT OR REPLACE does not
reliably update existing rows via triggers)
- Handles UNIQUE conflicts gracefully (returns null on conflict)
Embeddings are keyed by content hash, so the rename preserves all
existing vectors — no re-embedding required.
Both the CLI indexer and the library reindexer share the same helper,
eliminating the duplication that a previous review flagged.
Includes integration tests for: successful migration, already-lowercase
no-op, and UNIQUE conflict handling.
The blanket .toLowerCase() in handelize() drops filename casing,
which breaks path resolution on case-sensitive filesystems (Linux).
Files like README.md, CHANGELOG.md, and SKILL.md become unreachable
when the index stores them as readme.md, changelog.md, skill.md.
Since FTS5 already performs case-insensitive matching via the
unicode61 tokenizer, lowercasing the stored path provides no search
benefit — it only corrupts the metadata used to locate files on disk.
Remove .toLowerCase() and update all affected test expectations.
Bun runs all test files in a single process, so module-level state
leaks between files. The getDefaultDbPath test now resets the
_productionMode flag before asserting it throws, fixing the flaky
failure on Bun (ubuntu-latest) in CI.
These tests are already in store.helpers.unit.test.ts. The duplicates
in store.test.ts failed in CI because _productionMode module state
leaked from earlier tests in the same bun process, causing
getDefaultDbPath to return a path instead of throwing.
Separate hardcoded default from env var in DEFAULT_EMBED_MODEL so the
constructor can resolve: config param > env var > hardcoded default.
Also add env var support for QMD_GENERATE_MODEL and QMD_RERANK_MODEL.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Adds a benchmark harness that measures search quality across backends.
Given a fixture file with queries and expected results, it runs each
query through BM25, vector, hybrid (no rerank), and full pipeline,
then reports precision@k, recall, MRR, F1, and latency.
This is primarily a regression testing tool — users create fixtures
for their own vaults to catch quality regressions after config or
index changes. Ships with an example fixture against the eval-docs
test collection to demonstrate the format.
New files:
src/bench/bench.ts — main runner
src/bench/score.ts — precision, recall, MRR, F1, path matching
src/bench/types.ts — fixture and result types
src/bench/fixtures/ — example fixture
test/bench-score.test.ts — unit tests for scoring (16 tests)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Restore .toLowerCase() in handelize (was dropped, both test files
expected it inconsistently)
- Convert dots to dashes in filename body (e.g. v2.0 -> v2-0), keeping
only the extension dot. Tobi confirmed this is the intended behavior.
- Align both test/store.test.ts and test/store.helpers.unit.test.ts to
match (they had diverged, one expected case-preserved, one lowercase)
- Adjust 'ensureVecTable recreates' test to expect throw behavior
(matches #501 dimension-mismatch fix)
* Test nix flake builds in CI
* Update outdated bun.lock file
* fix: restore toLowerCase() in handelize and update tests
* Fix flake to use proper FODs
---------
Co-authored-by: Tobias Lütke <tobi@shopify.com>
- Restore .toLowerCase() in handelize (was dropped somewhere, tests expect it)
- Update dimension-mismatch test to expect throw instead of silent rebuild
(matches new behavior from #501)
- Fix one stale test expectation for preserved dots in filenames
Brace expansion patterns like `{doc1,doc2}.md` or `collection/{a,b}.md`
were incorrectly parsed as comma-separated file lists instead of being
passed to the glob matcher (picomatch). This happened because the
comma-detection heuristic only checked for `*` and `?` but not `{`.
Also adds `collection/path` matching in `matchFilesByGlob` so patterns
like `my-collection/{file1,file2}.md` work — previously the glob only
matched against `qmd://collection/path` (virtual) and `path` (relative
to collection root), missing the `collection/path` form.
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
sanitizeFTS5Term stripped all non-letter/non-number characters including
underscores, causing snake_case identifiers like `my_variable` to become
`myvariable` and silently fail BM25 matches.
Add underscore to the preserved character set in the Unicode regex.
Export the function and add unit tests covering snake_case, contractions,
punctuation stripping, and unicode.
Fixes#305
Co-authored-by: Matt Van Horn <455140+mvanhorn@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Resolve conflicts: combine AST chunking args (filepath, chunkStrategy)
with abort signal parameter from #458.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
PR #475 changed handelize() to preserve original case and dots,
but the tests still expected lowercase output. Update assertions
to match the new behavior.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Hyphenated terms like multi-agent, DEC-0054, gpt-4 were being stripped
of hyphens and concatenated (e.g., "multiagent") which missed matches.
Now they're split into FTS5 phrase queries ("multi agent") so the porter
tokenizer matches them correctly.
The bm25() call only had 2 weights for 3 columns (filepath, title, body),
giving body an implicit weight of 0. Add proper weights: filepath=1.5,
title=4.0, body=1.0 so title matches are boosted and body content is scored.
Add opt-in AST-aware chunk boundary detection for code files using
web-tree-sitter. When enabled with `--chunk-strategy auto`, code files
(.ts, .tsx, .js, .jsx, .py, .go, .rs) are chunked at function, class,
and import boundaries instead of arbitrary text positions. Default
behavior (`regex`) is unchanged — no surprises on upgrade.
In testing on QMD's own codebase, AST mode split 42% fewer function
bodies across chunk boundaries compared to regex-only chunking.
Usage:
qmd embed --chunk-strategy auto
qmd query "search terms" --chunk-strategy auto
What's included:
- Language detection from file extension with support for TypeScript,
JavaScript (including arrow functions and function expressions),
Python, Go, and Rust
- Per-language tree-sitter queries with scored break points aligned to
the existing markdown scale (class=100, function=90, type=80, import=60)
- AST break points merged with regex break points — highest score wins
at each position, so embedded markdown (comments, docstrings) still
benefits from regex patterns
- Refactored chunking core: chunkDocumentWithBreakPoints() extracted,
mergeBreakPoints() added, async chunkDocumentAsync() wrapper for AST
- ChunkStrategy type ("auto" | "regex") threaded through
generateEmbeddings(), hybridQuery(), structuredSearch(), CLI, and SDK
- getASTStatus() health check wired into `qmd status`
- Parse failures log a warning and fall back to regex — never crash
Hardening:
- Grammar packages are optionalDependencies with pinned versions to
prevent ABI breaks from semver drift
- web-tree-sitter is a direct dependency (pinned)
- Errors are logged (not silently swallowed) for debuggability
- Tested on both Node.js and Bun (Bun is actually faster)
Testing:
- 26 unit tests (test/ast.test.ts) — all 4 languages, error handling
- 7 integration tests (test/store.test.ts) — merge, equivalence, bypass
- Standalone test-ast-chunking.mjs with 63 synthetic tests and a
real-collection performance scanner (npx tsx test-ast-chunking.mjs ~/code)
- Validated end-to-end with qmd embed + qmd query on QMD's own codebase
- Zero markdown regressions across all test paths
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The bin/qmd wrapper checks for bun.lock to select the runtime, but since
bun.lock is committed to the repo, source builds using npm install are
incorrectly routed to Bun — causing native module ABI mismatches (#381)
and sqlite-vec crashes (#380).
Add package-lock.json as a higher-priority signal: if it exists, npm
installed the dependencies and Node should be used. Also fix
cleanupOrphanedVectors() to use the existing isSqliteVecAvailable()
guard instead of checking sqlite_master, which can report the virtual
table even when the vec0 module isn't loaded.
Fixes#381, fixes#380
Continuation of #362 (runtime detection false positives)
Add _ciMode flag to LlamaCpp that throws immediately on embedBatch,
generate, expandQuery, and rerank when CI=true — prevents silent 30s
timeouts. Skip MCP HTTP Transport tests in CI (they instantiate a real
LlamaCpp). Bump vitest/bun test timeouts to 60s for slower CI runners.