Compare commits

...

91 Commits

Author SHA1 Message Date
b19f486d50
Merge branch 'main' into feat-nvidia-embedding-remote-sync 2026-06-12 07:38:27 +08:00
Tobias Lütke
636602409c
Merge pull request #715 from rymalia/docs/qmd-reference-phase-1
docs: CLI reference, collection-flag semantics, MCP params, bench, new commands
2026-06-08 18:50:52 +02:00
Ryan Malia
7488fe8094 docs: CLI reference, collection-flag semantics, MCP params, bench, new commands
Documents previously-undocumented surface area surfaced by onboarding feedback
and the bench discoverability report:

- README: collection filtering (-c semantics), collection show/include/exclude/
  update-cmd, --intent/--no-rerank/-C/--full-path, --format <kind> (legacy
  output booleans noted as aliases), vector-search/deep-search aliases, embed
  memory flags, a sample --explain trace, MCP tool parameter reference, qmd
  doctor/init, get :from:count + --no-line-numbers, and a Benchmarking section
  for qmd bench.
- README: removed the misleading `qmd update --pull` example; --pull is parsed
  but never consumed, so it points to `qmd collection update-cmd` (the real
  per-collection pre-reindex mechanism) instead.
- docs/SYNTAX.md: drop the non-existent `q` MCP parameter (the query tool/REST
  endpoint accept only `searches`); add a Scoping section.
- server.ts: buildInstructions now advertises the plural `collections` parameter
  to match the schema (singular was silently stripped, yielding unscoped
  results), and the `get` instruction documents the full file.md:from:count
  range suffix instead of only file.md:100.

Refs #25, #181, #217, #372, #520, #576
2026-06-07 13:37:41 -07:00
Tobias Lütke
3f751cd0f0
Merge pull request #698 from tobi/feat/literal-path-storage
fix: store literal filesystem paths, drop handelize() at index time
2026-06-01 18:49:02 -04:00
Tobias Lütke
5528b14abe test: normalize collectionDir with realpathSync for macOS compat
On macOS /tmp is a symlink to /private/tmp. mkdtemp returns /tmp/...
but getRealPath(resolve(pwd)) in collectionAdd resolves symlinks and
stores /private/tmp/... in the DB. toVirtualPath and --full-path
resolution then fail because the test-side collectionDir path doesn't
match the DB-side path. Fix by calling realpathSync on the test
collectionDir before passing it to CLI commands and assertions.
2026-06-01 20:25:28 +00:00
Tobias Lütke
070147d8ab fix: store literal filesystem paths, drop handelize() at index time
Filenames with special characters (#, &, spaces, [], (), etc.) now
round-trip correctly through index → search → get → full-path.

Root cause: reindexCollection() called handelize() on the relative path
before storing it in documents.path, turning
  '# Meeting - 234232 3432 __ 5.md' → 'Meeting-234232-3432-5.md'
This broke all downstream operations that needed to reconstruct the
real filesystem path from the DB record.

Changes:
- Remove handelize() from reindexCollection() in store.ts (index time)
- Remove handelize() from update command path in cli/qmd.ts
- findOrMigrateLegacyDocument now tries both raw path and handalized
  variant so existing indexes auto-migrate on next qmd update
- resolveVirtualPath, toVirtualPath, detectCollectionFromPath all work
  correctly once the DB stores literal paths

Tests (test/path-fidelity.test.ts — 10/10):
- Store level: DB contains literal paths, not handalized slugs
- toVirtualPath returns non-null for crazy-named files
- (1) search --json file field shows literal path
- (2) get --full-path resolves to a real on-disk path
- (3) get <actual-fs-path> finds the document
- (3b) subdir file with crazy name also works
- (4) ls shows literal paths
- (5) search docid can be fetched back
- Normal filenames still work (regression)
- Migration: qmd update on handalized index rewrites paths to literal
2026-06-01 20:20:50 +00:00
Tobias Lütke
f9d414c931 fix(search): split dotted tokens in FTS5 so version strings like 2026.4.10 match (#563)
fix(http): return qmd:// URIs from REST /query endpoint to match CLI output (#576)
2026-05-31 23:15:37 +00:00
Tobi Lutke
5323277086
release: v2.5.3 2026-05-28 19:53:26 -07:00
Tobi Lutke
c5f4217a6f
fix(cli): exit naturally so node-llama-cpp's beforeExit fires
The libggml-metal static destructor asserts on a non-empty residency-set
collection during __cxa_finalize_ranges, dumping a multi-kB GGML backtrace
after successful output (ggml-org/llama.cpp#22593, one-line fix open as
PR #22595). The assertion only trips when process.exit() skips Node's
beforeExit hook — which is exactly the hook node-llama-cpp registers to
auto-dispose its native handles.

Primary fix: finishSuccessfulCliCommand now sets process.exitCode = 0
and returns instead of calling process.exit(0). The event loop drains,
beforeExit fires, native Metal resources tear down in order, and the
process exits cleanly even without the workaround env var.

Defense-in-depth retained: bin/qmd and scripts/test-all.mjs still export
GGML_METAL_NO_RESIDENCY=1 on darwin for error paths and tests that
terminate via process.exit(). Opt back in with QMD_METAL_KEEP_RESIDENCY=1.

Also: correct upstream issue refs (was #17869 → now #22593/#22595).
Add scripts/repro-metal-rsets-crash.mjs minimal reproduction.
2026-05-28 17:23:31 -07:00
Tobi Lutke
c162ed1319
fix: disable libggml-metal residency sets on darwin
The libggml-metal static device destructor asserts on a non-empty
residency set during libc `exit()` → `__cxa_finalize_ranges`
(ggml-org/llama.cpp#17869). The residency set's 180 s keep_alive timer
hasn't expired by exit, so `GGML_ASSERT([rsets->data count] == 0)`
fails and `ggml_abort` dumps a multi-kB backtrace to stderr after the
user-visible output. Every llama-using CLI command (`query`,
`vsearch`, `embed`) was affected, plus the `bun test` runner.

No JS-side dispose path can prevent it: the static destructor runs
after every JS-reachable cleanup, and Node's `reallyExit` calls libc
`exit()` not `_exit()` (verified in node/src/api/environment.cc),
so it does NOT skip C++ static destructors as we'd assumed.

The actual fix is to disable residency sets via
`GGML_METAL_NO_RESIDENCY=1` before the native binding loads. For
QMD's short-lived CLI workflow there's no measurable cost
(benchmarked: identical wall time with and without on M3 Pro).

Three propagation points are needed:
- `bin/qmd` exports the env var before spawning node/bun. This
  covers all production CLI invocations.
- `src/test-preload.ts` mirrors the launcher for `bun test` runs.
  Bun does NOT sync `process.env` mutations to libc `setenv()`
  (verified empirically — Node does, via uv_os_setenv), so on Bun we
  reach for `bun:ffi` to call `setenv()` directly. vitest forks
  per-test-file so its parent never loads the binding.
- `qmd doctor` reports the mitigation state via the new
  `isDarwinMetalMitigationActive()` predicate so users can verify it
  in their environment.

Opt back in with `QMD_METAL_KEEP_RESIDENCY=1` (long-lived qmd
processes, MCP daemon hot reload, upstream fix triage). The old
`QMD_DISABLE_DARWIN_QUERY_JSON_SAFE_EXIT` is removed — its per-command
bypass mechanism didn't actually work on Node (it called
`process.reallyExit` which goes through libc exit) and is fully
replaced by the launcher env var.

Removed the old broken `installDarwinExitGuard()` mechanism from
LlamaCpp; kept the function name as a no-op shim for back-compat.
2026-05-28 13:40:14 -07:00
Tobi Lutke
0d7fdb7589
fix(launcher): prefer Node+tsx over Bun in source mode when both lockfiles exist
Source-mode runner selection now mirrors the dist-mode 'npm priority' rule:
if both package-lock.json and bun.lock are present in the package root,
use Node + tsx instead of Bun. pnpm/npm installs ship Node-ABI native
modules (better-sqlite3, sqlite-vec), and routing through Bun produces
ABI mismatches.

This also fixes pnpm-global installs, which copy the entire working tree
(including .git and bun.lock) into <prefix>/node_modules/@tobilu/qmd/.
The old logic saw .git + bun.lock + bun-on-PATH and routed to Bun
against the Node-installed native modules.

Adds a regression test covering the both-lockfiles source-checkout case.
2026-05-28 11:57:30 -07:00
Tobi Lutke
3de3162e1a
feat(cli): ./-prefix $PWD-relative --full-path; add --format <kind>
--full-path now ./-prefixes any path that resolves under $PWD, both for
search/query results and for get/multi-get headers. This makes the
output unambiguously a filesystem path — a bare 'notes/foo.md' could be
misread as a collection-relative qmd:// fragment, but './notes/foo.md'
cannot. Absolute realpaths (when the file is outside $PWD) are
unchanged. Extracted as renderFullPath() and reused across the three
call sites so the policy stays consistent.

New --format <kind> flag selects output format for search/query and
multi-get (cli|json|csv|md|xml|files). The legacy boolean aliases
(--json/--csv/--md/--xml/--files) still work for back-compat but are
removed from --help; the skill is updated to use --format.

ANSI colors and OSC 8 hyperlinks are already gated on process.stdout
.isTTY, so piped/agentic invocations get clean plain-text output with
no escape sequences. Verified via od -c on a piped 'qmd search' run.
2026-05-28 11:35:21 -07:00
Tobi Lutke
436420e927
feat(search,query): --full-path swaps qmd:// for on-disk paths
`qmd://` URIs remain the default identifier in search and query output
(across all formats: cli, --json, --md, --csv, --xml, --files). The
default CLI view now consistently prints the full qmd:// URI as the
visible label so it can be piped straight into `qmd get`, and --md
output gains a **file:** line for the same reason.

--full-path (already on get/multi-get) now also applies to search and
query: the per-result label becomes the file's on-disk path — relative
to $PWD when the file is in a subfolder of the current directory,
absolute realpath otherwise — and the per-result #docid is dropped
because the path is the identifier. Falls back to qmd:// when the file
is no longer resolvable on disk.

Also locks in @@ -line,count @@ header arithmetic with a regression test
that mirrors the user-reported 77-line / '1 before, 72 after' scenario.
2026-05-28 11:18:03 -07:00
Tobi Lutke
fa8f904a9d
docs(qmd-skill): structured-query-first; cite docid + lines; no sed
- Make structured `qmd query` with intent:/lex:/vec:/hyde: the default
  search mode, and emphasize that the caller authors the expansion
  rather than leaning on the built-in query-expansion model.
- Tell the caller to cite the #docid and exact line numbers now
  printed by get/multi-get, and to slice files with the :from:count
  suffix or --from/-l instead of piping through sed/head/tail.
- Document --full-path for handing the on-disk path to editor tools.
- Bump skill version to 2.2.0 and record the behavior changes under
  ## [Unreleased] in CHANGELOG.md.
- Update the package smoke test that pinned the old 'structured
  queries' wording to match the new, more specific intro phrasing.
2026-05-28 10:56:13 -07:00
Tobi Lutke
41bc3a27d8
feat(get,multi-get): line-numbered + docid output, line ranges, --full-path
Redesign the get/multi-get retrieval surface so callers can cite what
they retrieved and request follow-up slices without piping through sed:

- Output is line-numbered by default; opt out with --no-line-numbers.
- Header always identifies the document by qmd:// path + #docid. The
  MCP get/multi_get tools default lineNumbers=true to match.
- qmd get and the MCP get tool accept a :from:count suffix on a path
  or docid (e.g. '#abc123:120:40' reads 40 lines from line 120).
  Explicit --from/-l flags still override the suffix.
- qmd multi-get now includes #docid in every output format (--md,
  --json, --csv, --xml, --files, default CLI), matching qmd search.
- New --full-path flag swaps the qmd:// + docid header for the
  document's on-disk path (handy for piping into Read/Edit/editors);
  falls back to the canonical header when the file no longer exists.
2026-05-28 10:55:55 -07:00
Tobi Lütke
443760f4d5
release: v2.5.2 2026-05-22 20:24:08 +00:00
Tobi Lütke
f72f6dcc6c
Update CHANGELOG.md and remove autoresearch.jsonl 2026-05-22 20:24:01 +00:00
Tobi Lütke
65e517f783
Enhance launcher shebang polyglot to fall back to Bun if Node is missing on the system
Result: {"status":"keep","test_status":0}
2026-05-22 20:21:12 +00:00
Tobi Lütke
7a5d8f5574
Make bin/qmd launcher a shebang polyglot to support both Windows cmd/ps1 native wrappers and sh-invoked smoke tests
Result: {"status":"keep","test_status":0}
2026-05-22 20:08:49 +00:00
Tobi Lütke
65b813d737
Rewrite launcher in Node.js to fix Windows execution and keep tests passing
Result: {"status":"keep","test_status":0}
2026-05-22 20:03:44 +00:00
Tobi Lütke
ba6538090f
release: v2.5.1 2026-05-20 00:24:59 +00:00
Tobi Lütke
96ab7787f8
Use npm trusted publishing 2026-05-20 00:19:46 +00:00
Tobi Lütke
a8a314b802
Stabilize doctor CLI tests 2026-05-19 23:34:06 +00:00
Tobi Lütke
1926412f29
release: v2.5.0 2026-05-19 23:08:39 +00:00
Tobi Lütke
e3a9c5d02b
Make release script portable 2026-05-19 23:08:30 +00:00
Tobi Lütke
a3f0b9423f
Expand install smoke harness 2026-05-19 23:05:49 +00:00
Tobi Lütke
b5f156c313
Improve qmd diagnostics and embed resilience 2026-05-19 21:39:48 +00:00
Tobi Lutke
105c577b3b
docs: improve qmd skill guidance 2026-05-19 15:22:14 -04:00
Tobi Lutke
2b250f3dca
fix: run qmd from unbuilt checkouts 2026-05-19 15:22:13 -04:00
Tobi Lutke
632c34d120
chore: strengthen package test task 2026-05-19 14:27:38 -04:00
Tobi Lutke
d9348f43a0
feat: add local init and doctor diagnostics 2026-05-19 14:27:33 -04:00
Tobi Lutke
5cda3cf54c
Improve qmd doctor diagnostics 2026-05-19 12:48:16 -04:00
Tobi Lütke
596198c2ba
fix: update nix module hashes 2026-05-18 02:59:20 +00:00
Tobi Lütke
6ea0544e76
chore: update core runtime dependencies 2026-05-18 02:54:03 +00:00
Tobi Lütke
ac6b154f0c
feat: add qmd doctor vector diagnostics 2026-05-18 01:52:05 +00:00
Tobias Lütke
ddbd6bd8be
Merge pull request #656 from tobi/fix/gpu-status-warning
Fix GPU status guidance and benchmark warnings
2026-05-16 19:55:39 -04:00
Tobi Lütke
ad8a371be2
Fix QMD CI test runtime assumptions 2026-05-16 23:52:53 +00:00
Tobi Lütke
028e8fc86f
Improve packaged QMD skill 2026-05-16 23:46:40 +00:00
Tobi Lütke
15a711d729
Run local tests on Node and Bun 2026-05-16 23:46:22 +00:00
Tobi Lütke
da184e58e9
Unify QMD model resolution 2026-05-16 23:46:22 +00:00
Tobi Lütke
1f757379e2
Fix GPU status guidance and benchmark warnings 2026-05-16 23:45:58 +00:00
Tobias Lütke
cdf3bc0712
Merge pull request #657 from tobi/feat/cli-served-skills
Serve QMD skill instructions from the CLI
2026-05-16 19:40:10 -04:00
Tobi Lütke
c18c74a134
Serve QMD skill instructions from CLI 2026-05-16 22:43:33 +00:00
Tobias Lütke
17aa34d753
Merge pull request #655 from tobi/feat/local-qmd-index-bench
feat: support project-local QMD indexes
2026-05-16 14:31:05 -04:00
Tobi Lütke
b2550d273a
Merge remote-tracking branch 'origin/main' into feat/local-qmd-index-bench
# Conflicts:
#	src/cli/qmd.ts
2026-05-16 18:27:49 +00:00
Tobi Lütke
2e0c74310c
feat: support project-local indexes in bench 2026-05-16 18:22:04 +00:00
Tobias Lütke
87520252a5
Merge pull request #654 from tobi/workoff/t_657414dc-dev-review
fix: keep partial embeddings pending
2026-05-16 13:52:49 -04:00
Tobi Lütke
910ca07fd9
fix: keep partial embeddings pending 2026-05-16 17:48:01 +00:00
Tobias Lütke
a35a487af0
Merge pull request #653 from tobi/workoff/t_09301bf8-dev-review
test: cover qmd bin wrapper install layouts
2026-05-16 13:47:44 -04:00
Tobi Lütke
dc49ccff1e
test: cover qmd bin wrapper install layouts 2026-05-16 17:41:43 +00:00
Tobias Lütke
474ac7863d
Merge pull request #652 from tobi/workoff/t_23fce575-dev-review
fix: avoid macOS Metal cleanup abort after JSON query
2026-05-16 13:41:02 -04:00
Tobi Lütke
b59ba6ab1e
test: keep cleanup lifecycle regression portable 2026-05-16 17:35:08 +00:00
Tobias Lütke
9bc316e545
Merge pull request #651 from tobi/workoff/t_0d576ae5-dev-review
fix: keep llama GPU fallback noise off JSON stdout
2026-05-16 13:31:16 -04:00
Tobi Lütke
e4505607f9
Merge remote-tracking branch 'origin/main' into workoff/t_0d576ae5-dev-review
# Conflicts:
#	CHANGELOG.md
2026-05-16 17:26:27 +00:00
Tobi Lütke
60c75cb332
fix: avoid macOS Metal cleanup abort after JSON query 2026-05-16 17:22:46 +00:00
Tobias Lütke
bad20f5565
Merge pull request #644 from Ginja/fix/snippet-absolute-line-numbers
fix: return absolute line numbers from qmd_query
2026-05-16 13:18:35 -04:00
Tobi Lütke
dd5d82d523
fix: keep llama GPU fallback noise off JSON stdout 2026-05-16 17:18:06 +00:00
Tobias Lütke
d0bcdf0cfb
Merge pull request #635 from erlebach/fix/ls-absolute-path-collections
fix(ls): handle collections whose names are absolute paths
2026-05-16 13:13:15 -04:00
Tobi Lütke
2dc8634ac7
fix(ls): preserve qmd:/// collection aliases 2026-05-16 17:12:38 +00:00
Tobias Lütke
7a7c6ed9e2
Merge pull request #643 from woodenriver05/fix-candidate-limit-forwarding
Forward candidateLimit through search APIs
2026-05-16 13:10:44 -04:00
Tobias Lütke
bbd09ff84d
Merge pull request #648 from rubycerebra/mcp/terse-collection-catalogue
perf(mcp): emit terse collection summary in serverInstructions
2026-05-16 13:09:23 -04:00
James Cherry
9cecdc8703 perf(mcp): emit terse collection summary in serverInstructions
Previously buildInstructions emitted one line per collection with name,
doc count, and description, which can run to ~1.5 KB for a dozen
collections and is injected into every session's system prompt at
MCP initialize. Most agents never query qmd in a given session, so the
catalogue lines are a recurring token cost for static info that the
existing 'status' tool already exposes on demand.

Now emit a single comma-joined names line plus a hint that 'status'
returns the rest. listContexts() lookup is dropped since the per-
collection descriptions are no longer rendered.

Refs #647
2026-05-15 23:44:16 +01:00
Riley Shott
aa1818e181
fix: clamp negative fromLine in get to avoid silent tail content
The query tool description tells agents to compute fromLine = line - 20
for context around a hit. For hits in lines 1 through 20 that yields a
negative fromLine, which propagated unchanged through:

  MCP get handler -> store.getDocumentBody -> Array.prototype.slice

A negative slice start offsets from the end of the array rather than
clamping to the beginning, so a top-of-file hit on a long document
returned an empty string and on a short document returned content from
the wrong region (e.g. lines 11-30 of a 30-line file in response to a
request for the head of the document). The lineNumbers branch was the
same shape: addLineNumbers(text, -19) emitted "-19:", "-18:" prefixes.

Same buggy slice lived in the CLI getDocument path independently.

Fix in three layers, plus the docstring:

- src/mcp/server.ts: clamp parsedFromLine to >= 1 after parsing input
  args and the :line suffix, before it reaches getDocumentBody and
  addLineNumbers. Also tighten the query tool's recommendation to
  `fromLine = max(1, line - 20)` so following the docstring literally
  produces a valid value.
- src/cli/qmd.ts: same clamp on the CLI getDocument fromLine after
  the colon-suffix parse.
- src/store.ts: defensive Math.max(0, ...) on the slice start in
  getDocumentBody so SDK callers and any future entry points are
  protected without relying on every caller remembering to clamp.
- test/store.test.ts: regression test on getDocumentBody with
  fromLine = -19 returns the head of the document, not the tail.
- test/cli.test.ts: regression test on `qmd get --from -19` matches
  the no-flag baseline (head of document).
2026-05-13 23:59:33 -07:00
Riley Shott
1f522cffe2
fix: return absolute line numbers from qmd_query
The MCP `query` tool, HTTP `/query` endpoint, and CLI `qmd query`
all returned chunk-local line numbers in their snippet output, so
the line could not be passed back to `qmd_get` as `fromLine`
without an out-of-band lookup. Pass the full document body plus
`bestChunkPos` to `extractSnippet` instead of the chunk text alone
so it can compute absolute line offsets while still scoping the
keyword scan to the reranker-chosen chunk window (preserves #149).

Also restores documented behavior of `qmd query --full`, which was
emitting the best chunk (~3.6KB max) instead of the full document.

extractSnippet now also falls back to a full-body scan when given a
chunkPos but the chunk window contains no positive matches. The
upstream chunk selector leaves bestIdx=0 as its initialization
default whenever scoring fails to find a winner (e.g. queryTerms
filtered to empty by the length>2 guard, or semantic-only matches
with no lex overlap), so an unconditional chunk-scoped scan would
land on chunk 0 instead of where the actual match lives.

- src/mcp/server.ts: SearchResultItem gains `line: number`; both MCP
  and HTTP `/query` handlers populate it
- src/cli/qmd.ts: OutputRow.body now sources from r.body
- src/store.ts: extractSnippet falls back to full-body scan when
  chunk-scoped pass finds no positive match
- test/mcp.test.ts: new fixture asserts absolute line 301 for a
  marker placed past the first chunk boundary
- test/store.test.ts: regression test for the bestScore<=0 fallback
2026-05-13 23:59:32 -07:00
woodenriver05
3b7e065024 fix: forward candidateLimit through search APIs 2026-05-11 20:25:34 -10:00
Tobias Lütke
746beedb48
Merge pull request #636 from tobi/stack/qmd-kanban-fixes-2026-05-09
Integrate QMD fix stack
2026-05-09 16:22:07 -04:00
Tobi Lütke
e36ab96567
fix: allow HTTP query rerank control 2026-05-09 19:03:17 +00:00
Tobi Lütke
669e234d1e
test: index MCP HTTP fixture before query 2026-05-09 18:56:06 +00:00
Tobi Lütke
b32ee4e660
test: make CI fixture invocations portable 2026-05-09 18:45:56 +00:00
Tobi Lütke
4505d8132e
ci(nix): update node module hashes
Refresh fixed-output hashes after moving AST grammar packages into runtime dependencies so Nix CI builds the current locked dependency graph.
2026-05-09 18:31:26 +00:00
Tobi Lütke
e627ca7de6
test: allow slow CPU rerank fixture 2026-05-09 18:20:26 +00:00
Tobi Lütke
ddc969a5f4
fix embed model and qmd home resolution 2026-05-09 18:17:10 +00:00
Tobi Lütke
b775592230
fix mcp --index store selection 2026-05-09 18:16:02 +00:00
Tobi Lütke
e8229d8bfb
Fix Windows CUDA context parallelism 2026-05-09 18:15:39 +00:00
Tobi Lütke
dff6513693
Preserve document IDs across case-only renames 2026-05-09 18:15:15 +00:00
Tobi Lütke
656707c6b4
Fix Node ESM index path normalization 2026-05-09 18:14:33 +00:00
Tobi Lütke
3653f6015c
Fix MCP stdio native log pollution 2026-05-09 18:14:15 +00:00
Tobi Lütke
3f055e705d
Fix AST grammar packaging for Bun installs 2026-05-09 18:13:58 +00:00
Tobi Lütke
004714af48
Fix hybrid RRF weighting by query type 2026-05-09 18:13:38 +00:00
Tobi Lütke
92aaded36e
fix(store): preserve inactive docs during orphan cleanup 2026-05-09 18:13:01 +00:00
Tobi Lütke
5b9f472849
fix(embed): honor collection filter 2026-05-09 18:12:45 +00:00
Tobi Lütke
d045a8bab6
fix(search): support CJK FTS queries 2026-05-09 18:12:45 +00:00
Tobi Lütke
3d991b2a47
fix(cli): keep status from importing llama 2026-05-09 18:12:45 +00:00
erlebach
8fdc4815c5 auto: update src/cli/qmd.ts 2026-05-09 13:35:34 -04:00
Tobias Lütke
d58fedf4b5
Merge pull request #579 from lukeboyett/fix/db-transaction-type
fix(db): declare transaction() on local Database interface
2026-05-02 20:16:33 -04:00
Tobias Lütke
06218b1cc5
Merge pull request #384 from rymalia/fix/semantic-query-hyphen-validation
fix: allow hyphenated words in vec/hyde queries (#383)
2026-05-02 20:16:30 -04:00
Tobias Lütke
a2fcac8225
Merge pull request #602 from fxstein/fix/mcp-production-mode-test-leak
fix(mcp): do not enable production mode at module import time
2026-05-02 20:16:27 -04:00
Pi
a0c460333b fix(cli): do not enable production mode at module import time
src/cli/qmd.ts has the same module-scope enableProductionMode() call that
src/mcp/server.ts had — and the same test-isolation leak. test/cli.test.ts
imports buildEditorUri and termLink from this module, which executes the
top-level enableProductionMode() as a side effect of import, flipping the
global _productionMode flag for every later test file in the Bun process.

This is the actual driver of the Store Creation > createStore throws
without explicit path in test mode failure — test/cli.test.ts runs
alphabetically before test/store.test.ts, so the flag is already true by
the time store.test.ts checks it.

Mirror the fix applied to src/mcp/server.ts in the previous commit: move
enableProductionMode() from module scope into the if (isMain) guard so
the flag is only flipped when qmd is actually invoked as the CLI
entrypoint, not when the module is imported for its exports.
2026-04-23 17:41:10 +00:00
Pi
54262f566c fix(mcp): do not enable production mode at module import time
The top-level enableProductionMode() call added in #537 fixed a real issue
(MCP server resolving the wrong database path at startup) but introduced
test-isolation breakage as a side effect: merely importing src/mcp/server.ts
flipped the global _productionMode flag, which then broke unrelated tests
that depend on the default (development) database path resolution.

This shows up concretely as "Store Creation > createStore throws without
explicit path in test mode" failing on Bun (ubuntu-latest) in CI, because
test/mcp.test.ts imports startMcpHttpServer from this module.

Move the enableProductionMode() call from module scope into the two server
entry points (startMcpServer and startMcpHttpServer). The fix originally
intended by #537 — ensuring production mode is active before getDefaultDbPath
runs — is preserved because both call sites still flip the flag before
createStore / getDefaultDbPath. Importing the module for its exports no longer
mutates global state.

Verified locally against current main: CI failure on Bun (ubuntu-latest)
reproduces on unmodified upstream main (14+ consecutive failed runs since
#537 merged on 2026-04-09) and is resolved by this change.

Refs: #537
2026-04-23 15:57:02 +00:00
Luke Boyett
d231b64617 fix(db): declare transaction() on local Database interface
The narrow cross-runtime Database interface in src/db.ts defines the
subset of better-sqlite3 / bun:sqlite methods used throughout QMD.
Commit fee576b ("fix: migrate legacy lowercase paths on reindex")
introduced a db.transaction(...) call in src/store.ts but did not
extend the interface, breaking `tsc -p tsconfig.build.json`:

  src/store.ts(2142,22): error TS2339: Property 'transaction' does
  not exist on type 'Database'.

Both underlying engines expose transaction(fn), so this just makes
the type reflect reality.
2026-04-14 08:34:20 -04:00
Ryan Malia
d531211030 fix: allow hyphenated words in vec/hyde queries (#383)
The validateSemanticQuery regex rejected any hyphen followed by a word
character, blocking common compound words (real-time, multi-client,
kebab-case identifiers like better-sqlite3). Tighten the check to only
match negation syntax at token boundaries (start of string or after
whitespace).

See https://github.com/tobi/qmd/issues/383

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-12 00:14:21 -07:00
52 changed files with 7602 additions and 889 deletions

View File

@ -32,13 +32,12 @@ jobs:
- uses: actions/setup-node@v4
with:
node-version: 22
node-version: 24
registry-url: https://registry.npmjs.org
package-manager-cache: false
- run: npm run build
- run: npm publish --provenance --access public
env:
NODE_AUTH_TOKEN: ${{ secrets.NPM_TOKEN }}
- name: Extract release notes
id: notes

View File

@ -19,6 +19,198 @@
and query expansion models.
- Embedding: use approximate token counts in external embedding mode so
chunking does not load a local GGUF tokenizer.
### Documentation
- README: documented collection filtering (`-c` semantics), the `collection
show`/`include`/`exclude`/`update-cmd` subcommands, the `--intent`/`--no-rerank`/
`-C`/`--full-path` search flags, the `--format <kind>` output selector (with the
legacy `--json`/`--csv`/`--md`/`--xml`/`--files` booleans noted as aliases),
`vector-search`/`deep-search` aliases, embed
memory flags (`--max-docs-per-batch`/`--max-batch-mb`), a sample `--explain`
score trace, the `qmd doctor`/`qmd init` commands, the `get` `:from:count`
suffix and `--no-line-numbers`, an MCP tool parameter reference, and a
Benchmarking section for `qmd bench`.
- docs/SYNTAX.md: removed the non-existent `q` MCP parameter example (the `query`
tool and REST endpoint accept only the `searches` array) and added a Scoping
section.
- README: removed the misleading `qmd update --pull` example. The `--pull` flag is
parsed but never consumed (`updateCollections()` ignores it); the real mechanism
for running `git pull` before re-indexing is a per-collection `update` command,
set via `qmd collection update-cmd`.
### Fixed
- MCP server instructions now tell agents to scope with the plural `collections`
parameter (matching the schema). The previous singular `collection` hint led
agents to pass a parameter that Zod silently strips, producing unscoped results.
The `get` instruction line also now documents the full `file.md:from:count`
range suffix instead of only the single-line `file.md:100` offset.
- Filesystem paths with special characters (`#`, `&`, spaces, `[]`, `()`, etc.)
now round-trip correctly through index → search → get. Previously
`reindexCollection` called `handelize()` on relative paths before storing
them, turning `# Meeting - 234232 3432 __ 5.md` into
`Meeting-234232-3432-5.md` and making `qmd get <actual-path>`,
`qmd get --full-path`, and `qmd ls` return dead or garbled paths. Paths are
now stored verbatim. Existing indexes auto-migrate on the next `qmd update`.
- FTS5 search now correctly matches dotted version strings like `2026.4.10`. The
`porter unicode61` tokenizer splits on dots (storing `2026`, `4`, `10` as
separate tokens), but the query sanitizer was stripping dots and producing
`2026410` which never matched. Dotted terms are now split and ANDed together
so version-string searches work as expected (#563).
- HTTP REST endpoints `/query` and `/search` now return `qmd://collection/path`
URIs in the `file` field, matching the output format used by the CLI and MCP
resource URIs. Previously the raw `displayPath` (`collection/path`) was
returned without the scheme prefix (#576).
- The embed session `maxDuration` is now env-configurable via
`QMD_EMBED_MAX_DURATION_MS` (default: 30 min). This prevents large-corpus
embeddings from being aborted by the hardcoded 30-minute ceiling (#673).
## [2.5.3] - 2026-05-28
### Features
- `qmd get` now accepts a `:from:count` suffix on a path or docid (e.g.
`qmd get "#abc123:120:40"` reads 40 lines starting at line 120). Explicit
`--from`/`-l` flags still override the suffix. The MCP `get` tool accepts the
same suffix.
- `qmd get` and `qmd multi-get` are now **line-numbered by default** and print
the document's `#docid` and `qmd://` path in the output header. Disable line
numbers with `--no-line-numbers`. The MCP `get`/`multi_get` tools default
`lineNumbers` to `true` to match.
- `qmd multi-get` now includes the `#docid` in every output format
(`--md`, `--json`, `--csv`, `--xml`, `--files`, and the default CLI view),
consistent with `qmd search`.
- `qmd get` and `qmd multi-get` accept `--full-path`, which replaces the
`qmd://` path + `#docid` with the document's on-disk filesystem path (handy for
piping into `Read`/`Edit`/an editor). Falls back to the canonical `qmd://` +
docid header when the file no longer exists on disk.
- `qmd search` / `qmd query` now show a clearer hit identifier: the default CLI
view (and the new `**file:**` line in `--md` output) always prints the full
`qmd://collection/path` URI so you can pipe it straight back into `qmd get`.
- `qmd search` / `qmd query` accept `--full-path` with the same semantics as
`qmd get`: the result label becomes the file's on-disk path — `./`-prefixed
relative path when the file lives in a subfolder of `$PWD`, absolute realpath
otherwise — and the per-result `#docid` is dropped because the path is the
identifier. The leading `./` is intentional so the output is unambiguously a
filesystem path. Applies to all output formats.
- `qmd get` and `qmd multi-get` now also use the `./`-prefixed convention when
`--full-path` renders a path under `$PWD`, matching `search`/`query`.
- New `--format <kind>` flag selects the output format (`cli` | `json` | `csv` |
`md` | `xml` | `files`) for `search`, `query`, and `multi-get`. The legacy
boolean aliases (`--json`/`--csv`/`--md`/`--xml`/`--files`) still work but are
no longer in `--help`; prefer `--format`.
### Fixes
- Launcher: source-mode runner selection now prefers Node + tsx over Bun when
both `package-lock.json` and `bun.lock` are present in the package root,
mirroring the dist-mode "npm priority" rule. Fixes pnpm-global installs that
copy the entire working tree (including `.git` and `bun.lock`) into the
install dir and previously routed through Bun, causing ABI mismatches with
the Node-built `better-sqlite3` / `sqlite-vec` native modules.
- Darwin Metal: llama-using commands (`query`, `vsearch`, `embed`) no longer
dump a multi-kB GGML/Metal backtrace at process exit even when output
succeeded. The libggml-metal static `ggml_metal_device` destructor asserts
`[rsets->data count] == 0` during `__cxa_finalize_ranges`, but the
buffer-free path never calls the symmetric `ggml_metal_device_rsets_rm`
to remove released rsets from the device collection (upstream
ggml-org/llama.cpp#22593, one-line fix open as PR #22595). The assertion
only fires when `process.exit()` skips Node's `beforeExit` hook, which is
what node-llama-cpp uses to auto-dispose Metal contexts. Primary fix:
`finishSuccessfulCliCommand` now sets `process.exitCode = 0` and returns
instead of calling `process.exit(0)`, so `beforeExit` fires and the native
binding cleans up before libc's static destructor runs. Defense-in-depth:
the launcher (`bin/qmd`) and the npm test driver (`scripts/test-all.mjs`
+ the `test:bun` / `test:unit` package.json scripts) also set
`GGML_METAL_NO_RESIDENCY=1` on darwin before spawning node/bun, covering
error paths and tests that still terminate via `process.exit()`. The env
var must be set before node/bun start — libggml-metal reads it via libc
`getenv` at module-load time, and Bun does not propagate `process.env`
mutations to libc `setenv` — so it lives in the launcher rather than in
test-preload. Residency sets give no measurable speedup for QMD's
short-lived CLI workflow (benchmarked on M3 Pro). Opt back in with
`QMD_METAL_KEEP_RESIDENCY=1` for long-lived qmd processes (e.g. the MCP
daemon may benefit on hot reload) or to triage the upstream fix.
`qmd doctor` reports the mitigation state. Minimal reproduction:
`scripts/repro-metal-rsets-crash.mjs`.
### Docs
- qmd skill: emphasize reading line ranges with `get`'s built-in
`:from:count` suffix / `--from`/`-l` flags instead of piping through
`sed`/`head`/`tail`; cite the docid and line numbers now present in retrieval
output; and author structured `intent:`/`lex:`/`vec:`/`hyde:` queries yourself
rather than relying on built-in query expansion.
## [2.5.2] - 2026-05-22
### Fixes
- Launcher: Rewrite `bin/qmd` as a Node-based shebang polyglot to fix global npm installation execution failures on Windows (#668 / #452), while supporting seamless fallback to Bun in Node-less environments.
## [2.5.1] - 2026-05-20
### Changes
- Release: publish from GitHub Actions via npm Trusted Publishing/OIDC instead of a long-lived `NPM_TOKEN` secret.
## [2.5.0] - 2026-05-19
### Changes
- Dependencies: update core SQLite/config/chunking packages (`better-sqlite3`, `yaml`, `web-tree-sitter`, `tree-sitter-go`, and `tree-sitter-python`) while keeping incompatible `zod`, `tsx`, and `vitest` majors pinned.
- Agent skills: add `qmd skills list|get|path` to serve version-matched runtime skill instructions from the installed CLI, and make `qmd skill install` write a stable discovery stub so installed agent skills do not go stale after QMD upgrades.
- CLI: add `qmd doctor` for index/runtime diagnostics, including SQLite/sqlite-vec versions, embedding fingerprint freshness, mixed-fingerprint detection, safe legacy fingerprint adoption, and content-hash sampling.
### Fixes
- Launcher: prefer runnable TypeScript source in git checkouts even when ignored `dist/` artifacts exist, while packaged installs continue to run `dist/`.
- GPU: keep node-llama-cpp's documented `gpu: "auto"` initialization as the primary path, then perform no-build packaged CUDA/Vulkan/Metal probes only if auto falls back to CPU.
- CLI: move GPU/CPU runtime diagnostics out of `qmd status`; use `qmd doctor` for device probing and related environment guidance.
- CLI: point unexpected command/setup failures toward `qmd doctor` so diagnostics are the default next step when QMD behaves incorrectly.
- Doctor: explicitly warn when `content_vectors` contains multiple non-empty embedding fingerprint names, with the per-fingerprint document/chunk breakdown.
- Embed: make the TTY progress line label byte-based input progress explicitly, show embedded chunks as a count, and shorten the displayed model name.
- Embed: retain per-chunk failure details, retry failed chunks after later successful embeds and again when no other chunks remain, clear recovered errors, and cap retries to avoid endless loops.
- Tests: expand the container smoke harness to cover npm-global, npx-style, and Bun-global install scenarios, always checking auto and `QMD_FORCE_CPU=1` doctor modes, with opt-in tiny `qmd embed` and GPU probe runs for supported container runtimes.
- Embedding: fingerprint vector metadata using the active embedding model and formatting/chunking parameters so stale vectors are treated as pending after search semantics change. Legacy `content_vectors` columns are migrated lazily on first vector-health/write use to preserve fast QMD startup.
- Skill: expand the packaged QMD skill with retrieval-first workflows, structured query examples, wiki/source collection guidance, and safe fallbacks when model-backed search is unavailable.
- Tests: make `bun run test` execute the local unit suite under both Node/Vitest and Bun (`test:node` + `test:bun`) so runtime-specific regressions are caught before CI.
- Model config: centralize embedding/rerank/generation model resolution so `qmd embed`, `status`, `query`, `vsearch`, `pull`, SDK vector search, and `bench` use the same active `.qmd/index.yaml` model hints and environment fallbacks.
- GPU/status: `qmd status` now uses the same embedding model identity as `qmd embed` when computing pending embeddings, so URI-backed embeddings are not incorrectly reported as pending under the legacy `embeddinggemma` alias.
- GPU status: `qmd status` now always shows GPU mode/configuration without unsafe native probing, and CPU-fallback warnings point to `QMD_STATUS_DEVICE_PROBE=1 qmd status` for an actual backend probe. The no-GPU warning is emitted once per process instead of once per LLM instance during benchmarks.
- GPU: add `QMD_FORCE_CPU=1` / `--no-gpu` to bypass CUDA/Vulkan/Metal probing entirely, and route native llama.cpp stdout noise to stderr so JSON output stays parseable during search/query commands.
- Snippet line numbers: `qmd_query` (MCP), HTTP `/query`, and `qmd query`
(CLI JSON output and snippet headers) now return absolute source-file
line numbers instead of chunk-local ones, so the `line` field can be
passed back to `qmd_get` as `fromLine` without a separate lookup.
Snippet selection remains scoped to the best matching chunk
(preserves #149).
- CLI: `qmd query --full` now emits the full document body in all output
formats (json, csv, md, xml), restoring the documented behavior of the
flag. Previously it returned only the best matching chunk (~3.6KB max
per result). Output payload for `--full` queries is now proportional
to total document size.
- macOS Metal: `qmd query --json` now flushes successful JSON output and uses a safe immediate-exit path on Darwin to avoid ggml Metal finalizer aborts; other commands still dispose LLM contexts/models before the llama runtime. #368
- Embedding: require complete chunk coverage before treating a document as
embedded, remove partial vectors when chunk/session failures leave a
document incomplete, and keep `qmd status` pending counts honest after
interrupted long embed runs. #637 #378
- Embedding: `qmd embed -c <collection>` now scopes pending-doc selection
to the requested collection instead of embedding global pending work.
Scoped `--force` clears only collection-owned vectors, preserves shared
hashes referenced by sibling collections, and drops `vectors_vec` only
when the scoped clear empties all vectors.
- Hybrid search: weight RRF lists by query type so original FTS and original vector evidence get the intended 2x boost, instead of accidentally boosting the first lexical expansion. #591
- MCP: seed llama.cpp/GGML quiet env vars before launching `qmd mcp` so native logs cannot pollute stdio JSON-RPC framing. #593
- CLI: remove CommonJS `require()` calls from ESM index path normalization so `qmd --index <path>` no longer crashes with `ERR_AMBIGUOUS_MODULE_SYNTAX` on Node 22+. #634
- Windows CUDA: serialize llama.cpp embedding/reranking contexts by default to avoid intermittent `ggml-cuda.cu:98` crashes in `qmd query`; set `QMD_EMBED_PARALLELISM` to opt back into parallel contexts if your driver is stable. #519
- MCP: make `qmd mcp --index <name>` use the selected index for both foreground and daemon HTTP servers instead of falling back to the default store. #343
- Embedding: respect `QMD_EMBED_MODEL` consistently for vector indexing and vector-backed search, with default-model fallback when unset.
- Config: use one home-directory resolver for YAML config and the default SQLite cache path, avoiding Windows CLI/MCP split-brain when `HOME` is unset.
- GPU: respect explicit `QMD_LLAMA_GPU=metal|vulkan|cuda` backend overrides instead of always using auto GPU selection. #529
- Fix: preserve original filename case in `handelize()`. The previous
`.toLowerCase()` call made indexed paths unreachable on case-sensitive
@ -27,6 +219,18 @@
- CLI: make `qmd status` skip native `node-llama-cpp` device probing by
default so status stays safe on machines with broken or unsupported GPU
drivers. Set `QMD_STATUS_DEVICE_PROBE=1` to opt in.
- CLI: lazy-load `node-llama-cpp` so lightweight commands such as
`qmd status` do not import native ML dependencies or trigger llama.cpp
builds on ARM/no-GPU machines. #491
- Store: keep content rows referenced by inactive documents during orphan
cleanup so `qmd update` preserves soft-deleted tombstones for removed
files. #585
- Packaging: install AST grammar WASM packages as required dependencies so
Bun global installs include TypeScript/TSX/JavaScript grammars, and add a
`smoke:package-grammars` verification command. #595
- Launcher: add wrapper smoke coverage for scoped package, npm/npx,
Homebrew/Linuxbrew, Bun global symlink layouts, and `$BUN_INSTALL`
false-positive runtime selection regressions. #351 #353 #354 #356 #358 #359
## [2.1.0] - 2026-04-05

195
README.md
View File

@ -135,6 +135,30 @@ LLM models stay loaded in VRAM across requests. Embedding/reranking contexts are
Point any MCP client at `http://localhost:8181/mcp` to connect.
#### MCP Tool Parameters
| Tool | Parameter | Type | Notes |
|------|-----------|------|-------|
| `query` | `searches` | array | Typed sub-queries (`lex`/`vec`/`hyde`), 110. **Required.** First gets 2x weight. |
| `query` | `collections` | string[] | Filter by collection names (OR). **Array only** — singular `collection` is silently ignored. |
| `query` | `intent` | string | Disambiguation context (does not search on its own) |
| `query` | `limit` | number | Max results (default 10) |
| `query` | `minScore` | number | Minimum relevance 01 (default 0) |
| `query` | `candidateLimit` | number | Max candidates to rerank (default 40) |
| `query` | `rerank` | boolean | Run LLM reranking (default **true**); set false for RRF-only |
| `get` | `file` | string | Path, docid (`#abc123`), or `path:from:count` (e.g. `#abc123:120:40`) |
| `get` | `fromLine` | number | Start line (1-indexed); overrides the `:from` suffix |
| `get` | `maxLines` | number | Limit returned lines |
| `get` | `lineNumbers` | boolean | Prefix lines with numbers (default **true**) |
| `multi_get` | `pattern` | string | Glob pattern or comma-separated list |
| `multi_get` | `maxBytes` | number | Skip files larger than N (default 10240) |
| `multi_get` | `maxLines` | number | Limit lines per file |
| `multi_get` | `lineNumbers` | boolean | Prefix lines with numbers (default **true**) |
Unknown parameters are silently ignored (not rejected) — double-check names if
results seem unscoped. The HTTP `/query` and `/search` endpoints return
`qmd://collection/path` URIs in the `file` field, matching the CLI and MCP output.
### SDK / Library Usage
Use QMD as a library in your own Node.js or Bun applications.
@ -575,6 +599,17 @@ qmd collection rename myproject my-project
# List files in a collection
qmd ls notes
qmd ls notes/subfolder
# Show collection details (path, glob mask, include status, context count)
qmd collection show notes
# Include or exclude a collection from default (unscoped) queries
qmd collection include notes
qmd collection exclude notes
# Run a command before every `qmd update` (e.g. git pull); empty arg clears it
qmd collection update-cmd notes 'git pull --rebase'
qmd collection update-cmd notes
```
### Generate Vector Embeddings
@ -591,6 +626,10 @@ qmd embed --chunk-strategy auto
# Also works with query for consistent chunk selection
qmd query "auth flow" --chunk-strategy auto
# Memory control for large corpora / constrained systems
qmd embed --max-docs-per-batch 50 # cap docs per embedding batch
qmd embed --max-batch-mb 64 # cap batch size in MB
```
**AST-aware chunking** (`--chunk-strategy auto`) uses tree-sitter to chunk code
@ -652,6 +691,9 @@ qmd vsearch "how to login"
qmd query "user authentication"
```
Two aliases exist for the semantic/hybrid modes: `vector-search` (→ `vsearch`)
and `deep-search` (→ `query`).
### Options
```sh
@ -664,24 +706,45 @@ qmd query "user authentication"
--line-numbers # Add line numbers to output
--explain # Include retrieval score traces (query, JSON/CLI output)
--index <name> # Use named index
--intent "<text>" # Disambiguation context (e.g. "web page load times")
--no-rerank # Skip LLM reranking (RRF scores only; faster on CPU)
-C, --candidate-limit <n> # Max candidates to rerank (default: 40)
--full-path # Emit on-disk filesystem paths instead of qmd:// URIs
# Output formats (for search and multi-get)
--files # Output: docid,score,filepath,context
--json # JSON output with snippets
--csv # CSV output
--md # Markdown output
--xml # XML output
--format <kind> # cli (default) | json | csv | md | xml | files
# (--json, --csv, --md, --xml, --files are legacy aliases)
# Get options
qmd get <file>[:line] # Get document, optionally starting at line
-l <num> # Maximum lines to return
--from <num> # Start from line number
qmd get <file>[:from[:count]] # Get document; optional start line and count
-l <num> # Maximum lines to return
--from <num> # Start line (overrides the :from suffix)
--no-line-numbers # Disable line numbering (on by default)
# Multi-get options
-l <num> # Maximum lines per file
--max-bytes <num> # Skip files larger than N bytes (default: 10KB)
```
### Collection Filtering
The `-c`/`--collection` flag filters results by collection **name** (as shown by
`qmd collection list`). Collections are a global registry — you can search any
collection from any directory:
```sh
qmd search "auth" -c notes # single collection
qmd search "auth" -c notes -c docs # multiple collections (OR)
```
With no `-c` flag, all default-included collections are searched. Collections
marked excluded (`qmd collection exclude <name>`) are skipped unless named
explicitly with `-c`.
> **Note:** With multiple `-c` flags, results come from a global top-K pool and are
> then filtered. If one collection dominates the rankings, matches from smaller
> collections may not appear at the default limit — raise `-n` or use `--all`.
### Output Format
Default output is colorized CLI format (respects `NO_COLOR` env).
@ -759,17 +822,48 @@ qmd query --json --explain "quarterly reports"
qmd --index work search "quarterly reports"
```
The `--explain` flag attaches a score breakdown to each result: the FTS/vector
backend scores plus the RRF fusion math (rank, weight, top-rank bonus) and every
sub-query's contribution. Abbreviated:
```json
{
"docid": "#6c90f0",
"score": 0.89,
"file": "qmd://qmd/README.md",
"explain": {
"ftsScores": [0.892, 0.907],
"vectorScores": [0.540, 0.484],
"rrf": {
"rank": 1,
"weight": 0.75,
"baseScore": 0.123,
"topRankBonus": 0.05,
"totalScore": 0.173,
"contributions": [
{ "source": "fts", "queryType": "original", "query": "reranking",
"rank": 1, "weight": 2, "backendScore": 0.892, "rrfContribution": 0.0328 }
]
}
}
}
```
### Index Maintenance
```sh
# Show index status and collections with contexts
qmd status
# Re-index all collections
# Re-index all collections. If a collection has a configured update command
# (e.g. `git pull`), it runs first — set one with `qmd collection update-cmd`.
qmd update
# Re-index with git pull first (for remote repos)
qmd update --pull
# Diagnose the install (runtime, sqlite-vec, embedding fingerprints, GPU probe)
qmd doctor
# Initialize a project-local index in the current directory
qmd init
# Get document by filepath (with fuzzy matching suggestions)
qmd get notes/meeting.md
@ -780,6 +874,13 @@ qmd get "#abc123"
# Get document starting at line 50, max 100 lines
qmd get notes/meeting.md:50 -l 100
# Read 40 lines starting at line 120 via the :from:count suffix (works with docids)
qmd get notes/meeting.md:120:40
qmd get "#abc123:120:40"
# get / multi-get are line-numbered by default; disable with --no-line-numbers
qmd get notes/meeting.md --no-line-numbers
# Get multiple documents by glob pattern
qmd multi-get "journals/2025-05*.md"
@ -796,6 +897,75 @@ qmd multi-get "docs/*.md" --json
qmd cleanup
```
### Benchmarking
Measure search quality across all four backends with `qmd bench` and a fixture file
of queries with known-relevant documents.
**From a git checkout**, an example fixture and its test corpus ship in the repo:
```sh
# One-time setup (indexes the repo's test corpus into its own collection)
qmd collection add test/eval-docs --name eval-docs
qmd embed -c eval-docs
# Run the benchmark (table output)
qmd bench src/bench/fixtures/example.json
# JSON output for programmatic analysis
qmd bench src/bench/fixtures/example.json --json
```
> The example fixture (`src/bench/fixtures/example.json`) and its test corpus
> (`test/eval-docs/`) exist only in a git checkout — they are **not** part of the
> published npm package. If you installed via `npm`/`npx`, write your own fixture
> (see below) against a collection you have already indexed:
>
> ```sh
> qmd bench my-fixture.json -c my-collection
> ```
Each query runs against four backends, reporting precision@k, recall, MRR, and F1:
| Backend | What it tests | LLM required |
|---------|---------------|--------------|
| `bm25` | Keyword search only (FTS5) | No |
| `vector` | Semantic similarity only | Embedding model |
| `hybrid` | BM25 + vector fusion (no reranking) | Embedding model |
| `full` | Full pipeline with LLM reranking | All three models |
**Score interpretation:** `1.00` = perfect (all expected docs in top results),
`0.00` = complete miss. The example fixture typically shows bm25 ~0.50, vector
~0.70, and hybrid/full ~1.00 — a concrete demonstration of why hybrid search beats
either backend alone.
**Custom fixtures** are JSON:
```json
{
"description": "My benchmark",
"version": 1,
"collection": "my-collection",
"queries": [
{
"id": "find-auth",
"query": "authentication flow",
"type": "semantic",
"expected_files": ["docs/auth-design.md"],
"expected_in_top_k": 3
}
]
}
```
`expected_files` are collection-relative paths as shown by `qmd ls`. The `type`
field (`exact`, `semantic`, `topical`, `cross-domain`, `alias`) labels queries for
grouping — it does not change search behavior.
> **Heads-up:** if the fixture's collection isn't indexed, bench currently runs to
> completion and reports all zeros with no warning. Verify setup with
> `qmd ls <collection>` first.
## Data Storage
Index stored in: `~/.cache/qmd/index.sqlite`
@ -817,6 +987,9 @@ llm_cache -- Cached LLM responses (query expansion, rerank scores)
| Variable | Default | Description |
|----------|---------|-------------|
| `XDG_CACHE_HOME` | `~/.cache` | Cache directory location |
| `QMD_LLAMA_GPU` | `auto` | Force llama.cpp GPU backend (`metal`, `vulkan`, `cuda`) or disable GPU with `false` |
| `QMD_FORCE_CPU` | unset | Set to `1`/`true` to force CPU mode before any CUDA/Vulkan/Metal probing. Equivalent CLI flag: `--no-gpu`. |
| `QMD_EMBED_PARALLELISM` | automatic | Override embedding/reranking context parallelism (1-8). Windows CUDA defaults to `1` because parallel CUDA contexts can crash with `ggml-cuda.cu:98`; use Vulkan or raise this only if your driver is stable. |
## How It Works

162
bin/qmd
View File

@ -31,3 +31,165 @@ elif [ -x "$HOME/.bun/bin/bun" ]; then
else
exec node "$DIR/dist/cli/qmd.js" "$@"
fi
#!/usr/bin/env node
// 2>/dev/null; if command -v node >/dev/null 2>&1; then exec node "$0" "$@"; else exec bun "$0" "$@"; fi
// Cross-platform launcher for qmd.
//
// Previously this was a POSIX shell script with `#!/bin/sh`, which meant npm
// on Windows generated shims that tried to route through `/bin/sh` — a path
// that doesn't exist on Windows, so `qmd` failed immediately after a global
// install. Rewriting the launcher in Node.js lets npm generate native
// cmd/ps1/sh shims that invoke `node` directly on every platform.
import { spawn, spawnSync } from "node:child_process";
import { existsSync, realpathSync } from "node:fs";
import { dirname, resolve } from "node:path";
import { fileURLToPath } from "node:url";
// Resolve symlinks so global installs (npm link / npm install -g) can find
// the actual package directory instead of the global bin directory.
const self = realpathSync(fileURLToPath(import.meta.url));
const pkgDir = resolve(dirname(self), "..");
const jsEntry = resolve(pkgDir, "dist/cli/qmd.js");
const tsEntry = resolve(pkgDir, "src/cli/qmd.ts");
// MCP stdio reserves stdout exclusively for JSON-RPC frames. node-llama-cpp
// / llama.cpp / ggml can write native logs directly to stdout before JS-level
// log handlers are attached, so seed the native quiet env before Node/Bun imports
// the CLI and its LLM modules. Preserve explicit user values when provided.
if (process.argv[2] === "mcp") {
process.env.LLAMA_LOG_LEVEL = process.env.LLAMA_LOG_LEVEL || "error";
process.env.GGML_LOG_LEVEL = process.env.GGML_LOG_LEVEL || "error";
process.env.GGML_BACKEND_SILENT = process.env.GGML_BACKEND_SILENT || "1";
}
// libggml-metal on macOS uses "residency sets" to keep allocated model memory
// resident across inference requests (180-second keep_alive timer). The
// process-static device destructor that runs during libc exit() asserts the
// residency set is empty (ggml-org/llama.cpp#22593); the keep_alive hasn't
// expired by exit, so the assertion fails and ggml_abort dumps a multi-kB
// stack trace to stderr even when the user-visible results were already
// emitted correctly. No JS-side dispose can prevent it because the static
// destructor runs in __cxa_finalize_ranges, after every JS-reachable cleanup.
//
// For QMD's short-lived CLI workflow, residency sets provide no observable
// performance benefit (subsequent requests don't reuse the warm mapping —
// measured: identical wall time with and without on M3 Pro), so disable them
// by default on darwin. The env var must be set BEFORE the native llama.cpp
// binding loads, which is why it lives here in the launcher rather than in
// the JS entry point. Opt back in with QMD_METAL_KEEP_RESIDENCY=1 if you
// run long-lived qmd processes (the MCP daemon may benefit on hot reload)
// or are triaging an upstream Metal teardown fix.
if (process.platform === "darwin" && process.env.QMD_METAL_KEEP_RESIDENCY !== "1") {
process.env.GGML_METAL_NO_RESIDENCY = process.env.GGML_METAL_NO_RESIDENCY || "1";
}
function hasBun() {
try {
const res = spawnSync("bun", ["--version"], { stdio: "ignore", shell: process.platform === "win32" });
return res.status === 0;
} catch {
return false;
}
}
// In published packages, bin/qmd must run dist/. In a git checkout, however,
// dist/ is often ignored and can be stale after git reset or branch switches.
// Prefer source mode only for checkouts so ./bin/qmd reflects the checked-out
// source without changing packaged/runtime behavior.
//
// Critical: source-mode detection must NOT trigger when a package manager
// installed us. `pnpm install -g .` (and `npm install -g .`) copy the entire
// working tree — including .git/, bun.lock, package-lock.json, src/, and even
// node_modules/ — into <prefix>/node_modules/@tobilu/qmd/, so .git and a
// lockfile being present is not a reliable "this is a working tree" signal.
// What IS reliable: a package-manager install always lands the package
// directory inside a `node_modules/` segment; a bare working-tree checkout
// (with `bun link` or a direct path invocation) does not. Gate source mode
// on that. Allow QMD_SOURCE_MODE=1 / =0 as an explicit override for the
// rare case where the heuristic disagrees with the user.
const sourceOverride = process.env.QMD_SOURCE_MODE;
const looksInstalled = pkgDir.split("/").includes("node_modules");
const sourceAllowed = sourceOverride === "1"
|| (sourceOverride !== "0" && !looksInstalled);
let useSourceMode = false;
let sourceRunner = null;
let sourceArgs = [];
if (sourceAllowed && existsSync(resolve(pkgDir, ".git")) && existsSync(tsEntry)) {
// Lockfile-driven runner selection — mirror the dist-mode logic below so
// source mode picks the same runtime the user's deps were installed for.
// package-lock.json wins over bun.lock when both are present: pnpm/npm
// installs ship the Node-ABI native modules (better-sqlite3, sqlite-vec),
// and running Bun against them produces ABI mismatches. This also fixes
// pnpm-global installs, which copy the whole working tree — including .git
// and bun.lock — into the install dir and used to route through Bun even
// when the user installed via npm/pnpm.
const hasNpmLock = existsSync(resolve(pkgDir, "package-lock.json"));
const hasBunLock = existsSync(resolve(pkgDir, "bun.lock")) || existsSync(resolve(pkgDir, "bun.lockb"));
const tsxEntry = resolve(pkgDir, "node_modules/tsx/dist/cli.mjs");
const tsxAvailable = existsSync(tsxEntry);
if (hasNpmLock && tsxAvailable) {
useSourceMode = true;
sourceRunner = "node";
sourceArgs = [tsxEntry, tsEntry, ...process.argv.slice(2)];
} else if (hasBunLock && hasBun()) {
useSourceMode = true;
sourceRunner = "bun";
sourceArgs = [tsEntry, ...process.argv.slice(2)];
} else if (tsxAvailable) {
useSourceMode = true;
sourceRunner = "node";
sourceArgs = [tsxEntry, tsEntry, ...process.argv.slice(2)];
}
}
if (!useSourceMode && !existsSync(jsEntry)) {
console.error(`qmd is not built: missing ${jsEntry}`);
console.error("Run: bun install && bun run build");
console.error("Or: npm install && npm run build");
console.error("After building, run: qmd doctor");
process.exit(1);
}
// Detect the package manager that installed dependencies by checking lockfiles.
// $BUN_INSTALL is intentionally NOT checked — it only indicates that bun exists
// on the system, not that it was used to install this package (see #361).
//
// package-lock.json takes priority: if it exists, npm installed the native
// modules for Node. The repo ships bun.lock, so without this check, source
// builds that use npm would be incorrectly routed to bun, causing ABI
// mismatches with better-sqlite3 / sqlite-vec (see #381).
let runnerName = "node";
if (existsSync(resolve(pkgDir, "package-lock.json"))) {
runnerName = "node";
} else if (existsSync(resolve(pkgDir, "bun.lock")) || existsSync(resolve(pkgDir, "bun.lockb"))) {
runnerName = "bun";
} else {
runnerName = "node";
}
const runner = useSourceMode ? sourceRunner : (runnerName === "node" ? "node" : "bun");
const args = useSourceMode ? sourceArgs : [jsEntry, ...process.argv.slice(2)];
const needsShell = (runner === "bun") && process.platform === "win32";
const child = spawn(runner, args, {
stdio: "inherit",
shell: needsShell,
});
child.on("exit", (code, signal) => {
if (signal) {
process.kill(process.pid, signal);
} else {
process.exit(code ?? 0);
}
});
child.on("error", (err) => {
const name = useSourceMode ? sourceRunner : runnerName;
console.error(`qmd: failed to launch ${name}: ${err.message}`);
process.exit(1);
});

View File

@ -6,13 +6,17 @@
"name": "2025-12-07-bm25-q",
"dependencies": {
"@modelcontextprotocol/sdk": "1.29.0",
"better-sqlite3": "12.8.0",
"better-sqlite3": "12.10.0",
"fast-glob": "3.3.3",
"node-llama-cpp": "3.18.1",
"picomatch": "4.0.4",
"sqlite-vec": "0.1.9",
"web-tree-sitter": "0.26.7",
"yaml": "2.8.3",
"tree-sitter-go": "0.25.0",
"tree-sitter-python": "0.25.0",
"tree-sitter-rust": "0.24.0",
"tree-sitter-typescript": "0.23.2",
"web-tree-sitter": "0.26.8",
"yaml": "2.9.0",
"zod": "4.2.1",
},
"devDependencies": {
@ -26,10 +30,6 @@
"sqlite-vec-linux-arm64": "0.1.9",
"sqlite-vec-linux-x64": "0.1.9",
"sqlite-vec-windows-x64": "0.1.9",
"tree-sitter-go": "0.23.4",
"tree-sitter-python": "0.23.4",
"tree-sitter-rust": "0.24.0",
"tree-sitter-typescript": "0.23.2",
},
"peerDependencies": {
"typescript": "^5.9.3",
@ -247,7 +247,7 @@
"base64-js": ["base64-js@1.5.1", "", {}, "sha512-AKpaYlHn8t4SVbOHCy+b5+KKgvR4vrsD8vbvrbiQJps7fKDTkjkDry6ji0rUJjC0kzbNePLwzxq8iypo41qeWA=="],
"better-sqlite3": ["better-sqlite3@12.8.0", "", { "dependencies": { "bindings": "^1.5.0", "prebuild-install": "^7.1.1" } }, "sha512-RxD2Vd96sQDjQr20kdP+F+dK/1OUNiVOl200vKBZY8u0vTwysfolF6Hq+3ZK2+h8My9YvZhHsF+RSGZW2VYrPQ=="],
"better-sqlite3": ["better-sqlite3@12.10.0", "", { "dependencies": { "bindings": "^1.5.0", "prebuild-install": "^7.1.1" } }, "sha512-CyzaZRQKyHkB2ZInfTTl2nvT33EbDpjkLEbE8/Zck3Ll6O0qqvuGdrJ45HgtH+HykRg88ITY3AdreBGN70aBSQ=="],
"bindings": ["bindings@1.5.0", "", { "dependencies": { "file-uri-to-path": "1.0.0" } }, "sha512-p2q/t/mhvuOj/UeLlV6566GD/guowlr0hHxClI0W9m7MWYkL1F0hLo+0Aexs9HSPCtR1SXQ0TD3MMKrXZajbiQ=="],
@ -401,7 +401,7 @@
"get-proto": ["get-proto@1.0.1", "", { "dependencies": { "dunder-proto": "^1.0.1", "es-object-atoms": "^1.0.0" } }, "sha512-sTSfBjoXBp89JvIKIefqw7U2CCebsc74kiY6awiGogKtoSGbgjYE/G/+l9sF3MWFPNc9IcoOC4ODfKHfxFmp0g=="],
"get-tsconfig": ["get-tsconfig@4.13.6", "", { "dependencies": { "resolve-pkg-maps": "^1.0.0" } }, "sha512-shZT/QMiSHc/YBLxxOkMtgSid5HFoauqCE3/exfsEcwg1WkeqjG+V40yBbBrsD+jW2HDXcs28xOfcbm2jI8Ddw=="],
"get-tsconfig": ["get-tsconfig@4.14.0", "", { "dependencies": { "resolve-pkg-maps": "^1.0.0" } }, "sha512-yTb+8DXzDREzgvYmh6s9vHsSVCHeC0G3PI5bEXNBHtmshPnO+S5O7qgLEOn0I5QvMy6kpZN8K1NKGyilLb93wA=="],
"github-from-package": ["github-from-package@0.0.0", "", {}, "sha512-SyHy3T1v2NUXn29OsWdxmK6RwHD+vkj3v8en8AOBZ1wBQ/hCAQ5bAQTD02kW4W9tUp/3Qh6J8r9EvntiyCmOOw=="],
@ -509,7 +509,7 @@
"node-abi": ["node-abi@3.87.0", "", { "dependencies": { "semver": "^7.3.5" } }, "sha512-+CGM1L1CgmtheLcBuleyYOn7NWPVu0s0EJH2C4puxgEZb9h8QpR9G2dBfZJOAUhi7VQxuBPMd0hiISWcTyiYyQ=="],
"node-addon-api": ["node-addon-api@8.5.0", "", {}, "sha512-/bRZty2mXUIFY/xU5HLvveNHlswNJej+RnxBjOMkidWfwZzgTbPG1E3K5TOxRLOR+5hX7bSofy8yf1hZevMS8A=="],
"node-addon-api": ["node-addon-api@8.7.0", "", {}, "sha512-9MdFxmkKaOYVTV+XVRG8ArDwwQ77XIgIPyKASB1k3JPq3M8fGQQQE3YpMOrKm6g//Ktx8ivZr8xo1Qmtqub+GA=="],
"node-api-headers": ["node-api-headers@1.8.0", "", {}, "sha512-jfnmiKWjRAGbdD1yQS28bknFM1tbHC1oucyuMPjmkEs+kpiu76aRs40WlTmBmyEgzDM76ge1DQ7XJ3R5deiVjQ=="],
@ -687,11 +687,11 @@
"toidentifier": ["toidentifier@1.0.1", "", {}, "sha512-o5sSPKEkg/DIQNmH43V0/uerLrpzVedkUh8tGNvaeXpfpuwjKenlSox/2O/BTlZUtEe+JG7s5YhEz608PlAHRA=="],
"tree-sitter-go": ["tree-sitter-go@0.23.4", "", { "dependencies": { "node-addon-api": "^8.2.1", "node-gyp-build": "^4.8.2" }, "peerDependencies": { "tree-sitter": "^0.21.1" }, "optionalPeers": ["tree-sitter"] }, "sha512-iQaHEs4yMa/hMo/ZCGqLfG61F0miinULU1fFh+GZreCRtKylFLtvn798ocCZjO2r/ungNZgAY1s1hPFyAwkc7w=="],
"tree-sitter-go": ["tree-sitter-go@0.25.0", "", { "dependencies": { "node-addon-api": "^8.3.1", "node-gyp-build": "^4.8.4" }, "peerDependencies": { "tree-sitter": "^0.25.0" }, "optionalPeers": ["tree-sitter"] }, "sha512-APBc/Dq3xz/e35Xpkhb1blu5UgW+2E3RyGWawZSCNcbGwa7jhSQPS8KsUupuzBla8PCo8+lz9W/JDJjmfRa2tw=="],
"tree-sitter-javascript": ["tree-sitter-javascript@0.23.1", "", { "dependencies": { "node-addon-api": "^8.2.2", "node-gyp-build": "^4.8.2" }, "peerDependencies": { "tree-sitter": "^0.21.1" }, "optionalPeers": ["tree-sitter"] }, "sha512-/bnhbrTD9frUYHQTiYnPcxyHORIw157ERBa6dqzaKxvR/x3PC4Yzd+D1pZIMS6zNg2v3a8BZ0oK7jHqsQo9fWA=="],
"tree-sitter-python": ["tree-sitter-python@0.23.4", "", { "dependencies": { "node-addon-api": "^8.2.1", "node-gyp-build": "^4.8.2" }, "peerDependencies": { "tree-sitter": "^0.21.1" }, "optionalPeers": ["tree-sitter"] }, "sha512-MbmUAl7y5UCUWqHscHke7DdRDwQnVNMNKQYQc4Gq2p09j+fgPxaU8JVsuOI/0HD3BSEEe5k9j3xmdtIWbDtDgw=="],
"tree-sitter-python": ["tree-sitter-python@0.25.0", "", { "dependencies": { "node-addon-api": "^8.5.0", "node-gyp-build": "^4.8.4" }, "peerDependencies": { "tree-sitter": "^0.25.0" }, "optionalPeers": ["tree-sitter"] }, "sha512-eCmJx6zQa35GxaCtQD+wXHOhYqBxEL+bp71W/s3fcDMu06MrtzkVXR437dRrCrbrDbyLuUDJpAgycs7ncngLXw=="],
"tree-sitter-rust": ["tree-sitter-rust@0.24.0", "", { "dependencies": { "node-addon-api": "^8.2.2", "node-gyp-build": "^4.8.4" }, "peerDependencies": { "tree-sitter": "^0.22.1" }, "optionalPeers": ["tree-sitter"] }, "sha512-NWemUDf629Tfc90Y0Z55zuwPCAHkLxWnMf2RznYu4iBkkrQl2o/CHGB7Cr52TyN5F1DAx8FmUnDtCy9iUkXZEQ=="],
@ -725,7 +725,7 @@
"vitest": ["vitest@3.2.4", "", { "dependencies": { "@types/chai": "^5.2.2", "@vitest/expect": "3.2.4", "@vitest/mocker": "3.2.4", "@vitest/pretty-format": "^3.2.4", "@vitest/runner": "3.2.4", "@vitest/snapshot": "3.2.4", "@vitest/spy": "3.2.4", "@vitest/utils": "3.2.4", "chai": "^5.2.0", "debug": "^4.4.1", "expect-type": "^1.2.1", "magic-string": "^0.30.17", "pathe": "^2.0.3", "picomatch": "^4.0.2", "std-env": "^3.9.0", "tinybench": "^2.9.0", "tinyexec": "^0.3.2", "tinyglobby": "^0.2.14", "tinypool": "^1.1.1", "tinyrainbow": "^2.0.0", "vite": "^5.0.0 || ^6.0.0 || ^7.0.0-0", "vite-node": "3.2.4", "why-is-node-running": "^2.3.0" }, "peerDependencies": { "@edge-runtime/vm": "*", "@types/debug": "^4.1.12", "@types/node": "^18.0.0 || ^20.0.0 || >=22.0.0", "@vitest/browser": "3.2.4", "@vitest/ui": "3.2.4", "happy-dom": "*", "jsdom": "*" }, "optionalPeers": ["@edge-runtime/vm", "@types/debug", "@types/node", "@vitest/browser", "@vitest/ui", "happy-dom", "jsdom"], "bin": { "vitest": "vitest.mjs" } }, "sha512-LUCP5ev3GURDysTWiP47wRRUpLKMOfPh+yKTx3kVIEiu5KOMeqzpnYNsKyOoVrULivR8tLcks4+lga33Whn90A=="],
"web-tree-sitter": ["web-tree-sitter@0.26.7", "", {}, "sha512-KiZhelTvBA/ziUHEO7Emb75cGVAq8iGZNabYaZm53Zpy50NsXyOW+xSHlwHt5CVg/TRPZBfeVLTTobF0LjFJ1w=="],
"web-tree-sitter": ["web-tree-sitter@0.26.8", "", {}, "sha512-4sUwi7ZyOrIk5KLgYLkc2A/F0LFMQnBhfb+2Cdl7ik4ePJ6JD+fk4ofI2sA5eGawBKBaK4Vntt7Ww5KcEsay4A=="],
"which": ["which@6.0.1", "", { "dependencies": { "isexe": "^4.0.0" }, "bin": { "node-which": "bin/which.js" } }, "sha512-oGLe46MIrCRqX7ytPUf66EAYvdeMIZYn3WaocqqKZAxrBpkqHfL/qvTyJ/bTk5+AqHCjXmrv3CEWgy368zhRUg=="],
@ -739,7 +739,7 @@
"yallist": ["yallist@5.0.0", "", {}, "sha512-YgvUTfwqyc7UXVMrB+SImsVYSmTS8X/tSrtdNZMImM+n7+QTriRXyXim0mBrTXNeqzVF0KWGgHPeiyViFFrNDw=="],
"yaml": ["yaml@2.8.3", "", { "bin": { "yaml": "bin.mjs" } }, "sha512-AvbaCLOO2Otw/lW5bmh9d/WEdcDFdQp2Z2ZUH3pX9U2ihyUY0nvLv7J6TrWowklRGPYbB/IuIMfYgxaCPg5Bpg=="],
"yaml": ["yaml@2.9.0", "", { "bin": { "yaml": "bin.mjs" } }, "sha512-2AvhNX3mb8zd6Zy7INTtSpl1F15HW6Wnqj0srWlkKLcpYl/gMIMJiyuGq2KeI2YFxUPjdlB+3Lc10seMLtL4cA=="],
"yargs": ["yargs@17.7.2", "", { "dependencies": { "cliui": "^8.0.1", "escalade": "^3.1.1", "get-caller-file": "^2.0.5", "require-directory": "^2.1.1", "string-width": "^4.2.3", "y18n": "^5.0.5", "yargs-parser": "^21.1.1" } }, "sha512-7dSzzRQ++CKnNI/krKnYRV7JKKPUXMEh61soaHKg9mrWEhzFWhFnxPxGl+69cD1Ou63C13NUPCnmIcrvqCuM6w=="],
@ -773,8 +773,6 @@
"micromatch/picomatch": ["picomatch@2.3.1", "", {}, "sha512-JU3teHTNjmE2VCGFzuY8EXzCDVwEqB2a8fsIvwaStHhAWJEeVd1o1QD80CU6+ZdEXXSLbSsuLwJjkCBWqRQUVA=="],
"node-llama-cpp/node-addon-api": ["node-addon-api@8.7.0", "", {}, "sha512-9MdFxmkKaOYVTV+XVRG8ArDwwQ77XIgIPyKASB1k3JPq3M8fGQQQE3YpMOrKm6g//Ktx8ivZr8xo1Qmtqub+GA=="],
"ora/cli-spinners": ["cli-spinners@3.4.0", "", {}, "sha512-bXfOC4QcT1tKXGorxL3wbJm6XJPDqEnij2gQ2m7ESQuE+/z9YFIWnl/5RpTiKWbMq3EVKR4fRLJGn6DVfu0mpw=="],
"postcss/nanoid": ["nanoid@3.3.11", "", { "bin": { "nanoid": "bin/nanoid.cjs" } }, "sha512-N8SpfPUnUp1bK+PMYW8qSWdl9U+wwNWI4QKxOYDy9JAro3WMX7p2OeVRF9v+347pnakNevPmiHhNmZ2HbFA76w=="],
@ -793,9 +791,13 @@
"tinyglobby/picomatch": ["picomatch@4.0.3", "", {}, "sha512-5gTmgEY/sqK6gFXLIsQNH19lWb4ebPDLA4SdLP7dsWkIXHWlG66oPuVvXSGFPppYZz8ZDZq0dYYrbHfBCVUb1Q=="],
"vite/picomatch": ["picomatch@4.0.3", "", {}, "sha512-5gTmgEY/sqK6gFXLIsQNH19lWb4ebPDLA4SdLP7dsWkIXHWlG66oPuVvXSGFPppYZz8ZDZq0dYYrbHfBCVUb1Q=="],
"tree-sitter-javascript/node-addon-api": ["node-addon-api@8.5.0", "", {}, "sha512-/bRZty2mXUIFY/xU5HLvveNHlswNJej+RnxBjOMkidWfwZzgTbPG1E3K5TOxRLOR+5hX7bSofy8yf1hZevMS8A=="],
"vitest/picomatch": ["picomatch@4.0.3", "", {}, "sha512-5gTmgEY/sqK6gFXLIsQNH19lWb4ebPDLA4SdLP7dsWkIXHWlG66oPuVvXSGFPppYZz8ZDZq0dYYrbHfBCVUb1Q=="],
"tree-sitter-rust/node-addon-api": ["node-addon-api@8.5.0", "", {}, "sha512-/bRZty2mXUIFY/xU5HLvveNHlswNJej+RnxBjOMkidWfwZzgTbPG1E3K5TOxRLOR+5hX7bSofy8yf1hZevMS8A=="],
"tree-sitter-typescript/node-addon-api": ["node-addon-api@8.5.0", "", {}, "sha512-/bRZty2mXUIFY/xU5HLvveNHlswNJej+RnxBjOMkidWfwZzgTbPG1E3K5TOxRLOR+5hX7bSofy8yf1hZevMS8A=="],
"vite/picomatch": ["picomatch@4.0.3", "", {}, "sha512-5gTmgEY/sqK6gFXLIsQNH19lWb4ebPDLA4SdLP7dsWkIXHWlG66oPuVvXSGFPppYZz8ZDZq0dYYrbHfBCVUb1Q=="],
"wrap-ansi/ansi-styles": ["ansi-styles@4.3.0", "", { "dependencies": { "color-convert": "^2.0.1" } }, "sha512-zbB9rCJAT1rbjiVDb2hqKFHNYLxgtk8NURxZ3IZwD3F6NtxbXZQCnnSi1Lkx+IDohdPlFp222wVALIheZJQSEg=="],

View File

@ -127,26 +127,41 @@ Without intent, "performance" is ambiguous (web-perf? team health? fitness?). Wi
- Empty lines are ignored
- Leading/trailing whitespace is trimmed
## MCP/HTTP API
## Scoping
The `query` tool accepts a query document:
Restrict queries to specific collections with `-c` (CLI) or `collections` (MCP/SDK):
```json
{
"q": "lex: CAP theorem\nvec: consistency vs availability",
"collections": ["docs"],
"limit": 10
}
```bash
# CLI — by collection name (see `qmd collection list`)
qmd query -c docs "how does auth work"
qmd query -c docs -c notes $'lex: auth\nvec: authentication flow'
```
Or structured format:
For MCP / HTTP, pass a plural `collections` array (OR match):
```json
{ "searches": [ { "type": "lex", "query": "auth" } ], "collections": ["docs", "notes"] }
```
`-c`/`collections` matches by collection name and works from any directory.
Multiple values are OR-combined. Without scoping, all default-included collections
are searched; collections marked excluded (`qmd collection exclude <name>`) are
skipped unless explicitly named. In MCP the parameter is the plural `collections`
array — a singular `collection` is silently ignored.
## MCP/HTTP API
The `query` tool (and the REST `/query` endpoint) accept a structured query with a
`searches` array. There is no `q` string parameter — `searches` is required:
```json
{
"searches": [
{ "type": "lex", "query": "CAP theorem" },
{ "type": "vec", "query": "consistency vs availability" }
]
],
"collections": ["docs"],
"limit": 10
}
```

View File

@ -44,8 +44,8 @@
});
nodeModulesHashes = {
x86_64-linux = "sha256-D0ezO4vqq4iswcAMU2DCql9ZAQvh3me6N9aDB5roq4w=";
aarch64-darwin = "sha256-qU+9KdR/nTocelyANS09I/4yaQ+7s1LvJNqB27IOK/c=";
x86_64-linux = "sha256-sVXoNWIcx1RYRtRWB4F2j7x8/cabFBKq+plFhPU7tBc=";
aarch64-darwin = "sha256-gDyJ5boyH44SeXlKo+W4G36GSUejyXP5PFvW+dFS1Mk=";
# Populate these on first build for additional hosts if/when needed.
aarch64-linux = pkgs.lib.fakeHash;

View File

@ -1,6 +1,6 @@
{
"name": "@tobilu/qmd",
"version": "2.1.0",
"version": "2.5.3",
"description": "Query Markup Documents - On-device hybrid search for markdown files with BM25, vector search, and LLM reranking",
"type": "module",
"main": "dist/index.js",
@ -17,13 +17,23 @@
"files": [
"bin/",
"dist/",
"skills/",
"scripts/build.mjs",
"scripts/check-package-grammars.mjs",
"scripts/package-smoke.mjs",
"scripts/test-all.mjs",
"LICENSE",
"CHANGELOG.md"
],
"scripts": {
"prepare": "[ -d .git ] && ./scripts/install-hooks.sh || true",
"build": "tsc -p tsconfig.build.json && printf '#!/usr/bin/env node\n' | cat - dist/cli/qmd.js > dist/cli/qmd.tmp && mv dist/cli/qmd.tmp dist/cli/qmd.js && chmod +x dist/cli/qmd.js",
"test": "vitest run --reporter=verbose test/",
"build": "node scripts/build.mjs",
"test": "node scripts/test-all.mjs",
"test:types": "node ./node_modules/typescript/bin/tsc -p tsconfig.build.json --noEmit",
"test:node": "node ./node_modules/vitest/vitest.mjs run --reporter=verbose --testTimeout 60000",
"test:bun": "bun test --timeout 60000 --preload ./src/test-preload.ts",
"test:unit": "CI=true node ./node_modules/vitest/vitest.mjs run --reporter=verbose --testTimeout 60000 test/ && CI=true bun test --timeout 60000 --preload ./src/test-preload.ts test/",
"test:package": "node scripts/package-smoke.mjs",
"qmd": "tsx src/cli/qmd.ts",
"index": "tsx src/cli/qmd.ts index",
"vector": "tsx src/cli/qmd.ts vector",
@ -31,7 +41,8 @@
"vsearch": "tsx src/cli/qmd.ts vsearch",
"rerank": "tsx src/cli/qmd.ts rerank",
"inspector": "npx @modelcontextprotocol/inspector tsx src/cli/qmd.ts mcp",
"release": "./scripts/release.sh"
"release": "./scripts/release.sh",
"smoke:package-grammars": "node scripts/check-package-grammars.mjs"
},
"publishConfig": {
"access": "public"
@ -46,13 +57,17 @@
},
"dependencies": {
"@modelcontextprotocol/sdk": "1.29.0",
"better-sqlite3": "12.8.0",
"better-sqlite3": "12.10.0",
"fast-glob": "3.3.3",
"node-llama-cpp": "3.18.1",
"picomatch": "4.0.4",
"sqlite-vec": "0.1.9",
"web-tree-sitter": "0.26.7",
"yaml": "2.8.3",
"tree-sitter-go": "0.25.0",
"tree-sitter-python": "0.25.0",
"tree-sitter-rust": "0.24.0",
"tree-sitter-typescript": "0.23.2",
"web-tree-sitter": "0.26.8",
"yaml": "2.9.0",
"zod": "4.2.1"
},
"optionalDependencies": {
@ -60,11 +75,7 @@
"sqlite-vec-darwin-x64": "0.1.9",
"sqlite-vec-linux-arm64": "0.1.9",
"sqlite-vec-linux-x64": "0.1.9",
"sqlite-vec-windows-x64": "0.1.9",
"tree-sitter-go": "0.23.4",
"tree-sitter-python": "0.23.4",
"tree-sitter-rust": "0.24.0",
"tree-sitter-typescript": "0.23.2"
"sqlite-vec-windows-x64": "0.1.9"
},
"devDependencies": {
"@types/better-sqlite3": "7.6.13",

133
pnpm-lock.yaml generated
View File

@ -12,8 +12,8 @@ importers:
specifier: 1.29.0
version: 1.29.0(zod@4.2.1)
better-sqlite3:
specifier: 12.8.0
version: 12.8.0
specifier: 12.10.0
version: 12.10.0
fast-glob:
specifier: 3.3.3
version: 3.3.3
@ -26,15 +26,27 @@ importers:
sqlite-vec:
specifier: 0.1.9
version: 0.1.9
tree-sitter-go:
specifier: 0.25.0
version: 0.25.0
tree-sitter-python:
specifier: 0.25.0
version: 0.25.0
tree-sitter-rust:
specifier: 0.24.0
version: 0.24.0
tree-sitter-typescript:
specifier: 0.23.2
version: 0.23.2
typescript:
specifier: ^5.9.3
version: 5.9.3
web-tree-sitter:
specifier: 0.26.7
version: 0.26.7
specifier: 0.26.8
version: 0.26.8
yaml:
specifier: 2.8.3
version: 2.8.3
specifier: 2.9.0
version: 2.9.0
zod:
specifier: 4.2.1
version: 4.2.1
@ -47,7 +59,7 @@ importers:
version: 4.21.0
vitest:
specifier: 3.2.4
version: 3.2.4(@types/node@25.5.2)(tsx@4.21.0)(yaml@2.8.3)
version: 3.2.4(@types/node@25.5.2)(tsx@4.21.0)(yaml@2.9.0)
optionalDependencies:
sqlite-vec-darwin-arm64:
specifier: 0.1.9
@ -64,18 +76,6 @@ importers:
sqlite-vec-windows-x64:
specifier: 0.1.9
version: 0.1.9
tree-sitter-go:
specifier: 0.23.4
version: 0.23.4
tree-sitter-python:
specifier: 0.23.4
version: 0.23.4
tree-sitter-rust:
specifier: 0.24.0
version: 0.24.0
tree-sitter-typescript:
specifier: 0.23.2
version: 0.23.2
packages:
@ -273,36 +273,42 @@ packages:
engines: {node: '>=20.0.0'}
cpu: [arm64, x64]
os: [linux]
libc: [glibc]
'@node-llama-cpp/linux-armv7l@3.18.1':
resolution: {integrity: sha512-BrJL2cGo0pN5xd5nw+CzTn2rFMpz9MJyZZPUY81ptGkF2uIuXT2hdCVh56i9ImQrTwBfq1YcZL/l/Qe/1+HR/Q==}
engines: {node: '>=20.0.0'}
cpu: [arm, x64]
os: [linux]
libc: [glibc]
'@node-llama-cpp/linux-x64-cuda-ext@3.18.1':
resolution: {integrity: sha512-VqyKhAVHPCpFzh0f1koCBgpThL+04QOXwv0oDQ8s8YcpfMMOXQlBhTB0plgTh0HrPExoObfTS4ohkrbyGgmztQ==}
engines: {node: '>=20.0.0'}
cpu: [x64]
os: [linux]
libc: [glibc]
'@node-llama-cpp/linux-x64-cuda@3.18.1':
resolution: {integrity: sha512-qOaYP4uwsUoBHQ/7xSOvyJIuXapS57Al+Sudgi00f96ldNZLKe1vuSGptAi5LTM2lIj66PKm6h8PlRWctwsZ2g==}
engines: {node: '>=20.0.0'}
cpu: [x64]
os: [linux]
libc: [glibc]
'@node-llama-cpp/linux-x64-vulkan@3.18.1':
resolution: {integrity: sha512-SIaNTK5pUPhwJD0gmiQfHa8OrRctVMmnqu+slJrz2Mzgg/XrwFndJlS9hvc+jSjTXCouwf7sYeQaaJWvQgBh/A==}
engines: {node: '>=20.0.0'}
cpu: [x64]
os: [linux]
libc: [glibc]
'@node-llama-cpp/linux-x64@3.18.1':
resolution: {integrity: sha512-tRmWcsyvAcqJHQHXHsaOkx6muGbcirA9nRdNgH6n7bjGUw4VuoBD3dChyNF3/Ktt7ohB9kz+XhhyZjbDHpXyMA==}
engines: {node: '>=20.0.0'}
cpu: [x64]
os: [linux]
libc: [glibc]
'@node-llama-cpp/mac-arm64-metal@3.18.1':
resolution: {integrity: sha512-cyZTdsUMlvuRlGmkkoBbN3v/DT6NuruEqoQYd9CqIrPyLa1xLNBTSKIZ9SgRnw23iCOj4URfITvRP+2pu63LuQ==}
@ -375,24 +381,28 @@ packages:
engines: {node: '>= 10'}
cpu: [arm64]
os: [linux]
libc: [glibc]
'@reflink/reflink-linux-arm64-musl@0.1.19':
resolution: {integrity: sha512-37iO/Dp6m5DDaC2sf3zPtx/hl9FV3Xze4xoYidrxxS9bgP3S8ALroxRK6xBG/1TtfXKTvolvp+IjrUU6ujIGmA==}
engines: {node: '>= 10'}
cpu: [arm64]
os: [linux]
libc: [musl]
'@reflink/reflink-linux-x64-gnu@0.1.19':
resolution: {integrity: sha512-jbI8jvuYCaA3MVUdu8vLoLAFqC+iNMpiSuLbxlAgg7x3K5bsS8nOpTRnkLF7vISJ+rVR8W+7ThXlXlUQ93ulkw==}
engines: {node: '>= 10'}
cpu: [x64]
os: [linux]
libc: [glibc]
'@reflink/reflink-linux-x64-musl@0.1.19':
resolution: {integrity: sha512-e9FBWDe+lv7QKAwtKOt6A2W/fyy/aEEfr0g6j/hWzvQcrzHCsz07BNQYlNOjTfeytrtLU7k449H1PI95jA4OjQ==}
engines: {node: '>= 10'}
cpu: [x64]
os: [linux]
libc: [musl]
'@reflink/reflink-win32-arm64-msvc@0.1.19':
resolution: {integrity: sha512-09PxnVIQcd+UOn4WAW73WU6PXL7DwGS6wPlkMhMg2zlHHG65F3vHepOw06HFCq+N42qkaNAc8AKIabWvtk6cIQ==}
@ -444,66 +454,79 @@ packages:
resolution: {integrity: sha512-L+34Qqil+v5uC0zEubW7uByo78WOCIrBvci69E7sFASRl0X7b/MB6Cqd1lky/CtcSVTydWa2WZwFuWexjS5o6g==}
cpu: [arm]
os: [linux]
libc: [glibc]
'@rollup/rollup-linux-arm-musleabihf@4.60.1':
resolution: {integrity: sha512-n83O8rt4v34hgFzlkb1ycniJh7IR5RCIqt6mz1VRJD6pmhRi0CXdmfnLu9dIUS6buzh60IvACM842Ffb3xd6Gg==}
cpu: [arm]
os: [linux]
libc: [musl]
'@rollup/rollup-linux-arm64-gnu@4.60.1':
resolution: {integrity: sha512-Nql7sTeAzhTAja3QXeAI48+/+GjBJ+QmAH13snn0AJSNL50JsDqotyudHyMbO2RbJkskbMbFJfIJKWA6R1LCJQ==}
cpu: [arm64]
os: [linux]
libc: [glibc]
'@rollup/rollup-linux-arm64-musl@4.60.1':
resolution: {integrity: sha512-+pUymDhd0ys9GcKZPPWlFiZ67sTWV5UU6zOJat02M1+PiuSGDziyRuI/pPue3hoUwm2uGfxdL+trT6Z9rxnlMA==}
cpu: [arm64]
os: [linux]
libc: [musl]
'@rollup/rollup-linux-loong64-gnu@4.60.1':
resolution: {integrity: sha512-VSvgvQeIcsEvY4bKDHEDWcpW4Yw7BtlKG1GUT4FzBUlEKQK0rWHYBqQt6Fm2taXS+1bXvJT6kICu5ZwqKCnvlQ==}
cpu: [loong64]
os: [linux]
libc: [glibc]
'@rollup/rollup-linux-loong64-musl@4.60.1':
resolution: {integrity: sha512-4LqhUomJqwe641gsPp6xLfhqWMbQV04KtPp7/dIp0nzPxAkNY1AbwL5W0MQpcalLYk07vaW9Kp1PBhdpZYYcEw==}
cpu: [loong64]
os: [linux]
libc: [musl]
'@rollup/rollup-linux-ppc64-gnu@4.60.1':
resolution: {integrity: sha512-tLQQ9aPvkBxOc/EUT6j3pyeMD6Hb8QF2BTBnCQWP/uu1lhc9AIrIjKnLYMEroIz/JvtGYgI9dF3AxHZNaEH0rw==}
cpu: [ppc64]
os: [linux]
libc: [glibc]
'@rollup/rollup-linux-ppc64-musl@4.60.1':
resolution: {integrity: sha512-RMxFhJwc9fSXP6PqmAz4cbv3kAyvD1etJFjTx4ONqFP9DkTkXsAMU4v3Vyc5BgzC+anz7nS/9tp4obsKfqkDHg==}
cpu: [ppc64]
os: [linux]
libc: [musl]
'@rollup/rollup-linux-riscv64-gnu@4.60.1':
resolution: {integrity: sha512-QKgFl+Yc1eEk6MmOBfRHYF6lTxiiiV3/z/BRrbSiW2I7AFTXoBFvdMEyglohPj//2mZS4hDOqeB0H1ACh3sBbg==}
cpu: [riscv64]
os: [linux]
libc: [glibc]
'@rollup/rollup-linux-riscv64-musl@4.60.1':
resolution: {integrity: sha512-RAjXjP/8c6ZtzatZcA1RaQr6O1TRhzC+adn8YZDnChliZHviqIjmvFwHcxi4JKPSDAt6Uhf/7vqcBzQJy0PDJg==}
cpu: [riscv64]
os: [linux]
libc: [musl]
'@rollup/rollup-linux-s390x-gnu@4.60.1':
resolution: {integrity: sha512-wcuocpaOlaL1COBYiA89O6yfjlp3RwKDeTIA0hM7OpmhR1Bjo9j31G1uQVpDlTvwxGn2nQs65fBFL5UFd76FcQ==}
cpu: [s390x]
os: [linux]
libc: [glibc]
'@rollup/rollup-linux-x64-gnu@4.60.1':
resolution: {integrity: sha512-77PpsFQUCOiZR9+LQEFg9GClyfkNXj1MP6wRnzYs0EeWbPcHs02AXu4xuUbM1zhwn3wqaizle3AEYg5aeoohhg==}
cpu: [x64]
os: [linux]
libc: [glibc]
'@rollup/rollup-linux-x64-musl@4.60.1':
resolution: {integrity: sha512-5cIATbk5vynAjqqmyBjlciMJl1+R/CwX9oLk/EyiFXDWd95KpHdrOJT//rnUl4cUcskrd0jCCw3wpZnhIHdD9w==}
cpu: [x64]
os: [linux]
libc: [musl]
'@rollup/rollup-openbsd-x64@4.60.1':
resolution: {integrity: sha512-cl0w09WsCi17mcmWqqglez9Gk8isgeWvoUZ3WiJFYSR3zjBQc2J5/ihSjpl+VLjPqjQ/1hJRcqBfLjssREQILw==}
@ -628,9 +651,9 @@ packages:
base64-js@1.5.1:
resolution: {integrity: sha512-AKpaYlHn8t4SVbOHCy+b5+KKgvR4vrsD8vbvrbiQJps7fKDTkjkDry6ji0rUJjC0kzbNePLwzxq8iypo41qeWA==}
better-sqlite3@12.8.0:
resolution: {integrity: sha512-RxD2Vd96sQDjQr20kdP+F+dK/1OUNiVOl200vKBZY8u0vTwysfolF6Hq+3ZK2+h8My9YvZhHsF+RSGZW2VYrPQ==}
engines: {node: 20.x || 22.x || 23.x || 24.x || 25.x}
better-sqlite3@12.10.0:
resolution: {integrity: sha512-CyzaZRQKyHkB2ZInfTTl2nvT33EbDpjkLEbE8/Zck3Ll6O0qqvuGdrJ45HgtH+HykRg88ITY3AdreBGN70aBSQ==}
engines: {node: 20.x || 22.x || 23.x || 24.x || 25.x || 26.x}
bindings@1.5.0:
resolution: {integrity: sha512-p2q/t/mhvuOj/UeLlV6566GD/guowlr0hHxClI0W9m7MWYkL1F0hLo+0Aexs9HSPCtR1SXQ0TD3MMKrXZajbiQ==}
@ -943,8 +966,8 @@ packages:
resolution: {integrity: sha512-sTSfBjoXBp89JvIKIefqw7U2CCebsc74kiY6awiGogKtoSGbgjYE/G/+l9sF3MWFPNc9IcoOC4ODfKHfxFmp0g==}
engines: {node: '>= 0.4'}
get-tsconfig@4.13.7:
resolution: {integrity: sha512-7tN6rFgBlMgpBML5j8typ92BKFi2sFQvIdpAqLA2beia5avZDrMs0FLZiM5etShWq5irVyGcGMEA1jcDaK7A/Q==}
get-tsconfig@4.14.0:
resolution: {integrity: sha512-yTb+8DXzDREzgvYmh6s9vHsSVCHeC0G3PI5bEXNBHtmshPnO+S5O7qgLEOn0I5QvMy6kpZN8K1NKGyilLb93wA==}
github-from-package@0.0.0:
resolution: {integrity: sha512-SyHy3T1v2NUXn29OsWdxmK6RwHD+vkj3v8en8AOBZ1wBQ/hCAQ5bAQTD02kW4W9tUp/3Qh6J8r9EvntiyCmOOw==}
@ -1536,10 +1559,10 @@ packages:
resolution: {integrity: sha512-o5sSPKEkg/DIQNmH43V0/uerLrpzVedkUh8tGNvaeXpfpuwjKenlSox/2O/BTlZUtEe+JG7s5YhEz608PlAHRA==}
engines: {node: '>=0.6'}
tree-sitter-go@0.23.4:
resolution: {integrity: sha512-iQaHEs4yMa/hMo/ZCGqLfG61F0miinULU1fFh+GZreCRtKylFLtvn798ocCZjO2r/ungNZgAY1s1hPFyAwkc7w==}
tree-sitter-go@0.25.0:
resolution: {integrity: sha512-APBc/Dq3xz/e35Xpkhb1blu5UgW+2E3RyGWawZSCNcbGwa7jhSQPS8KsUupuzBla8PCo8+lz9W/JDJjmfRa2tw==}
peerDependencies:
tree-sitter: ^0.21.1
tree-sitter: ^0.25.0
peerDependenciesMeta:
tree-sitter:
optional: true
@ -1552,10 +1575,10 @@ packages:
tree-sitter:
optional: true
tree-sitter-python@0.23.4:
resolution: {integrity: sha512-MbmUAl7y5UCUWqHscHke7DdRDwQnVNMNKQYQc4Gq2p09j+fgPxaU8JVsuOI/0HD3BSEEe5k9j3xmdtIWbDtDgw==}
tree-sitter-python@0.25.0:
resolution: {integrity: sha512-eCmJx6zQa35GxaCtQD+wXHOhYqBxEL+bp71W/s3fcDMu06MrtzkVXR437dRrCrbrDbyLuUDJpAgycs7ncngLXw==}
peerDependencies:
tree-sitter: ^0.21.1
tree-sitter: ^0.25.0
peerDependenciesMeta:
tree-sitter:
optional: true
@ -1691,8 +1714,8 @@ packages:
jsdom:
optional: true
web-tree-sitter@0.26.7:
resolution: {integrity: sha512-KiZhelTvBA/ziUHEO7Emb75cGVAq8iGZNabYaZm53Zpy50NsXyOW+xSHlwHt5CVg/TRPZBfeVLTTobF0LjFJ1w==}
web-tree-sitter@0.26.8:
resolution: {integrity: sha512-4sUwi7ZyOrIk5KLgYLkc2A/F0LFMQnBhfb+2Cdl7ik4ePJ6JD+fk4ofI2sA5eGawBKBaK4Vntt7Ww5KcEsay4A==}
which@2.0.2:
resolution: {integrity: sha512-BLI3Tl1TW3Pvl70l3yq3Y64i+awpwXqsGBYWkkqMtnbXgrMD+yj7rhW0kuEDxzJaYXGjEW5ogapKNMEKNMjibA==}
@ -1724,8 +1747,8 @@ packages:
resolution: {integrity: sha512-YgvUTfwqyc7UXVMrB+SImsVYSmTS8X/tSrtdNZMImM+n7+QTriRXyXim0mBrTXNeqzVF0KWGgHPeiyViFFrNDw==}
engines: {node: '>=18'}
yaml@2.8.3:
resolution: {integrity: sha512-AvbaCLOO2Otw/lW5bmh9d/WEdcDFdQp2Z2ZUH3pX9U2ihyUY0nvLv7J6TrWowklRGPYbB/IuIMfYgxaCPg5Bpg==}
yaml@2.9.0:
resolution: {integrity: sha512-2AvhNX3mb8zd6Zy7INTtSpl1F15HW6Wnqj0srWlkKLcpYl/gMIMJiyuGq2KeI2YFxUPjdlB+3Lc10seMLtL4cA==}
engines: {node: '>= 14.6'}
hasBin: true
@ -2060,13 +2083,13 @@ snapshots:
chai: 5.3.3
tinyrainbow: 2.0.0
'@vitest/mocker@3.2.4(vite@7.3.2(@types/node@25.5.2)(tsx@4.21.0)(yaml@2.8.3))':
'@vitest/mocker@3.2.4(vite@7.3.2(@types/node@25.5.2)(tsx@4.21.0)(yaml@2.9.0))':
dependencies:
'@vitest/spy': 3.2.4
estree-walker: 3.0.3
magic-string: 0.30.21
optionalDependencies:
vite: 7.3.2(@types/node@25.5.2)(tsx@4.21.0)(yaml@2.8.3)
vite: 7.3.2(@types/node@25.5.2)(tsx@4.21.0)(yaml@2.9.0)
'@vitest/pretty-format@3.2.4':
dependencies:
@ -2130,7 +2153,7 @@ snapshots:
base64-js@1.5.1: {}
better-sqlite3@12.8.0:
better-sqlite3@12.10.0:
dependencies:
bindings: 1.5.0
prebuild-install: 7.1.3
@ -2474,7 +2497,7 @@ snapshots:
dunder-proto: 1.0.1
es-object-atoms: 1.1.1
get-tsconfig@4.13.7:
get-tsconfig@4.14.0:
dependencies:
resolve-pkg-maps: 1.0.0
@ -2654,8 +2677,7 @@ snapshots:
node-api-headers@1.8.0: {}
node-gyp-build@4.8.4:
optional: true
node-gyp-build@4.8.4: {}
node-llama-cpp@3.18.1(typescript@5.9.3):
dependencies:
@ -3113,41 +3135,36 @@ snapshots:
toidentifier@1.0.1: {}
tree-sitter-go@0.23.4:
tree-sitter-go@0.25.0:
dependencies:
node-addon-api: 8.7.0
node-gyp-build: 4.8.4
optional: true
tree-sitter-javascript@0.23.1:
dependencies:
node-addon-api: 8.7.0
node-gyp-build: 4.8.4
optional: true
tree-sitter-python@0.23.4:
tree-sitter-python@0.25.0:
dependencies:
node-addon-api: 8.7.0
node-gyp-build: 4.8.4
optional: true
tree-sitter-rust@0.24.0:
dependencies:
node-addon-api: 8.7.0
node-gyp-build: 4.8.4
optional: true
tree-sitter-typescript@0.23.2:
dependencies:
node-addon-api: 8.7.0
node-gyp-build: 4.8.4
tree-sitter-javascript: 0.23.1
optional: true
tsx@4.21.0:
dependencies:
esbuild: 0.27.7
get-tsconfig: 4.13.7
get-tsconfig: 4.14.0
optionalDependencies:
fsevents: 2.3.3
@ -3177,13 +3194,13 @@ snapshots:
vary@1.1.2: {}
vite-node@3.2.4(@types/node@25.5.2)(tsx@4.21.0)(yaml@2.8.3):
vite-node@3.2.4(@types/node@25.5.2)(tsx@4.21.0)(yaml@2.9.0):
dependencies:
cac: 6.7.14
debug: 4.4.3
es-module-lexer: 1.7.0
pathe: 2.0.3
vite: 7.3.2(@types/node@25.5.2)(tsx@4.21.0)(yaml@2.8.3)
vite: 7.3.2(@types/node@25.5.2)(tsx@4.21.0)(yaml@2.9.0)
transitivePeerDependencies:
- '@types/node'
- jiti
@ -3198,7 +3215,7 @@ snapshots:
- tsx
- yaml
vite@7.3.2(@types/node@25.5.2)(tsx@4.21.0)(yaml@2.8.3):
vite@7.3.2(@types/node@25.5.2)(tsx@4.21.0)(yaml@2.9.0):
dependencies:
esbuild: 0.27.7
fdir: 6.5.0(picomatch@4.0.4)
@ -3210,13 +3227,13 @@ snapshots:
'@types/node': 25.5.2
fsevents: 2.3.3
tsx: 4.21.0
yaml: 2.8.3
yaml: 2.9.0
vitest@3.2.4(@types/node@25.5.2)(tsx@4.21.0)(yaml@2.8.3):
vitest@3.2.4(@types/node@25.5.2)(tsx@4.21.0)(yaml@2.9.0):
dependencies:
'@types/chai': 5.2.3
'@vitest/expect': 3.2.4
'@vitest/mocker': 3.2.4(vite@7.3.2(@types/node@25.5.2)(tsx@4.21.0)(yaml@2.8.3))
'@vitest/mocker': 3.2.4(vite@7.3.2(@types/node@25.5.2)(tsx@4.21.0)(yaml@2.9.0))
'@vitest/pretty-format': 3.2.4
'@vitest/runner': 3.2.4
'@vitest/snapshot': 3.2.4
@ -3234,8 +3251,8 @@ snapshots:
tinyglobby: 0.2.15
tinypool: 1.1.1
tinyrainbow: 2.0.0
vite: 7.3.2(@types/node@25.5.2)(tsx@4.21.0)(yaml@2.8.3)
vite-node: 3.2.4(@types/node@25.5.2)(tsx@4.21.0)(yaml@2.8.3)
vite: 7.3.2(@types/node@25.5.2)(tsx@4.21.0)(yaml@2.9.0)
vite-node: 3.2.4(@types/node@25.5.2)(tsx@4.21.0)(yaml@2.9.0)
why-is-node-running: 2.3.0
optionalDependencies:
'@types/node': 25.5.2
@ -3253,7 +3270,7 @@ snapshots:
- tsx
- yaml
web-tree-sitter@0.26.7: {}
web-tree-sitter@0.26.8: {}
which@2.0.2:
dependencies:
@ -3280,7 +3297,7 @@ snapshots:
yallist@5.0.0: {}
yaml@2.8.3: {}
yaml@2.9.0: {}
yargs-parser@21.1.1: {}

29
scripts/build.mjs Normal file
View File

@ -0,0 +1,29 @@
#!/usr/bin/env node
import { spawnSync } from "node:child_process";
import { chmodSync, readFileSync, renameSync, writeFileSync } from "node:fs";
import { join } from "node:path";
import { fileURLToPath } from "node:url";
const root = join(fileURLToPath(new URL("..", import.meta.url)));
function run(command, args, options = {}) {
const result = spawnSync(command, args, {
cwd: root,
stdio: "inherit",
shell: process.platform === "win32",
...options,
});
if (result.status !== 0) {
process.exit(result.status ?? 1);
}
}
run(process.execPath, [join(root, "node_modules", "typescript", "bin", "tsc"), "-p", "tsconfig.build.json"]);
const cliPath = join(root, "dist", "cli", "qmd.js");
const tmpPath = `${cliPath}.tmp`;
const built = readFileSync(cliPath, "utf8");
const withoutExistingShebang = built.startsWith("#!") ? built.slice(built.indexOf("\n") + 1) : built;
writeFileSync(tmpPath, `#!/usr/bin/env node\n${withoutExistingShebang}`);
renameSync(tmpPath, cliPath);
chmodSync(cliPath, 0o755);

View File

@ -0,0 +1,29 @@
#!/usr/bin/env node
import { createRequire } from "node:module";
const require = createRequire(import.meta.url);
const grammars = [
"tree-sitter-typescript/tree-sitter-typescript.wasm",
"tree-sitter-typescript/tree-sitter-tsx.wasm",
"tree-sitter-python/tree-sitter-python.wasm",
"tree-sitter-go/tree-sitter-go.wasm",
"tree-sitter-rust/tree-sitter-rust.wasm",
];
let ok = true;
for (const grammar of grammars) {
try {
const resolved = require.resolve(grammar);
console.log(`ok ${grammar} -> ${resolved}`);
} catch (err) {
ok = false;
console.error(`missing ${grammar}`);
console.error(err instanceof Error ? err.message : String(err));
}
}
if (!ok) {
console.error("\nAST grammar package smoke check failed. Run `bun install` locally or repair a broken global install with the matching `bun add tree-sitter-...@<version>` command shown by `qmd status`.");
process.exit(1);
}

65
scripts/package-smoke.mjs Normal file
View File

@ -0,0 +1,65 @@
#!/usr/bin/env node
import { spawnSync } from "node:child_process";
import { existsSync, readFileSync, statSync } from "node:fs";
import { join } from "node:path";
import { fileURLToPath } from "node:url";
const root = fileURLToPath(new URL("..", import.meta.url));
const pkg = JSON.parse(readFileSync(join(root, "package.json"), "utf8"));
function run(label, command, args, options = {}) {
console.log(`==> ${label}`);
const { quiet, ...spawnOptions } = options;
const result = spawnSync(command, args, {
cwd: root,
stdio: quiet ? "pipe" : "inherit",
shell: process.platform === "win32",
...spawnOptions,
});
if (result.status !== 0) {
console.error(`Package smoke failed: ${label}`);
if (quiet) {
if (result.stdout) process.stderr.write(result.stdout);
if (result.stderr) process.stderr.write(result.stderr);
}
process.exit(result.status ?? 1);
}
}
function assertPath(path, label = path) {
const full = join(root, path);
if (!existsSync(full)) {
console.error(`Package smoke failed: missing ${label} (${path})`);
process.exit(1);
}
return full;
}
run("build compiled package", process.execPath, ["scripts/build.mjs"]);
run("AST grammar runtime packages", process.execPath, ["scripts/check-package-grammars.mjs"]);
for (const entry of pkg.files ?? []) {
assertPath(entry.replace(/\/$/, ""), `package.json files[] entry ${entry}`);
}
for (const [name, binPath] of Object.entries(pkg.bin ?? {})) {
const full = assertPath(binPath, `bin ${name}`);
const mode = statSync(full).mode;
if ((mode & 0o111) === 0) {
console.error(`Package smoke failed: bin ${name} is not executable (${binPath})`);
process.exit(1);
}
}
assertPath("dist/index.js", "compiled main export");
assertPath("dist/index.d.ts", "compiled type export");
assertPath("dist/cli/qmd.js", "compiled CLI");
run("compiled CLI under Node", process.execPath, ["dist/cli/qmd.js", "--help"], { quiet: true });
run("package wrapper", "sh", ["bin/qmd", "--help"], { quiet: true });
if (process.env.QMD_SKIP_BUN_SMOKE === "1") {
console.log("==> compiled CLI under Bun (skipped by QMD_SKIP_BUN_SMOKE=1)");
} else {
run("compiled CLI under Bun", "bun", ["dist/cli/qmd.js", "--help"], { quiet: true });
}

View File

@ -93,7 +93,7 @@ echo ""
# --- Rename [Unreleased] -> [X.Y.Z] - date, add fresh [Unreleased] ---
sed -i '' "s/^## \[Unreleased\].*/## [$NEW] - $DATE/" CHANGELOG.md
perl -0pi -e 's/^## \[Unreleased\].*/## ['"$NEW"'] - '"$DATE"'/m' CHANGELOG.md
# Insert a new empty [Unreleased] section after the header
awk '

View File

@ -0,0 +1,118 @@
#!/usr/bin/env node
/**
* Minimal reproduction of llama.cpp issue ggml-org/llama.cpp#22593:
*
* ggml-metal-device.m:612: GGML_ASSERT([rsets->data count] == 0) failed
*
* Root cause (per the upstream issue and proposed fix PR #22595):
* `ggml_metal_buffer_rset_free` releases the per-buffer residency set object
* but does NOT call the symmetric `ggml_metal_device_rsets_rm`. So the
* device's `rsets->data` array accumulates dangling references. When the
* process exits and libc fires the process-static `ggml_metal_device`
* destructor in `__cxa_finalize_ranges`, the destructor asserts the
* array is empty and it isn't.
*
* Observed downstream behavior:
* - With EXPLICIT `dispose()` of every JS handle in order, the assertion
* does NOT fire. node-llama-cpp's dispose path tears the Metal buffers
* down before the static dtor runs, so the device's rsets array is
* empty by exit time. (Tested locally clean exit.)
* - With NO dispose (the typical real-world case: synchronous `exit()`,
* `--watch` mode, `process.exit()` after results are written, or any
* code path where GC + finalizers race with libc exit), the rset
* references linger until the static dtor fires, and the assertion
* trips.
*
* What this script does:
* 1. Load node-llama-cpp + a small GGUF model on the Metal backend.
* This allocates at least one Metal buffer calls rsets_add internally.
* 2. Run an inference (creating an embedding context populates buffers
* that the dispose path would normally clean up).
* 3. Skip explicit dispose. Just let the process exit.
*
* Expected behavior on macOS 15+ with Apple Silicon, current llama.cpp
* (bundled in node-llama-cpp 3.18.1, llama.cpp tag b8390):
* - Without GGML_METAL_NO_RESIDENCY:
* Script writes "ok" and main() returns, then ggml_abort fires the
* assertion, prints a multi-kB backtrace, and the process exits with
* SIGABRT (exit code 134).
* - With GGML_METAL_NO_RESIDENCY=1:
* Clean exit code 0. Residency-set code path is skipped entirely.
* - With --dispose flag (manual cleanup):
* Clean exit code 0 even without the env var, as long as JS dispose()
* runs successfully before libc exit.
*
* Usage:
* # Reproduce the crash (no dispose, no env var)
* node scripts/repro-metal-rsets-crash.mjs
*
* # Verify the documented workaround
* GGML_METAL_NO_RESIDENCY=1 node scripts/repro-metal-rsets-crash.mjs
*
* # Verify that explicit dispose also avoids the crash
* node scripts/repro-metal-rsets-crash.mjs --dispose
*
* Refs:
* https://github.com/ggml-org/llama.cpp/issues/22593 (root-cause analysis)
* https://github.com/ggml-org/llama.cpp/pull/22595 (one-line fix, open)
* https://github.com/tobi/qmd/issues/368 (downstream report)
* https://github.com/tobi/qmd/issues/674 (downstream, current)
* https://github.com/tobi/qmd/pull/600 (downstream workaround PR)
*/
import { existsSync } from "node:fs";
import { homedir } from "node:os";
import { resolve } from "node:path";
const DEFAULT_MODEL = resolve(
homedir(),
".cache/qmd/models/hf_ggml-org_embeddinggemma-300M-Q8_0.gguf",
);
const args = process.argv.slice(2);
const wantsDispose = args.includes("--dispose");
const modelPath = args.find((a) => !a.startsWith("--")) ?? DEFAULT_MODEL;
if (!existsSync(modelPath)) {
console.error(`Model not found: ${modelPath}`);
console.error("Pass a path to any local GGUF as argv[1], or run `qmd embed` once to populate the default cache path.");
process.exit(2);
}
console.error(
`[repro] GGML_METAL_NO_RESIDENCY=${process.env.GGML_METAL_NO_RESIDENCY ?? "(unset)"}`,
);
console.error(`[repro] dispose=${wantsDispose}`);
console.error(`[repro] loading: ${modelPath}`);
const { getLlama } = await import("node-llama-cpp");
const llama = await getLlama();
const model = await llama.loadModel({ modelPath });
const context = await model.createEmbeddingContext();
console.error(`[repro] backend: ${llama.gpu}`);
// Run actual inference so the buffer-allocation path is hit.
await context.getEmbeddingFor("repro text");
if (wantsDispose) {
console.error("[repro] explicit dispose…");
await context.dispose();
await model.dispose();
await llama.dispose();
}
console.error("[repro] main() returning via process.exit(0)");
console.log("ok");
// CRITICAL: use process.exit(), not `return`. node-llama-cpp registers a
// `process.once('beforeExit', …)` hook that auto-disposes WeakRef'd Llama
// instances when the event loop empties naturally. `process.exit()` skips
// `beforeExit`, so the rsets stay populated until libc's `exit()` fires the
// static dtor — which is when the upstream assertion bug trips.
//
// CLI tools (qmd query, qmd vsearch, qmd embed, etc.) all call process.exit()
// after writing results, which is why every real downstream report crashes
// even though the minimal "let main return" version does not.
process.exit(0);

38
scripts/test-all.mjs Normal file
View File

@ -0,0 +1,38 @@
#!/usr/bin/env node
import { spawnSync } from "node:child_process";
import { join } from "node:path";
import { fileURLToPath } from "node:url";
const root = fileURLToPath(new URL("..", import.meta.url));
// Mirror bin/qmd's darwin Metal residency mitigation for test subprocesses.
// libggml-metal asserts on a non-empty residency set during its static
// destructor (ggml-org/llama.cpp#22593, fix open as #22595) and dumps a
// multi-kB backtrace at process exit even when tests pass. The env var must
// be set BEFORE the subprocess starts because libggml-metal reads it via
// libc getenv at module-load time. Opt out with QMD_METAL_KEEP_RESIDENCY=1.
const darwinMetalEnv =
process.platform === "darwin" && process.env.QMD_METAL_KEEP_RESIDENCY !== "1"
? { GGML_METAL_NO_RESIDENCY: "1" }
: {};
function run(label, command, args, options = {}) {
console.log(`==> ${label}`);
const { env: extraEnv, ...spawnOptions } = options;
const result = spawnSync(command, args, {
cwd: root,
stdio: "inherit",
shell: process.platform === "win32",
env: { ...process.env, ...darwinMetalEnv, ...(extraEnv ?? {}) },
...spawnOptions,
});
if (result.status !== 0) {
console.error(`Test task failed: ${label}`);
process.exit(result.status ?? 1);
}
}
run("TypeScript build typecheck", process.execPath, [join(root, "node_modules", "typescript", "bin", "tsc"), "-p", "tsconfig.build.json", "--noEmit"]);
run("Vitest suite under Node", process.execPath, [join(root, "node_modules", "vitest", "vitest.mjs"), "run", "--reporter=verbose", "--testTimeout", "60000", "test/"], { env: { CI: "true" } });
run("Bun test suite", "bun", ["test", "--timeout", "60000", "--preload", "./src/test-preload.ts", "test/"], { env: { CI: "true" } });
run("Package smoke", process.execPath, ["scripts/package-smoke.mjs"]);

View File

@ -1,144 +1,295 @@
---
name: qmd
description: Search markdown knowledge bases, notes, and documentation using QMD. Use when users ask to search notes, find documents, or look up information.
description: Search local markdown knowledge bases, notes, docs, and wikis with QMD. Use when users ask to find notes, retrieve documents, inspect a wiki, answer from indexed markdown, or set up QMD access.
license: MIT
compatibility: Requires qmd CLI or MCP server. Install via `npm install -g @tobilu/qmd`.
metadata:
author: tobi
version: "2.0.0"
version: "2.2.0"
allowed-tools: Bash(qmd:*), mcp__qmd__*
---
# QMD - Quick Markdown Search
# QMD - Query Markdown Documents
Local search engine for markdown content.
## How search works
## Status
QMD searches local markdown collections: notes, docs, wikis, transcripts, and
project knowledge bases. Use it before web search when the answer may already be
in indexed local files.
!`qmd status 2>/dev/null || echo "Not installed: npm install -g @tobilu/qmd"`
The workflow is always:
## MCP: `query`
1. Search for candidate documents.
2. Retrieve the full source with `qmd get` or `qmd multi-get`.
3. Answer from retrieved text, citing paths or docids.
Do not answer from snippets alone when the user needs facts, decisions, quotes,
or nuance. Snippets are only leads.
Typical loop:
```bash
qmd search "merchant reality support interviews" -n 5
# leads: #abc123 concepts/customer-proximity.md; #def432 sources/merchant-call.md
qmd multi-get "#abc123,#def432" --format md
```
**Default to structured `qmd query` with `intent:`, `lex:`, `vec:`, and `hyde:`
fields that you write yourself.** You are a better query expander than the
built-in model: you know the user's actual goal, the domain vocabulary, and the
nearby-but-wrong concepts to avoid. Do not just paste the user's words into
`qmd query "..."` and hope the expansion model guesses right — supply the
`intent:` and craft the lexical and semantic terms deliberately (see
[Pick the right search mode](#pick-the-right-search-mode)).
When reporting what you retrieved, a compact note is enough; do not paste whole
files unless needed:
```text
Retrieved:
- #abc123 concepts/customer-proximity.md
- #def432 sources/merchant-call.md
```
## Pick the right search mode
Use **BM25 lexical search** when you know exact words, titles, names, code
symbols, or rare phrases:
```bash
qmd search "cockpit OKR Goodhart" -n 10
qmd search '"AI Before Headcount"' -c concepts -n 5
```
Use **`qmd query` with structured fields** when the user describes an idea
indirectly, uses different wording than the source, or needs conceptual recall.
**This is the default mode — write the fields yourself rather than leaning on
query expansion.** Combine exact anchors with semantic recall:
```bash
qmd query $'intent: Find the concept note about metrics as instruments without letting OKRs replace judgment.\nlex: cockpit instruments OKR Goodhart metrics judgment\nvec: data informed not metric driven product judgment\nhyde: A concept note says metrics are useful like cockpit instruments, but leaders should remain data-informed rather than metric-driven because OKRs and dashboards can Goodhart product judgment.'
```
Structured query fields (you author each one — do not delegate this to the
expansion model):
- `intent:` states what you are trying to find **and what to avoid**. Always
supply this. It steers ranking away from nearby-but-wrong concepts.
- `lex:` exact terms, aliases, titles, code symbols, and rare words you expect
in the source. This is your own keyword expansion.
- `vec:` paraphrases the idea in natural language, in source-like wording.
- `hyde:` describes the document or answer that would satisfy the request.
You do not need all four every time, but you should almost always write at least
`intent:` plus one of `lex:`/`vec:`. A bare `qmd query "the user's sentence"`
throws away the context only you have and relies on the built-in expander to
reconstruct it — prefer the structured form.
If you genuinely have nothing to expand (a single rare token, a verbatim phrase),
that is a job for `qmd search`, not bare `qmd query`:
```bash
qmd query --format json --explain $'intent: ...\nlex: ...\nvec: ...' # inspect ranking
```
If `qmd query` is slow or model/GPU setup fails, fall back to `qmd search` with
better lexical terms.
## Retrieve sources
Search results include docids like `#abc123` and `qmd://...` paths. Fetch them:
```bash
qmd get "#abc123"
qmd get qmd://concepts/ai-before-headcount.md
qmd multi-get "#abc123,#def432" --format md
qmd multi-get 'concepts/{ai-before-headcount.md,data-informed-not-metric-driven.md}' --format md
qmd multi-get 'sources/podcast-2025-*.md' -l 80
```
Use `multi-get` when comparing several hits or gathering context across pages.
### Output is line-numbered and carries the docid — cite both
`get` and `multi-get` are **line-numbered by default** and always print the
document's `#docid` and `qmd://` path. So `get` output looks like:
```text
qmd://concepts/note.md #abc123
---
1: # Metrics as instruments
2:
3: Treat dashboards like cockpit instruments...
```
Cite the docid and exact line numbers in your answer, and use the numbers to ask
for the next slice. Pass `--no-line-numbers` only when you need raw content to
copy verbatim (e.g. reproducing a code block).
When you need to open or edit the underlying file (e.g. hand a path to `Read`,
`Edit`, or an editor), add `--full-path`. It replaces the `qmd://` URL + docid
header with the document's on-disk path, falling back to the canonical header if
the file no longer exists on disk:
```text
$ qmd get "#abc123" --full-path
/Users/you/notes/concepts/note.md
---
1: # Metrics as instruments
```
`--full-path` works the same way on `qmd search` and `qmd query`: result paths
become the file's on-disk path — `./`-prefixed relative path when the file is
inside `$PWD`, absolute realpath otherwise — and the per-result `#docid` is
dropped because the path is the identifier. The leading `./` is intentional so
the output is unambiguously a filesystem path and cannot be mistaken for a bare
collection-relative string. Default search/query output still uses `qmd://`
URIs; only opt into `--full-path` when you specifically need a path you can hand
to a non-QMD tool.
### Read line ranges with the `:from:count` suffix — never pipe through `sed`/`head`/`tail`
`qmd get` slices files itself. Use the suffix or flags; do **not** shell out to
`sed -n`, `head`, `tail`, or `awk` to pull a line range. Piping defeats docid
resolution, virtual-path lookups, line numbering, and the header, and it is
slower and more error-prone.
The most compact form is a `:from:count` suffix right on the path or docid —
prefer it:
```bash
qmd get "#abc123:120:40" # 40 lines starting at line 120
qmd get qmd://concepts/note.md:200:60 # lines 200259
qmd get "#abc123:120" # from line 120 to end of file
qmd get "#abc123" --from 120 -l 40 # equivalent, using flags
```
Suffix and flags:
- `<path>:<from>:<count>` — start at line `<from>`, read `<count>` lines. **Best
for reading around a search hit.**
- `<path>:<from>` — start at `<from>`, read to end of file.
- `--from <line>` / `-l <lines>` — flag equivalents. Explicit flags override the
suffix, so `... :5:2 -l 1` reads 1 line.
- `--no-line-numbers` — drop the `N:` prefixes (line numbers are on by default).
Wrong: `qmd get "#abc123" | sed -n '120,160p'`
Right: `qmd get "#abc123:120:40"`
Search results include a `:line` anchor on each hit — feed it straight into
`qmd get path:line:<n>` to read a window around the match (line numbers in the
output will start at `line`).
## Discover what is indexed
```bash
qmd collection list
qmd ls
qmd status
```
Add collection filters when broad searches drift into the wrong corpus:
```bash
qmd search "headcount autonomous agents" -c concepts -n 10
qmd query "merchant support product reality" -c concepts -c sources -n 10
```
Omit `-c` to search everything.
## MCP Tool: `query`
When using the MCP server, prefer structured searches:
```json
{
"searches": [
{ "type": "lex", "query": "CAP theorem consistency" },
{ "type": "vec", "query": "tradeoff between consistency and availability" }
{ "type": "lex", "query": "cockpit OKR Goodhart" },
{ "type": "vec", "query": "data informed not metric driven product judgment" },
{ "type": "hyde", "query": "A concept note explains that metrics are useful as instruments, but leaders should not let OKRs or dashboards replace judgment." }
],
"collections": ["docs"],
"intent": "Find the concept note about using metrics as instruments without becoming metric-driven.",
"collections": ["concepts"],
"limit": 10
}
```
### Query Types
Query types:
| Type | Method | Input |
|------|--------|-------|
| `lex` | BM25 | Keywords — exact terms, names, code |
| `vec` | Vector | Question — natural language |
| `hyde` | Vector | Answer — hypothetical result (50-100 words) |
- `lex` — BM25 keyword search. Best for exact terms, names, titles, and code.
- `vec` — vector semantic search. Best for natural-language concepts.
- `hyde` — vector search using a hypothetical answer/document passage.
### Writing Good Queries
## Query craft
**lex (keyword)**
- 2-5 terms, no filler words
- Exact phrase: `"connection pool"` (quoted)
- Exclude terms: `performance -sports` (minus prefix)
- Code identifiers work: `handleError async`
Good QMD searches mix three things:
**vec (semantic)**
- Full natural language question
- Be specific: `"how does the rate limiter handle burst traffic"`
- Include context: `"in the payment service, how are refunds processed"`
1. **Title/alias anchors:** exact page titles, named entities, phrases.
2. **Semantic paraphrase:** how a human would describe the idea.
3. **Negative space:** enough intent to avoid nearby-but-wrong concepts.
**hyde (hypothetical document)**
- Write 50-100 words of what the *answer* looks like
- Use the vocabulary you expect in the result
**expand (auto-expand)**
- Use a single-line query (implicit) or `expand: question` on its own line
- Lets the local LLM generate lex/vec/hyde variations
- Do not mix `expand:` with other typed lines — it's either a standalone expand query or a full query document
### Intent (Disambiguation)
When a query term is ambiguous, add `intent` to steer results:
```json
{
"searches": [
{ "type": "lex", "query": "performance" }
],
"intent": "web page load times and Core Web Vitals"
}
```
Intent affects expansion, reranking, chunk selection, and snippet extraction. It does not search on its own — it's a steering signal that disambiguates queries like "performance" (web-perf vs team health vs fitness).
### Combining Types
| Goal | Approach |
|------|----------|
| Know exact terms | `lex` only |
| Don't know vocabulary | Use a single-line query (implicit `expand:`) or `vec` |
| Best recall | `lex` + `vec` |
| Complex topic | `lex` + `vec` + `hyde` |
| Ambiguous query | Add `intent` to any combination above |
First query gets 2x weight in fusion — put your best guess first.
### Lex Query Syntax
| Syntax | Meaning | Example |
|--------|---------|---------|
| `term` | Prefix match | `perf` matches "performance" |
| `"phrase"` | Exact phrase | `"rate limiter"` |
| `-term` | Exclude | `performance -sports` |
Note: `-term` only works in lex queries, not vec/hyde.
### Collection Filtering
```json
{ "collections": ["docs"] } // Single
{ "collections": ["docs", "notes"] } // Multiple (OR)
```
Omit to search all collections.
## Other MCP Tools
| Tool | Use |
|------|-----|
| `get` | Retrieve doc by path or `#docid` |
| `multi_get` | Retrieve multiple by glob/list |
| `status` | Collections and health |
## CLI
Examples:
```bash
qmd query "question" # Auto-expand + rerank
qmd query $'lex: X\nvec: Y' # Structured
qmd query $'expand: question' # Explicit expand
qmd query --json --explain "q" # Show score traces (RRF + rerank blend)
qmd search "keywords" # BM25 only (no LLM)
qmd get "#abc123" # By docid
qmd multi-get "journals/2026-*.md" -l 40 # Batch pull snippets by glob
qmd multi-get notes/foo.md,notes/bar.md # Comma-separated list, preserves order
# Exact-ish title lookup
qmd search '"arm the rebels" merchants tools big companies' -c concepts
# Semantic concept lookup
qmd query $'intent: Find the customer proximity concept, not generic customer delight.\nlex: support pseudonymous merchant customer interviews\nvec: founder stays close to merchant reality through support and product use'
# Source lookup
qmd search "six-week cadence WhatsApp merchant relationships Shawn Ryan" -c sources -n 10
```
## HTTP API
## Setup and maintenance
```bash
curl -X POST http://localhost:8181/query \
-H "Content-Type: application/json" \
-d '{"searches": [{"type": "lex", "query": "test"}]}'
```
## Setup
Only mutate indexes when the user asked for setup or maintenance. Searching and
retrieving are safe; collection/index mutation is not a casual first step.
```bash
npm install -g @tobilu/qmd
qmd collection add ~/notes --name notes
qmd update
qmd embed
```
Health and diagnostics:
```bash
qmd doctor
qmd status
qmd pull
```
`qmd doctor` checks config, model cache, device/GPU setup, vector fingerprints,
and common environment overrides. If a model-backed command fails, run it before
changing configuration.
## MCP setup
See `references/mcp-setup.md` for Claude Code, Claude Desktop, OpenClaw, and HTTP
server configuration.
## Pitfalls
- **Do not stop at snippets.** Fetch documents before making claims.
- **Do not slice files with `sed`/`head`/`tail`.** Use the `path:from:count`
suffix (e.g. `qmd get "#abc123:120:40"`) or `--from`/`-l`. Output is already
line-numbered; piping breaks docid resolution, the header, and virtual paths.
- **Do not lean on query expansion.** Write `intent:`/`lex:`/`vec:`/`hyde:`
yourself. A bare `qmd query "user sentence"` discards the context only you
have. You expand the query; the model just ranks.
- **Do not overuse semantic search.** If you know exact titles or terms, BM25 is
faster and often better.
- **Do not mutate indexes casually.** `qmd collection add`, `qmd update`, and
`qmd embed` change local state and can be expensive.
- **Model-backed commands can be environment-sensitive.** If `qmd query`,
`qmd vsearch`, or reranking fails because local models/GPU are unavailable,
use `qmd search` and stronger lexical/structured terms.
- **Ambiguous user wording needs intent.** Add `intent:` rather than hoping query
expansion guesses the right domain.
- **Collection names matter.** Search `concepts` for synthesized wiki pages,
`sources` for transcripts/raw source pages, and docs collections for code or
project documentation.

View File

@ -63,15 +63,22 @@ export function detectLanguage(filepath: string): SupportedLanguage | null {
/**
* Maps language to the npm package and wasm filename for the grammar.
*/
const GRAMMAR_MAP: Record<SupportedLanguage, { pkg: string; wasm: string }> = {
typescript: { pkg: "tree-sitter-typescript", wasm: "tree-sitter-typescript.wasm" },
tsx: { pkg: "tree-sitter-typescript", wasm: "tree-sitter-tsx.wasm" },
javascript: { pkg: "tree-sitter-typescript", wasm: "tree-sitter-typescript.wasm" },
python: { pkg: "tree-sitter-python", wasm: "tree-sitter-python.wasm" },
go: { pkg: "tree-sitter-go", wasm: "tree-sitter-go.wasm" },
rust: { pkg: "tree-sitter-rust", wasm: "tree-sitter-rust.wasm" },
const GRAMMAR_MAP: Record<SupportedLanguage, { pkg: string; wasm: string; version: string }> = {
typescript: { pkg: "tree-sitter-typescript", wasm: "tree-sitter-typescript.wasm", version: "0.23.2" },
tsx: { pkg: "tree-sitter-typescript", wasm: "tree-sitter-tsx.wasm", version: "0.23.2" },
javascript: { pkg: "tree-sitter-typescript", wasm: "tree-sitter-typescript.wasm", version: "0.23.2" },
python: { pkg: "tree-sitter-python", wasm: "tree-sitter-python.wasm", version: "0.23.4" },
go: { pkg: "tree-sitter-go", wasm: "tree-sitter-go.wasm", version: "0.23.4" },
rust: { pkg: "tree-sitter-rust", wasm: "tree-sitter-rust.wasm", version: "0.24.0" },
};
export function formatGrammarLoadError(language: SupportedLanguage, err: unknown): string {
const grammar = GRAMMAR_MAP[language];
const detail = err instanceof Error ? err.message : String(err);
return `${grammar.pkg}/${grammar.wasm} failed to load (${detail}); falling back to regex chunking. ` +
`Repair a broken global install with: bun add ${grammar.pkg}@${grammar.version}`;
}
// =============================================================================
// Per-Language Query Definitions
// =============================================================================
@ -176,6 +183,9 @@ let initPromise: Promise<void> | null = null;
/** Languages that have already failed to load — warn only once per process. */
const failedLanguages = new Set<string>();
/** Last grammar load error by language, for status output. */
const grammarLoadErrors = new Map<SupportedLanguage, string>();
/** Cached grammar load promises. */
const grammarCache = new Map<string, Promise<LanguageType>>();
@ -228,7 +238,9 @@ async function loadGrammar(language: SupportedLanguage): Promise<LanguageType |
} catch (err) {
failedLanguages.add(language);
grammarCache.delete(wasmKey);
console.warn(`[qmd] Failed to load tree-sitter grammar for ${language}: ${err}`);
const message = formatGrammarLoadError(language, err);
grammarLoadErrors.set(language, message);
console.warn(`[qmd] AST grammar unavailable for ${language}: ${message}`);
return null;
}
}
@ -345,7 +357,7 @@ export async function getASTStatus(): Promise<{
getQuery(lang, grammar);
languages.push({ language: lang, available: true });
} else {
languages.push({ language: lang, available: false, error: "grammar failed to load" });
languages.push({ language: lang, available: false, error: grammarLoadErrors.get(lang) ?? "grammar failed to load" });
}
} catch (err) {
languages.push({

View File

@ -260,16 +260,18 @@ async function main() {
const r = await benchmarkConfig(model, llama, docs, p, true);
results.push(r);
process.stdout.write(` ${r.medianMs.toFixed(0)}ms (${r.docsPerSec.toFixed(1)} docs/s)\n`);
} catch (e: any) {
process.stdout.write(` failed: ${e.message}\n`);
} catch (e: unknown) {
const message = e instanceof Error ? e.message : String(e);
process.stdout.write(` failed: ${message}\n`);
// Try without flash
process.stdout.write(` [${p} ctx, no flash] running...`);
try {
const r = await benchmarkConfig(model, llama, docs, p, false);
results.push(r);
process.stdout.write(` ${r.medianMs.toFixed(0)}ms (${r.docsPerSec.toFixed(1)} docs/s)\n`);
} catch (e2: any) {
process.stdout.write(` failed: ${e2.message}\n`);
} catch (e2: unknown) {
const message = e2 instanceof Error ? e2.message : String(e2);
process.stdout.write(` failed: ${message}\n`);
}
}
}

View File

@ -22,6 +22,7 @@ import {
type QMDStore,
type SearchResult,
type HybridQueryResult,
type ExpandedQuery,
} from "../index.js";
import { scoreResults } from "./score.js";
import type {
@ -34,35 +35,130 @@ import type {
type Backend = {
name: string;
run: (store: QMDStore, query: string, limit: number, collection?: string) => Promise<string[]>;
run: (store: QMDStore, query: BenchmarkQuery, limit: number, collection?: string) => Promise<string[]>;
};
type ParsedStructuredQuery = {
searches: ExpandedQuery[];
intent?: string;
};
function parseStructuredQuery(query: string): ParsedStructuredQuery | undefined {
const lines = query.split("\n").map((line, idx) => ({
trimmed: line.trim(),
number: idx + 1,
})).filter(line => line.trimmed.length > 0);
if (lines.length === 0) return undefined;
const prefixRe = /^(lex|vec|hyde):\s*/i;
const intentRe = /^intent:\s*/i;
const searches: ExpandedQuery[] = [];
let intent: string | undefined;
for (const line of lines) {
if (intentRe.test(line.trimmed)) {
if (intent !== undefined) {
throw new Error(`Line ${line.number}: only one intent: line is allowed per benchmark query.`);
}
intent = line.trimmed.replace(intentRe, "").trim();
if (!intent) {
throw new Error(`Line ${line.number}: intent: must include text.`);
}
continue;
}
const match = line.trimmed.match(prefixRe);
if (match) {
const type = match[1]!.toLowerCase() as "lex" | "vec" | "hyde";
const text = line.trimmed.slice(match[0].length).trim();
if (!text) {
throw new Error(`Line ${line.number} (${type}:) must include text.`);
}
searches.push({ type, query: text, line: line.number });
continue;
}
if (lines.length === 1) {
return undefined;
}
throw new Error(`Line ${line.number} is missing a lex:/vec:/hyde:/intent: prefix.`);
}
if (intent && searches.length === 0) {
throw new Error("intent: cannot appear alone. Add at least one lex:, vec:, or hyde: line.");
}
return searches.length > 0 ? { searches, intent } : undefined;
}
function uniqueFiles(files: string[], limit: number): string[] {
const seen = new Set<string>();
const out: string[] = [];
for (const file of files) {
if (seen.has(file)) continue;
seen.add(file);
out.push(file);
if (out.length >= limit) break;
}
return out;
}
const BACKENDS: Backend[] = [
{
name: "bm25",
run: async (store, query, limit, collection) => {
const results = await store.searchLex(query, { limit, collection });
const structured = parseStructuredQuery(query.query);
const lexQueries = structured?.searches.filter(q => q.type === "lex");
if (structured) {
const files: string[] = [];
for (const lex of lexQueries ?? []) {
const results = await store.searchLex(lex.query, { limit, collection });
files.push(...results.map((r: SearchResult) => r.filepath));
}
return uniqueFiles(files, limit);
}
const results = await store.searchLex(query.query, { limit, collection });
return results.map((r: SearchResult) => r.filepath);
},
},
{
name: "vector",
run: async (store, query, limit, collection) => {
const results = await store.searchVector(query, { limit, collection });
const structured = parseStructuredQuery(query.query);
const vectorQueries = structured?.searches.filter(q => q.type === "vec" || q.type === "hyde");
if (structured) {
const files: string[] = [];
for (const vectorQuery of vectorQueries ?? []) {
const results = await store.searchVector(vectorQuery.query, { limit, collection });
files.push(...results.map((r: SearchResult) => r.filepath));
}
return uniqueFiles(files, limit);
}
const results = await store.searchVector(query.query, { limit, collection });
return results.map((r: SearchResult) => r.filepath);
},
},
{
name: "hybrid",
run: async (store, query, limit, collection) => {
const results = await store.search({ query, limit, collection, rerank: false });
const structured = parseStructuredQuery(query.query);
const results = structured
? await store.search({ queries: structured.searches, intent: structured.intent, limit, collection, rerank: false })
: await store.search({ query: query.query, limit, collection, rerank: false });
return results.map((r: HybridQueryResult) => r.file);
},
},
{
name: "full",
run: async (store, query, limit, collection) => {
const results = await store.search({ query, limit, collection, rerank: true });
const structured = parseStructuredQuery(query.query);
const results = structured
? await store.search({ queries: structured.searches, intent: structured.intent, limit, collection, rerank: true })
: await store.search({ query: query.query, limit, collection, rerank: true });
return results.map((r: HybridQueryResult) => r.file);
},
},
@ -79,18 +175,23 @@ async function runQuery(
let resultFiles: string[];
try {
resultFiles = await backend.run(store, query.query, limit, collection);
} catch (err: any) {
resultFiles = await backend.run(store, query, limit, collection);
} catch {
// Backend may not be available (e.g., no embeddings for vector search)
return {
precision_at_k: 0,
recall: 0,
recall_at_1: 0,
recall_at_3: 0,
recall_at_5: 0,
mrr: 0,
f1: 0,
hits_at_k: 0,
total_expected: query.expected_files.length,
latency_ms: Date.now() - start,
top_files: [],
matched_files: [],
unmatched_expected_files: query.expected_files,
};
}
@ -111,14 +212,14 @@ function formatTable(results: QueryResult[]): string {
const num = (n: number) => n.toFixed(2).padStart(5);
lines.push(
`${pad("Query", 25)} ${pad("Backend", 8)} ${pad("P@k", 6)} ${pad("Recall", 7)} ${pad("MRR", 6)} ${pad("F1", 6)} ${pad("ms", 8)}`
`${pad("Query", 25)} ${pad("Backend", 8)} ${pad("P@k", 6)} ${pad("R@1", 6)} ${pad("R@3", 6)} ${pad("R@5", 6)} ${pad("MRR", 6)} ${pad("F1", 6)} ${pad("ms", 8)}`
);
lines.push("-".repeat(70));
lines.push("-".repeat(88));
for (const r of results) {
for (const [backend, br] of Object.entries(r.backends)) {
lines.push(
`${pad(r.id, 25)} ${pad(backend, 8)} ${num(br.precision_at_k)} ${num(br.recall)} ${num(br.mrr)} ${num(br.f1)} ${String(Math.round(br.latency_ms)).padStart(7)}ms`
`${pad(r.id, 25)} ${pad(backend, 8)} ${num(br.precision_at_k)} ${num(br.recall_at_1)} ${num(br.recall_at_3)} ${num(br.recall_at_5)} ${num(br.mrr)} ${num(br.f1)} ${String(Math.round(br.latency_ms)).padStart(7)}ms`
);
}
lines.push("");
@ -138,13 +239,16 @@ function computeSummary(results: QueryResult[]): BenchmarkResult["summary"] {
}
}
for (const name of backendNames) {
let totalP = 0, totalR = 0, totalMrr = 0, totalF1 = 0, totalLat = 0, count = 0;
for (const name of Array.from(backendNames)) {
let totalP = 0, totalR = 0, totalR1 = 0, totalR3 = 0, totalR5 = 0, totalMrr = 0, totalF1 = 0, totalLat = 0, count = 0;
for (const r of results) {
const br = r.backends[name];
if (!br) continue;
totalP += br.precision_at_k;
totalR += br.recall;
totalR1 += br.recall_at_1;
totalR3 += br.recall_at_3;
totalR5 += br.recall_at_5;
totalMrr += br.mrr;
totalF1 += br.f1;
totalLat += br.latency_ms;
@ -154,6 +258,9 @@ function computeSummary(results: QueryResult[]): BenchmarkResult["summary"] {
summary[name] = {
avg_precision: totalP / count,
avg_recall: totalR / count,
avg_recall_at_1: totalR1 / count,
avg_recall_at_3: totalR3 / count,
avg_recall_at_5: totalR5 / count,
avg_mrr: totalMrr / count,
avg_f1: totalF1 / count,
avg_latency_ms: totalLat / count,
@ -166,7 +273,7 @@ function computeSummary(results: QueryResult[]): BenchmarkResult["summary"] {
export async function runBenchmark(
fixturePath: string,
options: { json?: boolean; collection?: string; backends?: string[] } = {},
options: { json?: boolean; collection?: string; backends?: string[]; dbPath?: string; configPath?: string } = {},
): Promise<BenchmarkResult> {
// Load fixture
const raw = readFileSync(resolve(fixturePath), "utf-8");
@ -177,7 +284,10 @@ export async function runBenchmark(
}
// Open store
const store = await createStore({ dbPath: getDefaultDbPath() });
const store = await createStore({
dbPath: options.dbPath ?? getDefaultDbPath(),
...(options.configPath ? { configPath: options.configPath } : {}),
});
// Filter backends if requested
const activeBackends = options.backends
@ -232,7 +342,7 @@ export async function runBenchmark(
const num = (n: number) => n.toFixed(3).padStart(6);
for (const [name, s] of Object.entries(summary)) {
console.log(
` ${pad(name, 8)} P@k=${num(s.avg_precision)} Recall=${num(s.avg_recall)} MRR=${num(s.avg_mrr)} F1=${num(s.avg_f1)} Avg=${Math.round(s.avg_latency_ms)}ms`
` ${pad(name, 8)} P@k=${num(s.avg_precision)} R@1=${num(s.avg_recall_at_1)} R@3=${num(s.avg_recall_at_3)} R@5=${num(s.avg_recall_at_5)} MRR=${num(s.avg_mrr)} F1=${num(s.avg_f1)} Avg=${Math.round(s.avg_latency_ms)}ms`
);
}
}

View File

@ -11,7 +11,7 @@
*/
export function normalizePath(p: string): string {
if (p.startsWith("qmd://")) {
// qmd://collection/path/to/file → path/to/file
// qmd://collection/docs/readme.md → docs/readme.md
const withoutScheme = p.slice("qmd://".length);
const slashIdx = withoutScheme.indexOf("/");
p = slashIdx >= 0 ? withoutScheme.slice(slashIdx + 1) : withoutScheme;
@ -31,6 +31,30 @@ export function pathsMatch(result: string, expected: string): boolean {
return false;
}
type ScoreMetrics = {
precision_at_k: number;
recall: number;
recall_at_1: number;
recall_at_3: number;
recall_at_5: number;
mrr: number;
f1: number;
hits_at_k: number;
matched_files: string[];
unmatched_expected_files: string[];
};
function hitsWithin(resultFiles: string[], expectedFiles: string[], k: number): number {
const topKResults = resultFiles.slice(0, k);
let hits = 0;
for (const expected of expectedFiles) {
if (topKResults.some(r => pathsMatch(r, expected))) {
hits++;
}
}
return hits;
}
/**
* Score a set of search results against expected files.
*/
@ -38,21 +62,18 @@ export function scoreResults(
resultFiles: string[],
expectedFiles: string[],
topK: number,
): { precision_at_k: number; recall: number; mrr: number; f1: number; hits_at_k: number } {
): ScoreMetrics {
// Count hits in top-k
const topKResults = resultFiles.slice(0, topK);
let hitsAtK = 0;
for (const expected of expectedFiles) {
if (topKResults.some(r => pathsMatch(r, expected))) {
hitsAtK++;
}
}
const hitsAtK = hitsWithin(resultFiles, expectedFiles, topK);
const matchedFiles: string[] = [];
const unmatchedExpectedFiles: string[] = [];
// Count total hits anywhere
let totalHits = 0;
for (const expected of expectedFiles) {
if (resultFiles.some(r => pathsMatch(r, expected))) {
totalHits++;
matchedFiles.push(expected);
} else {
unmatchedExpectedFiles.push(expected);
}
}
@ -67,10 +88,24 @@ export function scoreResults(
const denominator = Math.min(topK, expectedFiles.length);
const precision_at_k = denominator > 0 ? hitsAtK / denominator : 0;
const recall = expectedFiles.length > 0 ? totalHits / expectedFiles.length : 0;
const recall = expectedFiles.length > 0 ? matchedFiles.length / expectedFiles.length : 0;
const recall_at_1 = expectedFiles.length > 0 ? hitsWithin(resultFiles, expectedFiles, 1) / expectedFiles.length : 0;
const recall_at_3 = expectedFiles.length > 0 ? hitsWithin(resultFiles, expectedFiles, 3) / expectedFiles.length : 0;
const recall_at_5 = expectedFiles.length > 0 ? hitsWithin(resultFiles, expectedFiles, 5) / expectedFiles.length : 0;
const f1 = precision_at_k + recall > 0
? 2 * (precision_at_k * recall) / (precision_at_k + recall)
: 0;
return { precision_at_k, recall, mrr, f1, hits_at_k: hitsAtK };
return {
precision_at_k,
recall,
recall_at_1,
recall_at_3,
recall_at_5,
mrr,
f1,
hits_at_k: hitsAtK,
matched_files: matchedFiles,
unmatched_expected_files: unmatchedExpectedFiles,
};
}

View File

@ -37,6 +37,12 @@ export interface BackendResult {
precision_at_k: number;
/** Fraction of expected files found anywhere in results */
recall: number;
/** Fraction of expected files found in the first result */
recall_at_1: number;
/** Fraction of expected files found in the top 3 results */
recall_at_3: number;
/** Fraction of expected files found in the top 5 results */
recall_at_5: number;
/** Reciprocal rank of first relevant result (1/rank, 0 if not found) */
mrr: number;
/** Harmonic mean of precision_at_k and recall */
@ -49,6 +55,10 @@ export interface BackendResult {
latency_ms: number;
/** Top result file paths (for inspection) */
top_files: string[];
/** Expected files that were found anywhere in the returned result set */
matched_files: string[];
/** Expected files missing from the returned result set */
unmatched_expected_files: string[];
}
export interface QueryResult {
@ -65,6 +75,9 @@ export interface BenchmarkResult {
summary: Record<string, {
avg_precision: number;
avg_recall: number;
avg_recall_at_1: number;
avg_recall_at_3: number;
avg_recall_at_5: number;
avg_mrr: number;
avg_f1: number;
avg_latency_ms: number;

View File

@ -185,8 +185,9 @@ export function searchResultsToMarkdown(
if (opts.lineNumbers) {
content = addLineNumbers(content);
}
const fileLine = `**file:** \`${row.displayPath}\`\n`;
const contextLine = row.context ? `**context:** ${row.context}\n` : "";
return `---\n# ${heading}\n\n**docid:** \`#${row.docid}\`\n${contextLine}\n${content}\n`;
return `---\n# ${heading}\n\n${fileLine}**docid:** \`#${row.docid}\`\n${contextLine}\n${content}\n`;
}).join("\n");
}

File diff suppressed because it is too large Load Diff

View File

@ -6,8 +6,8 @@
*/
import { existsSync, mkdirSync, readFileSync, writeFileSync } from "fs";
import { join, dirname } from "path";
import { homedir } from "os";
import { join, dirname, resolve } from "path";
import { qmdHomedir } from "./paths.js";
import YAML from "yaml";
// ============================================================================
@ -101,9 +101,7 @@ export function setConfigSource(source?: { configPath?: string; config?: Collect
export function setConfigIndexName(name: string): void {
// Resolve relative paths to absolute paths and sanitize for use as filename
if (name.includes('/')) {
const { resolve } = require('path');
const { cwd } = require('process');
const absolutePath = resolve(cwd(), name);
const absolutePath = resolve(process.cwd(), name);
// Replace path separators with underscores to create a valid filename
currentIndexName = absolutePath.replace(/\//g, '_').replace(/^_/, '');
} else {
@ -120,13 +118,41 @@ function getConfigDir(): string {
if (process.env.XDG_CONFIG_HOME) {
return join(process.env.XDG_CONFIG_HOME, "qmd");
}
return join(homedir(), ".config", "qmd");
return join(qmdHomedir(), ".config", "qmd");
}
function getConfigFilePath(): string {
return join(getConfigDir(), `${currentIndexName}.yml`);
}
/**
* Find a project-local QMD config by walking upward from startDir.
* The local config lives at .qmd/index.yaml or .qmd/index.yml and,
* when used by the CLI, keeps both config and index DB writes inside
* the project instead of the global ~/.config / ~/.cache locations.
*/
export function findLocalConfigPath(startDir: string = process.cwd()): string | undefined {
let dir = resolve(startDir);
while (true) {
const qmdDir = join(dir, ".qmd");
const yamlPath = join(qmdDir, "index.yaml");
if (existsSync(yamlPath)) return yamlPath;
const ymlPath = join(qmdDir, "index.yml");
if (existsSync(ymlPath)) return ymlPath;
const parent = dirname(dir);
if (parent === dir) return undefined;
dir = parent;
}
}
/** Return the local SQLite index path paired with a local .qmd/index.yaml file. */
export function getLocalDbPath(configPath: string): string {
return join(dirname(configPath), "index.sqlite");
}
/**
* Ensure config directory exists
*/
@ -161,7 +187,8 @@ export function loadConfig(): CollectionConfig {
try {
const content = readFileSync(configPath, "utf-8");
const config = YAML.parse(content) as CollectionConfig;
const parsed = YAML.parse(content) as CollectionConfig | null | undefined;
const config = parsed ?? { collections: {} };
// Ensure collections object exists
if (!config.collections) {

View File

@ -11,10 +11,16 @@
* SQLite build before creating any database instances.
*/
export const isBun = typeof globalThis.Bun !== "undefined";
export const isBun = "Bun" in globalThis;
let _Database: any;
let _sqliteVecLoad: ((db: any) => void) | null;
export type SQLiteValue = string | number | bigint | Buffer | Uint8Array | Float32Array | null;
export type SQLiteParams = readonly SQLiteValue[];
type DatabaseConstructor = new (path: string) => Database;
type LoadableSqliteDatabase = Pick<Database, "loadExtension">;
let _Database: DatabaseConstructor;
let _sqliteVecLoad: ((db: LoadableSqliteDatabase) => void) | null;
if (isBun) {
// Dynamic string prevents tsc from resolving bun:sqlite on Node.js builds
@ -44,15 +50,15 @@ if (isBun) {
const testDb = new BunDatabase(":memory:");
testDb.loadExtension(vecPath);
testDb.close();
_sqliteVecLoad = (db: any) => db.loadExtension(vecPath);
_sqliteVecLoad = (db: LoadableSqliteDatabase) => db.loadExtension(vecPath);
} catch {
// Vector search won't work, but BM25 and other operations are unaffected.
_sqliteVecLoad = null;
}
} else {
_Database = (await import("better-sqlite3")).default;
_Database = (await import("better-sqlite3")).default as unknown as DatabaseConstructor;
const sqliteVec = await import("sqlite-vec");
_sqliteVecLoad = (db: any) => sqliteVec.load(db);
_sqliteVecLoad = (db: LoadableSqliteDatabase) => sqliteVec.load(db as Parameters<typeof sqliteVec.load>[0]);
}
/**
@ -70,13 +76,14 @@ export interface Database {
prepare(sql: string): Statement;
transaction<T extends (...args: any[]) => any>(fn: T): T;
loadExtension(path: string): void;
transaction<T extends (...args: SQLiteValue[]) => unknown>(fn: T): T;
close(): void;
}
export interface Statement {
run(...params: any[]): { changes: number; lastInsertRowid: number | bigint };
get(...params: any[]): any;
all(...params: any[]): any[];
run(...params: SQLiteValue[]): { changes: number; lastInsertRowid: number | bigint };
get<T = unknown>(...params: SQLiteValue[]): T | undefined;
all<T = unknown>(...params: SQLiteValue[]): T[];
}
/**

File diff suppressed because one or more lines are too long

View File

@ -23,7 +23,6 @@ import {
structuredSearch,
extractSnippet,
addLineNumbers,
DEFAULT_EMBED_MODEL,
DEFAULT_MULTI_GET_MAX_BYTES,
reindexCollection,
generateEmbeddings,
@ -159,6 +158,8 @@ export interface SearchOptions {
collections?: string[];
/** Max results (default: 10) */
limit?: number;
/** Max candidates to rerank (default: 40) */
candidateLimit?: number;
/** Minimum score threshold */
minScore?: number;
/** Include explain traces */
@ -290,6 +291,8 @@ export interface QMDStore {
embed(options?: {
force?: boolean;
model?: string;
/** Restrict embedding to documents in one collection. */
collection?: string;
maxDocsPerBatch?: number;
maxBatchBytes?: number;
chunkStrategy?: ChunkStrategy;
@ -400,6 +403,7 @@ export async function createStore(options: StoreOptions): Promise<QMDStore> {
minScore: opts.minScore,
explain: opts.explain,
intent: opts.intent,
candidateLimit: opts.candidateLimit,
skipRerank,
chunkStrategy: opts.chunkStrategy,
});
@ -412,12 +416,13 @@ export async function createStore(options: StoreOptions): Promise<QMDStore> {
minScore: opts.minScore,
explain: opts.explain,
intent: opts.intent,
candidateLimit: opts.candidateLimit,
skipRerank,
chunkStrategy: opts.chunkStrategy,
});
},
searchLex: async (q, opts) => internal.searchFTS(q, opts?.limit, opts?.collection),
searchVector: async (q, opts) => internal.searchVec(q, DEFAULT_EMBED_MODEL, opts?.limit, opts?.collection),
searchVector: async (q, opts) => internal.searchVec(q, llm.embedModelName, opts?.limit, opts?.collection),
expandQuery: async (q, opts) => internal.expandQuery(q, undefined, opts?.intent),
get: async (pathOrDocid, opts) => internal.findDocument(pathOrDocid, opts),
getDocumentBody: async (pathOrDocid, opts) => {
@ -516,6 +521,7 @@ export async function createStore(options: StoreOptions): Promise<QMDStore> {
return generateEmbeddings(internal, {
force: embedOpts?.force,
model: embedOpts?.model,
collection: embedOpts?.collection,
maxDocsPerBatch: embedOpts?.maxDocsPerBatch,
maxBatchBytes: embedOpts?.maxBatchBytes,
chunkStrategy: embedOpts?.chunkStrategy,

View File

@ -5,16 +5,72 @@
* local GGUF embeddings plus local text generation and reranking via node-llama-cpp.
*/
import {
getLlama,
resolveModelFile,
LlamaChatSession,
LlamaLogLevel,
type Llama,
type LlamaModel,
type LlamaEmbeddingContext,
type Token as LlamaToken,
import type {
Llama,
LlamaModel,
LlamaEmbeddingContext,
Token as LlamaToken,
} from "node-llama-cpp";
type StdoutChunk = string | Uint8Array;
type WriteCallback = (err?: Error | null) => void;
type NodeLlamaCppModule = {
getLlama: (options: Record<string, unknown>) => Promise<Llama>;
getLlamaGpuTypes?: (include?: "supported" | "allValid") => Promise<LlamaGpuMode[]>;
resolveModelFile: (model: string, cacheDir: string) => Promise<string>;
LlamaChatSession: new (options: { contextSequence: unknown }) => {
prompt: (prompt: string, options?: Record<string, unknown>) => Promise<string>;
};
LlamaLogLevel: { error: unknown };
};
let nodeLlamaCppImport: Promise<NodeLlamaCppModule> | null = null;
async function loadNodeLlamaCpp(): Promise<NodeLlamaCppModule> {
nodeLlamaCppImport ??= withNativeStdoutRedirectedToStderr(
() => import("node-llama-cpp") as Promise<NodeLlamaCppModule>
);
return nodeLlamaCppImport;
}
export function setNodeLlamaCppModuleForTest(module: NodeLlamaCppModule | null): void {
nodeLlamaCppImport = module ? Promise.resolve(module) : null;
failedGpuInitModes.clear();
noGpuAccelerationWarningShown = false;
cpuForcedPrebuiltFallbackWarningShown = false;
}
type StdoutWrite = typeof process.stdout.write;
let nativeStdoutRedirectDepth = 0;
let originalStdoutWrite: StdoutWrite | null = null;
/**
* Some node-llama-cpp native build/probe paths write library noise to stdout.
* JSON APIs must reserve stdout for machine-readable payloads, so route that
* noise to stderr while native llama initialization is in progress.
*/
export async function withNativeStdoutRedirectedToStderr<T>(fn: () => Promise<T>): Promise<T> {
if (nativeStdoutRedirectDepth === 0) {
originalStdoutWrite = process.stdout.write.bind(process.stdout) as StdoutWrite;
process.stdout.write = ((chunk: StdoutChunk, encodingOrCallback?: BufferEncoding | WriteCallback, callback?: WriteCallback) => {
if (typeof encodingOrCallback === "function") {
return process.stderr.write(chunk, encodingOrCallback);
}
return process.stderr.write(chunk, encodingOrCallback, callback);
}) as StdoutWrite;
}
nativeStdoutRedirectDepth++;
try {
return await fn();
} finally {
nativeStdoutRedirectDepth--;
if (nativeStdoutRedirectDepth === 0 && originalStdoutWrite) {
process.stdout.write = originalStdoutWrite;
originalStdoutWrite = null;
}
}
}
import { homedir } from "os";
import { join } from "path";
import { existsSync, mkdirSync, statSync, unlinkSync, readdirSync, readFileSync, writeFileSync, openSync, readSync, closeSync } from "fs";
@ -37,7 +93,7 @@ export function isQwen3EmbeddingModel(modelUri: string): boolean {
* Uses Qwen3-Embedding instruct format when a Qwen embedding model is active.
*/
export function formatQueryForEmbedding(query: string, modelUri?: string): string {
const uri = modelUri ?? process.env.QMD_EMBED_MODEL ?? DEFAULT_EMBED_MODEL;
const uri = modelUri ?? resolveEmbedModel();
if (isQwen3EmbeddingModel(uri)) {
return `Instruct: Retrieve relevant documents for the given query\nQuery: ${query}`;
}
@ -50,7 +106,7 @@ export function formatQueryForEmbedding(query: string, modelUri?: string): strin
* Qwen3-Embedding encodes documents as raw text without special prefixes.
*/
export function formatDocForEmbedding(text: string, title?: string, modelUri?: string): string {
const uri = modelUri ?? process.env.QMD_EMBED_MODEL ?? DEFAULT_EMBED_MODEL;
const uri = modelUri ?? resolveEmbedModel();
if (isQwen3EmbeddingModel(uri)) {
// Qwen3-Embedding: documents are raw text, no task prefix
return title ? `${title}\n${text}` : text;
@ -208,6 +264,32 @@ export const DEFAULT_EMBED_MODEL_URI = DEFAULT_EMBED_MODEL;
export const DEFAULT_RERANK_MODEL_URI = DEFAULT_RERANK_MODEL;
export const DEFAULT_GENERATE_MODEL_URI = DEFAULT_GENERATE_MODEL;
export type ModelResolutionConfig = {
embed?: string;
generate?: string;
rerank?: string;
};
export function resolveEmbedModel(config?: ModelResolutionConfig): string {
return config?.embed || process.env.QMD_EMBED_MODEL || DEFAULT_EMBED_MODEL;
}
export function resolveGenerateModel(config?: ModelResolutionConfig): string {
return config?.generate || process.env.QMD_GENERATE_MODEL || DEFAULT_GENERATE_MODEL;
}
export function resolveRerankModel(config?: ModelResolutionConfig): string {
return config?.rerank || process.env.QMD_RERANK_MODEL || DEFAULT_RERANK_MODEL;
}
export function resolveModels(config?: ModelResolutionConfig): Required<ModelResolutionConfig> {
return {
embed: resolveEmbedModel(config),
generate: resolveGenerateModel(config),
rerank: resolveRerankModel(config),
};
}
// Local model cache directory
const MODEL_CACHE_DIR = process.env.XDG_CACHE_HOME
? join(process.env.XDG_CACHE_HOME, "qmd", "models")
@ -270,37 +352,106 @@ async function getRemoteEtag(ref: HfRef): Promise<string | null> {
const GGUF_MAGIC = Buffer.from("GGUF");
export type GgufFileInspection = {
exists: boolean;
valid: boolean;
kind: "missing" | "gguf" | "html" | "invalid";
sizeBytes?: number;
magic?: string;
details: string;
};
function formatModelFileSize(sizeBytes: number): string {
return `${(sizeBytes / 1024).toFixed(0)} KB`;
}
function printableMagic(header: Buffer): string {
const text = header.toString("utf-8");
return /^[\x20-\x7e]{1,4}$/.test(text) ? text : `0x${header.toString("hex")}`;
}
/**
* Inspect a potential GGUF model file without mutating it.
* Used by doctor for early diagnostics and by runtime validation before load.
*/
export function inspectGgufFile(filePath: string): GgufFileInspection {
if (!existsSync(filePath)) {
return { exists: false, valid: false, kind: "missing", details: "file does not exist" };
}
let sizeBytes = 0;
try {
sizeBytes = statSync(filePath).size;
const fd = openSync(filePath, "r");
const sniff = Buffer.alloc(512);
try {
readSync(fd, sniff, 0, 512, 0);
} finally {
closeSync(fd);
}
const header = sniff.subarray(0, 4);
if (header.equals(GGUF_MAGIC)) {
return {
exists: true,
valid: true,
kind: "gguf",
sizeBytes,
magic: "GGUF",
details: `valid GGUF (${formatModelFileSize(sizeBytes)})`,
};
}
const magic = printableMagic(header);
const text = sniff.toString("utf-8").toLowerCase();
const isHtml = text.includes("<!doctype") || text.includes("<html");
if (isHtml) {
return {
exists: true,
valid: false,
kind: "html",
sizeBytes,
magic,
details: `HTML page, not a GGUF model (${formatModelFileSize(sizeBytes)}); likely proxy/firewall/captive portal response`,
};
}
return {
exists: true,
valid: false,
kind: "invalid",
sizeBytes,
magic,
details: `not valid GGUF (expected magic "GGUF", got "${magic}", ${formatModelFileSize(sizeBytes)})`,
};
} catch (error) {
return {
exists: true,
valid: false,
kind: "invalid",
sizeBytes,
details: `cannot read model file: ${error instanceof Error ? error.message : String(error)}`,
};
}
}
/**
* Validate that a file is actually a GGUF model, not an HTML error page
* from a proxy, firewall, or failed download.
* Throws a descriptive error if the file is not valid GGUF.
*/
function validateGgufFile(filePath: string, modelUri: string): void {
if (!existsSync(filePath)) return; // let downstream handle missing files
// Read header + sniff bytes in one go, then close immediately
const fd = openSync(filePath, "r");
const sniff = Buffer.alloc(512);
try {
readSync(fd, sniff, 0, 512, 0);
} finally {
closeSync(fd);
}
const header = sniff.subarray(0, 4);
if (header.equals(GGUF_MAGIC)) return; // valid GGUF
const text = sniff.toString("utf-8").toLowerCase();
const isHtml = text.includes("<!doctype") || text.includes("<html");
const got = header.toString("utf-8");
const sizeKB = (statSync(filePath).size / 1024).toFixed(0);
const inspection = inspectGgufFile(filePath);
if (!inspection.exists || inspection.valid) return; // let downstream handle missing files
// Remove the bad file so the next attempt re-downloads
unlinkSync(filePath);
try {
unlinkSync(filePath);
} catch { /* best effort */ }
if (isHtml) {
if (inspection.kind === "html") {
throw new Error(
`Downloaded model file is an HTML page, not a GGUF model (${sizeKB} KB).\n` +
`Downloaded model file is an HTML page, not a GGUF model (${formatModelFileSize(inspection.sizeBytes ?? 0)}).\n` +
`Something is intercepting the download from huggingface.co (a proxy, firewall, or captive portal).\n\n` +
`Model: ${modelUri}\n` +
`Path: ${filePath}\n\n` +
@ -313,7 +464,7 @@ function validateGgufFile(filePath: string, modelUri: string): void {
}
throw new Error(
`Model file is not valid GGUF (expected magic "GGUF", got "${got}", file is ${sizeKB} KB).\n` +
`Model file is not valid GGUF (expected magic "GGUF", got "${inspection.magic ?? "unknown"}", file is ${formatModelFileSize(inspection.sizeBytes ?? 0)}).\n` +
`Model: ${modelUri}\n` +
`Path: ${filePath}\n\n` +
`The file has been removed. Run the command again to re-download.`
@ -364,6 +515,7 @@ export async function pullModels(
}
}
const { resolveModelFile } = await loadNodeLlamaCpp();
const path = await resolveModelFile(model, cacheDir);
validateGgufFile(path, model);
const sizeBytes = existsSync(path) ? statSync(path).size : 0;
@ -460,9 +612,51 @@ export type LlamaCppConfig = {
const DEFAULT_INACTIVITY_TIMEOUT_MS = 5 * 60 * 1000;
const DEFAULT_EXPAND_CONTEXT_SIZE = 2048;
type LlamaGpuMode = "auto" | "metal" | "vulkan" | "cuda" | false;
export type LlamaGpuMode = "auto" | "metal" | "vulkan" | "cuda" | false;
type ParallelismOptions = {
gpu: string | false;
platform?: NodeJS.Platform;
computed: number;
envValue?: string;
};
export function resolveParallelismOverride(envValue = process.env.QMD_EMBED_PARALLELISM): number | undefined {
const normalized = envValue?.trim() ?? "";
if (!normalized) return undefined;
const parsed = Number(normalized);
if (!Number.isInteger(parsed) || parsed < 1) {
process.stderr.write(`QMD Warning: invalid QMD_EMBED_PARALLELISM="${envValue}", using automatic parallelism.\n`);
return undefined;
}
return Math.min(8, parsed);
}
export function resolveSafeParallelism(options: ParallelismOptions): number {
const override = resolveParallelismOverride(options.envValue);
if (override !== undefined) return override;
// node-llama-cpp/llama.cpp CUDA on Windows is unstable with multiple
// simultaneous contexts (ggml-cuda.cu:98 in #519). Vulkan and CPU do not
// show the same failure mode, so only serialize Windows CUDA by default.
if ((options.platform ?? process.platform) === "win32" && options.gpu === "cuda") {
return 1;
}
return Math.max(1, options.computed);
}
export function resolveLlamaGpuMode(
envValue = process.env.QMD_LLAMA_GPU,
forceCpuValue = process.env.QMD_FORCE_CPU
): LlamaGpuMode {
const forceCpu = forceCpuValue?.trim().toLowerCase() ?? "";
if (forceCpu && !["false", "off", "none", "disable", "disabled", "0"].includes(forceCpu)) {
return false;
}
export function resolveLlamaGpuMode(envValue = process.env.QMD_LLAMA_GPU): LlamaGpuMode {
const normalized = envValue?.trim().toLowerCase() ?? "";
if (!normalized) return "auto";
if (["false", "off", "none", "disable", "disabled", "0"].includes(normalized)) return false;
@ -472,6 +666,23 @@ export function resolveLlamaGpuMode(envValue = process.env.QMD_LLAMA_GPU): Llama
return "auto";
}
async function disposeWithTimeout(resourceName: string, dispose: () => Promise<void>, timeoutMs = 1000): Promise<void> {
const timeoutPromise = new Promise<"timeout">((resolve) => {
setTimeout(() => resolve("timeout"), timeoutMs).unref();
});
try {
const result = await Promise.race([dispose(), timeoutPromise]);
if (result === "timeout") {
process.stderr.write(`QMD Warning: timed out disposing ${resourceName}; continuing shutdown.\n`);
}
} catch (error) {
process.stderr.write(
`QMD Warning: failed to dispose ${resourceName} (${error instanceof Error ? error.message : String(error)}); continuing shutdown.\n`
);
}
}
function resolveExpandContextSize(configValue?: number): number {
if (configValue !== undefined) {
if (!Number.isInteger(configValue) || configValue <= 0) {
@ -493,6 +704,14 @@ function resolveExpandContextSize(configValue?: number): number {
return parsed;
}
const failedGpuInitModes = new Set<LlamaGpuMode>();
let noGpuAccelerationWarningShown = false;
let cpuForcedPrebuiltFallbackWarningShown = false;
function isCpuModeRequested(): boolean {
return resolveLlamaGpuMode() === false;
}
export class LlamaCpp implements LLM {
private readonly _ciMode = !!process.env.CI;
private llama: Llama | null = null;
@ -530,6 +749,15 @@ export class LlamaCpp implements LLM {
this.embedApiKey = config.embedApiKey || process.env.QMD_EMBED_API_KEY || process.env.NVIDIA_API_KEY || process.env.OPENAI_API_KEY;
this.generateModelUri = config.generateModel || process.env.QMD_GENERATE_MODEL || DEFAULT_GENERATE_MODEL;
this.rerankModelUri = config.rerankModel || process.env.QMD_RERANK_MODEL || DEFAULT_RERANK_MODEL;
// STRUCTURAL INVARIANT: the launcher (bin/qmd) sets GGML_METAL_NO_RESIDENCY=1
// on darwin BEFORE the native binding loads, which prevents the libggml-metal
// static destructor assertion at process exit (ggml-org/llama.cpp#22593).
// See isDarwinMetalMitigationActive() for the runtime check exposed to
// diagnostics. No constructor-time guard installation is needed.
this.embedModelUri = resolveEmbedModel({ embed: config.embedModel });
this.generateModelUri = resolveGenerateModel({ generate: config.generateModel });
this.rerankModelUri = resolveRerankModel({ rerank: config.rerankModel });
this.modelCacheDir = config.modelCacheDir || MODEL_CACHE_DIR;
this.expandContextSize = resolveExpandContextSize(config.expandContextSize);
this.inactivityTimeoutMs = config.inactivityTimeoutMs ?? DEFAULT_INACTIVITY_TIMEOUT_MS;
@ -542,6 +770,13 @@ export class LlamaCpp implements LLM {
get usesLocalEmbedding(): boolean {
return isLocalEmbeddingModel(this.embedModelUri);
get generateModelName(): string {
return this.generateModelUri;
}
get rerankModelName(): string {
return this.rerankModelUri;
}
/**
@ -649,33 +884,89 @@ export class LlamaCpp implements LLM {
if (!this.llama) {
const gpuMode = resolveLlamaGpuMode();
const loadLlama = async (gpu: LlamaGpuMode) =>
await getLlama({
build: allowBuild ? "autoAttempt" : "never",
const { getLlama, getLlamaGpuTypes, LlamaLogLevel } = await loadNodeLlamaCpp();
const loadLlama = async (gpu: LlamaGpuMode, sourceBuildAllowed = allowBuild, buildOverride?: "auto" | "never") =>
await withNativeStdoutRedirectedToStderr(() => getLlama({
// Prefer packaged prebuilt bindings before compiling llama.cpp locally.
// node-llama-cpp documents gpu:"auto" as the best default: Metal on
// Apple Silicon, CUDA when fully available, Vulkan where available,
// then CPU. Use build:"auto" for normal loads and build:"never" for
// diagnostic/probe paths that must not compile llama.cpp.
build: buildOverride ?? (sourceBuildAllowed ? "auto" : "never"),
logLevel: LlamaLogLevel.error,
gpu,
skipDownload: !allowBuild,
});
progressLogs: false,
skipDownload: !sourceBuildAllowed,
}));
const loadCpuCompatibleLlama = async () => {
try {
return await loadLlama(false, false);
} catch (err) {
// Some platforms, notably Apple Silicon, ship a Metal prebuilt but no
// CPU-only prebuilt. Do a fast no-build lookup for an actual CPU
// binding first; if it does not exist, use the packaged auto/Metal
// binding and disable model offloading via gpuLayers: 0.
if (!cpuForcedPrebuiltFallbackWarningShown) {
cpuForcedPrebuiltFallbackWarningShown = true;
process.stderr.write(
`QMD Warning: CPU-only llama.cpp prebuilt not available (${err instanceof Error ? err.message : String(err)}); using packaged backend with GPU offloading disabled.\n`
);
}
return await loadLlama("auto", false);
}
};
let llama: Llama;
if (gpuMode === false) {
llama = await loadLlama(false);
llama = await loadCpuCompatibleLlama();
} else if (failedGpuInitModes.has(gpuMode)) {
process.stderr.write(
`QMD Warning: skipping previously failed GPU init${gpuMode === "auto" ? "" : ` for QMD_LLAMA_GPU=${gpuMode}`}, using CPU.\n`
);
llama = await loadCpuCompatibleLlama();
} else {
try {
llama = await loadLlama(gpuMode);
// If node-llama-cpp auto-detection chose CPU, do one no-build pass
// over all OS-valid packaged GPU backends. This preserves the
// documented auto mode for Metal/CUDA/Vulkan while recovering on
// systems where a packaged backend can load but detection is too
// conservative. Never compile during these extra probes.
if (gpuMode === "auto" && llama.gpu === false && getLlamaGpuTypes) {
const candidates = (await getLlamaGpuTypes("allValid"))
.filter((candidate): candidate is Exclude<LlamaGpuMode, "auto" | false> => candidate !== false && candidate !== "auto");
for (const candidate of candidates) {
if (failedGpuInitModes.has(candidate)) continue;
try {
const gpuLlama = await loadLlama(candidate, false, "never");
if (gpuLlama.gpu !== false) {
await disposeWithTimeout("CPU llama runtime", () => llama.dispose());
llama = gpuLlama;
break;
}
await disposeWithTimeout(`${candidate} probe runtime`, () => gpuLlama.dispose());
} catch {
failedGpuInitModes.add(candidate);
}
}
}
} catch (err) {
// GPU backend (e.g. Vulkan on headless/driverless machines) can throw at init.
// Fall back to CPU so qmd still works.
// GPU backend (e.g. Vulkan/CUDA on headless/driverless machines) can throw at init.
// Fall back to CPU so qmd still works, and cache the failure to avoid repeated
// expensive native build/probe attempts in this process.
failedGpuInitModes.add(gpuMode);
process.stderr.write(
`QMD Warning: GPU init failed${gpuMode === "auto" ? "" : ` for QMD_LLAMA_GPU=${gpuMode}`} (${err instanceof Error ? err.message : String(err)}), falling back to CPU.\n`
);
llama = await loadLlama(false);
llama = await loadCpuCompatibleLlama();
}
}
if (llama.gpu === false) {
if (llama.gpu === false && !noGpuAccelerationWarningShown) {
noGpuAccelerationWarningShown = true;
process.stderr.write(
"QMD Warning: no GPU acceleration, running on CPU (slow). Run 'qmd status' for details.\n"
"QMD Warning: no GPU acceleration, running on CPU (slow). Run 'qmd doctor' for device diagnostics.\n"
);
}
this.llama = llama;
@ -683,6 +974,17 @@ export class LlamaCpp implements LLM {
return this.llama;
}
private isCpuOffloadForced(): boolean {
return isCpuModeRequested();
}
private modelLoadOptions(modelPath: string): { modelPath: string; gpuLayers?: number } {
return {
modelPath,
...(this.isCpuOffloadForced() ? { gpuLayers: 0 } : {}),
};
}
/**
* Resolve a model URI to a local path, downloading if needed.
* Validates the downloaded file is actually a GGUF model (not an HTML error page
@ -691,6 +993,7 @@ export class LlamaCpp implements LLM {
private async resolveModel(modelUri: string): Promise<string> {
this.ensureModelCacheDir();
// resolveModelFile handles HF URIs and downloads to the cache dir
const { resolveModelFile } = await loadNodeLlamaCpp();
const modelPath = await resolveModelFile(modelUri, this.modelCacheDir);
validateGgufFile(modelPath, modelUri);
return modelPath;
@ -713,7 +1016,7 @@ export class LlamaCpp implements LLM {
this.embedModelLoadPromise = (async () => {
const llama = await this.ensureLlama();
const modelPath = await this.resolveModel(this.embedModelUri);
const model = await llama.loadModel({ modelPath });
const model = await llama.loadModel(this.modelLoadOptions(modelPath));
this.embedModel = model;
// Model loading counts as activity - ping to keep alive
this.touchActivity();
@ -739,21 +1042,23 @@ export class LlamaCpp implements LLM {
private async computeParallelism(perContextMB: number): Promise<number> {
const llama = await this.ensureLlama();
if (llama.gpu) {
if (!this.isCpuOffloadForced() && llama.gpu) {
try {
const vram = await llama.getVramState();
const freeMB = vram.free / (1024 * 1024);
const maxByVram = Math.floor((freeMB * 0.25) / perContextMB);
return Math.max(1, Math.min(8, maxByVram));
const computed = Math.max(1, Math.min(8, maxByVram));
return resolveSafeParallelism({ gpu: llama.gpu, computed });
} catch {
return 2;
return resolveSafeParallelism({ gpu: llama.gpu, computed: 2 });
}
}
// CPU: split cores across contexts. At least 4 threads per context.
const cores = llama.cpuMathCores || 4;
const maxContexts = Math.floor(cores / 4);
return Math.max(1, Math.min(4, maxContexts));
const computed = Math.max(1, Math.min(4, maxContexts));
return resolveSafeParallelism({ gpu: false, computed });
}
/**
@ -762,7 +1067,7 @@ export class LlamaCpp implements LLM {
*/
private async threadsPerContext(parallelism: number): Promise<number> {
const llama = await this.ensureLlama();
if (llama.gpu) return 0; // GPU: let the library decide
if (!this.isCpuOffloadForced() && llama.gpu) return 0; // GPU: let the library decide
const cores = llama.cpuMathCores || 4;
return Math.max(1, Math.floor(cores / parallelism));
}
@ -830,7 +1135,7 @@ export class LlamaCpp implements LLM {
this.generateModelLoadPromise = (async () => {
const llama = await this.ensureLlama();
const modelPath = await this.resolveModel(this.generateModelUri);
const model = await llama.loadModel({ modelPath });
const model = await llama.loadModel(this.modelLoadOptions(modelPath));
this.generateModel = model;
return model;
})();
@ -862,7 +1167,7 @@ export class LlamaCpp implements LLM {
this.rerankModelLoadPromise = (async () => {
const llama = await this.ensureLlama();
const modelPath = await this.resolveModel(this.rerankModelUri);
const model = await llama.loadModel({ modelPath });
const model = await llama.loadModel(this.modelLoadOptions(modelPath));
this.rerankModel = model;
// Model loading counts as activity - ping to keep alive
this.touchActivity();
@ -911,9 +1216,8 @@ export class LlamaCpp implements LLM {
try {
this.rerankContexts.push(await model.createRankingContext({
contextSize: LlamaCpp.RERANK_CONTEXT_SIZE,
flashAttention: true,
...(threads > 0 ? { threads } : {}),
} as any));
}));
} catch {
if (this.rerankContexts.length === 0) {
// Flash attention might not be supported — retry without it
@ -1194,6 +1498,7 @@ export class LlamaCpp implements LLM {
// Create fresh context -> sequence -> session for each call
const context = await this.generateModel!.createContext();
const sequence = context.getSequence();
const { LlamaChatSession } = await loadNodeLlamaCpp();
const session = new LlamaChatSession({ contextSequence: sequence });
const maxTokens = options.maxTokens ?? 150;
@ -1208,7 +1513,7 @@ export class LlamaCpp implements LLM {
temperature,
topK: 20,
topP: 0.8,
onTextChunk: (text) => {
onTextChunk: (text: string) => {
result += text;
},
});
@ -1274,6 +1579,7 @@ export class LlamaCpp implements LLM {
contextSize: this.expandContextSize,
});
const sequence = genContext.getSequence();
const { LlamaChatSession } = await loadNodeLlamaCpp();
const session = new LlamaChatSession({ contextSequence: sequence });
try {
@ -1452,17 +1758,18 @@ export class LlamaCpp implements LLM {
cpuCores: number;
}> {
const llama = await this.ensureLlama(options.allowBuild ?? true);
const gpuDevices = await llama.getGpuDeviceNames();
const cpuForced = this.isCpuOffloadForced();
const gpuDevices = cpuForced ? [] : await llama.getGpuDeviceNames();
let vram: { total: number; used: number; free: number } | undefined;
if (llama.gpu) {
if (!cpuForced && llama.gpu) {
try {
const state = await llama.getVramState();
vram = { total: state.total, used: state.used, free: state.free };
} catch { /* no vram info */ }
}
return {
gpu: llama.gpu,
gpuOffloading: llama.supportsGpuOffloading,
gpu: cpuForced ? false : llama.gpu,
gpuOffloading: !cpuForced && llama.supportsGpuOffloading,
gpuDevices,
vram,
cpuCores: llama.cpuMathCores,
@ -1482,22 +1789,37 @@ export class LlamaCpp implements LLM {
this.inactivityTimer = null;
}
// Disposing llama cascades to models and contexts automatically
// See: https://node-llama-cpp.withcat.ai/guide/objects-lifecycle
// Note: llama.dispose() can hang indefinitely, so we use a timeout
if (this.llama) {
const disposePromise = this.llama.dispose();
const timeoutPromise = new Promise<void>((resolve) => setTimeout(resolve, 1000));
await Promise.race([disposePromise, timeoutPromise]);
// Explicitly dispose in dependency order: contexts first, then models, then llama.
// Relying only on llama.dispose() leaves Metal resource sets alive until process
// finalization on Apple Silicon, where ggml_metal_device_free can abort after
// otherwise-successful CLI output (#368).
for (const ctx of this.embedContexts) {
await disposeWithTimeout("embedding context", () => ctx.dispose());
}
this.embedContexts = [];
for (const ctx of this.rerankContexts) {
await disposeWithTimeout("rerank context", () => ctx.dispose());
}
this.rerankContexts = [];
if (this.embedModel) {
await disposeWithTimeout("embedding model", () => this.embedModel!.dispose());
this.embedModel = null;
}
if (this.generateModel) {
await disposeWithTimeout("generation model", () => this.generateModel!.dispose());
this.generateModel = null;
}
if (this.rerankModel) {
await disposeWithTimeout("rerank model", () => this.rerankModel!.dispose());
this.rerankModel = null;
}
// Clear references
this.embedContexts = [];
this.rerankContexts = [];
this.embedModel = null;
this.generateModel = null;
this.rerankModel = null;
this.llama = null;
if (this.llama) {
await disposeWithTimeout("llama runtime", () => this.llama!.dispose());
this.llama = null;
}
// Clear any in-flight load/create promises
this.embedModelLoadPromise = null;
@ -1752,6 +2074,66 @@ export function canUnloadLLM(): boolean {
return defaultSessionManager.canUnload();
}
// =============================================================================
// Darwin Metal exit-crash mitigation
// =============================================================================
//
// libggml-metal on macOS keeps allocated model memory wired via "residency
// sets" with a 180-second keep_alive timer (added in ggml-org/llama.cpp#11427).
// The process-static `std::vector<std::unique_ptr<ggml_metal_device>>`
// destructor fires during libc `exit()` → `__cxa_finalize_ranges` and asserts
// `[rsets->data count] == 0` — but the keep_alive hasn't expired, so the
// assertion fails and `ggml_abort` dumps a multi-kilobyte stack trace to
// stderr after the user-visible output. See ggml-org/llama.cpp#22593.
//
// No JS-side dispose call (`llama.dispose()`, `model.dispose()`, etc.) can
// prevent it: the static destructor runs after every JS-reachable cleanup,
// and `process.reallyExit` on Node calls libc `exit()` not `_exit()` (it
// does NOT skip C++ static destructors — verified in
// node/src/api/environment.cc).
//
// The actual fix is to disable residency sets via `GGML_METAL_NO_RESIDENCY=1`,
// which we set from `bin/qmd` before Node loads the native binding. For QMD's
// short-lived CLI workflow this has no measurable cost (subsequent calls
// don't reuse the warm mapping). The functions below report whether that
// mitigation is in effect — kept here, in the module that depends on the
// underlying resource, so doctor can answer "is the protection active?"
// without reaching into env handling directly.
//
// Setting `QMD_METAL_KEEP_RESIDENCY=1` opts back into residency sets (with
// the visible-noise consequences). The legacy `QMD_DISABLE_DARWIN_SAFE_EXIT`
// env var is accepted as a no-op alias for back-compat; it had no effect on
// Node prior to this fix.
/**
* Whether QMD's darwin Metal exit-crash mitigation is active in this process:
* true residency sets disabled, process exit completes silently
* false either non-darwin, or `QMD_METAL_KEEP_RESIDENCY=1` overrode it,
* in which case the libggml-metal teardown assertion may fire
*/
export function isDarwinMetalMitigationActive(): boolean {
if (process.platform !== "darwin") return false;
if (process.env.QMD_METAL_KEEP_RESIDENCY === "1") return false;
return process.env.GGML_METAL_NO_RESIDENCY === "1";
}
/**
* Compatibility shim: previous releases installed a `process.on('exit')` hook
* that tried to skip the C++ static destructor by calling `process.reallyExit`.
* That mechanism didn't work on Node (Environment::Exit still calls libc
* `exit()`), so it was replaced by `GGML_METAL_NO_RESIDENCY=1` from bin/qmd.
* Kept as a no-op for code paths that still call it; safe to remove once no
* production launcher predates the residency-set fix.
*/
export function installDarwinExitGuard(): void {
// Intentional no-op. See isDarwinMetalMitigationActive() for the real check.
}
/** @deprecated Replaced by isDarwinMetalMitigationActive. */
export function isDarwinExitGuardInstalled(): boolean {
return isDarwinMetalMitigationActive();
}
// =============================================================================
// Singleton for default LlamaCpp instance
// =============================================================================
@ -1759,7 +2141,9 @@ export function canUnloadLLM(): boolean {
let defaultLlamaCpp: LlamaCpp | null = null;
/**
* Get the default LlamaCpp instance (creates one if needed)
* Get the default LlamaCpp instance (creates one if needed). The LlamaCpp
* constructor installs the darwin exit guard, so any code path that obtains
* the singleton is protected.
*/
export function getDefaultLlamaCpp(): LlamaCpp {
if (!defaultLlamaCpp) {
@ -1769,12 +2153,24 @@ export function getDefaultLlamaCpp(): LlamaCpp {
}
/**
* Set a custom default LlamaCpp instance (useful for testing)
* Set a custom default LlamaCpp instance (useful for testing). Setting a
* non-null instance also ensures the darwin exit guard is installed keeps
* the invariant intact for test doubles that didn't go through the real
* constructor.
*/
export function setDefaultLlamaCpp(llm: LlamaCpp | null): void {
if (llm !== null) installDarwinExitGuard();
defaultLlamaCpp = llm;
}
/**
* Peek at the default LlamaCpp instance without instantiating one. Used by
* doctor and lifecycle diagnostics.
*/
export function hasDefaultLlamaCpp(): boolean {
return defaultLlamaCpp !== null;
}
/**
* Dispose the default LlamaCpp instance if it exists.
* Call this before process exit to prevent NAPI crashes.

View File

@ -32,8 +32,6 @@ import {
import { getConfigPath } from "../collections.js";
import { enableProductionMode } from "../store.js";
enableProductionMode();
// =============================================================================
// Types for structured content
// =============================================================================
@ -44,6 +42,7 @@ type SearchResultItem = {
title: string;
score: number;
context: string | null;
line: number; // Absolute line in source markdown
snippet: string;
};
@ -108,7 +107,6 @@ function getPackageVersion(): string {
*/
async function buildInstructions(store: QMDStore): Promise<string> {
const status = await store.getStatus();
const contexts = await store.listContexts();
const globalCtx = await store.getGlobalContext();
const lines: string[] = [];
@ -117,15 +115,13 @@ async function buildInstructions(store: QMDStore): Promise<string> {
if (globalCtx) lines.push(`Context: ${globalCtx}`);
// --- What's searchable? ---
// Emit names only — the per-collection doc counts and descriptions can run to ~1.5 KB
// across a dozen collections, and the same info is available on demand via the `status` tool.
if (status.collections.length > 0) {
lines.push("");
lines.push("Collections (scope with `collection` parameter):");
for (const col of status.collections) {
// Find root context for this collection
const rootCtx = contexts.find(c => c.collection === col.name && (c.path === "" || c.path === "/"));
const desc = rootCtx ? `${rootCtx.context}` : "";
lines.push(` - "${col.name}" (${col.documents} docs)${desc}`);
}
const names = status.collections.map(c => c.name).join(", ");
lines.push(`Collections (scope with \`collections\` parameter): ${names}`);
lines.push("Call the `status` tool for collection descriptions, paths, and per-collection doc counts.");
}
// --- Capability gaps ---
@ -155,7 +151,7 @@ async function buildInstructions(store: QMDStore): Promise<string> {
// --- Retrieval workflow ---
lines.push("");
lines.push("Retrieval:");
lines.push(" - `get` — single document by path or docid (#abc123). Supports line offset (`file.md:100`).");
lines.push(" - `get` — single document by path or docid (#abc123). Supports a line-range suffix: `file.md:100` (from line 100) or `file.md:100:40` (40 lines from line 100).");
lines.push(" - `multi_get` — batch retrieve by glob (`journals/2025-05*.md`) or comma-separated list.");
// --- Non-obvious things that prevent mistakes ---
@ -244,6 +240,8 @@ async function createMcpServer(store: QMDStore): Promise<McpServer> {
title: "Query",
description: `Search the knowledge base using a query document — one or more typed sub-queries combined for best recall.
Each result includes a \`line\` field with the absolute 1-indexed line of the best match in the source markdown. To read more context around a hit, call \`get(file, fromLine = max(1, line - 20), maxLines = 80, lineNumbers = true)\`.
## Query Types
**lex** BM25 keyword search. Fast, exact, no LLM needed.
@ -333,6 +331,7 @@ Intent-aware lex (C++ performance, not sports):
collections: effectiveCollections.length > 0 ? effectiveCollections : undefined,
limit,
minScore,
candidateLimit,
rerank,
intent,
});
@ -343,13 +342,14 @@ Intent-aware lex (C++ performance, not sports):
|| searches[0]?.query || "";
const filtered: SearchResultItem[] = results.map(r => {
const { line, snippet } = extractSnippet(r.bestChunk, primaryQuery, 300, undefined, undefined, intent);
const { line, snippet } = extractSnippet(r.body, primaryQuery, 300, r.bestChunkPos, r.bestChunk.length, intent);
return {
docid: `#${r.docid}`,
file: r.displayPath,
title: r.title,
score: Math.round(r.score * 100) / 100,
context: r.context,
line,
snippet: addLineNumbers(snippet, line),
};
});
@ -372,21 +372,31 @@ Intent-aware lex (C++ performance, not sports):
description: "Retrieve the full content of a document by its file path or docid. Use paths or docids (#abc123) from search results. Suggests similar files if not found.",
annotations: { readOnlyHint: true, openWorldHint: false },
inputSchema: {
file: z.string().describe("File path or docid from search results (e.g., 'pages/meeting.md', '#abc123', or 'pages/meeting.md:100' to start at line 100)"),
file: z.string().describe("File path or docid from search results. Supports a line-range suffix: 'pages/meeting.md:100' starts at line 100; 'pages/meeting.md:100:40' (or '#abc123:100:40') reads 40 lines from line 100."),
fromLine: z.number().optional().describe("Start from this line number (1-indexed)"),
maxLines: z.number().optional().describe("Maximum number of lines to return"),
lineNumbers: z.boolean().optional().default(false).describe("Add line numbers to output (format: 'N: content')"),
lineNumbers: z.boolean().optional().default(true).describe("Add line numbers to output (format: 'N: content'). On by default; set false for raw content."),
},
},
async ({ file, fromLine, maxLines, lineNumbers }) => {
// Support :line suffix in `file` (e.g. "foo.md:120") when fromLine isn't provided
// Support :line and :from:count suffixes in `file` (e.g. "foo.md:120" or
// "foo.md:120:40"). Explicit fromLine/maxLines args take precedence.
let parsedFromLine = fromLine;
let parsedMaxLines = maxLines;
let lookup = file;
const colonMatch = lookup.match(/:(\d+)$/);
if (colonMatch && colonMatch[1] && parsedFromLine === undefined) {
parsedFromLine = parseInt(colonMatch[1], 10);
lookup = lookup.slice(0, -colonMatch[0].length);
const rangeMatch = lookup.match(/:(\d+):(\d+)$/);
if (rangeMatch) {
if (parsedFromLine === undefined) parsedFromLine = parseInt(rangeMatch[1]!, 10);
if (parsedMaxLines === undefined) parsedMaxLines = parseInt(rangeMatch[2]!, 10);
lookup = lookup.slice(0, -rangeMatch[0].length);
} else {
const colonMatch = lookup.match(/:(\d+)$/);
if (colonMatch && colonMatch[1] && parsedFromLine === undefined) {
parsedFromLine = parseInt(colonMatch[1], 10);
lookup = lookup.slice(0, -colonMatch[0].length);
}
}
if (parsedFromLine !== undefined) parsedFromLine = Math.max(1, parsedFromLine);
const result = await store.get(lookup, { includeBody: false });
@ -401,7 +411,7 @@ Intent-aware lex (C++ performance, not sports):
};
}
const body = await store.getDocumentBody(result.filepath, { fromLine: parsedFromLine, maxLines }) ?? "";
const body = await store.getDocumentBody(result.filepath, { fromLine: parsedFromLine, maxLines: parsedMaxLines }) ?? "";
let text = body;
if (lineNumbers) {
const startLine = parsedFromLine || 1;
@ -440,7 +450,7 @@ Intent-aware lex (C++ performance, not sports):
pattern: z.string().describe("Glob pattern or comma-separated list of file paths"),
maxLines: z.number().optional().describe("Maximum lines per file"),
maxBytes: z.number().optional().default(10240).describe("Skip files larger than this (default: 10240 = 10KB)"),
lineNumbers: z.boolean().optional().default(false).describe("Add line numbers to output (format: 'N: content')"),
lineNumbers: z.boolean().optional().default(true).describe("Add line numbers to output (format: 'N: content'). On by default; set false for raw content."),
},
},
async ({ pattern, maxLines, maxBytes, lineNumbers }) => {
@ -540,10 +550,20 @@ Intent-aware lex (C++ performance, not sports):
// Transport: stdio (default)
// =============================================================================
export async function startMcpServer(): Promise<void> {
export type McpStartupOptions = {
dbPath?: string;
};
export async function startMcpServer(options: McpStartupOptions = {}): Promise<void> {
// Opt into production mode when the MCP server is actually started, not
// when this module is merely imported for its exports. Importing the module
// at the top level flipped the global production flag and broke test
// isolation for downstream suites that expect the default (development)
// database path behaviour.
enableProductionMode();
const configPath = getConfigPath();
const store = await createStore({
dbPath: getDefaultDbPath(),
dbPath: options.dbPath ?? getDefaultDbPath(),
...(existsSync(configPath) ? { configPath } : {}),
});
const server = await createMcpServer(store);
@ -565,10 +585,17 @@ export type HttpServerHandle = {
* Start MCP server over Streamable HTTP (JSON responses, no SSE).
* Binds to localhost only. Returns a handle for shutdown and port discovery.
*/
export async function startMcpHttpServer(port: number, options?: { quiet?: boolean }): Promise<HttpServerHandle> {
export async function startMcpHttpServer(
port: number,
options: ({ quiet?: boolean } & McpStartupOptions) = {},
): Promise<HttpServerHandle> {
// See startMcpServer() for the rationale — flip production mode here so the
// HTTP transport resolves the real database path, without leaking state into
// callers that only import this module for its exports (e.g. tests).
enableProductionMode();
const configPath = getConfigPath();
const store = await createStore({
dbPath: getDefaultDbPath(),
dbPath: options.dbPath ?? getDefaultDbPath(),
...(existsSync(configPath) ? { configPath } : {}),
});
@ -608,9 +635,21 @@ export async function startMcpHttpServer(port: number, options?: { quiet?: boole
return new Date().toISOString().slice(11, 23); // HH:mm:ss.SSS
}
type JsonRpcLikeBody = {
method?: unknown;
params?: {
name?: unknown;
arguments?: Record<string, unknown>;
};
};
type RestSearchInput = {
type?: unknown;
query?: unknown;
};
/** Extract a human-readable label from a JSON-RPC body */
function describeRequest(body: any): string {
const method = body?.method ?? "unknown";
function describeRequest(body: JsonRpcLikeBody): string {
const method = typeof body.method === "string" ? body.method : "unknown";
if (method === "tools/call") {
const tool = body.params?.name ?? "?";
const args = body.params?.arguments;
@ -654,7 +693,7 @@ export async function startMcpHttpServer(port: number, options?: { quiet?: boole
// REST endpoint: POST /query (alias: /search) — structured search without MCP protocol
if ((pathname === "/query" || pathname === "/search") && nodeReq.method === "POST") {
const rawBody = await collectBody(nodeReq);
const params = JSON.parse(rawBody);
const params = JSON.parse(rawBody) as Record<string, unknown>;
// Validate required fields
if (!params.searches || !Array.isArray(params.searches)) {
@ -664,35 +703,39 @@ export async function startMcpHttpServer(port: number, options?: { quiet?: boole
}
// Map to internal format
const queries: ExpandedQuery[] = params.searches.map((s: any) => ({
const searches = params.searches as RestSearchInput[];
const queries: ExpandedQuery[] = searches.map((s) => ({
type: s.type as 'lex' | 'vec' | 'hyde',
query: String(s.query || ""),
}));
// Use default collections if none specified
const effectiveCollections = params.collections ?? defaultCollectionNames;
const effectiveCollections = Array.isArray(params.collections) ? params.collections.map(String) : defaultCollectionNames;
const results = await store.search({
queries,
collections: effectiveCollections.length > 0 ? effectiveCollections : undefined,
limit: params.limit ?? 10,
minScore: params.minScore ?? 0,
intent: params.intent,
limit: typeof params.limit === "number" ? params.limit : 10,
minScore: typeof params.minScore === "number" ? params.minScore : 0,
candidateLimit: typeof params.candidateLimit === "number" ? params.candidateLimit : undefined,
intent: typeof params.intent === "string" ? params.intent : undefined,
rerank: typeof params.rerank === "boolean" ? params.rerank : undefined,
});
// Use first lex or vec query for snippet extraction
const primaryQuery = params.searches.find((s: any) => s.type === 'lex')?.query
|| params.searches.find((s: any) => s.type === 'vec')?.query
|| params.searches[0]?.query || "";
const primaryQuery = searches.find((s) => s.type === 'lex')?.query
|| searches.find((s) => s.type === 'vec')?.query
|| searches[0]?.query || "";
const formatted = results.map(r => {
const { line, snippet } = extractSnippet(r.bestChunk, primaryQuery, 300);
const { line, snippet } = extractSnippet(r.body, String(primaryQuery), 300, r.bestChunkPos, r.bestChunk.length, typeof params.intent === "string" ? params.intent : undefined);
return {
docid: `#${r.docid}`,
file: r.displayPath,
file: `qmd://${encodeQmdPath(r.displayPath)}`,
title: r.title,
score: Math.round(r.score * 100) / 100,
context: r.context,
line,
snippet: addLineNumbers(snippet, line),
};
});

5
src/paths.ts Normal file
View File

@ -0,0 +1,5 @@
import { homedir as osHomedir } from "node:os";
export function qmdHomedir(): string {
return process.env.HOME || process.env.USERPROFILE || osHomedir() || "/tmp";
}

File diff suppressed because it is too large Load Diff

View File

@ -1,8 +1,19 @@
/**
* Test preload file to ensure proper cleanup of native resources.
*
* Uses bun:test afterAll to properly dispose of llama.cpp Metal
* resources before the process exits, avoiding GGML_ASSERT failures.
* Uses bun:test afterAll to dispose of llama.cpp Metal resources before
* the process exits necessary on darwin to avoid the upstream rsets
* destructor assertion (ggml-org/llama.cpp#22593, fix open as #22595).
*
* The runner-level mitigation `GGML_METAL_NO_RESIDENCY=1` must be set
* BEFORE bun/node starts (libggml-metal reads it via libc getenv at
* module load). Bun does not propagate `process.env` writes to libc
* setenv, so setting it from here would be a no-op for the native
* binding. The env var is injected by:
* - bin/qmd for production CLI runs
* - scripts/test-all.mjs for `npm test`
* - package.json test:bun / test:unit scripts for direct invocation
* See CLAUDE.md for invoking `bun test` manually on darwin.
*/
import { afterAll } from "bun:test";
import { disposeDefaultLlamaCpp } from "./llm";

4
src/types/picomatch.d.ts vendored Normal file
View File

@ -0,0 +1,4 @@
declare module "picomatch" {
export type Matcher = (input: string) => boolean;
export default function picomatch(pattern: string | string[], options?: Record<string, unknown>): Matcher;
}

View File

@ -13,10 +13,12 @@ ENV PATH="/root/.local/bin:$PATH"
# Pre-install node and bun
RUN mise use -g node@latest bun@latest
# Copy the packed tarball and install via both package managers
COPY tobilu-qmd-*.tgz /tmp/
RUN mise exec node@latest -- npm install -g /tmp/tobilu-qmd-*.tgz
RUN mise exec bun@latest -- bun install -g /tmp/tobilu-qmd-*.tgz
# Copy the packed tarball and install via both package managers. Keep a stable
# tarball path for npm-exec/npx-style smoke scenarios.
COPY tobilu-qmd-*.tgz /tmp/qmd-package.tgz
RUN cp /tmp/qmd-package.tgz /tmp/tobilu-qmd.tgz
RUN mise exec node@latest -- npm install -g /tmp/qmd-package.tgz
RUN mise exec bun@latest -- bun install -g /tmp/qmd-package.tgz
# Copy test project (src + test + configs) and install deps
COPY test-src/ /opt/qmd/

View File

@ -6,7 +6,7 @@
*/
import { describe, test, expect } from "vitest";
import { detectLanguage, getASTBreakPoints, extractSymbols } from "../src/ast.js";
import { detectLanguage, getASTBreakPoints, extractSymbols, formatGrammarLoadError } from "../src/ast.js";
import type { SupportedLanguage } from "../src/ast.js";
// =============================================================================
@ -315,6 +315,16 @@ describe("getASTBreakPoints - error handling", () => {
// Should either return some partial break points or empty array — not throw
expect(Array.isArray(points)).toBe(true);
});
test("explains missing grammar packages with a repair command", () => {
const msg = formatGrammarLoadError(
"typescript",
new Error("Cannot find module 'tree-sitter-typescript/tree-sitter-typescript.wasm'"),
);
expect(msg).toContain("tree-sitter-typescript");
expect(msg).toContain("bun add tree-sitter-typescript@0.23.2");
expect(msg).toContain("falling back to regex");
});
});
// =============================================================================

View File

@ -99,6 +99,20 @@ describe("scoreResults", () => {
expect(result.mrr).toBeCloseTo(0.5); // 1/2
});
test("reports recall@1/3/5 and matched documents", () => {
const result = scoreResults(
["x.md", "qmd://concepts/a.md", "docs/b.md", "docs/c.md", "docs/d.md"],
["concepts/a.md", "b.md", "missing.md"],
3,
);
expect(result.recall_at_1).toBe(0);
expect(result.recall_at_3).toBeCloseTo(2 / 3);
expect(result.recall_at_5).toBeCloseTo(2 / 3);
expect(result.matched_files).toEqual(["concepts/a.md", "b.md"]);
expect(result.unmatched_expected_files).toEqual(["missing.md"]);
});
test("empty results", () => {
const result = scoreResults([], ["a.md"], 1);
expect(result.precision_at_k).toBe(0);

263
test/bin-wrapper.test.ts Normal file
View File

@ -0,0 +1,263 @@
import { afterEach, describe, expect, test } from "vitest";
import { chmodSync, copyFileSync, mkdtempSync, mkdirSync, readFileSync, realpathSync, rmSync, symlinkSync, writeFileSync } from "node:fs";
import { tmpdir } from "node:os";
import { dirname, join, relative } from "node:path";
import { execFileSync, spawnSync } from "node:child_process";
import { fileURLToPath } from "node:url";
const repoRoot = fileURLToPath(new URL("..", import.meta.url));
const fixtures: string[] = [];
function makeTempFixture() {
const root = mkdtempSync(join(tmpdir(), "qmd-bin-wrapper-"));
fixtures.push(root);
const capturePath = join(root, "capture.txt");
const runtimeBin = join(root, "runtime-bin");
mkdirSync(runtimeBin, { recursive: true });
for (const runtime of ["node", "bun"]) {
const runtimePath = join(runtimeBin, runtime);
if (runtime === "node") {
writeFileSync(
runtimePath,
`#!/bin/sh
if [ "$(basename "$1")" = "qmd" ]; then
exec "${process.execPath}" "$@"
else
{
printf '%s\\n' 'node'
printf '%s\\n' "$1"
shift
printf '%s\\n' "$@"
} > "$QMD_WRAPPER_CAPTURE"
fi
`,
);
} else {
writeFileSync(
runtimePath,
`#!/bin/sh\n{\n printf '%s\\n' '${runtime}'\n printf '%s\\n' "$1"\n shift\n printf '%s\\n' "$@"\n} > "$QMD_WRAPPER_CAPTURE"\n`,
);
}
chmodSync(runtimePath, 0o755);
}
return { root, capturePath, runtimeBin };
}
function makePackage(root: string, packagePath: string, lockfiles: string[] = [], options: { dist?: boolean; source?: boolean; tsx?: boolean; git?: boolean } = {}) {
const packageRoot = join(root, packagePath);
const includeDist = options.dist ?? true;
mkdirSync(join(packageRoot, "bin"), { recursive: true });
copyFileSync(join(repoRoot, "bin", "qmd"), join(packageRoot, "bin", "qmd"));
chmodSync(join(packageRoot, "bin", "qmd"), 0o755);
if (includeDist) {
mkdirSync(join(packageRoot, "dist", "cli"), { recursive: true });
writeFileSync(join(packageRoot, "dist", "cli", "qmd.js"), "// fixture\n");
}
if (options.source) {
mkdirSync(join(packageRoot, "src", "cli"), { recursive: true });
writeFileSync(join(packageRoot, "src", "cli", "qmd.ts"), "// source fixture\n");
}
if (options.tsx) {
mkdirSync(join(packageRoot, "node_modules", "tsx", "dist"), { recursive: true });
writeFileSync(join(packageRoot, "node_modules", "tsx", "dist", "cli.mjs"), "// tsx fixture\n");
}
if (options.git) {
mkdirSync(join(packageRoot, ".git"), { recursive: true });
}
for (const lockfile of lockfiles) {
writeFileSync(join(packageRoot, lockfile), "");
}
return packageRoot;
}
function symlinkRelative(target: string, linkPath: string) {
mkdirSync(dirname(linkPath), { recursive: true });
symlinkSync(relative(dirname(linkPath), target), linkPath);
}
function runWrapper(commandPath: string, runtimeBin: string, capturePath: string, env: Record<string, string> = {}) {
rmSync(capturePath, { force: true });
execFileSync(commandPath, ["--version"], {
env: {
...process.env,
...env,
PATH: `${runtimeBin}:${process.env.PATH ?? ""}`,
QMD_WRAPPER_CAPTURE: capturePath,
},
stdio: ["ignore", "pipe", "pipe"],
});
const [runtime, scriptPath, ...args] = readFileSync(capturePath, "utf8").trimEnd().split("\n");
return { runtime, scriptPath, args };
}
afterEach(() => {
for (const fixture of fixtures.splice(0)) {
rmSync(fixture, { recursive: true, force: true });
}
});
describe("bin/qmd package wrapper", () => {
test("direct package invocation resolves dist/cli/qmd.js from the package root", () => {
const { root, runtimeBin, capturePath } = makeTempFixture();
const packageRoot = makePackage(root, "node_modules/@tobilu/qmd");
const result = runWrapper(join(packageRoot, "bin", "qmd"), runtimeBin, capturePath);
expect(result.runtime).toBe("node");
expect(result.scriptPath).toBe(realpathSync(join(packageRoot, "dist", "cli", "qmd.js")));
expect(result.args).toEqual(["--version"]);
});
test("npm/Homebrew global bin symlink resolves scoped package path", () => {
const { root, runtimeBin, capturePath } = makeTempFixture();
const packageRoot = makePackage(root, "opt/homebrew/lib/node_modules/@tobilu/qmd");
const globalBin = join(root, "opt", "homebrew", "bin", "qmd");
symlinkRelative(join(packageRoot, "bin", "qmd"), globalBin);
const result = runWrapper(globalBin, runtimeBin, capturePath);
expect(result.runtime).toBe("node");
expect(result.scriptPath).toBe(realpathSync(join(packageRoot, "dist", "cli", "qmd.js")));
});
test("multi-hop global bin symlink chain resolves to the real package root", () => {
const { root, runtimeBin, capturePath } = makeTempFixture();
const packageRoot = makePackage(root, "opt/homebrew/lib/node_modules/@tobilu/qmd");
const globalBin = join(root, "opt", "homebrew", "bin", "qmd");
const shim = join(root, "opt", "homebrew", "Cellar", "qmd", "current", "bin", "qmd");
symlinkRelative(join(packageRoot, "bin", "qmd"), shim);
symlinkRelative(shim, globalBin);
const result = runWrapper(globalBin, runtimeBin, capturePath);
expect(result.runtime).toBe("node");
expect(result.scriptPath).toBe(realpathSync(join(packageRoot, "dist", "cli", "qmd.js")));
});
test("linuxbrew global bin symlink resolves lib/node_modules scoped package path", () => {
const { root, runtimeBin, capturePath } = makeTempFixture();
const packageRoot = makePackage(root, "home/linuxbrew/.linuxbrew/lib/node_modules/@tobilu/qmd");
const globalBin = join(root, "home", "linuxbrew", ".linuxbrew", "bin", "qmd");
symlinkRelative(join(packageRoot, "bin", "qmd"), globalBin);
const result = runWrapper(globalBin, runtimeBin, capturePath);
expect(result.runtime).toBe("node");
expect(result.scriptPath).toBe(realpathSync(join(packageRoot, "dist", "cli", "qmd.js")));
});
test("npx scoped package .bin symlink resolves @tobilu/qmd package path", () => {
const { root, runtimeBin, capturePath } = makeTempFixture();
const packageRoot = makePackage(root, "npm/_npx/abc123/node_modules/@tobilu/qmd");
const npxBin = join(root, "npm", "_npx", "abc123", "node_modules", ".bin", "qmd");
symlinkRelative(join(packageRoot, "bin", "qmd"), npxBin);
const result = runWrapper(npxBin, runtimeBin, capturePath);
expect(result.runtime).toBe("node");
expect(result.scriptPath).toBe(realpathSync(join(packageRoot, "dist", "cli", "qmd.js")));
});
test("bun global symlink uses bun when package-local bun lockfile exists", () => {
const { root, runtimeBin, capturePath } = makeTempFixture();
const packageRoot = makePackage(root, "home/user/.bun/install/global/node_modules/@tobilu/qmd", ["bun.lock"]);
const bunBin = join(root, "home", "user", ".bun", "bin", "qmd");
symlinkRelative(join(packageRoot, "bin", "qmd"), bunBin);
const result = runWrapper(bunBin, runtimeBin, capturePath);
expect(result.runtime).toBe("bun");
expect(result.scriptPath).toBe(realpathSync(join(packageRoot, "dist", "cli", "qmd.js")));
});
test("ambient BUN_INSTALL alone does not select bun for an npm-installed package", () => {
const { root, runtimeBin, capturePath } = makeTempFixture();
const packageRoot = makePackage(root, "opt/homebrew/lib/node_modules/@tobilu/qmd");
const globalBin = join(root, "opt", "homebrew", "bin", "qmd");
symlinkRelative(join(packageRoot, "bin", "qmd"), globalBin);
const result = runWrapper(globalBin, runtimeBin, capturePath, { BUN_INSTALL: join(root, ".bun") });
expect(result.runtime).toBe("node");
expect(result.scriptPath).toBe(realpathSync(join(packageRoot, "dist", "cli", "qmd.js")));
});
test("package-lock.json takes priority over bun lockfiles", () => {
const { root, runtimeBin, capturePath } = makeTempFixture();
const packageRoot = makePackage(root, "node_modules/@tobilu/qmd", ["package-lock.json", "bun.lock"]);
const result = runWrapper(join(packageRoot, "bin", "qmd"), runtimeBin, capturePath);
expect(result.runtime).toBe("node");
expect(result.scriptPath).toBe(realpathSync(join(packageRoot, "dist", "cli", "qmd.js")));
});
test("packaged tree uses dist even if source files are present", () => {
const { root, runtimeBin, capturePath } = makeTempFixture();
const packageRoot = makePackage(root, "node_modules/@tobilu/qmd", ["bun.lock"], { source: true });
const result = runWrapper(join(packageRoot, "bin", "qmd"), runtimeBin, capturePath);
expect(result.runtime).toBe("bun");
expect(result.scriptPath).toBe(realpathSync(join(packageRoot, "dist", "cli", "qmd.js")));
});
test("prefers source with bun in a Bun checkout even when dist exists", () => {
const { root, runtimeBin, capturePath } = makeTempFixture();
const packageRoot = makePackage(root, "qmd", ["bun.lock"], { source: true, git: true });
const result = runWrapper(join(packageRoot, "bin", "qmd"), runtimeBin, capturePath);
expect(result.runtime).toBe("bun");
expect(result.scriptPath).toBe(realpathSync(join(packageRoot, "src", "cli", "qmd.ts")));
expect(result.args).toEqual(["--version"]);
});
test("prefers source through tsx in a Node checkout even when dist exists", () => {
const { root, runtimeBin, capturePath } = makeTempFixture();
const packageRoot = makePackage(root, "qmd", [], { source: true, tsx: true, git: true });
const result = runWrapper(join(packageRoot, "bin", "qmd"), runtimeBin, capturePath);
expect(result.runtime).toBe("node");
expect(result.scriptPath).toBe(realpathSync(join(packageRoot, "node_modules", "tsx", "dist", "cli.mjs")));
expect(result.args).toEqual([realpathSync(join(packageRoot, "src", "cli", "qmd.ts")), "--version"]);
});
test("source checkout with both bun.lock and package-lock.json prefers node+tsx", () => {
// Mirrors the dist-mode "npm priority" rule: a working tree that has both
// lockfiles (because the user ran `npm install` against a repo that also
// ships bun.lock) installed native modules for Node's ABI, so source mode
// must route through tsx to avoid better-sqlite3 / sqlite-vec mismatches.
const { root, runtimeBin, capturePath } = makeTempFixture();
const packageRoot = makePackage(root, "qmd", ["bun.lock", "package-lock.json"], { source: true, tsx: true, git: true });
const result = runWrapper(join(packageRoot, "bin", "qmd"), runtimeBin, capturePath);
expect(result.runtime).toBe("node");
expect(result.scriptPath).toBe(realpathSync(join(packageRoot, "node_modules", "tsx", "dist", "cli.mjs")));
expect(result.args).toEqual([realpathSync(join(packageRoot, "src", "cli", "qmd.ts")), "--version"]);
});
test("explains how to build when dist is missing and source cannot run", () => {
const { root, runtimeBin } = makeTempFixture();
const packageRoot = makePackage(root, "qmd", [], { dist: false });
const result = spawnSync(join(packageRoot, "bin", "qmd"), ["--version"], {
env: {
...process.env,
PATH: `${runtimeBin}:${process.env.PATH ?? ""}`,
},
encoding: "utf8",
stdio: ["ignore", "pipe", "pipe"],
});
expect(result.status).toBe(1);
expect(result.stderr).toContain("qmd is not built");
expect(result.stderr).toContain("bun install && bun run build");
expect(result.stderr).toContain("npm install && npm run build");
expect(result.stderr).toContain("qmd doctor");
});
});

View File

@ -0,0 +1,128 @@
import { describe, expect, test } from "vitest";
import { finishSuccessfulCliCommand } from "../src/cli/qmd.ts";
import { LlamaCpp, isDarwinMetalMitigationActive } from "../src/llm.ts";
describe("CLI successful-exit lifecycle", () => {
test("exits 0 after successful output when post-output LLM cleanup fails", async () => {
const exitCodes: number[] = [];
const stderr: string[] = [];
const flushed: string[] = [];
await finishSuccessfulCliCommand({
command: "query",
format: "json",
cleanup: async () => {
throw new Error("ggml_metal_device_free abort simulation");
},
exit: (code) => {
exitCodes.push(code);
},
stdout: { write: (chunk: string | Uint8Array, cb?: (error?: Error | null) => void) => { flushed.push(String(chunk)); cb?.(); return true; } },
stderr: { write: (chunk: string | Uint8Array, cb?: (error?: Error | null) => void) => { stderr.push(String(chunk)); cb?.(); return true; } },
});
expect(exitCodes).toEqual([0]);
expect(stderr.join("")).toContain("QMD Warning: cleanup after successful output failed");
expect(flushed).toEqual([""]);
});
test("flushes stdout, runs cleanup, flushes stderr, then exits (when exit is provided)", async () => {
// The legacy lifecycle order is preserved for callers that pass an
// explicit `exit` function — primarily this test, which needs an
// observable terminating step.
const calls: string[] = [];
await finishSuccessfulCliCommand({
command: "query",
format: "json",
cleanup: async () => { calls.push("cleanup"); },
exit: (code) => { calls.push(`exit:${code}`); },
stdout: { write: (_chunk: string | Uint8Array, cb?: (error?: Error | null) => void) => { calls.push("stdout-flush"); cb?.(); return true; } },
stderr: { write: (_chunk: string | Uint8Array, cb?: (error?: Error | null) => void) => { calls.push("stderr-flush"); cb?.(); return true; } },
});
expect(calls).toEqual(["stdout-flush", "cleanup", "stderr-flush", "exit:0"]);
});
test("production path: sets process.exitCode=0 and returns instead of calling process.exit", async () => {
// The real CLI does NOT pass `exit` — finishSuccessfulCliCommand should set
// process.exitCode and return, letting Node's `beforeExit` fire so
// node-llama-cpp's auto-dispose runs BEFORE libc's static destructor.
// process.exit() skips `beforeExit`, which is what trips the libggml-metal
// assertion (ggml-org/llama.cpp#22593) even with explicit dispose.
const prevCode = process.exitCode;
process.exitCode = 1; // poison the state to verify we set it
try {
const calls: string[] = [];
await finishSuccessfulCliCommand({
command: "query",
format: "json",
cleanup: async () => { calls.push("cleanup"); },
stdout: { write: (_c: string | Uint8Array, cb?: (error?: Error | null) => void) => { calls.push("stdout-flush"); cb?.(); return true; } },
stderr: { write: (_c: string | Uint8Array, cb?: (error?: Error | null) => void) => { calls.push("stderr-flush"); cb?.(); return true; } },
});
expect(calls).toEqual(["stdout-flush", "cleanup", "stderr-flush"]);
expect(process.exitCode).toBe(0);
} finally {
process.exitCode = prevCode;
}
});
test("darwin Metal mitigation reflects launcher-exported env on darwin", () => {
// The real mitigation lives in bin/qmd, which sets GGML_METAL_NO_RESIDENCY=1
// before Node loads the llama.cpp native binding. The JS-side predicate
// just reports whether that env was set (and not overridden by
// QMD_METAL_KEEP_RESIDENCY). On non-darwin the function returns false.
const expected =
process.platform === "darwin" &&
process.env.QMD_METAL_KEEP_RESIDENCY !== "1" &&
process.env.GGML_METAL_NO_RESIDENCY === "1";
expect(isDarwinMetalMitigationActive()).toBe(expected);
});
test("QMD_METAL_KEEP_RESIDENCY=1 disables the mitigation even when GGML_METAL_NO_RESIDENCY is set", () => {
const prevKeep = process.env.QMD_METAL_KEEP_RESIDENCY;
const prevNoRes = process.env.GGML_METAL_NO_RESIDENCY;
try {
process.env.QMD_METAL_KEEP_RESIDENCY = "1";
process.env.GGML_METAL_NO_RESIDENCY = "1";
expect(isDarwinMetalMitigationActive()).toBe(false);
} finally {
if (prevKeep === undefined) delete process.env.QMD_METAL_KEEP_RESIDENCY;
else process.env.QMD_METAL_KEEP_RESIDENCY = prevKeep;
if (prevNoRes === undefined) delete process.env.GGML_METAL_NO_RESIDENCY;
else process.env.GGML_METAL_NO_RESIDENCY = prevNoRes;
}
});
test("disposes Llama resources in dependency order before CLI exit", async () => {
const calls: string[] = [];
const llm = new LlamaCpp({ inactivityTimeoutMs: 0 });
const disposable = (name: string) => ({
dispose: async () => {
calls.push(name);
},
});
Object.assign(llm as unknown as Record<string, unknown>, {
embedContexts: [disposable("embed-context")],
rerankContexts: [disposable("rerank-context")],
embedModel: disposable("embed-model"),
generateModel: disposable("generate-model"),
rerankModel: disposable("rerank-model"),
llama: disposable("llama"),
});
await llm.dispose();
expect(calls).toEqual([
"embed-context",
"rerank-context",
"embed-model",
"generate-model",
"rerank-model",
"llama",
]);
});
});

View File

@ -0,0 +1,20 @@
import { describe, expect, test } from "vitest";
import { readFileSync } from "fs";
import { join } from "path";
describe("LLM module loading", () => {
test("node-llama-cpp is only dynamically imported by LLM operations", () => {
const source = readFileSync(join(process.cwd(), "src", "llm.ts"), "utf-8");
expect(source).not.toMatch(/import\s+(?!type\b)[\s\S]*?from\s+["']node-llama-cpp["']/);
expect(source).toContain('import("node-llama-cpp")');
});
test("importing the CLI for lightweight commands succeeds", async () => {
const mod = await import("../src/cli/qmd.ts");
expect(mod).toMatchObject({
buildEditorUri: expect.any(Function),
termLink: expect.any(Function),
});
});
});

File diff suppressed because it is too large Load Diff

View File

@ -6,15 +6,19 @@
*/
import { describe, test, expect, beforeEach, afterEach } from "vitest";
import { mkdtemp, rm, writeFile } from "fs/promises";
import { tmpdir } from "os";
import { join } from "path";
import { homedir } from "os";
import { getConfigPath, setConfigIndexName } from "../src/collections.js";
import { qmdHomedir } from "../src/paths.js";
import { getConfigPath, loadConfig, setConfigIndexName } from "../src/collections.js";
// Save/restore env vars around each test
let savedEnv: Record<string, string | undefined>;
beforeEach(() => {
savedEnv = {
HOME: process.env.HOME,
USERPROFILE: process.env.USERPROFILE,
QMD_CONFIG_DIR: process.env.QMD_CONFIG_DIR,
XDG_CONFIG_HOME: process.env.XDG_CONFIG_HOME,
};
@ -38,7 +42,16 @@ describe("getConfigDir via getConfigPath", () => {
test("defaults to ~/.config/qmd when no env vars are set", () => {
delete process.env.QMD_CONFIG_DIR;
delete process.env.XDG_CONFIG_HOME;
expect(getConfigPath()).toBe(join(homedir(), ".config", "qmd", "index.yml"));
expect(getConfigPath()).toBe(join(qmdHomedir(), ".config", "qmd", "index.yml"));
});
test("uses the same USERPROFILE fallback as default DB path when HOME is unset", () => {
delete process.env.HOME;
delete process.env.QMD_CONFIG_DIR;
delete process.env.XDG_CONFIG_HOME;
process.env.USERPROFILE = "/Users/windows-user";
expect(getConfigPath()).toBe(join("/Users/windows-user", ".config", "qmd", "index.yml"));
});
test("QMD_CONFIG_DIR takes highest priority", () => {
@ -71,4 +84,15 @@ describe("getConfigDir via getConfigPath", () => {
setConfigIndexName("myindex");
expect(getConfigPath()).toBe(join("/xdg/config", "qmd", "myindex.yml"));
});
test("loadConfig treats an empty YAML file as an empty config", async () => {
const dir = await mkdtemp(join(tmpdir(), "qmd-empty-config-"));
try {
process.env.QMD_CONFIG_DIR = dir;
await writeFile(join(dir, "index.yml"), "");
expect(loadConfig()).toEqual({ collections: {} });
} finally {
await rm(dir, { recursive: true, force: true });
}
});
});

View File

@ -0,0 +1,27 @@
import { describe, expect, test } from "vitest";
import { execFileSync } from "child_process";
import { mkdtempSync } from "fs";
import { tmpdir } from "os";
import { dirname, join, resolve } from "path";
import { fileURLToPath } from "url";
const repoRoot = resolve(dirname(fileURLToPath(import.meta.url)), "..");
describe("Node ESM entrypoints", () => {
test("CLI --index path normalizes via setIndexName/setConfigIndexName under Node 22+", () => {
execFileSync(process.execPath, ["scripts/build.mjs"], {
cwd: repoRoot,
encoding: "utf-8",
stdio: "pipe",
});
const indexPath = join(mkdtempSync(join(tmpdir(), "qmd-index-")), "nested", "idx");
const output = execFileSync(process.execPath, ["dist/cli/qmd.js", "--index", indexPath, "--version"], {
cwd: repoRoot,
encoding: "utf-8",
stdio: "pipe",
});
expect(output).toContain("qmd ");
}, 120_000);
});

View File

@ -12,6 +12,14 @@ import {
getDefaultLlamaCpp,
disposeDefaultLlamaCpp,
resolveLlamaGpuMode,
setNodeLlamaCppModuleForTest,
withNativeStdoutRedirectedToStderr,
resolveParallelismOverride,
resolveSafeParallelism,
resolveEmbedModel,
resolveGenerateModel,
resolveRerankModel,
resolveModels,
withLLMSession,
canUnloadLLM,
SessionReleasedError,
@ -19,6 +27,63 @@ import {
type ILLMSession,
} from "../src/llm.js";
describe("model name resolution", () => {
function withModelEnv(env: Record<string, string | undefined>, fn: () => void): void {
const previous = {
QMD_EMBED_MODEL: process.env.QMD_EMBED_MODEL,
QMD_GENERATE_MODEL: process.env.QMD_GENERATE_MODEL,
QMD_RERANK_MODEL: process.env.QMD_RERANK_MODEL,
};
try {
for (const [key, value] of Object.entries(env)) {
if (value === undefined) delete process.env[key];
else process.env[key] = value;
}
fn();
} finally {
for (const [key, value] of Object.entries(previous)) {
if (value === undefined) delete process.env[key];
else process.env[key] = value;
}
}
}
test("all model roles resolve config hints before env fallbacks", () => {
withModelEnv({
QMD_EMBED_MODEL: "env-embed",
QMD_GENERATE_MODEL: "env-generate",
QMD_RERANK_MODEL: "env-rerank",
}, () => {
const config = {
embed: "config-embed",
generate: "config-generate",
rerank: "config-rerank",
};
expect(resolveEmbedModel(config)).toBe("config-embed");
expect(resolveGenerateModel(config)).toBe("config-generate");
expect(resolveRerankModel(config)).toBe("config-rerank");
expect(resolveModels(config)).toEqual(config);
});
});
test("LlamaCpp constructor uses the same resolver as status/embed/query helpers", () => {
withModelEnv({
QMD_EMBED_MODEL: "env-embed",
QMD_GENERATE_MODEL: "env-generate",
QMD_RERANK_MODEL: "env-rerank",
}, () => {
const llm = new LlamaCpp({
embedModel: "config-embed",
generateModel: "config-generate",
rerankModel: "config-rerank",
});
expect(llm.embedModelName).toBe(resolveEmbedModel({ embed: "config-embed" }));
expect(llm.generateModelName).toBe(resolveGenerateModel({ generate: "config-generate" }));
expect(llm.rerankModelName).toBe(resolveRerankModel({ rerank: "config-rerank" }));
});
});
});
// =============================================================================
// Singleton Tests (no model loading required)
// =============================================================================
@ -75,6 +140,29 @@ describe("QMD_LLAMA_GPU resolution", () => {
expect(resolveLlamaGpuMode(" cuda ")).toBe("cuda");
});
test("QMD_FORCE_CPU disables GPU before QMD_LLAMA_GPU auto-detection", () => {
const prevForceCpu = process.env.QMD_FORCE_CPU;
process.env.QMD_FORCE_CPU = "1";
try {
expect(resolveLlamaGpuMode(undefined)).toBe(false);
expect(resolveLlamaGpuMode("cuda")).toBe(false);
} finally {
if (prevForceCpu === undefined) delete process.env.QMD_FORCE_CPU;
else process.env.QMD_FORCE_CPU = prevForceCpu;
}
});
test("QMD_FORCE_CPU ignores false-ish values", () => {
const prevForceCpu = process.env.QMD_FORCE_CPU;
process.env.QMD_FORCE_CPU = "0";
try {
expect(resolveLlamaGpuMode(undefined)).toBe("auto");
} finally {
if (prevForceCpu === undefined) delete process.env.QMD_FORCE_CPU;
else process.env.QMD_FORCE_CPU = prevForceCpu;
}
});
test("warns and falls back to auto for unsupported values", () => {
const stderrSpy = vi.spyOn(process.stderr, "write").mockReturnValue(true);
try {
@ -87,6 +175,201 @@ describe("QMD_LLAMA_GPU resolution", () => {
});
});
describe("native llama stdout containment", () => {
test("redirects native stdout noise to stderr while JSON callers are initializing llama", async () => {
const stdoutSpy = vi.spyOn(process.stdout, "write").mockReturnValue(true);
const stderrSpy = vi.spyOn(process.stderr, "write").mockReturnValue(true);
try {
await withNativeStdoutRedirectedToStderr(async () => {
process.stdout.write("cmake build spam\n");
return "ok";
});
expect(stdoutSpy).not.toHaveBeenCalled();
expect(stderrSpy).toHaveBeenCalledWith("cmake build spam\n", undefined, undefined);
} finally {
stdoutSpy.mockRestore();
stderrSpy.mockRestore();
}
});
test("keeps native GPU failure noise off stdout and caches failed GPU init", async () => {
const prevGpu = process.env.QMD_LLAMA_GPU;
const prevForceCpu = process.env.QMD_FORCE_CPU;
process.env.QMD_LLAMA_GPU = "cuda";
delete process.env.QMD_FORCE_CPU;
const calls: unknown[] = [];
const fakeLlama = { gpu: false, cpuMathCores: 4 };
setNodeLlamaCppModuleForTest({
LlamaLogLevel: { error: "error" },
resolveModelFile: vi.fn(),
LlamaChatSession: vi.fn() as any,
getLlama: vi.fn(async (options: Record<string, unknown>) => {
calls.push(options.gpu);
if (options.gpu === "cuda") {
process.stdout.write("cmake build spam\n");
throw new Error("CUDA unavailable");
}
return fakeLlama as any;
}),
});
const stdoutSpy = vi.spyOn(process.stdout, "write").mockReturnValue(true);
const stderrSpy = vi.spyOn(process.stderr, "write").mockReturnValue(true);
try {
const first = new LlamaCpp();
const second = new LlamaCpp();
await (first as any).ensureLlama();
await (second as any).ensureLlama();
expect(stdoutSpy).not.toHaveBeenCalled();
expect(stderrSpy).toHaveBeenCalledWith("cmake build spam\n", undefined, undefined);
expect(calls).toEqual(["cuda", false, false]);
expect(String(stderrSpy.mock.calls.map(call => call[0]).join(""))).toContain("skipping previously failed GPU init");
} finally {
stdoutSpy.mockRestore();
stderrSpy.mockRestore();
setNodeLlamaCppModuleForTest(null);
if (prevGpu === undefined) delete process.env.QMD_LLAMA_GPU;
else process.env.QMD_LLAMA_GPU = prevGpu;
if (prevForceCpu === undefined) delete process.env.QMD_FORCE_CPU;
else process.env.QMD_FORCE_CPU = prevForceCpu;
}
});
test("warns about CPU fallback only once per process", async () => {
const prevGpu = process.env.QMD_LLAMA_GPU;
const prevForceCpu = process.env.QMD_FORCE_CPU;
process.env.QMD_LLAMA_GPU = "false";
delete process.env.QMD_FORCE_CPU;
setNodeLlamaCppModuleForTest({
LlamaLogLevel: { error: "error" },
resolveModelFile: vi.fn(),
LlamaChatSession: vi.fn() as any,
getLlama: vi.fn(async () => ({ gpu: false, cpuMathCores: 4 }) as any),
});
const stderrSpy = vi.spyOn(process.stderr, "write").mockReturnValue(true);
try {
const first = new LlamaCpp();
const second = new LlamaCpp();
await (first as any).ensureLlama();
await (second as any).ensureLlama();
const stderr = String(stderrSpy.mock.calls.map(call => call[0]).join(""));
expect(stderr.match(/no GPU acceleration/g)?.length).toBe(1);
expect(stderr).toContain("qmd doctor");
expect(stderr).not.toContain("QMD_STATUS_DEVICE_PROBE");
} finally {
stderrSpy.mockRestore();
setNodeLlamaCppModuleForTest(null);
if (prevGpu === undefined) delete process.env.QMD_LLAMA_GPU;
else process.env.QMD_LLAMA_GPU = prevGpu;
if (prevForceCpu === undefined) delete process.env.QMD_FORCE_CPU;
else process.env.QMD_FORCE_CPU = prevForceCpu;
}
});
test("embeds hello world with QMD_FORCE_CPU=1 without throwing", async () => {
const prevGpu = process.env.QMD_LLAMA_GPU;
const prevForceCpu = process.env.QMD_FORCE_CPU;
process.env.QMD_FORCE_CPU = "1";
process.env.QMD_LLAMA_GPU = "metal";
const getEmbeddingFor = vi.fn(async (text: string) => ({
vector: new Float32Array([0.1, 0.2, 0.3]),
text,
}));
const createEmbeddingContext = vi.fn(async () => ({
getEmbeddingFor,
dispose: vi.fn(async () => {}),
}));
const loadModel = vi.fn(async () => ({
trainContextSize: 2048,
tokenize: (text: string) => Array.from(text),
detokenize: (tokens: string[]) => tokens.join(""),
createEmbeddingContext,
dispose: vi.fn(async () => {}),
}));
const getLlama = vi.fn(async (options: Record<string, unknown>) => ({
gpu: false,
cpuMathCores: 4,
loadModel,
dispose: vi.fn(async () => {}),
}) as any);
setNodeLlamaCppModuleForTest({
LlamaLogLevel: { error: "error" },
resolveModelFile: vi.fn(async () => "/tmp/nonexistent-model.gguf"),
LlamaChatSession: vi.fn() as any,
getLlama,
});
const stderrSpy = vi.spyOn(process.stderr, "write").mockReturnValue(true);
const llm = new LlamaCpp();
try {
const result = await llm.embed("hello world");
expect(result).toEqual({
embedding: [0.10000000149011612, 0.20000000298023224, 0.30000001192092896],
model: llm.embedModelName,
});
expect(getLlama).toHaveBeenCalledWith(expect.objectContaining({ gpu: false, build: "never" }));
expect(loadModel).toHaveBeenCalledWith(expect.objectContaining({ gpuLayers: 0 }));
expect(getEmbeddingFor).toHaveBeenCalledWith("hello world");
} finally {
await llm.dispose();
stderrSpy.mockRestore();
setNodeLlamaCppModuleForTest(null);
if (prevGpu === undefined) delete process.env.QMD_LLAMA_GPU;
else process.env.QMD_LLAMA_GPU = prevGpu;
if (prevForceCpu === undefined) delete process.env.QMD_FORCE_CPU;
else process.env.QMD_FORCE_CPU = prevForceCpu;
}
});
});
describe("LLM context parallelism safety", () => {
test("defaults Windows CUDA to one context to avoid ggml-cuda.cu:98 crashes", () => {
expect(resolveSafeParallelism({
gpu: "cuda",
platform: "win32",
computed: 8,
envValue: undefined,
})).toBe(1);
});
test("keeps non-Windows and non-CUDA backends on computed parallelism", () => {
expect(resolveSafeParallelism({ gpu: "cuda", platform: "linux", computed: 8 })).toBe(8);
expect(resolveSafeParallelism({ gpu: "vulkan", platform: "win32", computed: 8 })).toBe(8);
expect(resolveSafeParallelism({ gpu: false, platform: "win32", computed: 4 })).toBe(4);
});
test("QMD_EMBED_PARALLELISM overrides the Windows CUDA safety default", () => {
expect(resolveSafeParallelism({
gpu: "cuda",
platform: "win32",
computed: 8,
envValue: "2",
})).toBe(2);
});
test("QMD_EMBED_PARALLELISM clamps invalid values and warns", () => {
const stderrSpy = vi.spyOn(process.stderr, "write").mockReturnValue(true);
try {
expect(resolveParallelismOverride("0")).toBeUndefined();
expect(resolveParallelismOverride("bad")).toBeUndefined();
expect(stderrSpy).toHaveBeenCalledTimes(2);
expect(String(stderrSpy.mock.calls[0]?.[0] || "")).toContain("QMD_EMBED_PARALLELISM");
} finally {
stderrSpy.mockRestore();
}
});
});
describe("LlamaCpp expand context size config", () => {
const defaultExpandContextSize = 2048;
@ -820,7 +1103,7 @@ describe.skipIf(!!process.env.CI)("LlamaCpp Integration", () => {
for (const doc of result.results) {
console.log(` ${doc.file}: ${doc.score.toFixed(4)}`);
}
});
}, 30000);
});
describe("expandQuery", () => {

98
test/local-config.test.ts Normal file
View File

@ -0,0 +1,98 @@
import { existsSync, mkdtempSync, mkdirSync, writeFileSync, rmSync, realpathSync } from "node:fs";
import { execFileSync } from "node:child_process";
import { join } from "node:path";
import { tmpdir } from "node:os";
import { afterEach, describe, expect, test } from "vitest";
import { findLocalConfigPath, getLocalDbPath } from "../src/collections.js";
function cliCommandArgs(command: string): { bin: string; args: string[] } {
const cliPath = join(process.cwd(), "src/cli/qmd.ts");
if (process.versions.bun) {
return { bin: process.execPath, args: [cliPath, command] };
}
return {
bin: process.execPath,
args: [join(process.cwd(), "node_modules/tsx/dist/cli.mjs"), cliPath, command],
};
}
const roots: string[] = [];
function tempProject(): string {
const root = mkdtempSync(join(tmpdir(), "qmd-local-config-"));
roots.push(root);
return root;
}
afterEach(() => {
for (const root of roots.splice(0)) {
rmSync(root, { recursive: true, force: true });
}
});
describe("local .qmd project config", () => {
test("finds .qmd/index.yaml from nested working directories", () => {
const root = tempProject();
const configPath = join(root, ".qmd", "index.yaml");
mkdirSync(join(root, ".qmd"), { recursive: true });
writeFileSync(configPath, "collections: {}\n");
const nested = join(root, "wiki", "Shopify");
mkdirSync(nested, { recursive: true });
expect(findLocalConfigPath(nested)).toBe(configPath);
});
test("prefers index.yaml over index.yml when both exist", () => {
const root = tempProject();
mkdirSync(join(root, ".qmd"), { recursive: true });
const yaml = join(root, ".qmd", "index.yaml");
const yml = join(root, ".qmd", "index.yml");
writeFileSync(yaml, "collections: {}\n");
writeFileSync(yml, "collections: {}\n");
expect(findLocalConfigPath(root)).toBe(yaml);
});
test("uses .qmd/index.sqlite next to the local config", () => {
const root = tempProject();
mkdirSync(join(root, ".qmd"), { recursive: true });
const configPath = join(root, ".qmd", "index.yaml");
writeFileSync(configPath, "collections: {}\n");
expect(getLocalDbPath(configPath)).toBe(join(root, ".qmd", "index.sqlite"));
});
test("CLI uses local .qmd config and index instead of global cache", () => {
const root = tempProject();
mkdirSync(join(root, ".qmd"), { recursive: true });
mkdirSync(join(root, "docs"), { recursive: true });
writeFileSync(join(root, "docs", "a.md"), "# A\n\nLocal test document.\n");
writeFileSync(join(root, ".qmd", "index.yaml"), `collections:\n docs:\n path: ${JSON.stringify(join(root, "docs"))}\n pattern: "**/*.md"\n context:\n /: Local test docs\nmodels:\n embed: local-embed-model\n rerank: local-rerank-model\n generate: local-generate-model\n`);
const home = join(root, "home");
const { bin, args } = cliCommandArgs("status");
const output = execFileSync(bin, args, {
cwd: root,
encoding: "utf-8",
env: {
...process.env,
HOME: home,
XDG_CONFIG_HOME: join(home, ".config"),
XDG_CACHE_HOME: join(home, ".cache"),
QMD_EMBED_MODEL: "env-embed-model",
QMD_RERANK_MODEL: "env-rerank-model",
QMD_GENERATE_MODEL: "env-generate-model",
},
});
const localIndex = join(root, ".qmd", "index.sqlite");
expect(output).toContain(`Index: ${realpathSync(localIndex)}`);
expect(output).toContain("docs (qmd://docs/)");
expect(output).toContain("Embedding: local-embed-model");
expect(output).toContain("Reranking: local-rerank-model");
expect(output).toContain("Generation: local-generate-model");
expect(output).not.toContain("env-embed-model");
expect(existsSync(localIndex)).toBe(true);
expect(existsSync(join(home, ".cache", "qmd", "index.sqlite"))).toBe(false);
});
});

View File

@ -80,6 +80,7 @@ function initTestDatabase(db: Database): void {
seq INTEGER NOT NULL DEFAULT 0,
pos INTEGER NOT NULL DEFAULT 0,
model TEXT NOT NULL,
embed_fingerprint TEXT NOT NULL DEFAULT '',
embedded_at TEXT NOT NULL,
PRIMARY KEY (hash, seq)
)
@ -186,7 +187,7 @@ function seedTestData(db: Database): void {
for (let i = 0; i < 768; i++) embedding[i] = Math.random();
for (const doc of docs.slice(0, 4)) { // Skip large file for embeddings
db.prepare(`INSERT INTO content_vectors (hash, seq, pos, model, embedded_at) VALUES (?, 0, 0, 'embeddinggemma', ?)`).run(doc.hash, now);
db.prepare(`INSERT INTO content_vectors (hash, seq, pos, model, embed_fingerprint, embedded_at) VALUES (?, 0, 0, ?, ?, ?)`).run(doc.hash, DEFAULT_EMBED_MODEL, getEmbeddingFingerprint(DEFAULT_EMBED_MODEL), now);
db.prepare(`INSERT INTO vectors_vec (hash_seq, embedding) VALUES (?, ?)`).run(`${doc.hash}_0`, embedding);
}
}
@ -211,6 +212,7 @@ import {
findDocuments,
getStatus,
DEFAULT_EMBED_MODEL,
getEmbeddingFingerprint,
DEFAULT_QUERY_MODEL,
DEFAULT_RERANK_MODEL,
DEFAULT_MULTI_GET_MAX_BYTES,
@ -887,6 +889,33 @@ describe("MCP Server", () => {
expect(typeof col.documents).toBe("number");
}
});
test("REST /query and /search file field uses qmd:// URI prefix (#576)", () => {
// Regression test: the HTTP REST endpoint was returning r.displayPath (e.g.
// "docs/readme.md") instead of "qmd://docs/readme.md", while the CLI and MCP
// resource URIs always use the qmd:// scheme. This simulates the fix: the REST
// handler now applies encodeQmdPath and prepends "qmd://".
const results = searchFTS(testDb, "readme", 5);
expect(results.length).toBeGreaterThan(0);
// Simulate what the fixed REST handler produces for each result
const restResponseItems = results.map(r => ({
docid: `#${r.docid}`,
file: `qmd://${r.displayPath.split('/').map(s => encodeURIComponent(s)).join('/')}`,
title: r.title,
score: Math.round(r.score * 100) / 100,
}));
// Every file field must start with qmd://
for (const item of restResponseItems) {
expect(item.file).toMatch(/^qmd:\/\//);
}
// Spot-check the readme result
const readmeItem = restResponseItems.find(item => item.file.includes("readme"));
expect(readmeItem).toBeDefined();
expect(readmeItem!.file).toBe("qmd://docs/readme.md");
});
});
});
@ -913,6 +942,22 @@ describe.skipIf(!!process.env.CI)("MCP HTTP Transport", () => {
initTestDatabase(db);
seedTestData(db);
// 300 pad lines (37 chars each = 11100 chars) puts the marker past the
// first chunk boundary at CHUNK_SIZE_CHARS = 3600.
{
const padLine = "Pad line for chunk boundary coverage\n";
const absLineFixtureBody =
padLine.repeat(300) +
"UNIQUE_KEYWORD_XYZ marker\n" +
padLine.repeat(20);
const fixtureHash = "hash-abslines";
const now = new Date().toISOString();
db.prepare(`INSERT OR IGNORE INTO content (hash, doc, created_at) VALUES (?, ?, ?)`)
.run(fixtureHash, absLineFixtureBody, now);
db.prepare(`INSERT INTO documents (collection, path, title, hash, created_at, modified_at, active) VALUES ('docs', ?, ?, ?, ?, ?, 1)`)
.run("absolute-line-fixture.md", "Absolute Line Fixture", fixtureHash, now, now);
}
// Sync config into SQLite
const httpTestConfig: CollectionConfig = {
collections: {
@ -1074,4 +1119,29 @@ describe.skipIf(!!process.env.CI)("MCP HTTP Transport", () => {
expect(json.result).toBeDefined();
expect(json.result.content.length).toBeGreaterThan(0);
});
test("POST /mcp tools/call query returns absolute source-file line numbers, not chunk-local", async () => {
await mcpRequest({
jsonrpc: "2.0", id: 1, method: "initialize",
params: { protocolVersion: "2025-03-26", capabilities: {}, clientInfo: { name: "test", version: "1.0" } },
});
const { status, json } = await mcpRequest({
jsonrpc: "2.0", id: 5, method: "tools/call",
params: {
name: "query",
arguments: {
searches: [{ type: "lex", query: "UNIQUE_KEYWORD_XYZ" }],
rerank: false,
},
},
});
expect(status).toBe(200);
const results = json.result.structuredContent.results;
expect(results.length).toBeGreaterThan(0);
const hit = results.find((r: any) => r.file === "docs/absolute-line-fixture.md");
expect(hit).toBeDefined();
expect(hit.line).toBe(301);
expect(hit.snippet).toMatch(/^\d+: @@ -3\d\d,/);
});
});

71
test/package.test.ts Normal file
View File

@ -0,0 +1,71 @@
import { describe, expect, test } from "vitest";
import { readFileSync } from "node:fs";
import { join } from "node:path";
const root = new URL("..", import.meta.url);
const pkg = JSON.parse(readFileSync(new URL("package.json", root), "utf8"));
describe("package test task", () => {
test("runs typecheck, unit tests, and package smoke checks", () => {
expect(pkg.scripts.test).toContain("scripts/test-all.mjs");
expect(pkg.scripts["test:types"]).toContain("tsconfig.build.json --noEmit");
expect(pkg.scripts["test:unit"]).toContain("vitest.mjs");
expect(pkg.scripts["test:unit"]).toContain("bun test");
expect(pkg.scripts["test:unit"]).toContain("CI=true");
expect(pkg.scripts["test:package"]).toContain("scripts/package-smoke.mjs");
const testAllScript = readFileSync(new URL("scripts/test-all.mjs", root), "utf8");
expect(testAllScript).toContain("TypeScript build typecheck");
expect(testAllScript).toContain("Vitest suite under Node");
expect(testAllScript).toContain("Bun test suite");
expect(testAllScript).toContain("Package smoke");
const packageSmokeScript = readFileSync(new URL("scripts/package-smoke.mjs", root), "utf8");
expect(packageSmokeScript).toContain("scripts/build.mjs");
expect(packageSmokeScript).toContain("scripts/check-package-grammars.mjs");
expect(packageSmokeScript).toContain("compiled CLI under Node");
expect(packageSmokeScript).toContain("compiled CLI under Bun");
expect(packageSmokeScript).toContain("package wrapper");
});
});
describe("package grammar distribution", () => {
test("installs AST grammar wasm packages as required runtime dependencies", () => {
for (const dep of ["tree-sitter-typescript", "tree-sitter-python", "tree-sitter-go", "tree-sitter-rust"]) {
expect(pkg.dependencies, `${dep} should be a required dependency`).toHaveProperty(dep);
expect(pkg.optionalDependencies ?? {}, `${dep} should not be optional`).not.toHaveProperty(dep);
}
});
test("documents a packaging smoke check for grammar wasm availability", () => {
expect(pkg.scripts, "package.json scripts").toHaveProperty("smoke:package-grammars");
expect(String(pkg.scripts["smoke:package-grammars"])).toContain("check-package-grammars");
expect(pkg.files, "published package files").toContain("scripts/build.mjs");
expect(pkg.files, "published package files").toContain("scripts/check-package-grammars.mjs");
expect(pkg.files, "published package files").toContain("scripts/package-smoke.mjs");
expect(pkg.files, "published package files").toContain("scripts/test-all.mjs");
expect(pkg.files, "published package files").toContain("skills/");
const qmdSkill = readFileSync(new URL("skills/qmd/SKILL.md", root), "utf8");
expect(qmdSkill).toContain("# QMD - Query Markdown Documents");
expect(qmdSkill).toContain("## How search works");
expect(qmdSkill).toContain("## MCP Tool: `query`");
expect(qmdSkill).not.toContain("This file is a discovery stub");
const firstSixtyLines = qmdSkill.split(/\r?\n/).slice(0, 60).join("\n");
expect(firstSixtyLines).toContain("Search for candidate documents");
expect(firstSixtyLines).toContain("qmd search");
expect(firstSixtyLines).toContain('qmd multi-get "#abc123,#def432"');
expect(firstSixtyLines).toContain("Retrieved:");
expect(firstSixtyLines).toContain("qmd query");
// The skill must teach structured, self-authored queries near the top.
expect(firstSixtyLines).toContain("Default to structured");
const scriptPath = join(root.pathname, "scripts", "check-package-grammars.mjs");
const script = readFileSync(scriptPath, "utf8");
expect(script).toContain("tree-sitter-typescript/tree-sitter-typescript.wasm");
expect(script).toContain("tree-sitter-typescript/tree-sitter-tsx.wasm");
});
});

414
test/path-fidelity.test.ts Normal file
View File

@ -0,0 +1,414 @@
/**
* Path Fidelity Tests
*
* Verifies that QMD stores literal filesystem paths (not handalized slugs) so
* that paths with special characters spaces, #, &, @, [], (), etc. round-
* trip correctly through index search get full-path.
*
* This covers the five breakage points found before the literal-path fix:
* 1. search --json `file` field shows handalized slug instead of real path
* 2. `qmd get --full-path` silently falls back (resolveVirtualPath built
* a non-existent path from the slug, existsSync returned false)
* 3. `qmd get <actual-fs-path>` returns "Document not found"
* 4. `qmd ls` shows handalized slugs
* 5. `toVirtualPath(db, absPath)` returns null
*
* Also covers backward-compat migration: an index created with the old
* handalize-at-index-time code can be updated with `qmd update` and the paths
* are renamed to their literal forms in-place.
*/
import { describe, test, expect, beforeAll, afterAll } from "vitest";
import { mkdir, mkdtemp, rm, writeFile } from "fs/promises";
import { existsSync, realpathSync } from "fs";
import { tmpdir } from "os";
import { join } from "path";
import { spawn } from "child_process";
import { fileURLToPath } from "url";
import { dirname } from "path";
import YAML from "yaml";
import { openDatabase } from "../src/db.js";
import type { Database } from "../src/db.js";
import {
createStore,
toVirtualPath,
insertDocument,
insertContent,
hashContent,
handelize,
normalizePathSeparators,
syncConfigToDb,
} from "../src/store.js";
import type { CollectionConfig } from "../src/collections.js";
const thisDir = dirname(fileURLToPath(import.meta.url));
const projectRoot = join(thisDir, "..");
const qmdScript = join(projectRoot, "src", "cli", "qmd.ts");
const isBunRuntime = typeof (globalThis as { Bun?: unknown }).Bun !== "undefined";
const tsxCli = join(projectRoot, "node_modules", "tsx", "dist", "cli.mjs");
async function runQmd(
args: string[],
opts: { cwd: string; dbPath: string; configDir: string; env?: Record<string, string> }
): Promise<{ stdout: string; stderr: string; exitCode: number }> {
const runner = isBunRuntime
? { command: process.execPath, args: [qmdScript, ...args] }
: { command: process.execPath, args: [tsxCli, qmdScript, ...args] };
const proc = spawn(runner.command, runner.args, {
cwd: opts.cwd,
env: {
...process.env,
INDEX_PATH: opts.dbPath,
QMD_CONFIG_DIR: opts.configDir,
PWD: opts.cwd,
QMD_DOCTOR_DEVICE_PROBE: "0",
...(opts.env ?? {}),
},
stdio: ["ignore", "pipe", "pipe"],
});
let stdout = "";
let stderr = "";
proc.stdout?.on("data", (c: Buffer) => { stdout += c.toString(); });
proc.stderr?.on("data", (c: Buffer) => { stderr += c.toString(); });
const exitCode = await new Promise<number>((res, rej) => {
proc.once("error", rej);
proc.on("close", (code) => res(code ?? 1));
});
return { stdout, stderr, exitCode };
}
// ---------------------------------------------------------------------------
// Test environment setup
// ---------------------------------------------------------------------------
let testDir: string;
// Files with names that previously broke due to handalize() at index time.
const crazyFiles: Array<{ name: string; content: string }> = [
{
name: "# Meeting - 234232 3432 __ 5.md",
content: "# Meeting - 234232 3432 // 5\n\nSome meeting content with searchterm-alpha.\n",
},
{
name: "Budget & Revenue (Q4) [2024].md",
content: "# Budget & Revenue Q4 2024\n\nFinancial overview searchterm-beta.\n",
},
{
name: "normal-file.md",
content: "# Normal File\n\nPlain filename, should always work.\n",
},
];
const crazySubFiles: Array<{ name: string; content: string }> = [
{
name: "Notes #42 - foo@bar.md",
content: "# Notes #42\n\nSubdir file with searchterm-gamma.\n",
},
];
beforeAll(async () => {
testDir = await mkdtemp(join(tmpdir(), "qmd-path-fidelity-"));
});
afterAll(async () => {
await rm(testDir, { recursive: true, force: true });
});
// Helper: create a fresh isolated test environment with a corpus of crazy filenames.
async function createCrazyCollection(prefix: string): Promise<{
collectionDir: string;
dbPath: string;
configDir: string;
}> {
const envDir = join(testDir, prefix);
const collectionDir = join(envDir, "corpus");
const dbPath = join(envDir, "test.sqlite");
const configDir = join(envDir, "config");
await mkdir(collectionDir, { recursive: true });
await mkdir(join(collectionDir, "subdir"), { recursive: true });
await mkdir(configDir, { recursive: true });
// Resolve symlinks so the path matches what getRealPath() stores in the DB.
// On macOS /tmp is a symlink to /private/tmp; without this normalisation
// toVirtualPath() and --full-path resolution fail.
const realCollectionDir = realpathSync(collectionDir);
for (const f of crazyFiles) {
await writeFile(join(collectionDir, f.name), f.content);
}
for (const f of crazySubFiles) {
await writeFile(join(collectionDir, "subdir", f.name), f.content);
}
// Write empty YAML config — `collection add` will populate it
await writeFile(join(configDir, "index.yml"), "collections: {}\n");
return { collectionDir: realCollectionDir, dbPath, configDir };
}
// ---------------------------------------------------------------------------
// Unit tests: store-level path storage
// ---------------------------------------------------------------------------
describe("Path fidelity — store level", () => {
test("reindexCollection stores literal relative paths, not handalized slugs", async () => {
const { collectionDir, dbPath, configDir } = await createCrazyCollection("store-unit");
// Run `collection add` to index
const add = await runQmd(
["collection", "add", collectionDir, "--name", "crazytest"],
{ cwd: collectionDir, dbPath, configDir }
);
expect(add.exitCode, `collection add failed: ${add.stderr}`).toBe(0);
// Inspect the DB directly
const db = openDatabase(dbPath);
const rows = db.prepare(
"SELECT path FROM documents WHERE active = 1 ORDER BY path"
).all() as { path: string }[];
db.close();
const paths = rows.map((r) => r.path);
// Must contain literal filenames — not handalized slugs
expect(paths).toContain("# Meeting - 234232 3432 __ 5.md");
expect(paths).toContain("Budget & Revenue (Q4) [2024].md");
expect(paths).toContain("normal-file.md");
expect(paths).toContain("subdir/Notes #42 - foo@bar.md");
// Must NOT contain handalized versions
expect(paths).not.toContain("Meeting-234232-3432-5.md");
expect(paths).not.toContain("Budget-Revenue-Q4-2024.md");
expect(paths).not.toContain("subdir/Notes-42-foo-bar.md");
});
test("toVirtualPath returns non-null for crazy-named files", async () => {
const { collectionDir, dbPath, configDir } = await createCrazyCollection("store-to-virtual");
const add = await runQmd(
["collection", "add", collectionDir, "--name", "crazytest"],
{ cwd: collectionDir, dbPath, configDir }
);
expect(add.exitCode).toBe(0);
const rawDb = openDatabase(dbPath);
const result = toVirtualPath(rawDb, join(collectionDir, "Budget & Revenue (Q4) [2024].md"));
rawDb.close();
expect(result).not.toBeNull();
expect(result).toBe(`qmd://crazytest/Budget & Revenue (Q4) [2024].md`);
});
});
// ---------------------------------------------------------------------------
// CLI integration tests — the five original breakage points
// ---------------------------------------------------------------------------
describe("Path fidelity — CLI integration", () => {
let collectionDir: string;
let dbPath: string;
let configDir: string;
// Index once for the whole describe block (read-only tests share it)
beforeAll(async () => {
({ collectionDir, dbPath, configDir } = await createCrazyCollection("cli-shared"));
const add = await runQmd(
["collection", "add", collectionDir, "--name", "crazytest"],
{ cwd: collectionDir, dbPath, configDir }
);
expect(add.exitCode, `collection add failed: ${add.stderr}`).toBe(0);
});
test("(1) search --json file field contains literal path, not handalized slug", async () => {
const { stdout, exitCode } = await runQmd(
["search", "searchterm-alpha", "--json"],
{ cwd: collectionDir, dbPath, configDir }
);
expect(exitCode).toBe(0);
const results = JSON.parse(stdout) as Array<{ file: string }>;
expect(results.length).toBeGreaterThan(0);
const meetingResult = results.find((r) => r.file.includes("Meeting"));
expect(meetingResult).toBeDefined();
// Must contain the literal filename fragment
expect(meetingResult!.file).toContain("# Meeting - 234232 3432 __ 5.md");
// Must not contain the handalized version
expect(meetingResult!.file).not.toContain("Meeting-234232-3432-5.md");
});
test("(2) get --full-path resolves to real filesystem path for crazy-named file", async () => {
const virtualPath = `qmd://crazytest/Budget & Revenue (Q4) [2024].md`;
const { stdout, exitCode } = await runQmd(
["get", virtualPath, "--full-path"],
{ cwd: collectionDir, dbPath, configDir }
);
expect(exitCode, `get failed: ${stdout}`).toBe(0);
const header = stdout.split("\n")[0]!;
// Should show a real filesystem path, not a qmd:// virtual path
expect(header).not.toMatch(/^qmd:\/\//);
// Should include the literal filename
expect(header).toContain("Budget & Revenue (Q4) [2024].md");
// The resolved filesystem path should exist — strip the trailing docid (#abc123)
const fsPath = header.trim().replace(/\s+#[a-f0-9]{6}$/, "");
// Path may be absolute or relative-to-collectionDir; resolve against collectionDir
const absPath = fsPath.startsWith("/") ? fsPath : join(collectionDir, fsPath.replace(/^\.\//, ""));
expect(existsSync(absPath), `resolved path does not exist: ${absPath}`).toBe(true);
});
test("(3) get <actual-fs-path> finds the document", async () => {
const fsPath = join(collectionDir, "Budget & Revenue (Q4) [2024].md");
const { stdout, exitCode, stderr } = await runQmd(
["get", fsPath],
{ cwd: collectionDir, dbPath, configDir }
);
expect(exitCode, `get by fs path failed: ${stderr}`).toBe(0);
// Header should contain the document identifier
expect(stdout).toContain("Budget & Revenue (Q4) [2024].md");
});
test("(3b) get <actual-fs-path> finds subdir file with crazy name", async () => {
const fsPath = join(collectionDir, "subdir", "Notes #42 - foo@bar.md");
const { stdout, exitCode, stderr } = await runQmd(
["get", fsPath],
{ cwd: collectionDir, dbPath, configDir }
);
expect(exitCode, `get subdir file failed: ${stderr}`).toBe(0);
expect(stdout).toContain("Notes #42 - foo@bar.md");
});
test("(4) ls shows literal paths, not handalized slugs", async () => {
const { stdout, exitCode } = await runQmd(
["ls", "crazytest"],
{ cwd: collectionDir, dbPath, configDir }
);
expect(exitCode).toBe(0);
// Literal paths must appear
expect(stdout).toContain("# Meeting - 234232 3432 __ 5.md");
expect(stdout).toContain("Budget & Revenue (Q4) [2024].md");
expect(stdout).toContain("Notes #42 - foo@bar.md");
// Handalized slugs must NOT appear
expect(stdout).not.toContain("Meeting-234232-3432-5.md");
expect(stdout).not.toContain("Budget-Revenue-Q4-2024.md");
expect(stdout).not.toContain("Notes-42-foo-bar.md");
});
test("(5) search --json returns docid that can be fetched back", async () => {
const { stdout: searchOut, exitCode: searchExit } = await runQmd(
["search", "searchterm-beta", "--json"],
{ cwd: collectionDir, dbPath, configDir }
);
expect(searchExit).toBe(0);
const results = JSON.parse(searchOut) as Array<{ docid: string; file: string }>;
expect(results.length).toBeGreaterThan(0);
const hit = results[0]!;
expect(hit.docid).toMatch(/^#[a-f0-9]{6}$/);
// Fetch by docid — must work
const { stdout: getOut, exitCode: getExit } = await runQmd(
["get", hit.docid],
{ cwd: collectionDir, dbPath, configDir }
);
expect(getExit, `get by docid failed`).toBe(0);
expect(getOut).toContain("Budget & Revenue (Q4) [2024].md");
});
test("normal filenames are still stored correctly (regression)", async () => {
const { stdout, exitCode } = await runQmd(
["search", "Plain filename", "--json"],
{ cwd: collectionDir, dbPath, configDir }
);
expect(exitCode).toBe(0);
const results = JSON.parse(stdout) as Array<{ file: string }>;
const hit = results.find((r) => r.file.includes("normal-file"));
expect(hit).toBeDefined();
expect(hit!.file).toContain("normal-file.md");
});
});
// ---------------------------------------------------------------------------
// Migration test: old handalized DB upgraded by `qmd update`
// ---------------------------------------------------------------------------
describe("Path fidelity — migration from handalized index", () => {
test("qmd update migrates handalized paths to literal paths in existing index", async () => {
const { collectionDir, dbPath, configDir } = await createCrazyCollection("migration");
// Manually build an old-style DB using handalize() (simulates pre-fix index)
const store = createStore(dbPath);
const now = new Date().toISOString();
// Write and sync a config that points at the collection so `qmd update` knows where it is
const migrationYaml = `collections:\n crazytest:\n path: "${collectionDir}"\n mask: "**/*.md"\n`;
await writeFile(join(configDir, "index.yml"), migrationYaml);
const config = YAML.parse(migrationYaml) as CollectionConfig;
syncConfigToDb(store.db, config);
// Insert documents with handalized paths (old behavior)
for (const f of crazyFiles) {
const relPath = normalizePathSeparators(f.name);
const handleized = handelize(relPath);
const hash = await hashContent(f.content);
insertContent(store.db, hash, f.content, now);
insertDocument(store.db, "crazytest", handleized, `Title ${f.name}`, hash, now, now);
}
const subFile = crazySubFiles[0]!;
const subRel = `subdir/${subFile.name}`;
const subHandelized = handelize(subRel);
const subHash = await hashContent(subFile.content);
insertContent(store.db, subHash, subFile.content, now);
insertDocument(store.db, "crazytest", subHandelized, "Sub title", subHash, now, now);
store.close();
// Verify the old DB has handalized paths
const dbBefore = openDatabase(dbPath);
const pathsBefore = (dbBefore.prepare(
"SELECT path FROM documents WHERE active = 1 ORDER BY path"
).all() as { path: string }[]).map((r) => r.path);
dbBefore.close();
expect(pathsBefore).toContain("Meeting-234232-3432-5.md");
expect(pathsBefore).toContain("Budget-Revenue-Q4-2024.md");
expect(pathsBefore).not.toContain("# Meeting - 234232 3432 __ 5.md");
// Run `qmd update` with the new code — should migrate paths in-place
const update = await runQmd(
["update"],
{ cwd: collectionDir, dbPath, configDir }
);
expect(update.exitCode, `qmd update failed: ${update.stderr}`).toBe(0);
// Verify the DB now has literal paths
const dbAfter = openDatabase(dbPath);
const pathsAfter = (dbAfter.prepare(
"SELECT path FROM documents WHERE active = 1 ORDER BY path"
).all() as { path: string }[]).map((r) => r.path);
dbAfter.close();
expect(pathsAfter).toContain("# Meeting - 234232 3432 __ 5.md");
expect(pathsAfter).toContain("Budget & Revenue (Q4) [2024].md");
expect(pathsAfter).toContain("normal-file.md");
expect(pathsAfter).toContain("subdir/Notes #42 - foo@bar.md");
// Handalized slugs must be gone
expect(pathsAfter).not.toContain("Meeting-234232-3432-5.md");
expect(pathsAfter).not.toContain("Budget-Revenue-Q4-2024.md");
// Search must work after migration
const { stdout: searchOut, exitCode: searchExit } = await runQmd(
["search", "searchterm-alpha", "--json"],
{ cwd: collectionDir, dbPath, configDir }
);
expect(searchExit).toBe(0);
const results = JSON.parse(searchOut) as Array<{ file: string }>;
expect(results.length).toBeGreaterThan(0);
const meetingResult = results.find((r) => r.file.includes("Meeting"));
expect(meetingResult).toBeDefined();
expect(meetingResult!.file).toContain("# Meeting - 234232 3432 __ 5.md");
});
});

View File

@ -614,6 +614,20 @@ describe("search (unified API)", () => {
expect(results.length).toBeGreaterThan(0);
});
test("search() forwards candidateLimit to structured search", async () => {
const results = await store.search({
queries: [
{ type: "lex", query: "authentication" },
{ type: "lex", query: "meeting" },
],
limit: 5,
candidateLimit: 1,
rerank: false,
});
expect(results).toHaveLength(1);
});
// Tests below use search({ query: ... }) which triggers LLM query expansion
describe.skipIf(!!process.env.CI)("with LLM query expansion", () => {
test("search() with query and rerank:false returns results", async () => {
@ -982,6 +996,92 @@ describe("embed", () => {
}
});
test("store.embed scopes pending documents to the requested collection", async () => {
const store = await createStore({
dbPath: freshDbPath(),
config: {
collections: {
docs: { path: docsDir, pattern: "**/*.md" },
notes: { path: notesDir, pattern: "**/*.md" },
},
},
});
const fakeLlm = createFakeEmbedLlm();
setDefaultLlamaCpp(createFakeTokenizer() as any);
store.internal.llm = fakeLlm as any;
try {
await store.update();
const result = await store.embed({ collection: "docs" });
const vectorCounts = store.internal.db.prepare(`
SELECT d.collection, COUNT(DISTINCT v.hash) AS count
FROM documents d
LEFT JOIN content_vectors v ON v.hash = d.hash AND v.seq = 0
WHERE d.active = 1
GROUP BY d.collection
ORDER BY d.collection
`).all() as Array<{ collection: string; count: number }>;
expect(result.docsProcessed).toBe(3);
expect(result.chunksEmbedded).toBe(3);
expect(vectorCounts).toEqual([
{ collection: "docs", count: 3 },
{ collection: "notes", count: 0 },
]);
} finally {
setDefaultLlamaCpp(null);
await store.close();
}
});
test("store.embed with force only clears the requested collection", async () => {
const store = await createStore({
dbPath: freshDbPath(),
config: {
collections: {
docs: { path: docsDir, pattern: "**/*.md" },
notes: { path: notesDir, pattern: "**/*.md" },
},
},
});
const fakeLlm = createFakeEmbedLlm();
setDefaultLlamaCpp(createFakeTokenizer() as any);
store.internal.llm = fakeLlm as any;
const vectorCounts = () => store.internal.db.prepare(`
SELECT d.collection, COUNT(DISTINCT v.hash) AS count
FROM documents d
LEFT JOIN content_vectors v ON v.hash = d.hash AND v.seq = 0
WHERE d.active = 1
GROUP BY d.collection
ORDER BY d.collection
`).all() as Array<{ collection: string; count: number }>;
try {
await store.update();
await store.embed();
expect(vectorCounts()).toEqual([
{ collection: "docs", count: 3 },
{ collection: "notes", count: 3 },
]);
const result = await store.embed({ force: true, collection: "docs" });
expect(result.docsProcessed).toBe(3);
expect(result.chunksEmbedded).toBe(3);
expect(vectorCounts()).toEqual([
{ collection: "docs", count: 3 },
{ collection: "notes", count: 3 },
]);
} finally {
setDefaultLlamaCpp(null);
await store.close();
}
});
test("store.embed rejects invalid batch limits", async () => {
const store = await createStore({
dbPath: freshDbPath(),

View File

@ -1,17 +1,28 @@
#!/usr/bin/env bash
# Build a container image with qmd installed via npm and bun, then run smoke tests.
# Works with docker or podman (whichever is available).
# Build a clean container image from the current checkout package and exercise
# install/runtime scenarios under npm, npx, and Bun. Supports optional qmd embed
# and GPU probes, but keeps those expensive/device-specific checks opt-in.
#
# Usage:
# test/smoke-install.sh # build + run all smoke tests
# test/smoke-install.sh --build # build image only
# test/smoke-install.sh --shell # drop into container shell
# test/smoke-install.sh -- CMD... # run arbitrary command in container
# test/smoke-install.sh # build + run default smoke scenarios
# test/smoke-install.sh --build # build image only
# test/smoke-install.sh --shell # drop into container shell
# test/smoke-install.sh --scenario node # run one scenario (node|npx|bun|all)
# test/smoke-install.sh --with-embed # also run tiny qmd embed smoke tests
# test/smoke-install.sh --with-gpu # also probe GPU in doctor/embed scenarios
# QMD_SMOKE_GPU_BACKEND=cuda|vulkan|auto # backend for --with-gpu (default: auto)
# test/smoke-install.sh --no-build # reuse existing image
# test/smoke-install.sh -- CMD... # run arbitrary command in container
#
# GPU notes:
# Docker uses: --gpus all
# Podman uses: --device nvidia.com/gpu=all
# If your podman setup uses a different CDI device name, override with:
# QMD_SMOKE_GPU_ARGS='--device nvidia.com/gpu=all' test/smoke-install.sh --with-gpu
set -euo pipefail
cd "$(dirname "$0")/.."
# Pick container runtime
if command -v podman &>/dev/null; then
CTR=podman
elif command -v docker &>/dev/null; then
@ -21,10 +32,50 @@ else
exit 1
fi
IMAGE=qmd-smoke
IMAGE=${QMD_SMOKE_IMAGE:-qmd-smoke}
SCENARIO=all
DO_BUILD=1
WITH_EMBED=0
WITH_GPU=0
GPU_BACKEND=${QMD_SMOKE_GPU_BACKEND:-auto}
declare -a ARBITRARY_CMD=()
usage() {
sed -n '2,20p' "$0" | sed 's/^# \{0,1\}//'
}
while [[ $# -gt 0 ]]; do
case "$1" in
--build) DO_BUILD=1; BUILD_ONLY=1; shift ;;
--no-build) DO_BUILD=0; shift ;;
--shell) SHELL_ONLY=1; shift ;;
--scenario) SCENARIO="${2:-}"; shift 2 ;;
--with-embed) WITH_EMBED=1; shift ;;
--with-gpu) WITH_GPU=1; shift ;;
--help|-h) usage; exit 0 ;;
--) shift; ARBITRARY_CMD=("$@"); break ;;
*) echo "Unknown argument: $1" >&2; usage >&2; exit 1 ;;
esac
done
BUILD_ONLY=${BUILD_ONLY:-0}
SHELL_ONLY=${SHELL_ONLY:-0}
gpu_args() {
if [[ $WITH_GPU -ne 1 ]]; then return 0; fi
if [[ -n "${QMD_SMOKE_GPU_ARGS:-}" ]]; then
# shellcheck disable=SC2206
echo ${QMD_SMOKE_GPU_ARGS}
return 0
fi
case "$CTR" in
docker) echo "--gpus all" ;;
podman) echo "--device nvidia.com/gpu=all" ;;
esac
}
build_image() {
echo "==> Building TypeScript..."
echo "==> Building TypeScript package..."
npm run build --silent
echo "==> Packing tarball..."
@ -32,32 +83,35 @@ build_image() {
TARBALL=$(npm pack --pack-destination test/ 2>/dev/null | tail -1)
echo " $TARBALL"
# Copy project files into build context so vitest/bun tests can run inside
echo "==> Preparing container test project..."
rm -rf test/test-src
mkdir -p test/test-src/src test/test-src/test
cp src/*.ts test/test-src/src/
mkdir -p test/test-src/test
cp -r src test/test-src/
cp -r dist test/test-src/
cp test/*.test.ts test/test-src/test/
cp -r test/*.test.ts test/test-src/test/
cp package.json tsconfig.json tsconfig.build.json test/test-src/
echo "==> Building container image ($CTR)..."
echo "==> Building container image ($CTR): $IMAGE"
$CTR build -f test/Containerfile -t "$IMAGE" test/
# Clean up
rm -f test/tobilu-qmd-*.tgz
rm -rf test/test-src
echo "==> Image ready: $IMAGE"
}
run() {
$CTR run --rm "$IMAGE" bash -c "$*"
local args=()
# Intentionally word-split GPU args: container CLIs expect separate flags.
# shellcheck disable=SC2206
args=( $(gpu_args) )
$CTR run --rm "${args[@]}" "$IMAGE" bash -lc "$*"
}
PASS=0
FAIL=0
ok() { printf " %-50s OK\n" "$1"; PASS=$((PASS + 1)); }
fail() { printf " %-50s FAIL\n" "$1"; FAIL=$((FAIL + 1)); echo "$2" | sed 's/^/ /'; }
ok() { printf " %-58s OK\n" "$1"; PASS=$((PASS + 1)); }
fail() { printf " %-58s FAIL\n" "$1"; FAIL=$((FAIL + 1)); echo "$2" | sed 's/^/ /'; }
smoke_test() {
local label="$1"; shift
@ -73,97 +127,136 @@ smoke_test_output() {
local label="$1"; local expect="$2"; shift 2
local out
out=$(run "$@" 2>&1) || true
if echo "$out" | grep -q "$expect"; then
if grep -q "$expect" <<<"$out"; then
ok "$label"
else
fail "$label" "$out"
fi
}
run_smoke_tests() {
# ------------------------------------------------------------------
# Node (npm-installed qmd)
# ------------------------------------------------------------------
fixture_setup='rm -rf /tmp/qmd-fixture /tmp/qmd-cache /tmp/qmd-config /tmp/qmd-models; mkdir -p /tmp/qmd-fixture; printf "# Smoke Doc\n\nGPU and CPU embedding smoke test.\n" > /tmp/qmd-fixture/doc.md; export XDG_CACHE_HOME=/tmp/qmd-cache QMD_CONFIG_DIR=/tmp/qmd-config'
gpu_env() {
case "$GPU_BACKEND" in
auto|"") echo "" ;;
cuda|vulkan|metal) echo "QMD_LLAMA_GPU=$GPU_BACKEND" ;;
*) echo "Unsupported QMD_SMOKE_GPU_BACKEND=$GPU_BACKEND" >&2; exit 1 ;;
esac
}
run_doctor_smoke() {
local label="$1" bin="$2" extra_env="${3:-}"
smoke_test_output "$label doctor" "QMD Doctor" \
"$fixture_setup; $extra_env $bin doctor"
}
run_collection_smoke() {
local label="$1" bin="$2" extra_env="${3:-}"
smoke_test "$label collection add/list/status" \
"$fixture_setup; cd /tmp/qmd-fixture; $extra_env $bin collection add . --name smoke; $extra_env $bin collection list; $extra_env $bin status"
}
run_embed_smoke() {
local label="$1" bin="$2" extra_env="${3:-}"
[[ $WITH_EMBED -eq 1 ]] || return 0
smoke_test "$label qmd embed tiny fixture" \
"$fixture_setup; cd /tmp/qmd-fixture; $extra_env $bin collection add . --name smoke; $extra_env $bin embed --max-docs-per-batch 1 --max-batch-mb 1; $extra_env $bin doctor"
}
run_runtime_matrix() {
local label="$1" bin="$2" path_env="$3"
smoke_test_output "$label qmd help" "Usage:" "$path_env; $bin"
run_doctor_smoke "$label auto" "$path_env; $bin"
run_doctor_smoke "$label force-cpu" "$path_env; $bin" "QMD_FORCE_CPU=1"
run_collection_smoke "$label" "$path_env; $bin" "QMD_FORCE_CPU=1"
run_embed_smoke "$label force-cpu" "$path_env; $bin" "QMD_FORCE_CPU=1"
run_embed_smoke "$label auto" "$path_env; $bin"
if [[ $WITH_GPU -eq 1 ]]; then
local ge
ge=$(gpu_env)
run_doctor_smoke "$label gpu-$GPU_BACKEND" "$path_env; $bin" "$ge"
run_embed_smoke "$label gpu-$GPU_BACKEND" "$path_env; $bin" "$ge"
fi
}
run_node_scenario() {
local NODE_BIN='$(mise where node@latest)/bin'
echo "=== Node (npm install) ==="
smoke_test_output "qmd shows help" "Usage:" \
"export PATH=$NODE_BIN:\$PATH; qmd"
smoke_test "qmd collection list" \
"export PATH=$NODE_BIN:\$PATH; qmd collection list"
smoke_test "qmd status" \
"export PATH=$NODE_BIN:\$PATH; qmd status"
smoke_test "sqlite-vec loads" \
"export PATH=$NODE_BIN:\$PATH;
NPM_GLOBAL=\$(npm root -g);
node -e \"
const {openDatabase, loadSqliteVec} = await import('\$NPM_GLOBAL/@tobilu/qmd/dist/db.js');
local bin='qmd'
echo "=== Node: npm install -g packed tarball ==="
run_runtime_matrix "node" "$bin" "export PATH=$NODE_BIN:\$PATH"
smoke_test "node sqlite-vec loads" \
"export PATH=$NODE_BIN:\$PATH; NPM_GLOBAL=\$(npm root -g); node -e \"
const {openDatabase, loadSqliteVec} = await import('\\$NPM_GLOBAL/@tobilu/qmd/dist/db.js');
const db = openDatabase(':memory:');
loadSqliteVec(db);
const r = db.prepare('SELECT vec_version() as v').get();
console.log('sqlite-vec', r.v);
if (!r.v) process.exit(1);
\""
smoke_test "vitest (node)" \
smoke_test "node vitest store subset" \
"export PATH=$NODE_BIN:\$PATH; cd /opt/qmd && npx vitest run --reporter=verbose test/store.test.ts 2>&1 | tail -5"
}
# ------------------------------------------------------------------
# Bun (bun-installed qmd)
# ------------------------------------------------------------------
run_npx_scenario() {
local NODE_BIN='$(mise where node@latest)/bin'
local bin='npm exec --yes --package /tmp/tobilu-qmd.tgz -- qmd'
echo "=== Node: npm exec/npx-style packed tarball ==="
run_runtime_matrix "npx-style" "$bin" "export PATH=$NODE_BIN:\$PATH"
}
run_bun_scenario() {
local NODE_BIN='$(mise where node@latest)/bin'
local BUN_BIN='$(mise where bun@latest)/bin'
echo ""
echo "=== Bun (bun install) ==="
smoke_test_output "qmd shows help" "Usage:" \
"export PATH=$BUN_BIN:$NODE_BIN:\$PATH; \$HOME/.bun/bin/qmd"
smoke_test "qmd collection list" \
"export PATH=$BUN_BIN:$NODE_BIN:\$PATH; \$HOME/.bun/bin/qmd collection list"
smoke_test "qmd status" \
"export PATH=$BUN_BIN:$NODE_BIN:\$PATH; \$HOME/.bun/bin/qmd status"
smoke_test "sqlite-vec loads (bun)" \
local bin='$HOME/.bun/bin/qmd'
echo "=== Bun: bun install -g packed tarball ==="
run_runtime_matrix "bun" "$bin" "export PATH=$BUN_BIN:$NODE_BIN:\$PATH"
smoke_test "bun sqlite-vec loads" \
"export PATH=$BUN_BIN:\$PATH; bun -e \"
const {openDatabase, loadSqliteVec} = await import('\$HOME/.bun/install/global/node_modules/@tobilu/qmd/dist/db.js');
const {openDatabase, loadSqliteVec} = await import('\\$HOME/.bun/install/global/node_modules/@tobilu/qmd/dist/db.js');
const db = openDatabase(':memory:');
loadSqliteVec(db);
const r = db.prepare('SELECT vec_version() as v').get();
console.log('sqlite-vec', r.v);
if (!r.v) process.exit(1);
\""
smoke_test "bun test store" \
smoke_test "bun test store subset" \
"export PATH=$BUN_BIN:\$PATH; cd /opt/qmd && bun test --preload ./src/test-preload.ts --timeout 30000 test/store.test.ts 2>&1 | tail -10"
}
# ------------------------------------------------------------------
run_smoke_tests() {
case "$SCENARIO" in
node) run_node_scenario ;;
npx) run_npx_scenario ;;
bun) run_bun_scenario ;;
all) run_node_scenario; echo; run_npx_scenario; echo; run_bun_scenario ;;
*) echo "Unknown scenario: $SCENARIO" >&2; exit 1 ;;
esac
echo ""
echo "=== Results: $PASS passed, $FAIL failed ==="
[[ $FAIL -eq 0 ]]
}
# Parse arguments
case "${1:-}" in
--build)
build_image
;;
--shell)
build_image
echo "==> Dropping into container shell..."
$CTR run --rm -it "$IMAGE" bash
;;
--)
shift
run "$@"
;;
*)
build_image
echo ""
echo "==> Running smoke tests..."
run_smoke_tests
;;
esac
if [[ $DO_BUILD -eq 1 ]]; then
build_image
fi
if [[ ${#ARBITRARY_CMD[@]} -gt 0 ]]; then
run "${ARBITRARY_CMD[*]}"
exit $?
fi
if [[ $BUILD_ONLY -eq 1 ]]; then
exit 0
fi
if [[ $SHELL_ONLY -eq 1 ]]; then
echo "==> Dropping into container shell..."
# shellcheck disable=SC2206
gpu=( $(gpu_args) )
$CTR run --rm -it "${gpu[@]}" "$IMAGE" bash
exit $?
fi
echo ""
echo "==> Running smoke tests..."
run_smoke_tests

View File

@ -9,7 +9,7 @@
import { describe, test, expect, beforeAll, afterAll, beforeEach, afterEach, vi } from "vitest";
import { openDatabase, loadSqliteVec } from "../src/db.js";
import type { Database } from "../src/db.js";
import { unlink, mkdtemp, rmdir, writeFile } from "node:fs/promises";
import { unlink, mkdtemp, rmdir, writeFile, rm, mkdir, rename } from "node:fs/promises";
import { tmpdir } from "node:os";
import { join } from "node:path";
import YAML from "yaml";
@ -26,6 +26,7 @@ import {
extractTitle,
formatQueryForEmbedding,
formatDocForEmbedding,
getEmbeddingFingerprint,
chunkDocument,
chunkDocumentByTokens,
chunkDocumentAsync,
@ -46,13 +47,22 @@ import {
normalizeDocid,
isDocid,
syncConfigToDb,
reindexCollection,
STRONG_SIGNAL_MIN_SCORE,
STRONG_SIGNAL_MIN_GAP,
insertContent,
insertDocument,
generateEmbeddings,
getHybridRrfWeights,
_resetProductionModeForTesting,
hybridQuery,
structuredSearch,
vectorSearchQuery,
type Store,
type DocumentResult,
type SearchResult,
type RankedResult,
type RankedListMeta,
} from "../src/store.js";
import type { CollectionConfig } from "../src/collections.js";
@ -156,18 +166,18 @@ async function insertTestDocument(
const hash = opts.hash || await hashContent(body);
// Insert content (with OR IGNORE for deduplication)
db.prepare(`
INSERT OR IGNORE INTO content (hash, doc, created_at)
VALUES (?, ?, ?)
`).run(hash, body, now);
insertContent(db, hash, body, now);
// Insert document
const result = db.prepare(`
INSERT INTO documents (collection, path, title, hash, created_at, modified_at, active)
VALUES (?, ?, ?, ?, ?, ?, ?)
`).run(collectionName, path, title, hash, now, now, active);
insertDocument(db, collectionName, path, title, hash, now, now);
const row = db.prepare(`
SELECT id FROM documents WHERE collection = ? AND path = ?
`).get(collectionName, path) as { id: number } | undefined;
return Number(result.lastInsertRowid);
if (active === 0 && row) {
db.prepare(`UPDATE documents SET active = 0 WHERE id = ?`).run(row.id);
}
return row?.id ?? 0;
}
/** Sync YAML config file to SQLite store_collections in the current test store */
@ -277,7 +287,9 @@ afterAll(async () => {
describe("Store Creation", () => {
test("createStore throws without explicit path in test mode", () => {
// In test mode, createStore without path should throw to prevent accidental writes
// In test mode, createStore without path should throw to prevent accidental writes.
// Other tests may enable production mode in the same Bun process, so reset first.
_resetProductionModeForTesting();
const originalIndexPath = process.env.INDEX_PATH;
delete process.env.INDEX_PATH;
@ -300,19 +312,127 @@ describe("Store Creation", () => {
// Check tables exist
const tables = store.db.prepare(`
SELECT name FROM sqlite_master WHERE type='table' ORDER BY name
SELECT name FROM sqlite_master
WHERE type='table'
ORDER BY name
`).all() as { name: string }[];
const tableNames = tables.map(t => t.name);
expect(tableNames).toContain("documents");
expect(tableNames).toContain("documents_fts");
expect(tableNames).toContain("content_vectors");
expect(tableNames).toContain("content");
expect(tableNames).toContain("llm_cache");
// Note: path_contexts table removed in favor of YAML-based context storage
await cleanupTestDb(store);
});
test("createStore defers content_vectors embed_fingerprint migration until embedding health needs it", async () => {
const dbPath = join(testDir, `legacy-${Date.now()}-${Math.random().toString(36).slice(2)}.sqlite`);
const model = "hf:test/embed-model.gguf";
const legacyDb = openDatabase(dbPath);
legacyDb.exec(`
CREATE TABLE content (
hash TEXT PRIMARY KEY,
doc TEXT NOT NULL,
created_at TEXT NOT NULL
);
CREATE TABLE documents (
id INTEGER PRIMARY KEY AUTOINCREMENT,
collection TEXT NOT NULL,
path TEXT NOT NULL,
title TEXT,
hash TEXT NOT NULL,
created_at TEXT NOT NULL,
modified_at TEXT NOT NULL,
active INTEGER NOT NULL DEFAULT 1,
FOREIGN KEY (hash) REFERENCES content(hash) ON DELETE CASCADE,
UNIQUE(collection, path)
);
CREATE TABLE content_vectors (
hash TEXT NOT NULL,
seq INTEGER NOT NULL DEFAULT 0,
pos INTEGER NOT NULL DEFAULT 0,
model TEXT NOT NULL,
total_chunks INTEGER NOT NULL DEFAULT 1,
embedded_at TEXT NOT NULL,
PRIMARY KEY (hash, seq)
)
`);
const now = new Date().toISOString();
legacyDb.prepare(`INSERT INTO content (hash, doc, created_at) VALUES (?, ?, ?)`).run("hash1", "# Legacy\nbody", now);
legacyDb.prepare(`INSERT INTO documents (collection, path, title, hash, created_at, modified_at, active) VALUES (?, ?, ?, ?, ?, ?, 1)`).run("test", "legacy.md", "Legacy", "hash1", now, now);
legacyDb.prepare(`INSERT INTO content_vectors (hash, seq, pos, model, total_chunks, embedded_at) VALUES (?, ?, ?, ?, ?, ?)`).run("hash1", 0, 0, model, 1, now);
legacyDb.close();
const store = createStore(dbPath);
let columns = store.db.prepare(`PRAGMA table_info(content_vectors)`).all() as { name: string }[];
expect(columns.map(col => col.name)).not.toContain("embed_fingerprint");
expect(store.getHashesNeedingEmbedding(model)).toBe(1);
columns = store.db.prepare(`PRAGMA table_info(content_vectors)`).all() as { name: string }[];
const migratedRow = store.db.prepare(`SELECT embed_fingerprint FROM content_vectors WHERE hash = ?`).get("hash1") as { embed_fingerprint: string };
expect(columns.map(col => col.name)).toContain("embed_fingerprint");
expect(migratedRow.embed_fingerprint).toBe("");
await cleanupTestDb(store);
});
test("content_vectors column repair runs the full ALTER series and retries the failed operation", async () => {
const dbPath = join(testDir, `legacy-no-seq-${Date.now()}-${Math.random().toString(36).slice(2)}.sqlite`);
const model = "hf:test/embed-model.gguf";
const legacyDb = openDatabase(dbPath);
legacyDb.exec(`
CREATE TABLE content (
hash TEXT PRIMARY KEY,
doc TEXT NOT NULL,
created_at TEXT NOT NULL
);
CREATE TABLE documents (
id INTEGER PRIMARY KEY AUTOINCREMENT,
collection TEXT NOT NULL,
path TEXT NOT NULL,
title TEXT,
hash TEXT NOT NULL,
created_at TEXT NOT NULL,
modified_at TEXT NOT NULL,
active INTEGER NOT NULL DEFAULT 1,
FOREIGN KEY (hash) REFERENCES content(hash) ON DELETE CASCADE,
UNIQUE(collection, path)
);
CREATE TABLE content_vectors (
hash TEXT NOT NULL,
model TEXT NOT NULL,
embed_fingerprint TEXT NOT NULL DEFAULT '',
total_chunks INTEGER NOT NULL DEFAULT 1,
embedded_at TEXT NOT NULL
)
`);
legacyDb.close();
const store = createStore(dbPath);
let columns = store.db.prepare(`PRAGMA table_info(content_vectors)`).all() as { name: string }[];
expect(columns.map(col => col.name)).not.toContain("seq");
expect(columns.map(col => col.name)).not.toContain("pos");
store.ensureVecTable(3);
store.insertEmbedding("hash1", 1, 42, new Float32Array([1, 2, 3]), model, new Date().toISOString(), 2);
columns = store.db.prepare(`PRAGMA table_info(content_vectors)`).all() as { name: string }[];
const columnNames = columns.map(col => col.name);
expect(columnNames).toEqual(expect.arrayContaining(["seq", "pos", "model", "embed_fingerprint", "total_chunks", "embedded_at"]));
expect(store.db.prepare(`SELECT seq, pos, model, total_chunks FROM content_vectors WHERE hash = ?`).get("hash1")).toEqual({
seq: 1,
pos: 42,
model,
total_chunks: 2,
});
await cleanupTestDb(store);
});
test("createStore sets WAL journal mode", async () => {
const store = await createTestStore();
const result = store.db.prepare("PRAGMA journal_mode").get() as { journal_mode: string };
@ -1250,6 +1370,61 @@ describe("FTS Search", () => {
await cleanupTestDb(store);
});
test("searchFTS finds CJK documents by exact and mixed queries", async () => {
const store = await createTestStore();
const collectionName = await createTestCollection();
await insertTestDocument(store.db, collectionName, {
name: "zh",
title: "中文检索说明",
body: "这里介绍 vector 数据库和关键词检索。",
displayPath: "cjk/zh.md",
});
await insertTestDocument(store.db, collectionName, {
name: "ja",
title: "日本語検索メモ",
body: "この文書は検索品質とトークン化について説明します。",
displayPath: "cjk/ja.md",
});
await insertTestDocument(store.db, collectionName, {
name: "ko",
title: "한국어 검색 노트",
body: "이 문서는 검색 품질과 토큰화 문제를 설명합니다.",
displayPath: "cjk/ko.md",
});
expect(store.searchFTS("关键词检索", 10).map(r => r.displayPath)).toContain(`${collectionName}/cjk/zh.md`);
expect(store.searchFTS("検索品質", 10).map(r => r.displayPath)).toContain(`${collectionName}/cjk/ja.md`);
expect(store.searchFTS("검색 품질", 10).map(r => r.displayPath)).toContain(`${collectionName}/cjk/ko.md`);
expect(store.searchFTS("vector 关键词", 10).map(r => r.displayPath)).toContain(`${collectionName}/cjk/zh.md`);
await cleanupTestDb(store);
});
test("searchFTS keeps English behavior while indexing CJK text", async () => {
const store = await createTestStore();
const collectionName = await createTestCollection();
await insertTestDocument(store.db, collectionName, {
name: "english",
title: "Vector Search Notes",
body: "The quick brown fox explains vector search and BM25 ranking.",
displayPath: "english.md",
});
await insertTestDocument(store.db, collectionName, {
name: "zh",
title: "中文检索说明",
body: "这里介绍向量数据库和关键词检索。",
displayPath: "zh.md",
});
const foxResults = store.searchFTS("quick fox", 10);
expect(foxResults.map(r => r.displayPath)).toContain(`${collectionName}/english.md`);
expect(foxResults.map(r => r.displayPath)).not.toContain(`${collectionName}/zh.md`);
await cleanupTestDb(store);
});
test("searchFTS handles special characters in query", async () => {
const store = await createTestStore();
const collectionName = await createTestCollection();
@ -1429,6 +1604,39 @@ describe("FTS Search", () => {
await cleanupTestDb(store);
});
test("searchFTS matches dotted version strings like 2026.4.10 (#563)", async () => {
// Regression test: porter unicode61 tokenizer splits on dots, so the index
// stores "2026", "4", "10" as separate tokens. Before the fix, sanitizeFTS5Term
// stripped the dots producing "2026410" which never matched anything.
const store = await createTestStore();
const collectionName = await createTestCollection();
await insertTestDocument(store.db, collectionName, {
name: "release-notes",
title: "Release Notes",
body: "## Release 2026.4.10\n\nThis version introduces new features and bug fixes.",
displayPath: "test/release-notes.md",
});
// A document that does NOT contain the version string
await insertTestDocument(store.db, collectionName, {
name: "other-doc",
title: "Other Document",
body: "Unrelated content about gardening and cooking.",
displayPath: "test/other.md",
});
const results = store.searchFTS("2026.4.10", 10);
expect(results.length).toBeGreaterThan(0);
expect(results.map(r => r.displayPath)).toContain(`${collectionName}/test/release-notes.md`);
// Partial version should also work
const partial = store.searchFTS("2026.4", 10);
expect(partial.map(r => r.displayPath)).toContain(`${collectionName}/test/release-notes.md`);
await cleanupTestDb(store);
});
});
// =============================================================================
@ -1647,6 +1855,21 @@ describe("Document Retrieval", () => {
expect(body).toBeNull();
await cleanupTestDb(store);
});
test("getDocumentBody clamps negative fromLine to top of document", async () => {
const store = await createTestStore();
const collectionName = await createTestCollection({ pwd: "/path" });
await insertTestDocument(store.db, collectionName, {
name: "mydoc",
displayPath: "mydoc.md",
body: "Line 1\nLine 2\nLine 3\nLine 4\nLine 5",
});
const body = store.getDocumentBody({ filepath: "/path/mydoc.md" }, -19, 80);
expect(body).toBe("Line 1\nLine 2\nLine 3\nLine 4\nLine 5");
await cleanupTestDb(store);
});
});
describe("findDocuments (multi-get)", () => {
@ -1903,6 +2126,26 @@ describe("Snippet Extraction", () => {
expect(linesAfter).toBe(2); // Fourth, Fifth
});
test("extractSnippet with leading blank/frontmatter lines reports 1 before, not 0", () => {
// Regression: a user looked at `@@ -2,4 @@ (1 before, 72 after)` and
// suspected "1 before" was wrong because the match appeared to be the
// topmost visible line. The math takes "before" from the absolute file
// line, not from the visible portion of the snippet — so when the
// snippet starts at line 2, "1 before" is the correct count. Lock that
// in with a 77-line document whose match sits on line 3.
const otherLines = Array.from({ length: 72 }, (_, i) => `body line ${i + 6}`).join("\n");
const body = `---\ntitle: Notes\n# Heading with keyword\nIntro paragraph.\nMore intro lines.\n${otherLines}`;
const { line, linesBefore, snippetLines, linesAfter, snippet } =
extractSnippet(body, "keyword", 500);
expect(line).toBe(3); // match is on line 3
expect(linesBefore).toBe(1); // exactly one line above the 4-line snippet window
expect(snippetLines).toBe(4); // lines 2..5 form the snippet
expect(linesAfter).toBe(72); // remaining body
expect(snippet).toContain("@@ -2,4 @@ (1 before, 72 after)");
});
test("extractSnippet at document end shows 0 after", () => {
const body = "First\nSecond\nThird\nFourth\nFifth keyword";
const { linesBefore, linesAfter, snippetLines, line } = extractSnippet(body, "keyword", 500);
@ -1935,6 +2178,33 @@ describe("Snippet Extraction", () => {
expect(line).toBe(51); // "Target keyword" is line 51
expect(linesBefore).toBeGreaterThan(40); // Many lines before
});
test("extractSnippet anchors on chunkPos when lexical scoring finds no match", () => {
// The snippet tokenizer does not strip FTS5 syntax, so a quoted-phrase query
// tokenises into terms with embedded quotes that never appear in body text.
// bestScore stays at 0 even though the reranker correctly identified a chunk;
// the fallback should anchor on chunkPos rather than defaulting to line 1.
const padLine = "Lorem ipsum dolor sit amet\n";
const padding = padLine.repeat(100);
const body = padding + "chunk content here\nmore chunk content\n" + padding;
const chunkPos = padding.length;
const { line } = extractSnippet(body, '"unrelated quoted phrase"', 200, chunkPos);
expect(line).toBeGreaterThan(50);
expect(line).toBeLessThan(110);
});
test("extractSnippet with chunkPos=0 falls back to full-body scan when chunk has no match", () => {
// chunkPos=0 may be the chunk selector's bestIdx=0 default rather than a real
// first-chunk hit, so the fallback must consider matches outside chunk 0.
const padding = "Lorem ipsum dolor sit amet\n".repeat(200);
const body = padding + "TARGET_KEYWORD line content\ntail line\n";
const { line } = extractSnippet(body, "TARGET_KEYWORD", 200, 0);
expect(line).toBe(201);
});
});
// =============================================================================
@ -1988,6 +2258,38 @@ describe("Reciprocal Rank Fusion", () => {
expect(fused[0]!.file).toBe("doc1");
});
test("hybrid RRF weights boost original vector evidence over expansion-only hits", () => {
const originalFtsOnly = makeResult("original-fts-only.md", 0.95);
const expansionOnly = makeResult("lex-expansion-only.md", 0.95);
const originalVector = makeResult("original-vector.md", 0.95);
// Mirrors hybridQuery's common list order when a lex expansion exists:
// original FTS, lex expansion FTS, original vector.
const rankedLists = [
[originalFtsOnly],
[expansionOnly],
[originalVector],
];
const rankedListMeta: RankedListMeta[] = [
{ source: "fts", queryType: "original", query: "user query" },
{ source: "fts", queryType: "lex", query: "lex expansion" },
{ source: "vec", queryType: "original", query: "user query" },
];
const positionBasedWeights = rankedLists.map((_, i) => i < 2 ? 2.0 : 1.0);
const buggyOrder = reciprocalRankFusion(rankedLists, positionBasedWeights);
expect(buggyOrder.findIndex(r => r.file === "lex-expansion-only.md"))
.toBeLessThan(buggyOrder.findIndex(r => r.file === "original-vector.md"));
const semanticWeights = getHybridRrfWeights(rankedListMeta);
const fixedOrder = reciprocalRankFusion(rankedLists, semanticWeights);
expect(semanticWeights).toEqual([2.0, 1.0, 2.0]);
expect(fixedOrder.findIndex(r => r.file === "original-vector.md"))
.toBeLessThan(fixedOrder.findIndex(r => r.file === "lex-expansion-only.md"));
});
test("RRF adds top-rank bonus", () => {
// doc1 is #1 in list1, doc2 is #2 in list1
const list1 = [makeResult("doc1", 0.9), makeResult("doc2", 0.8)];
@ -2020,6 +2322,65 @@ describe("Reciprocal Rank Fusion", () => {
});
});
// =============================================================================
// Reindex Collection Tests
// =============================================================================
describe("Reindex Collection", () => {
test("preserves document id and embeddings when file path changes only by case", async () => {
const store = await createTestStore();
const collectionName = "docs";
const collectionPath = join(testDir, `case-rename-${Date.now()}-${Math.random().toString(36).slice(2)}`);
await mkdir(collectionPath, { recursive: true });
const originalPath = join(collectionPath, "README.md");
const renamedPath = join(collectionPath, "readme.md");
const body = "# Case Rename\n\nContent that should keep the same embedding.";
await writeFile(originalPath, body);
const firstResult = await reindexCollection(store, collectionPath, "**/*.md", collectionName);
expect(firstResult.indexed).toBe(1);
const before = store.db.prepare(`
SELECT id, path, hash FROM documents
WHERE collection = ? AND active = 1
`).get(collectionName) as { id: number; path: string; hash: string };
expect(before.path).toBe("README.md");
store.db.prepare(`
INSERT INTO content_vectors (hash, seq, pos, model, embedded_at)
VALUES (?, 0, 0, 'test-model', ?)
`).run(before.hash, new Date().toISOString());
await rename(originalPath, renamedPath);
const secondResult = await reindexCollection(store, collectionPath, "**/*.md", collectionName);
expect(secondResult.indexed).toBe(0);
expect(secondResult.unchanged).toBe(1);
expect(secondResult.removed).toBe(0);
const afterRows = store.db.prepare(`
SELECT id, path, hash, active FROM documents
WHERE collection = ?
ORDER BY id
`).all(collectionName) as { id: number; path: string; hash: string; active: number }[];
expect(afterRows).toHaveLength(1);
expect(afterRows[0]).toMatchObject({ id: before.id, path: "readme.md", hash: before.hash, active: 1 });
const vectorCount = store.db.prepare(`
SELECT COUNT(*) AS count FROM content_vectors WHERE hash = ?
`).get(before.hash) as { count: number };
expect(vectorCount.count).toBe(1);
const ftsRows = store.db.prepare(`
SELECT rowid, filepath FROM documents_fts WHERE rowid = ?
`).all(before.id) as { rowid: number; filepath: string }[];
expect(ftsRows).toEqual([{ rowid: before.id, filepath: "docs/readme.md" }]);
await cleanupTestDb(store);
});
});
// =============================================================================
// Index Status Tests
// =============================================================================
@ -2082,6 +2443,43 @@ describe("Index Status", () => {
await cleanupTestDb(store);
});
test("embedding health is scoped to the active embed model", async () => {
const store = await createTestStore();
const collectionName = await createTestCollection();
const activeModel = "hf:active/embed-model.gguf";
const staleModel = "hf:stale/embed-model.gguf";
const now = new Date().toISOString();
store.llm = { embedModelName: activeModel } as any;
store.ensureVecTable(3);
await insertTestDocument(store.db, collectionName, { name: "doc1", hash: "hash1" });
store.insertEmbedding("hash1", 0, 0, new Float32Array([1, 2, 3]), staleModel, now, 1);
expect(store.getHashesNeedingEmbedding()).toBe(1);
expect(store.getStatus().needsEmbedding).toBe(1);
expect(store.getIndexHealth().needsEmbedding).toBe(1);
expect(store.getHashesNeedingEmbedding(staleModel)).toBe(0);
await cleanupTestDb(store);
});
test("embedding health treats stale fingerprints as needing re-embedding", async () => {
const store = await createTestStore();
const collectionName = await createTestCollection();
const model = "hf:test/embed-model.gguf";
const now = new Date().toISOString();
store.llm = { embedModelName: model } as any;
store.ensureVecTable(3);
await insertTestDocument(store.db, collectionName, { name: "doc1", hash: "hash1" });
store.insertEmbedding("hash1", 0, 0, new Float32Array([1, 2, 3]), model, now, 1, "stale1");
expect(getEmbeddingFingerprint(model)).toMatch(/^[a-f0-9]{6}$/);
expect(store.getHashesNeedingEmbedding()).toBe(1);
await cleanupTestDb(store);
});
test("getIndexHealth returns health info", async () => {
const store = await createTestStore();
const collectionName = await createTestCollection();
@ -2256,6 +2654,33 @@ describe("Vector Table", () => {
await cleanupTestDb(store);
});
test("insertEmbedding is idempotent for an existing vec0 hash_seq (#598)", async () => {
const store = await createTestStore();
store.ensureVecTable(2);
const hash = "existinghashseq";
const first = new Float32Array([0.1, 0.2]);
const second = new Float32Array([0.3, 0.4]);
const now = new Date().toISOString();
store.db.prepare(`INSERT INTO vectors_vec (hash_seq, embedding) VALUES (?, ?)`).run(`${hash}_0`, first);
// Reproduces sqlite-vec's broken conflict handling: vec0 does not honor OR REPLACE.
expect(() => {
store.db.prepare(`INSERT OR REPLACE INTO vectors_vec (hash_seq, embedding) VALUES (?, ?)`).run(`${hash}_0`, second);
}).toThrow(/UNIQUE constraint failed/i);
// QMD must therefore use DELETE + INSERT when upserting the vector row.
expect(() => store.insertEmbedding(hash, 0, 0, second, "test-model", now)).not.toThrow();
const vectorCount = store.db.prepare(`SELECT COUNT(*) AS count FROM vectors_vec WHERE hash_seq = ?`).get(`${hash}_0`) as { count: number };
const metadataCount = store.db.prepare(`SELECT COUNT(*) AS count FROM content_vectors WHERE hash = ? AND seq = 0`).get(hash) as { count: number };
expect(vectorCount.count).toBe(1);
expect(metadataCount.count).toBe(1);
await cleanupTestDb(store);
});
});
// =============================================================================
@ -2263,6 +2688,47 @@ describe("Vector Table", () => {
// =============================================================================
describe("Integration", () => {
test("reindexCollection soft-deletes removed files and preserves inactive content (#585)", async () => {
const store = await createTestStore();
const collectionDir = await mkdtemp(join(testDir, "orphan-regression-"));
const collectionName = "orphan-regression";
try {
for (let i = 1; i <= 5; i++) {
await writeFile(join(collectionDir, `doc-${i}.md`), `# Doc ${i}\n\nUnique body ${i}`);
}
await createTestCollection({ pwd: collectionDir, glob: "**/*.md", name: collectionName });
const initial = await reindexCollection(store, collectionDir, "**/*.md", collectionName);
expect(initial.indexed).toBe(5);
expect(initial.removed).toBe(0);
await rm(join(collectionDir, "doc-3.md"));
await rm(join(collectionDir, "doc-4.md"));
await rm(join(collectionDir, "doc-5.md"));
const afterDelete = await reindexCollection(store, collectionDir, "**/*.md", collectionName);
expect(afterDelete.removed).toBe(3);
const counts = store.db.prepare(`
SELECT
SUM(CASE WHEN active = 1 THEN 1 ELSE 0 END) AS active,
SUM(CASE WHEN active = 0 THEN 1 ELSE 0 END) AS inactive,
COUNT(*) AS total
FROM documents
WHERE collection = ?
`).get(collectionName) as { active: number; inactive: number; total: number };
const contentCount = store.db.prepare(`SELECT COUNT(*) AS count FROM content`).get() as { count: number };
expect(counts).toEqual({ active: 2, inactive: 3, total: 5 });
expect(contentCount.count).toBe(5);
} finally {
await rm(collectionDir, { recursive: true, force: true });
await cleanupTestDb(store);
}
});
test("full document lifecycle: create, search, retrieve", async () => {
const store = await createTestStore();
const collectionName = await createTestCollection({ pwd: "/test/notes", glob: "**/*.md" });
@ -2802,6 +3268,219 @@ describe("Embedding batching", () => {
}
});
test("generateEmbeddings uses the active llm embed model when no explicit model is passed", async () => {
const store = await createTestStore();
const db = store.db;
const fakeLlm = createFakeEmbedLlm();
const model = "hf:env/embed-model.gguf";
setDefaultLlamaCpp(createFakeTokenizer() as any);
store.llm = { ...fakeLlm, embedModelName: model } as any;
try {
await insertTestDocument(db, "docs", { name: "one", body: "# One\n\nAlpha" });
const result = await generateEmbeddings(store);
expect(result.chunksEmbedded).toBe(1);
expect(fakeLlm.embedCalls[0]?.options?.model).toBe(model);
expect(fakeLlm.embedBatchModelCalls).toEqual([{ model }]);
expect(db.prepare(`SELECT DISTINCT model FROM content_vectors`).all()).toEqual([{ model }]);
} finally {
setDefaultLlamaCpp(null);
await cleanupTestDb(store);
}
});
test("generateEmbeddings does not mark a partially embedded multi-chunk document complete", async () => {
const store = await createTestStore();
const db = store.db;
let embedCalls = 0;
const fakeLlm = {
async embed(_text: string, _options?: { model?: string }) {
embedCalls++;
return embedCalls === 1
? { embedding: [0.1, 0.2, 0.3], model: "fake-embed" }
: null;
},
async embedBatch(texts: string[], _options?: { model?: string }) {
return texts.map((_text, index) => index === 0
? { embedding: [1, 2, 3], model: "fake-embed" }
: null
);
},
};
setDefaultLlamaCpp(createFakeTokenizer() as any);
store.llm = fakeLlm as any;
try {
await insertTestDocument(db, "docs", {
name: "long-doc",
body: "# Long doc\n\n" + "partial embedding regression ".repeat(260),
});
const result = await generateEmbeddings(store);
expect(result.errors).toBeGreaterThan(0);
expect(result.failures?.[0]?.attempts).toBe(3);
expect(db.prepare(`SELECT COUNT(*) as count FROM content_vectors`).get()).toEqual({ count: 0 });
expect(db.prepare(`SELECT COUNT(*) as count FROM vectors_vec`).get()).toEqual({ count: 0 });
expect(store.getHashesNeedingEmbedding()).toBe(1);
expect(store.getStatus().needsEmbedding).toBe(1);
} finally {
setDefaultLlamaCpp(null);
await cleanupTestDb(store);
}
});
test("generateEmbeddings clears chunk errors after successful retry", async () => {
const store = await createTestStore();
const db = store.db;
const fakeLlm = {
async embed(_text: string, _options?: { model?: string }) {
return { embedding: [0.1, 0.2, 0.3], model: "fake-embed" };
},
async embedBatch(texts: string[], _options?: { model?: string }) {
return texts.map((_text, index) => index === 0
? { embedding: [1, 2, 3], model: "fake-embed" }
: null
);
},
};
setDefaultLlamaCpp(createFakeTokenizer() as any);
store.llm = fakeLlm as any;
try {
await insertTestDocument(db, "docs", {
name: "retry-doc",
body: "# Retry doc\n\n" + "transient embedding failure ".repeat(260),
});
const result = await generateEmbeddings(store);
expect(result.errors).toBe(0);
expect(result.failures).toEqual([]);
expect(db.prepare(`SELECT COUNT(*) as count FROM content_vectors`).get()).toEqual({ count: result.chunksEmbedded });
expect(store.getHashesNeedingEmbedding()).toBe(0);
} finally {
setDefaultLlamaCpp(null);
await cleanupTestDb(store);
}
});
test("generateEmbeddings opens a long-lived LLM session for embed runs", async () => {
const store = await createTestStore();
const fakeLlm = createFakeEmbedLlm();
const sessionSpy = vi.spyOn(llmModule, "withLLMSessionForLlm");
setDefaultLlamaCpp(createFakeTokenizer() as any);
store.llm = fakeLlm as any;
try {
await insertTestDocument(store.db, "docs", { name: "one", body: "# One\n\nAlpha" });
await generateEmbeddings(store);
expect(sessionSpy).toHaveBeenCalledWith(
fakeLlm,
expect.any(Function),
expect.objectContaining({ maxDuration: 30 * 60 * 1000, name: "generateEmbeddings" }),
);
} finally {
sessionSpy.mockRestore();
setDefaultLlamaCpp(null);
await cleanupTestDb(store);
}
});
test("vectorSearchQuery uses the active llm embed model for vector lookups", async () => {
const store = await createTestStore();
const model = "hf:Qwen/Qwen3-Embedding-0.6B-GGUF/Qwen3-Embedding-0.6B-Q8_0.gguf";
const searchVecSpy = vi.fn(async () => [] as SearchResult[]) as any;
store.db.exec(`CREATE TABLE vectors_vec (hash_seq TEXT PRIMARY KEY, embedding BLOB)`);
store.llm = { embedModelName: model } as any;
store.searchVec = searchVecSpy as any;
store.expandQuery = vi.fn(async () => []) as any;
try {
await vectorSearchQuery(store, "custom query", { limit: 7, minScore: 0 });
expect(searchVecSpy).toHaveBeenCalledTimes(1);
expect(searchVecSpy.mock.calls[0]?.[0]).toBe("custom query");
expect(searchVecSpy.mock.calls[0]?.[1]).toBe(model);
expect(searchVecSpy.mock.calls[0]?.[2]).toBe(7);
} finally {
await cleanupTestDb(store);
}
});
test("hybridQuery uses the active llm embed model for precomputed vector lookups", async () => {
const store = await createTestStore();
const model = "hf:Qwen/Qwen3-Embedding-0.6B-GGUF/Qwen3-Embedding-0.6B-Q8_0.gguf";
const embedBatchSpy = vi.fn(async (texts: string[]) => texts.map(() => ({
embedding: [1, 2, 3],
model,
})));
const searchVecSpy = vi.fn(async () => [] as SearchResult[]) as any;
store.db.exec(`CREATE TABLE vectors_vec (hash_seq TEXT PRIMARY KEY, embedding BLOB)`);
store.llm = {
embedModelName: model,
embedBatch: embedBatchSpy,
} as any;
store.searchVec = searchVecSpy as any;
store.searchFTS = vi.fn(() => []) as any;
store.expandQuery = vi.fn(async () => []) as any;
try {
await hybridQuery(store, "hybrid query", { limit: 5, minScore: 0, skipRerank: true });
expect(embedBatchSpy).toHaveBeenCalledTimes(1);
expect(searchVecSpy).toHaveBeenCalledTimes(1);
expect(searchVecSpy.mock.calls[0]?.[0]).toBe("hybrid query");
expect(searchVecSpy.mock.calls[0]?.[1]).toBe(model);
expect(searchVecSpy.mock.calls[0]?.[5]).toEqual([1, 2, 3]);
} finally {
await cleanupTestDb(store);
}
});
test("structuredSearch uses the active llm embed model for precomputed vector lookups", async () => {
const store = await createTestStore();
const model = "hf:Qwen/Qwen3-Embedding-0.6B-GGUF/Qwen3-Embedding-0.6B-Q8_0.gguf";
const embedBatchSpy = vi.fn(async (texts: string[]) => texts.map(() => ({
embedding: [1, 2, 3],
model,
})));
const searchVecSpy = vi.fn(async () => [] as SearchResult[]) as any;
store.db.exec(`CREATE TABLE vectors_vec (hash_seq TEXT PRIMARY KEY, embedding BLOB)`);
store.llm = {
embedModelName: model,
embedBatch: embedBatchSpy,
} as any;
store.searchVec = searchVecSpy as any;
try {
await structuredSearch(store, [{ type: "vec", query: "structured query" }], {
limit: 5,
minScore: 0,
skipRerank: true,
});
expect(embedBatchSpy).toHaveBeenCalledTimes(1);
expect(searchVecSpy).toHaveBeenCalledTimes(1);
expect(searchVecSpy.mock.calls[0]?.[0]).toBe("structured query");
expect(searchVecSpy.mock.calls[0]?.[1]).toBe(model);
expect(searchVecSpy.mock.calls[0]?.[5]).toEqual([1, 2, 3]);
} finally {
await cleanupTestDb(store);
}
});
test("generateEmbeddings rejects invalid batch limits", async () => {
const store = await createTestStore();

View File

@ -361,17 +361,73 @@ describe("lex query syntax", () => {
expect(validateSemanticQuery("what is the CAP theorem")).toBeNull();
});
test("rejects negation syntax", () => {
test("rejects negation at start of query", () => {
expect(validateSemanticQuery("-redis connection pooling")).toContain("Negation");
});
test("rejects negation after space", () => {
expect(validateSemanticQuery("performance -sports")).toContain("Negation");
});
test("rejects negated quoted phrase", () => {
expect(validateSemanticQuery('-"exact phrase"')).toContain("Negation");
});
test("rejects multiple negations", () => {
expect(validateSemanticQuery("error handling -java -python")).toContain("Negation");
});
test("rejects negation after leading whitespace", () => {
expect(validateSemanticQuery(" -term at start")).toContain("Negation");
});
test("rejects negation after tab", () => {
expect(validateSemanticQuery("foo\t-bar")).toContain("Negation");
});
test("accepts hyphenated compound words", () => {
expect(validateSemanticQuery("long-lived server shared across clients")).toBeNull();
expect(validateSemanticQuery("real-time voice processing pipeline")).toBeNull();
expect(validateSemanticQuery("how does the rate-limiter handle burst traffic")).toBeNull();
expect(validateSemanticQuery("self-hosted deployment options")).toBeNull();
expect(validateSemanticQuery("multi-client session architecture")).toBeNull();
expect(validateSemanticQuery("cross-platform compatibility")).toBeNull();
expect(validateSemanticQuery("non-blocking I/O model")).toBeNull();
expect(validateSemanticQuery("in-memory caching strategy")).toBeNull();
expect(validateSemanticQuery("write-ahead log for crash recovery")).toBeNull();
expect(validateSemanticQuery("copy-on-write semantics")).toBeNull();
});
test("accepts multiple hyphens in a phrase", () => {
expect(validateSemanticQuery("state-of-the-art embedding models")).toBeNull();
expect(validateSemanticQuery("end-to-end testing")).toBeNull();
expect(validateSemanticQuery("man-in-the-middle attack prevention")).toBeNull();
});
test("accepts multiple hyphenated words in one query", () => {
expect(validateSemanticQuery("built-in vs add-on features")).toBeNull();
});
test("accepts short hyphenated terms", () => {
expect(validateSemanticQuery("A-B testing for ML models")).toBeNull();
expect(validateSemanticQuery("e-commerce platform")).toBeNull();
});
test("accepts bare hyphen without word character", () => {
expect(validateSemanticQuery("-")).toBeNull();
});
test("accepts hyde-style hypothetical answers", () => {
expect(validateSemanticQuery(
"The CAP theorem states that a distributed system cannot simultaneously provide consistency, availability, and partition tolerance."
)).toBeNull();
});
test("accepts hyde with hyphenated words", () => {
expect(validateSemanticQuery(
"HTTP transport runs a single long-lived daemon shared across all clients, avoiding per-session model re-loading."
)).toBeNull();
});
});
describe("validateLexQuery", () => {

View File

@ -4,7 +4,7 @@
"noEmit": false,
"outDir": "dist",
"declaration": true,
"noImplicitAny": false
"noImplicitAny": true
},
"include": ["src/**/*.ts"],
"exclude": ["src/**/*.test.ts", "src/test-preload.ts", "src/bench-*.ts"]