Merge branch 'main' into feat-nvidia-embedding-remote-sync
This commit is contained in:
commit
b19f486d50
5
.github/workflows/publish.yml
vendored
5
.github/workflows/publish.yml
vendored
@ -32,13 +32,12 @@ jobs:
|
||||
|
||||
- uses: actions/setup-node@v4
|
||||
with:
|
||||
node-version: 22
|
||||
node-version: 24
|
||||
registry-url: https://registry.npmjs.org
|
||||
package-manager-cache: false
|
||||
|
||||
- run: npm run build
|
||||
- run: npm publish --provenance --access public
|
||||
env:
|
||||
NODE_AUTH_TOKEN: ${{ secrets.NPM_TOKEN }}
|
||||
|
||||
- name: Extract release notes
|
||||
id: notes
|
||||
|
||||
204
CHANGELOG.md
204
CHANGELOG.md
@ -19,6 +19,198 @@
|
||||
and query expansion models.
|
||||
- Embedding: use approximate token counts in external embedding mode so
|
||||
chunking does not load a local GGUF tokenizer.
|
||||
### Documentation
|
||||
|
||||
- README: documented collection filtering (`-c` semantics), the `collection
|
||||
show`/`include`/`exclude`/`update-cmd` subcommands, the `--intent`/`--no-rerank`/
|
||||
`-C`/`--full-path` search flags, the `--format <kind>` output selector (with the
|
||||
legacy `--json`/`--csv`/`--md`/`--xml`/`--files` booleans noted as aliases),
|
||||
`vector-search`/`deep-search` aliases, embed
|
||||
memory flags (`--max-docs-per-batch`/`--max-batch-mb`), a sample `--explain`
|
||||
score trace, the `qmd doctor`/`qmd init` commands, the `get` `:from:count`
|
||||
suffix and `--no-line-numbers`, an MCP tool parameter reference, and a
|
||||
Benchmarking section for `qmd bench`.
|
||||
- docs/SYNTAX.md: removed the non-existent `q` MCP parameter example (the `query`
|
||||
tool and REST endpoint accept only the `searches` array) and added a Scoping
|
||||
section.
|
||||
- README: removed the misleading `qmd update --pull` example. The `--pull` flag is
|
||||
parsed but never consumed (`updateCollections()` ignores it); the real mechanism
|
||||
for running `git pull` before re-indexing is a per-collection `update` command,
|
||||
set via `qmd collection update-cmd`.
|
||||
|
||||
### Fixed
|
||||
|
||||
- MCP server instructions now tell agents to scope with the plural `collections`
|
||||
parameter (matching the schema). The previous singular `collection` hint led
|
||||
agents to pass a parameter that Zod silently strips, producing unscoped results.
|
||||
The `get` instruction line also now documents the full `file.md:from:count`
|
||||
range suffix instead of only the single-line `file.md:100` offset.
|
||||
|
||||
- Filesystem paths with special characters (`#`, `&`, spaces, `[]`, `()`, etc.)
|
||||
now round-trip correctly through index → search → get. Previously
|
||||
`reindexCollection` called `handelize()` on relative paths before storing
|
||||
them, turning `# Meeting - 234232 3432 __ 5.md` into
|
||||
`Meeting-234232-3432-5.md` and making `qmd get <actual-path>`,
|
||||
`qmd get --full-path`, and `qmd ls` return dead or garbled paths. Paths are
|
||||
now stored verbatim. Existing indexes auto-migrate on the next `qmd update`.
|
||||
|
||||
- FTS5 search now correctly matches dotted version strings like `2026.4.10`. The
|
||||
`porter unicode61` tokenizer splits on dots (storing `2026`, `4`, `10` as
|
||||
separate tokens), but the query sanitizer was stripping dots and producing
|
||||
`2026410` which never matched. Dotted terms are now split and ANDed together
|
||||
so version-string searches work as expected (#563).
|
||||
- HTTP REST endpoints `/query` and `/search` now return `qmd://collection/path`
|
||||
URIs in the `file` field, matching the output format used by the CLI and MCP
|
||||
resource URIs. Previously the raw `displayPath` (`collection/path`) was
|
||||
returned without the scheme prefix (#576).
|
||||
- The embed session `maxDuration` is now env-configurable via
|
||||
`QMD_EMBED_MAX_DURATION_MS` (default: 30 min). This prevents large-corpus
|
||||
embeddings from being aborted by the hardcoded 30-minute ceiling (#673).
|
||||
|
||||
## [2.5.3] - 2026-05-28
|
||||
|
||||
### Features
|
||||
|
||||
- `qmd get` now accepts a `:from:count` suffix on a path or docid (e.g.
|
||||
`qmd get "#abc123:120:40"` reads 40 lines starting at line 120). Explicit
|
||||
`--from`/`-l` flags still override the suffix. The MCP `get` tool accepts the
|
||||
same suffix.
|
||||
- `qmd get` and `qmd multi-get` are now **line-numbered by default** and print
|
||||
the document's `#docid` and `qmd://` path in the output header. Disable line
|
||||
numbers with `--no-line-numbers`. The MCP `get`/`multi_get` tools default
|
||||
`lineNumbers` to `true` to match.
|
||||
- `qmd multi-get` now includes the `#docid` in every output format
|
||||
(`--md`, `--json`, `--csv`, `--xml`, `--files`, and the default CLI view),
|
||||
consistent with `qmd search`.
|
||||
- `qmd get` and `qmd multi-get` accept `--full-path`, which replaces the
|
||||
`qmd://` path + `#docid` with the document's on-disk filesystem path (handy for
|
||||
piping into `Read`/`Edit`/an editor). Falls back to the canonical `qmd://` +
|
||||
docid header when the file no longer exists on disk.
|
||||
- `qmd search` / `qmd query` now show a clearer hit identifier: the default CLI
|
||||
view (and the new `**file:**` line in `--md` output) always prints the full
|
||||
`qmd://collection/path` URI so you can pipe it straight back into `qmd get`.
|
||||
- `qmd search` / `qmd query` accept `--full-path` with the same semantics as
|
||||
`qmd get`: the result label becomes the file's on-disk path — `./`-prefixed
|
||||
relative path when the file lives in a subfolder of `$PWD`, absolute realpath
|
||||
otherwise — and the per-result `#docid` is dropped because the path is the
|
||||
identifier. The leading `./` is intentional so the output is unambiguously a
|
||||
filesystem path. Applies to all output formats.
|
||||
- `qmd get` and `qmd multi-get` now also use the `./`-prefixed convention when
|
||||
`--full-path` renders a path under `$PWD`, matching `search`/`query`.
|
||||
- New `--format <kind>` flag selects the output format (`cli` | `json` | `csv` |
|
||||
`md` | `xml` | `files`) for `search`, `query`, and `multi-get`. The legacy
|
||||
boolean aliases (`--json`/`--csv`/`--md`/`--xml`/`--files`) still work but are
|
||||
no longer in `--help`; prefer `--format`.
|
||||
|
||||
### Fixes
|
||||
|
||||
- Launcher: source-mode runner selection now prefers Node + tsx over Bun when
|
||||
both `package-lock.json` and `bun.lock` are present in the package root,
|
||||
mirroring the dist-mode "npm priority" rule. Fixes pnpm-global installs that
|
||||
copy the entire working tree (including `.git` and `bun.lock`) into the
|
||||
install dir and previously routed through Bun, causing ABI mismatches with
|
||||
the Node-built `better-sqlite3` / `sqlite-vec` native modules.
|
||||
- Darwin Metal: llama-using commands (`query`, `vsearch`, `embed`) no longer
|
||||
dump a multi-kB GGML/Metal backtrace at process exit even when output
|
||||
succeeded. The libggml-metal static `ggml_metal_device` destructor asserts
|
||||
`[rsets->data count] == 0` during `__cxa_finalize_ranges`, but the
|
||||
buffer-free path never calls the symmetric `ggml_metal_device_rsets_rm`
|
||||
to remove released rsets from the device collection (upstream
|
||||
ggml-org/llama.cpp#22593, one-line fix open as PR #22595). The assertion
|
||||
only fires when `process.exit()` skips Node's `beforeExit` hook, which is
|
||||
what node-llama-cpp uses to auto-dispose Metal contexts. Primary fix:
|
||||
`finishSuccessfulCliCommand` now sets `process.exitCode = 0` and returns
|
||||
instead of calling `process.exit(0)`, so `beforeExit` fires and the native
|
||||
binding cleans up before libc's static destructor runs. Defense-in-depth:
|
||||
the launcher (`bin/qmd`) and the npm test driver (`scripts/test-all.mjs`
|
||||
+ the `test:bun` / `test:unit` package.json scripts) also set
|
||||
`GGML_METAL_NO_RESIDENCY=1` on darwin before spawning node/bun, covering
|
||||
error paths and tests that still terminate via `process.exit()`. The env
|
||||
var must be set before node/bun start — libggml-metal reads it via libc
|
||||
`getenv` at module-load time, and Bun does not propagate `process.env`
|
||||
mutations to libc `setenv` — so it lives in the launcher rather than in
|
||||
test-preload. Residency sets give no measurable speedup for QMD's
|
||||
short-lived CLI workflow (benchmarked on M3 Pro). Opt back in with
|
||||
`QMD_METAL_KEEP_RESIDENCY=1` for long-lived qmd processes (e.g. the MCP
|
||||
daemon may benefit on hot reload) or to triage the upstream fix.
|
||||
`qmd doctor` reports the mitigation state. Minimal reproduction:
|
||||
`scripts/repro-metal-rsets-crash.mjs`.
|
||||
|
||||
### Docs
|
||||
|
||||
- qmd skill: emphasize reading line ranges with `get`'s built-in
|
||||
`:from:count` suffix / `--from`/`-l` flags instead of piping through
|
||||
`sed`/`head`/`tail`; cite the docid and line numbers now present in retrieval
|
||||
output; and author structured `intent:`/`lex:`/`vec:`/`hyde:` queries yourself
|
||||
rather than relying on built-in query expansion.
|
||||
|
||||
## [2.5.2] - 2026-05-22
|
||||
|
||||
### Fixes
|
||||
|
||||
- Launcher: Rewrite `bin/qmd` as a Node-based shebang polyglot to fix global npm installation execution failures on Windows (#668 / #452), while supporting seamless fallback to Bun in Node-less environments.
|
||||
|
||||
|
||||
## [2.5.1] - 2026-05-20
|
||||
|
||||
### Changes
|
||||
|
||||
- Release: publish from GitHub Actions via npm Trusted Publishing/OIDC instead of a long-lived `NPM_TOKEN` secret.
|
||||
|
||||
## [2.5.0] - 2026-05-19
|
||||
|
||||
### Changes
|
||||
|
||||
- Dependencies: update core SQLite/config/chunking packages (`better-sqlite3`, `yaml`, `web-tree-sitter`, `tree-sitter-go`, and `tree-sitter-python`) while keeping incompatible `zod`, `tsx`, and `vitest` majors pinned.
|
||||
- Agent skills: add `qmd skills list|get|path` to serve version-matched runtime skill instructions from the installed CLI, and make `qmd skill install` write a stable discovery stub so installed agent skills do not go stale after QMD upgrades.
|
||||
- CLI: add `qmd doctor` for index/runtime diagnostics, including SQLite/sqlite-vec versions, embedding fingerprint freshness, mixed-fingerprint detection, safe legacy fingerprint adoption, and content-hash sampling.
|
||||
|
||||
### Fixes
|
||||
|
||||
- Launcher: prefer runnable TypeScript source in git checkouts even when ignored `dist/` artifacts exist, while packaged installs continue to run `dist/`.
|
||||
- GPU: keep node-llama-cpp's documented `gpu: "auto"` initialization as the primary path, then perform no-build packaged CUDA/Vulkan/Metal probes only if auto falls back to CPU.
|
||||
- CLI: move GPU/CPU runtime diagnostics out of `qmd status`; use `qmd doctor` for device probing and related environment guidance.
|
||||
- CLI: point unexpected command/setup failures toward `qmd doctor` so diagnostics are the default next step when QMD behaves incorrectly.
|
||||
- Doctor: explicitly warn when `content_vectors` contains multiple non-empty embedding fingerprint names, with the per-fingerprint document/chunk breakdown.
|
||||
- Embed: make the TTY progress line label byte-based input progress explicitly, show embedded chunks as a count, and shorten the displayed model name.
|
||||
- Embed: retain per-chunk failure details, retry failed chunks after later successful embeds and again when no other chunks remain, clear recovered errors, and cap retries to avoid endless loops.
|
||||
- Tests: expand the container smoke harness to cover npm-global, npx-style, and Bun-global install scenarios, always checking auto and `QMD_FORCE_CPU=1` doctor modes, with opt-in tiny `qmd embed` and GPU probe runs for supported container runtimes.
|
||||
- Embedding: fingerprint vector metadata using the active embedding model and formatting/chunking parameters so stale vectors are treated as pending after search semantics change. Legacy `content_vectors` columns are migrated lazily on first vector-health/write use to preserve fast QMD startup.
|
||||
|
||||
- Skill: expand the packaged QMD skill with retrieval-first workflows, structured query examples, wiki/source collection guidance, and safe fallbacks when model-backed search is unavailable.
|
||||
- Tests: make `bun run test` execute the local unit suite under both Node/Vitest and Bun (`test:node` + `test:bun`) so runtime-specific regressions are caught before CI.
|
||||
- Model config: centralize embedding/rerank/generation model resolution so `qmd embed`, `status`, `query`, `vsearch`, `pull`, SDK vector search, and `bench` use the same active `.qmd/index.yaml` model hints and environment fallbacks.
|
||||
- GPU/status: `qmd status` now uses the same embedding model identity as `qmd embed` when computing pending embeddings, so URI-backed embeddings are not incorrectly reported as pending under the legacy `embeddinggemma` alias.
|
||||
- GPU status: `qmd status` now always shows GPU mode/configuration without unsafe native probing, and CPU-fallback warnings point to `QMD_STATUS_DEVICE_PROBE=1 qmd status` for an actual backend probe. The no-GPU warning is emitted once per process instead of once per LLM instance during benchmarks.
|
||||
- GPU: add `QMD_FORCE_CPU=1` / `--no-gpu` to bypass CUDA/Vulkan/Metal probing entirely, and route native llama.cpp stdout noise to stderr so JSON output stays parseable during search/query commands.
|
||||
- Snippet line numbers: `qmd_query` (MCP), HTTP `/query`, and `qmd query`
|
||||
(CLI JSON output and snippet headers) now return absolute source-file
|
||||
line numbers instead of chunk-local ones, so the `line` field can be
|
||||
passed back to `qmd_get` as `fromLine` without a separate lookup.
|
||||
Snippet selection remains scoped to the best matching chunk
|
||||
(preserves #149).
|
||||
- CLI: `qmd query --full` now emits the full document body in all output
|
||||
formats (json, csv, md, xml), restoring the documented behavior of the
|
||||
flag. Previously it returned only the best matching chunk (~3.6KB max
|
||||
per result). Output payload for `--full` queries is now proportional
|
||||
to total document size.
|
||||
- macOS Metal: `qmd query --json` now flushes successful JSON output and uses a safe immediate-exit path on Darwin to avoid ggml Metal finalizer aborts; other commands still dispose LLM contexts/models before the llama runtime. #368
|
||||
- Embedding: require complete chunk coverage before treating a document as
|
||||
embedded, remove partial vectors when chunk/session failures leave a
|
||||
document incomplete, and keep `qmd status` pending counts honest after
|
||||
interrupted long embed runs. #637 #378
|
||||
- Embedding: `qmd embed -c <collection>` now scopes pending-doc selection
|
||||
to the requested collection instead of embedding global pending work.
|
||||
Scoped `--force` clears only collection-owned vectors, preserves shared
|
||||
hashes referenced by sibling collections, and drops `vectors_vec` only
|
||||
when the scoped clear empties all vectors.
|
||||
- Hybrid search: weight RRF lists by query type so original FTS and original vector evidence get the intended 2x boost, instead of accidentally boosting the first lexical expansion. #591
|
||||
- MCP: seed llama.cpp/GGML quiet env vars before launching `qmd mcp` so native logs cannot pollute stdio JSON-RPC framing. #593
|
||||
- CLI: remove CommonJS `require()` calls from ESM index path normalization so `qmd --index <path>` no longer crashes with `ERR_AMBIGUOUS_MODULE_SYNTAX` on Node 22+. #634
|
||||
- Windows CUDA: serialize llama.cpp embedding/reranking contexts by default to avoid intermittent `ggml-cuda.cu:98` crashes in `qmd query`; set `QMD_EMBED_PARALLELISM` to opt back into parallel contexts if your driver is stable. #519
|
||||
- MCP: make `qmd mcp --index <name>` use the selected index for both foreground and daemon HTTP servers instead of falling back to the default store. #343
|
||||
- Embedding: respect `QMD_EMBED_MODEL` consistently for vector indexing and vector-backed search, with default-model fallback when unset.
|
||||
- Config: use one home-directory resolver for YAML config and the default SQLite cache path, avoiding Windows CLI/MCP split-brain when `HOME` is unset.
|
||||
- GPU: respect explicit `QMD_LLAMA_GPU=metal|vulkan|cuda` backend overrides instead of always using auto GPU selection. #529
|
||||
- Fix: preserve original filename case in `handelize()`. The previous
|
||||
`.toLowerCase()` call made indexed paths unreachable on case-sensitive
|
||||
@ -27,6 +219,18 @@
|
||||
- CLI: make `qmd status` skip native `node-llama-cpp` device probing by
|
||||
default so status stays safe on machines with broken or unsupported GPU
|
||||
drivers. Set `QMD_STATUS_DEVICE_PROBE=1` to opt in.
|
||||
- CLI: lazy-load `node-llama-cpp` so lightweight commands such as
|
||||
`qmd status` do not import native ML dependencies or trigger llama.cpp
|
||||
builds on ARM/no-GPU machines. #491
|
||||
- Store: keep content rows referenced by inactive documents during orphan
|
||||
cleanup so `qmd update` preserves soft-deleted tombstones for removed
|
||||
files. #585
|
||||
- Packaging: install AST grammar WASM packages as required dependencies so
|
||||
Bun global installs include TypeScript/TSX/JavaScript grammars, and add a
|
||||
`smoke:package-grammars` verification command. #595
|
||||
- Launcher: add wrapper smoke coverage for scoped package, npm/npx,
|
||||
Homebrew/Linuxbrew, Bun global symlink layouts, and `$BUN_INSTALL`
|
||||
false-positive runtime selection regressions. #351 #353 #354 #356 #358 #359
|
||||
|
||||
## [2.1.0] - 2026-04-05
|
||||
|
||||
|
||||
195
README.md
195
README.md
@ -135,6 +135,30 @@ LLM models stay loaded in VRAM across requests. Embedding/reranking contexts are
|
||||
|
||||
Point any MCP client at `http://localhost:8181/mcp` to connect.
|
||||
|
||||
#### MCP Tool Parameters
|
||||
|
||||
| Tool | Parameter | Type | Notes |
|
||||
|------|-----------|------|-------|
|
||||
| `query` | `searches` | array | Typed sub-queries (`lex`/`vec`/`hyde`), 1–10. **Required.** First gets 2x weight. |
|
||||
| `query` | `collections` | string[] | Filter by collection names (OR). **Array only** — singular `collection` is silently ignored. |
|
||||
| `query` | `intent` | string | Disambiguation context (does not search on its own) |
|
||||
| `query` | `limit` | number | Max results (default 10) |
|
||||
| `query` | `minScore` | number | Minimum relevance 0–1 (default 0) |
|
||||
| `query` | `candidateLimit` | number | Max candidates to rerank (default 40) |
|
||||
| `query` | `rerank` | boolean | Run LLM reranking (default **true**); set false for RRF-only |
|
||||
| `get` | `file` | string | Path, docid (`#abc123`), or `path:from:count` (e.g. `#abc123:120:40`) |
|
||||
| `get` | `fromLine` | number | Start line (1-indexed); overrides the `:from` suffix |
|
||||
| `get` | `maxLines` | number | Limit returned lines |
|
||||
| `get` | `lineNumbers` | boolean | Prefix lines with numbers (default **true**) |
|
||||
| `multi_get` | `pattern` | string | Glob pattern or comma-separated list |
|
||||
| `multi_get` | `maxBytes` | number | Skip files larger than N (default 10240) |
|
||||
| `multi_get` | `maxLines` | number | Limit lines per file |
|
||||
| `multi_get` | `lineNumbers` | boolean | Prefix lines with numbers (default **true**) |
|
||||
|
||||
Unknown parameters are silently ignored (not rejected) — double-check names if
|
||||
results seem unscoped. The HTTP `/query` and `/search` endpoints return
|
||||
`qmd://collection/path` URIs in the `file` field, matching the CLI and MCP output.
|
||||
|
||||
### SDK / Library Usage
|
||||
|
||||
Use QMD as a library in your own Node.js or Bun applications.
|
||||
@ -575,6 +599,17 @@ qmd collection rename myproject my-project
|
||||
# List files in a collection
|
||||
qmd ls notes
|
||||
qmd ls notes/subfolder
|
||||
|
||||
# Show collection details (path, glob mask, include status, context count)
|
||||
qmd collection show notes
|
||||
|
||||
# Include or exclude a collection from default (unscoped) queries
|
||||
qmd collection include notes
|
||||
qmd collection exclude notes
|
||||
|
||||
# Run a command before every `qmd update` (e.g. git pull); empty arg clears it
|
||||
qmd collection update-cmd notes 'git pull --rebase'
|
||||
qmd collection update-cmd notes
|
||||
```
|
||||
|
||||
### Generate Vector Embeddings
|
||||
@ -591,6 +626,10 @@ qmd embed --chunk-strategy auto
|
||||
|
||||
# Also works with query for consistent chunk selection
|
||||
qmd query "auth flow" --chunk-strategy auto
|
||||
|
||||
# Memory control for large corpora / constrained systems
|
||||
qmd embed --max-docs-per-batch 50 # cap docs per embedding batch
|
||||
qmd embed --max-batch-mb 64 # cap batch size in MB
|
||||
```
|
||||
|
||||
**AST-aware chunking** (`--chunk-strategy auto`) uses tree-sitter to chunk code
|
||||
@ -652,6 +691,9 @@ qmd vsearch "how to login"
|
||||
qmd query "user authentication"
|
||||
```
|
||||
|
||||
Two aliases exist for the semantic/hybrid modes: `vector-search` (→ `vsearch`)
|
||||
and `deep-search` (→ `query`).
|
||||
|
||||
### Options
|
||||
|
||||
```sh
|
||||
@ -664,24 +706,45 @@ qmd query "user authentication"
|
||||
--line-numbers # Add line numbers to output
|
||||
--explain # Include retrieval score traces (query, JSON/CLI output)
|
||||
--index <name> # Use named index
|
||||
--intent "<text>" # Disambiguation context (e.g. "web page load times")
|
||||
--no-rerank # Skip LLM reranking (RRF scores only; faster on CPU)
|
||||
-C, --candidate-limit <n> # Max candidates to rerank (default: 40)
|
||||
--full-path # Emit on-disk filesystem paths instead of qmd:// URIs
|
||||
|
||||
# Output formats (for search and multi-get)
|
||||
--files # Output: docid,score,filepath,context
|
||||
--json # JSON output with snippets
|
||||
--csv # CSV output
|
||||
--md # Markdown output
|
||||
--xml # XML output
|
||||
--format <kind> # cli (default) | json | csv | md | xml | files
|
||||
# (--json, --csv, --md, --xml, --files are legacy aliases)
|
||||
|
||||
# Get options
|
||||
qmd get <file>[:line] # Get document, optionally starting at line
|
||||
-l <num> # Maximum lines to return
|
||||
--from <num> # Start from line number
|
||||
qmd get <file>[:from[:count]] # Get document; optional start line and count
|
||||
-l <num> # Maximum lines to return
|
||||
--from <num> # Start line (overrides the :from suffix)
|
||||
--no-line-numbers # Disable line numbering (on by default)
|
||||
|
||||
# Multi-get options
|
||||
-l <num> # Maximum lines per file
|
||||
--max-bytes <num> # Skip files larger than N bytes (default: 10KB)
|
||||
```
|
||||
|
||||
### Collection Filtering
|
||||
|
||||
The `-c`/`--collection` flag filters results by collection **name** (as shown by
|
||||
`qmd collection list`). Collections are a global registry — you can search any
|
||||
collection from any directory:
|
||||
|
||||
```sh
|
||||
qmd search "auth" -c notes # single collection
|
||||
qmd search "auth" -c notes -c docs # multiple collections (OR)
|
||||
```
|
||||
|
||||
With no `-c` flag, all default-included collections are searched. Collections
|
||||
marked excluded (`qmd collection exclude <name>`) are skipped unless named
|
||||
explicitly with `-c`.
|
||||
|
||||
> **Note:** With multiple `-c` flags, results come from a global top-K pool and are
|
||||
> then filtered. If one collection dominates the rankings, matches from smaller
|
||||
> collections may not appear at the default limit — raise `-n` or use `--all`.
|
||||
|
||||
### Output Format
|
||||
|
||||
Default output is colorized CLI format (respects `NO_COLOR` env).
|
||||
@ -759,17 +822,48 @@ qmd query --json --explain "quarterly reports"
|
||||
qmd --index work search "quarterly reports"
|
||||
```
|
||||
|
||||
The `--explain` flag attaches a score breakdown to each result: the FTS/vector
|
||||
backend scores plus the RRF fusion math (rank, weight, top-rank bonus) and every
|
||||
sub-query's contribution. Abbreviated:
|
||||
|
||||
```json
|
||||
{
|
||||
"docid": "#6c90f0",
|
||||
"score": 0.89,
|
||||
"file": "qmd://qmd/README.md",
|
||||
"explain": {
|
||||
"ftsScores": [0.892, 0.907],
|
||||
"vectorScores": [0.540, 0.484],
|
||||
"rrf": {
|
||||
"rank": 1,
|
||||
"weight": 0.75,
|
||||
"baseScore": 0.123,
|
||||
"topRankBonus": 0.05,
|
||||
"totalScore": 0.173,
|
||||
"contributions": [
|
||||
{ "source": "fts", "queryType": "original", "query": "reranking",
|
||||
"rank": 1, "weight": 2, "backendScore": 0.892, "rrfContribution": 0.0328 }
|
||||
]
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Index Maintenance
|
||||
|
||||
```sh
|
||||
# Show index status and collections with contexts
|
||||
qmd status
|
||||
|
||||
# Re-index all collections
|
||||
# Re-index all collections. If a collection has a configured update command
|
||||
# (e.g. `git pull`), it runs first — set one with `qmd collection update-cmd`.
|
||||
qmd update
|
||||
|
||||
# Re-index with git pull first (for remote repos)
|
||||
qmd update --pull
|
||||
# Diagnose the install (runtime, sqlite-vec, embedding fingerprints, GPU probe)
|
||||
qmd doctor
|
||||
|
||||
# Initialize a project-local index in the current directory
|
||||
qmd init
|
||||
|
||||
# Get document by filepath (with fuzzy matching suggestions)
|
||||
qmd get notes/meeting.md
|
||||
@ -780,6 +874,13 @@ qmd get "#abc123"
|
||||
# Get document starting at line 50, max 100 lines
|
||||
qmd get notes/meeting.md:50 -l 100
|
||||
|
||||
# Read 40 lines starting at line 120 via the :from:count suffix (works with docids)
|
||||
qmd get notes/meeting.md:120:40
|
||||
qmd get "#abc123:120:40"
|
||||
|
||||
# get / multi-get are line-numbered by default; disable with --no-line-numbers
|
||||
qmd get notes/meeting.md --no-line-numbers
|
||||
|
||||
# Get multiple documents by glob pattern
|
||||
qmd multi-get "journals/2025-05*.md"
|
||||
|
||||
@ -796,6 +897,75 @@ qmd multi-get "docs/*.md" --json
|
||||
qmd cleanup
|
||||
```
|
||||
|
||||
### Benchmarking
|
||||
|
||||
Measure search quality across all four backends with `qmd bench` and a fixture file
|
||||
of queries with known-relevant documents.
|
||||
|
||||
**From a git checkout**, an example fixture and its test corpus ship in the repo:
|
||||
|
||||
```sh
|
||||
# One-time setup (indexes the repo's test corpus into its own collection)
|
||||
qmd collection add test/eval-docs --name eval-docs
|
||||
qmd embed -c eval-docs
|
||||
|
||||
# Run the benchmark (table output)
|
||||
qmd bench src/bench/fixtures/example.json
|
||||
|
||||
# JSON output for programmatic analysis
|
||||
qmd bench src/bench/fixtures/example.json --json
|
||||
```
|
||||
|
||||
> The example fixture (`src/bench/fixtures/example.json`) and its test corpus
|
||||
> (`test/eval-docs/`) exist only in a git checkout — they are **not** part of the
|
||||
> published npm package. If you installed via `npm`/`npx`, write your own fixture
|
||||
> (see below) against a collection you have already indexed:
|
||||
>
|
||||
> ```sh
|
||||
> qmd bench my-fixture.json -c my-collection
|
||||
> ```
|
||||
|
||||
Each query runs against four backends, reporting precision@k, recall, MRR, and F1:
|
||||
|
||||
| Backend | What it tests | LLM required |
|
||||
|---------|---------------|--------------|
|
||||
| `bm25` | Keyword search only (FTS5) | No |
|
||||
| `vector` | Semantic similarity only | Embedding model |
|
||||
| `hybrid` | BM25 + vector fusion (no reranking) | Embedding model |
|
||||
| `full` | Full pipeline with LLM reranking | All three models |
|
||||
|
||||
**Score interpretation:** `1.00` = perfect (all expected docs in top results),
|
||||
`0.00` = complete miss. The example fixture typically shows bm25 ~0.50, vector
|
||||
~0.70, and hybrid/full ~1.00 — a concrete demonstration of why hybrid search beats
|
||||
either backend alone.
|
||||
|
||||
**Custom fixtures** are JSON:
|
||||
|
||||
```json
|
||||
{
|
||||
"description": "My benchmark",
|
||||
"version": 1,
|
||||
"collection": "my-collection",
|
||||
"queries": [
|
||||
{
|
||||
"id": "find-auth",
|
||||
"query": "authentication flow",
|
||||
"type": "semantic",
|
||||
"expected_files": ["docs/auth-design.md"],
|
||||
"expected_in_top_k": 3
|
||||
}
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
`expected_files` are collection-relative paths as shown by `qmd ls`. The `type`
|
||||
field (`exact`, `semantic`, `topical`, `cross-domain`, `alias`) labels queries for
|
||||
grouping — it does not change search behavior.
|
||||
|
||||
> **Heads-up:** if the fixture's collection isn't indexed, bench currently runs to
|
||||
> completion and reports all zeros with no warning. Verify setup with
|
||||
> `qmd ls <collection>` first.
|
||||
|
||||
## Data Storage
|
||||
|
||||
Index stored in: `~/.cache/qmd/index.sqlite`
|
||||
@ -817,6 +987,9 @@ llm_cache -- Cached LLM responses (query expansion, rerank scores)
|
||||
| Variable | Default | Description |
|
||||
|----------|---------|-------------|
|
||||
| `XDG_CACHE_HOME` | `~/.cache` | Cache directory location |
|
||||
| `QMD_LLAMA_GPU` | `auto` | Force llama.cpp GPU backend (`metal`, `vulkan`, `cuda`) or disable GPU with `false` |
|
||||
| `QMD_FORCE_CPU` | unset | Set to `1`/`true` to force CPU mode before any CUDA/Vulkan/Metal probing. Equivalent CLI flag: `--no-gpu`. |
|
||||
| `QMD_EMBED_PARALLELISM` | automatic | Override embedding/reranking context parallelism (1-8). Windows CUDA defaults to `1` because parallel CUDA contexts can crash with `ggml-cuda.cu:98`; use Vulkan or raise this only if your driver is stable. |
|
||||
|
||||
## How It Works
|
||||
|
||||
|
||||
162
bin/qmd
162
bin/qmd
@ -31,3 +31,165 @@ elif [ -x "$HOME/.bun/bin/bun" ]; then
|
||||
else
|
||||
exec node "$DIR/dist/cli/qmd.js" "$@"
|
||||
fi
|
||||
#!/usr/bin/env node
|
||||
// 2>/dev/null; if command -v node >/dev/null 2>&1; then exec node "$0" "$@"; else exec bun "$0" "$@"; fi
|
||||
// Cross-platform launcher for qmd.
|
||||
//
|
||||
// Previously this was a POSIX shell script with `#!/bin/sh`, which meant npm
|
||||
// on Windows generated shims that tried to route through `/bin/sh` — a path
|
||||
// that doesn't exist on Windows, so `qmd` failed immediately after a global
|
||||
// install. Rewriting the launcher in Node.js lets npm generate native
|
||||
// cmd/ps1/sh shims that invoke `node` directly on every platform.
|
||||
|
||||
import { spawn, spawnSync } from "node:child_process";
|
||||
import { existsSync, realpathSync } from "node:fs";
|
||||
import { dirname, resolve } from "node:path";
|
||||
import { fileURLToPath } from "node:url";
|
||||
|
||||
// Resolve symlinks so global installs (npm link / npm install -g) can find
|
||||
// the actual package directory instead of the global bin directory.
|
||||
const self = realpathSync(fileURLToPath(import.meta.url));
|
||||
const pkgDir = resolve(dirname(self), "..");
|
||||
const jsEntry = resolve(pkgDir, "dist/cli/qmd.js");
|
||||
const tsEntry = resolve(pkgDir, "src/cli/qmd.ts");
|
||||
|
||||
// MCP stdio reserves stdout exclusively for JSON-RPC frames. node-llama-cpp
|
||||
// / llama.cpp / ggml can write native logs directly to stdout before JS-level
|
||||
// log handlers are attached, so seed the native quiet env before Node/Bun imports
|
||||
// the CLI and its LLM modules. Preserve explicit user values when provided.
|
||||
if (process.argv[2] === "mcp") {
|
||||
process.env.LLAMA_LOG_LEVEL = process.env.LLAMA_LOG_LEVEL || "error";
|
||||
process.env.GGML_LOG_LEVEL = process.env.GGML_LOG_LEVEL || "error";
|
||||
process.env.GGML_BACKEND_SILENT = process.env.GGML_BACKEND_SILENT || "1";
|
||||
}
|
||||
|
||||
// libggml-metal on macOS uses "residency sets" to keep allocated model memory
|
||||
// resident across inference requests (180-second keep_alive timer). The
|
||||
// process-static device destructor that runs during libc exit() asserts the
|
||||
// residency set is empty (ggml-org/llama.cpp#22593); the keep_alive hasn't
|
||||
// expired by exit, so the assertion fails and ggml_abort dumps a multi-kB
|
||||
// stack trace to stderr even when the user-visible results were already
|
||||
// emitted correctly. No JS-side dispose can prevent it because the static
|
||||
// destructor runs in __cxa_finalize_ranges, after every JS-reachable cleanup.
|
||||
//
|
||||
// For QMD's short-lived CLI workflow, residency sets provide no observable
|
||||
// performance benefit (subsequent requests don't reuse the warm mapping —
|
||||
// measured: identical wall time with and without on M3 Pro), so disable them
|
||||
// by default on darwin. The env var must be set BEFORE the native llama.cpp
|
||||
// binding loads, which is why it lives here in the launcher rather than in
|
||||
// the JS entry point. Opt back in with QMD_METAL_KEEP_RESIDENCY=1 if you
|
||||
// run long-lived qmd processes (the MCP daemon may benefit on hot reload)
|
||||
// or are triaging an upstream Metal teardown fix.
|
||||
if (process.platform === "darwin" && process.env.QMD_METAL_KEEP_RESIDENCY !== "1") {
|
||||
process.env.GGML_METAL_NO_RESIDENCY = process.env.GGML_METAL_NO_RESIDENCY || "1";
|
||||
}
|
||||
|
||||
function hasBun() {
|
||||
try {
|
||||
const res = spawnSync("bun", ["--version"], { stdio: "ignore", shell: process.platform === "win32" });
|
||||
return res.status === 0;
|
||||
} catch {
|
||||
return false;
|
||||
}
|
||||
}
|
||||
|
||||
// In published packages, bin/qmd must run dist/. In a git checkout, however,
|
||||
// dist/ is often ignored and can be stale after git reset or branch switches.
|
||||
// Prefer source mode only for checkouts so ./bin/qmd reflects the checked-out
|
||||
// source without changing packaged/runtime behavior.
|
||||
//
|
||||
// Critical: source-mode detection must NOT trigger when a package manager
|
||||
// installed us. `pnpm install -g .` (and `npm install -g .`) copy the entire
|
||||
// working tree — including .git/, bun.lock, package-lock.json, src/, and even
|
||||
// node_modules/ — into <prefix>/node_modules/@tobilu/qmd/, so .git and a
|
||||
// lockfile being present is not a reliable "this is a working tree" signal.
|
||||
// What IS reliable: a package-manager install always lands the package
|
||||
// directory inside a `node_modules/` segment; a bare working-tree checkout
|
||||
// (with `bun link` or a direct path invocation) does not. Gate source mode
|
||||
// on that. Allow QMD_SOURCE_MODE=1 / =0 as an explicit override for the
|
||||
// rare case where the heuristic disagrees with the user.
|
||||
const sourceOverride = process.env.QMD_SOURCE_MODE;
|
||||
const looksInstalled = pkgDir.split("/").includes("node_modules");
|
||||
const sourceAllowed = sourceOverride === "1"
|
||||
|| (sourceOverride !== "0" && !looksInstalled);
|
||||
|
||||
let useSourceMode = false;
|
||||
let sourceRunner = null;
|
||||
let sourceArgs = [];
|
||||
|
||||
if (sourceAllowed && existsSync(resolve(pkgDir, ".git")) && existsSync(tsEntry)) {
|
||||
// Lockfile-driven runner selection — mirror the dist-mode logic below so
|
||||
// source mode picks the same runtime the user's deps were installed for.
|
||||
// package-lock.json wins over bun.lock when both are present: pnpm/npm
|
||||
// installs ship the Node-ABI native modules (better-sqlite3, sqlite-vec),
|
||||
// and running Bun against them produces ABI mismatches. This also fixes
|
||||
// pnpm-global installs, which copy the whole working tree — including .git
|
||||
// and bun.lock — into the install dir and used to route through Bun even
|
||||
// when the user installed via npm/pnpm.
|
||||
const hasNpmLock = existsSync(resolve(pkgDir, "package-lock.json"));
|
||||
const hasBunLock = existsSync(resolve(pkgDir, "bun.lock")) || existsSync(resolve(pkgDir, "bun.lockb"));
|
||||
const tsxEntry = resolve(pkgDir, "node_modules/tsx/dist/cli.mjs");
|
||||
const tsxAvailable = existsSync(tsxEntry);
|
||||
|
||||
if (hasNpmLock && tsxAvailable) {
|
||||
useSourceMode = true;
|
||||
sourceRunner = "node";
|
||||
sourceArgs = [tsxEntry, tsEntry, ...process.argv.slice(2)];
|
||||
} else if (hasBunLock && hasBun()) {
|
||||
useSourceMode = true;
|
||||
sourceRunner = "bun";
|
||||
sourceArgs = [tsEntry, ...process.argv.slice(2)];
|
||||
} else if (tsxAvailable) {
|
||||
useSourceMode = true;
|
||||
sourceRunner = "node";
|
||||
sourceArgs = [tsxEntry, tsEntry, ...process.argv.slice(2)];
|
||||
}
|
||||
}
|
||||
|
||||
if (!useSourceMode && !existsSync(jsEntry)) {
|
||||
console.error(`qmd is not built: missing ${jsEntry}`);
|
||||
console.error("Run: bun install && bun run build");
|
||||
console.error("Or: npm install && npm run build");
|
||||
console.error("After building, run: qmd doctor");
|
||||
process.exit(1);
|
||||
}
|
||||
|
||||
// Detect the package manager that installed dependencies by checking lockfiles.
|
||||
// $BUN_INSTALL is intentionally NOT checked — it only indicates that bun exists
|
||||
// on the system, not that it was used to install this package (see #361).
|
||||
//
|
||||
// package-lock.json takes priority: if it exists, npm installed the native
|
||||
// modules for Node. The repo ships bun.lock, so without this check, source
|
||||
// builds that use npm would be incorrectly routed to bun, causing ABI
|
||||
// mismatches with better-sqlite3 / sqlite-vec (see #381).
|
||||
let runnerName = "node";
|
||||
if (existsSync(resolve(pkgDir, "package-lock.json"))) {
|
||||
runnerName = "node";
|
||||
} else if (existsSync(resolve(pkgDir, "bun.lock")) || existsSync(resolve(pkgDir, "bun.lockb"))) {
|
||||
runnerName = "bun";
|
||||
} else {
|
||||
runnerName = "node";
|
||||
}
|
||||
|
||||
const runner = useSourceMode ? sourceRunner : (runnerName === "node" ? "node" : "bun");
|
||||
const args = useSourceMode ? sourceArgs : [jsEntry, ...process.argv.slice(2)];
|
||||
const needsShell = (runner === "bun") && process.platform === "win32";
|
||||
|
||||
const child = spawn(runner, args, {
|
||||
stdio: "inherit",
|
||||
shell: needsShell,
|
||||
});
|
||||
|
||||
child.on("exit", (code, signal) => {
|
||||
if (signal) {
|
||||
process.kill(process.pid, signal);
|
||||
} else {
|
||||
process.exit(code ?? 0);
|
||||
}
|
||||
});
|
||||
|
||||
child.on("error", (err) => {
|
||||
const name = useSourceMode ? sourceRunner : runnerName;
|
||||
console.error(`qmd: failed to launch ${name}: ${err.message}`);
|
||||
process.exit(1);
|
||||
});
|
||||
|
||||
38
bun.lock
38
bun.lock
@ -6,13 +6,17 @@
|
||||
"name": "2025-12-07-bm25-q",
|
||||
"dependencies": {
|
||||
"@modelcontextprotocol/sdk": "1.29.0",
|
||||
"better-sqlite3": "12.8.0",
|
||||
"better-sqlite3": "12.10.0",
|
||||
"fast-glob": "3.3.3",
|
||||
"node-llama-cpp": "3.18.1",
|
||||
"picomatch": "4.0.4",
|
||||
"sqlite-vec": "0.1.9",
|
||||
"web-tree-sitter": "0.26.7",
|
||||
"yaml": "2.8.3",
|
||||
"tree-sitter-go": "0.25.0",
|
||||
"tree-sitter-python": "0.25.0",
|
||||
"tree-sitter-rust": "0.24.0",
|
||||
"tree-sitter-typescript": "0.23.2",
|
||||
"web-tree-sitter": "0.26.8",
|
||||
"yaml": "2.9.0",
|
||||
"zod": "4.2.1",
|
||||
},
|
||||
"devDependencies": {
|
||||
@ -26,10 +30,6 @@
|
||||
"sqlite-vec-linux-arm64": "0.1.9",
|
||||
"sqlite-vec-linux-x64": "0.1.9",
|
||||
"sqlite-vec-windows-x64": "0.1.9",
|
||||
"tree-sitter-go": "0.23.4",
|
||||
"tree-sitter-python": "0.23.4",
|
||||
"tree-sitter-rust": "0.24.0",
|
||||
"tree-sitter-typescript": "0.23.2",
|
||||
},
|
||||
"peerDependencies": {
|
||||
"typescript": "^5.9.3",
|
||||
@ -247,7 +247,7 @@
|
||||
|
||||
"base64-js": ["base64-js@1.5.1", "", {}, "sha512-AKpaYlHn8t4SVbOHCy+b5+KKgvR4vrsD8vbvrbiQJps7fKDTkjkDry6ji0rUJjC0kzbNePLwzxq8iypo41qeWA=="],
|
||||
|
||||
"better-sqlite3": ["better-sqlite3@12.8.0", "", { "dependencies": { "bindings": "^1.5.0", "prebuild-install": "^7.1.1" } }, "sha512-RxD2Vd96sQDjQr20kdP+F+dK/1OUNiVOl200vKBZY8u0vTwysfolF6Hq+3ZK2+h8My9YvZhHsF+RSGZW2VYrPQ=="],
|
||||
"better-sqlite3": ["better-sqlite3@12.10.0", "", { "dependencies": { "bindings": "^1.5.0", "prebuild-install": "^7.1.1" } }, "sha512-CyzaZRQKyHkB2ZInfTTl2nvT33EbDpjkLEbE8/Zck3Ll6O0qqvuGdrJ45HgtH+HykRg88ITY3AdreBGN70aBSQ=="],
|
||||
|
||||
"bindings": ["bindings@1.5.0", "", { "dependencies": { "file-uri-to-path": "1.0.0" } }, "sha512-p2q/t/mhvuOj/UeLlV6566GD/guowlr0hHxClI0W9m7MWYkL1F0hLo+0Aexs9HSPCtR1SXQ0TD3MMKrXZajbiQ=="],
|
||||
|
||||
@ -401,7 +401,7 @@
|
||||
|
||||
"get-proto": ["get-proto@1.0.1", "", { "dependencies": { "dunder-proto": "^1.0.1", "es-object-atoms": "^1.0.0" } }, "sha512-sTSfBjoXBp89JvIKIefqw7U2CCebsc74kiY6awiGogKtoSGbgjYE/G/+l9sF3MWFPNc9IcoOC4ODfKHfxFmp0g=="],
|
||||
|
||||
"get-tsconfig": ["get-tsconfig@4.13.6", "", { "dependencies": { "resolve-pkg-maps": "^1.0.0" } }, "sha512-shZT/QMiSHc/YBLxxOkMtgSid5HFoauqCE3/exfsEcwg1WkeqjG+V40yBbBrsD+jW2HDXcs28xOfcbm2jI8Ddw=="],
|
||||
"get-tsconfig": ["get-tsconfig@4.14.0", "", { "dependencies": { "resolve-pkg-maps": "^1.0.0" } }, "sha512-yTb+8DXzDREzgvYmh6s9vHsSVCHeC0G3PI5bEXNBHtmshPnO+S5O7qgLEOn0I5QvMy6kpZN8K1NKGyilLb93wA=="],
|
||||
|
||||
"github-from-package": ["github-from-package@0.0.0", "", {}, "sha512-SyHy3T1v2NUXn29OsWdxmK6RwHD+vkj3v8en8AOBZ1wBQ/hCAQ5bAQTD02kW4W9tUp/3Qh6J8r9EvntiyCmOOw=="],
|
||||
|
||||
@ -509,7 +509,7 @@
|
||||
|
||||
"node-abi": ["node-abi@3.87.0", "", { "dependencies": { "semver": "^7.3.5" } }, "sha512-+CGM1L1CgmtheLcBuleyYOn7NWPVu0s0EJH2C4puxgEZb9h8QpR9G2dBfZJOAUhi7VQxuBPMd0hiISWcTyiYyQ=="],
|
||||
|
||||
"node-addon-api": ["node-addon-api@8.5.0", "", {}, "sha512-/bRZty2mXUIFY/xU5HLvveNHlswNJej+RnxBjOMkidWfwZzgTbPG1E3K5TOxRLOR+5hX7bSofy8yf1hZevMS8A=="],
|
||||
"node-addon-api": ["node-addon-api@8.7.0", "", {}, "sha512-9MdFxmkKaOYVTV+XVRG8ArDwwQ77XIgIPyKASB1k3JPq3M8fGQQQE3YpMOrKm6g//Ktx8ivZr8xo1Qmtqub+GA=="],
|
||||
|
||||
"node-api-headers": ["node-api-headers@1.8.0", "", {}, "sha512-jfnmiKWjRAGbdD1yQS28bknFM1tbHC1oucyuMPjmkEs+kpiu76aRs40WlTmBmyEgzDM76ge1DQ7XJ3R5deiVjQ=="],
|
||||
|
||||
@ -687,11 +687,11 @@
|
||||
|
||||
"toidentifier": ["toidentifier@1.0.1", "", {}, "sha512-o5sSPKEkg/DIQNmH43V0/uerLrpzVedkUh8tGNvaeXpfpuwjKenlSox/2O/BTlZUtEe+JG7s5YhEz608PlAHRA=="],
|
||||
|
||||
"tree-sitter-go": ["tree-sitter-go@0.23.4", "", { "dependencies": { "node-addon-api": "^8.2.1", "node-gyp-build": "^4.8.2" }, "peerDependencies": { "tree-sitter": "^0.21.1" }, "optionalPeers": ["tree-sitter"] }, "sha512-iQaHEs4yMa/hMo/ZCGqLfG61F0miinULU1fFh+GZreCRtKylFLtvn798ocCZjO2r/ungNZgAY1s1hPFyAwkc7w=="],
|
||||
"tree-sitter-go": ["tree-sitter-go@0.25.0", "", { "dependencies": { "node-addon-api": "^8.3.1", "node-gyp-build": "^4.8.4" }, "peerDependencies": { "tree-sitter": "^0.25.0" }, "optionalPeers": ["tree-sitter"] }, "sha512-APBc/Dq3xz/e35Xpkhb1blu5UgW+2E3RyGWawZSCNcbGwa7jhSQPS8KsUupuzBla8PCo8+lz9W/JDJjmfRa2tw=="],
|
||||
|
||||
"tree-sitter-javascript": ["tree-sitter-javascript@0.23.1", "", { "dependencies": { "node-addon-api": "^8.2.2", "node-gyp-build": "^4.8.2" }, "peerDependencies": { "tree-sitter": "^0.21.1" }, "optionalPeers": ["tree-sitter"] }, "sha512-/bnhbrTD9frUYHQTiYnPcxyHORIw157ERBa6dqzaKxvR/x3PC4Yzd+D1pZIMS6zNg2v3a8BZ0oK7jHqsQo9fWA=="],
|
||||
|
||||
"tree-sitter-python": ["tree-sitter-python@0.23.4", "", { "dependencies": { "node-addon-api": "^8.2.1", "node-gyp-build": "^4.8.2" }, "peerDependencies": { "tree-sitter": "^0.21.1" }, "optionalPeers": ["tree-sitter"] }, "sha512-MbmUAl7y5UCUWqHscHke7DdRDwQnVNMNKQYQc4Gq2p09j+fgPxaU8JVsuOI/0HD3BSEEe5k9j3xmdtIWbDtDgw=="],
|
||||
"tree-sitter-python": ["tree-sitter-python@0.25.0", "", { "dependencies": { "node-addon-api": "^8.5.0", "node-gyp-build": "^4.8.4" }, "peerDependencies": { "tree-sitter": "^0.25.0" }, "optionalPeers": ["tree-sitter"] }, "sha512-eCmJx6zQa35GxaCtQD+wXHOhYqBxEL+bp71W/s3fcDMu06MrtzkVXR437dRrCrbrDbyLuUDJpAgycs7ncngLXw=="],
|
||||
|
||||
"tree-sitter-rust": ["tree-sitter-rust@0.24.0", "", { "dependencies": { "node-addon-api": "^8.2.2", "node-gyp-build": "^4.8.4" }, "peerDependencies": { "tree-sitter": "^0.22.1" }, "optionalPeers": ["tree-sitter"] }, "sha512-NWemUDf629Tfc90Y0Z55zuwPCAHkLxWnMf2RznYu4iBkkrQl2o/CHGB7Cr52TyN5F1DAx8FmUnDtCy9iUkXZEQ=="],
|
||||
|
||||
@ -725,7 +725,7 @@
|
||||
|
||||
"vitest": ["vitest@3.2.4", "", { "dependencies": { "@types/chai": "^5.2.2", "@vitest/expect": "3.2.4", "@vitest/mocker": "3.2.4", "@vitest/pretty-format": "^3.2.4", "@vitest/runner": "3.2.4", "@vitest/snapshot": "3.2.4", "@vitest/spy": "3.2.4", "@vitest/utils": "3.2.4", "chai": "^5.2.0", "debug": "^4.4.1", "expect-type": "^1.2.1", "magic-string": "^0.30.17", "pathe": "^2.0.3", "picomatch": "^4.0.2", "std-env": "^3.9.0", "tinybench": "^2.9.0", "tinyexec": "^0.3.2", "tinyglobby": "^0.2.14", "tinypool": "^1.1.1", "tinyrainbow": "^2.0.0", "vite": "^5.0.0 || ^6.0.0 || ^7.0.0-0", "vite-node": "3.2.4", "why-is-node-running": "^2.3.0" }, "peerDependencies": { "@edge-runtime/vm": "*", "@types/debug": "^4.1.12", "@types/node": "^18.0.0 || ^20.0.0 || >=22.0.0", "@vitest/browser": "3.2.4", "@vitest/ui": "3.2.4", "happy-dom": "*", "jsdom": "*" }, "optionalPeers": ["@edge-runtime/vm", "@types/debug", "@types/node", "@vitest/browser", "@vitest/ui", "happy-dom", "jsdom"], "bin": { "vitest": "vitest.mjs" } }, "sha512-LUCP5ev3GURDysTWiP47wRRUpLKMOfPh+yKTx3kVIEiu5KOMeqzpnYNsKyOoVrULivR8tLcks4+lga33Whn90A=="],
|
||||
|
||||
"web-tree-sitter": ["web-tree-sitter@0.26.7", "", {}, "sha512-KiZhelTvBA/ziUHEO7Emb75cGVAq8iGZNabYaZm53Zpy50NsXyOW+xSHlwHt5CVg/TRPZBfeVLTTobF0LjFJ1w=="],
|
||||
"web-tree-sitter": ["web-tree-sitter@0.26.8", "", {}, "sha512-4sUwi7ZyOrIk5KLgYLkc2A/F0LFMQnBhfb+2Cdl7ik4ePJ6JD+fk4ofI2sA5eGawBKBaK4Vntt7Ww5KcEsay4A=="],
|
||||
|
||||
"which": ["which@6.0.1", "", { "dependencies": { "isexe": "^4.0.0" }, "bin": { "node-which": "bin/which.js" } }, "sha512-oGLe46MIrCRqX7ytPUf66EAYvdeMIZYn3WaocqqKZAxrBpkqHfL/qvTyJ/bTk5+AqHCjXmrv3CEWgy368zhRUg=="],
|
||||
|
||||
@ -739,7 +739,7 @@
|
||||
|
||||
"yallist": ["yallist@5.0.0", "", {}, "sha512-YgvUTfwqyc7UXVMrB+SImsVYSmTS8X/tSrtdNZMImM+n7+QTriRXyXim0mBrTXNeqzVF0KWGgHPeiyViFFrNDw=="],
|
||||
|
||||
"yaml": ["yaml@2.8.3", "", { "bin": { "yaml": "bin.mjs" } }, "sha512-AvbaCLOO2Otw/lW5bmh9d/WEdcDFdQp2Z2ZUH3pX9U2ihyUY0nvLv7J6TrWowklRGPYbB/IuIMfYgxaCPg5Bpg=="],
|
||||
"yaml": ["yaml@2.9.0", "", { "bin": { "yaml": "bin.mjs" } }, "sha512-2AvhNX3mb8zd6Zy7INTtSpl1F15HW6Wnqj0srWlkKLcpYl/gMIMJiyuGq2KeI2YFxUPjdlB+3Lc10seMLtL4cA=="],
|
||||
|
||||
"yargs": ["yargs@17.7.2", "", { "dependencies": { "cliui": "^8.0.1", "escalade": "^3.1.1", "get-caller-file": "^2.0.5", "require-directory": "^2.1.1", "string-width": "^4.2.3", "y18n": "^5.0.5", "yargs-parser": "^21.1.1" } }, "sha512-7dSzzRQ++CKnNI/krKnYRV7JKKPUXMEh61soaHKg9mrWEhzFWhFnxPxGl+69cD1Ou63C13NUPCnmIcrvqCuM6w=="],
|
||||
|
||||
@ -773,8 +773,6 @@
|
||||
|
||||
"micromatch/picomatch": ["picomatch@2.3.1", "", {}, "sha512-JU3teHTNjmE2VCGFzuY8EXzCDVwEqB2a8fsIvwaStHhAWJEeVd1o1QD80CU6+ZdEXXSLbSsuLwJjkCBWqRQUVA=="],
|
||||
|
||||
"node-llama-cpp/node-addon-api": ["node-addon-api@8.7.0", "", {}, "sha512-9MdFxmkKaOYVTV+XVRG8ArDwwQ77XIgIPyKASB1k3JPq3M8fGQQQE3YpMOrKm6g//Ktx8ivZr8xo1Qmtqub+GA=="],
|
||||
|
||||
"ora/cli-spinners": ["cli-spinners@3.4.0", "", {}, "sha512-bXfOC4QcT1tKXGorxL3wbJm6XJPDqEnij2gQ2m7ESQuE+/z9YFIWnl/5RpTiKWbMq3EVKR4fRLJGn6DVfu0mpw=="],
|
||||
|
||||
"postcss/nanoid": ["nanoid@3.3.11", "", { "bin": { "nanoid": "bin/nanoid.cjs" } }, "sha512-N8SpfPUnUp1bK+PMYW8qSWdl9U+wwNWI4QKxOYDy9JAro3WMX7p2OeVRF9v+347pnakNevPmiHhNmZ2HbFA76w=="],
|
||||
@ -793,9 +791,13 @@
|
||||
|
||||
"tinyglobby/picomatch": ["picomatch@4.0.3", "", {}, "sha512-5gTmgEY/sqK6gFXLIsQNH19lWb4ebPDLA4SdLP7dsWkIXHWlG66oPuVvXSGFPppYZz8ZDZq0dYYrbHfBCVUb1Q=="],
|
||||
|
||||
"vite/picomatch": ["picomatch@4.0.3", "", {}, "sha512-5gTmgEY/sqK6gFXLIsQNH19lWb4ebPDLA4SdLP7dsWkIXHWlG66oPuVvXSGFPppYZz8ZDZq0dYYrbHfBCVUb1Q=="],
|
||||
"tree-sitter-javascript/node-addon-api": ["node-addon-api@8.5.0", "", {}, "sha512-/bRZty2mXUIFY/xU5HLvveNHlswNJej+RnxBjOMkidWfwZzgTbPG1E3K5TOxRLOR+5hX7bSofy8yf1hZevMS8A=="],
|
||||
|
||||
"vitest/picomatch": ["picomatch@4.0.3", "", {}, "sha512-5gTmgEY/sqK6gFXLIsQNH19lWb4ebPDLA4SdLP7dsWkIXHWlG66oPuVvXSGFPppYZz8ZDZq0dYYrbHfBCVUb1Q=="],
|
||||
"tree-sitter-rust/node-addon-api": ["node-addon-api@8.5.0", "", {}, "sha512-/bRZty2mXUIFY/xU5HLvveNHlswNJej+RnxBjOMkidWfwZzgTbPG1E3K5TOxRLOR+5hX7bSofy8yf1hZevMS8A=="],
|
||||
|
||||
"tree-sitter-typescript/node-addon-api": ["node-addon-api@8.5.0", "", {}, "sha512-/bRZty2mXUIFY/xU5HLvveNHlswNJej+RnxBjOMkidWfwZzgTbPG1E3K5TOxRLOR+5hX7bSofy8yf1hZevMS8A=="],
|
||||
|
||||
"vite/picomatch": ["picomatch@4.0.3", "", {}, "sha512-5gTmgEY/sqK6gFXLIsQNH19lWb4ebPDLA4SdLP7dsWkIXHWlG66oPuVvXSGFPppYZz8ZDZq0dYYrbHfBCVUb1Q=="],
|
||||
|
||||
"wrap-ansi/ansi-styles": ["ansi-styles@4.3.0", "", { "dependencies": { "color-convert": "^2.0.1" } }, "sha512-zbB9rCJAT1rbjiVDb2hqKFHNYLxgtk8NURxZ3IZwD3F6NtxbXZQCnnSi1Lkx+IDohdPlFp222wVALIheZJQSEg=="],
|
||||
|
||||
|
||||
@ -127,26 +127,41 @@ Without intent, "performance" is ambiguous (web-perf? team health? fitness?). Wi
|
||||
- Empty lines are ignored
|
||||
- Leading/trailing whitespace is trimmed
|
||||
|
||||
## MCP/HTTP API
|
||||
## Scoping
|
||||
|
||||
The `query` tool accepts a query document:
|
||||
Restrict queries to specific collections with `-c` (CLI) or `collections` (MCP/SDK):
|
||||
|
||||
```json
|
||||
{
|
||||
"q": "lex: CAP theorem\nvec: consistency vs availability",
|
||||
"collections": ["docs"],
|
||||
"limit": 10
|
||||
}
|
||||
```bash
|
||||
# CLI — by collection name (see `qmd collection list`)
|
||||
qmd query -c docs "how does auth work"
|
||||
qmd query -c docs -c notes $'lex: auth\nvec: authentication flow'
|
||||
```
|
||||
|
||||
Or structured format:
|
||||
For MCP / HTTP, pass a plural `collections` array (OR match):
|
||||
|
||||
```json
|
||||
{ "searches": [ { "type": "lex", "query": "auth" } ], "collections": ["docs", "notes"] }
|
||||
```
|
||||
|
||||
`-c`/`collections` matches by collection name and works from any directory.
|
||||
Multiple values are OR-combined. Without scoping, all default-included collections
|
||||
are searched; collections marked excluded (`qmd collection exclude <name>`) are
|
||||
skipped unless explicitly named. In MCP the parameter is the plural `collections`
|
||||
array — a singular `collection` is silently ignored.
|
||||
|
||||
## MCP/HTTP API
|
||||
|
||||
The `query` tool (and the REST `/query` endpoint) accept a structured query with a
|
||||
`searches` array. There is no `q` string parameter — `searches` is required:
|
||||
|
||||
```json
|
||||
{
|
||||
"searches": [
|
||||
{ "type": "lex", "query": "CAP theorem" },
|
||||
{ "type": "vec", "query": "consistency vs availability" }
|
||||
]
|
||||
],
|
||||
"collections": ["docs"],
|
||||
"limit": 10
|
||||
}
|
||||
```
|
||||
|
||||
|
||||
@ -44,8 +44,8 @@
|
||||
});
|
||||
|
||||
nodeModulesHashes = {
|
||||
x86_64-linux = "sha256-D0ezO4vqq4iswcAMU2DCql9ZAQvh3me6N9aDB5roq4w=";
|
||||
aarch64-darwin = "sha256-qU+9KdR/nTocelyANS09I/4yaQ+7s1LvJNqB27IOK/c=";
|
||||
x86_64-linux = "sha256-sVXoNWIcx1RYRtRWB4F2j7x8/cabFBKq+plFhPU7tBc=";
|
||||
aarch64-darwin = "sha256-gDyJ5boyH44SeXlKo+W4G36GSUejyXP5PFvW+dFS1Mk=";
|
||||
|
||||
# Populate these on first build for additional hosts if/when needed.
|
||||
aarch64-linux = pkgs.lib.fakeHash;
|
||||
|
||||
35
package.json
35
package.json
@ -1,6 +1,6 @@
|
||||
{
|
||||
"name": "@tobilu/qmd",
|
||||
"version": "2.1.0",
|
||||
"version": "2.5.3",
|
||||
"description": "Query Markup Documents - On-device hybrid search for markdown files with BM25, vector search, and LLM reranking",
|
||||
"type": "module",
|
||||
"main": "dist/index.js",
|
||||
@ -17,13 +17,23 @@
|
||||
"files": [
|
||||
"bin/",
|
||||
"dist/",
|
||||
"skills/",
|
||||
"scripts/build.mjs",
|
||||
"scripts/check-package-grammars.mjs",
|
||||
"scripts/package-smoke.mjs",
|
||||
"scripts/test-all.mjs",
|
||||
"LICENSE",
|
||||
"CHANGELOG.md"
|
||||
],
|
||||
"scripts": {
|
||||
"prepare": "[ -d .git ] && ./scripts/install-hooks.sh || true",
|
||||
"build": "tsc -p tsconfig.build.json && printf '#!/usr/bin/env node\n' | cat - dist/cli/qmd.js > dist/cli/qmd.tmp && mv dist/cli/qmd.tmp dist/cli/qmd.js && chmod +x dist/cli/qmd.js",
|
||||
"test": "vitest run --reporter=verbose test/",
|
||||
"build": "node scripts/build.mjs",
|
||||
"test": "node scripts/test-all.mjs",
|
||||
"test:types": "node ./node_modules/typescript/bin/tsc -p tsconfig.build.json --noEmit",
|
||||
"test:node": "node ./node_modules/vitest/vitest.mjs run --reporter=verbose --testTimeout 60000",
|
||||
"test:bun": "bun test --timeout 60000 --preload ./src/test-preload.ts",
|
||||
"test:unit": "CI=true node ./node_modules/vitest/vitest.mjs run --reporter=verbose --testTimeout 60000 test/ && CI=true bun test --timeout 60000 --preload ./src/test-preload.ts test/",
|
||||
"test:package": "node scripts/package-smoke.mjs",
|
||||
"qmd": "tsx src/cli/qmd.ts",
|
||||
"index": "tsx src/cli/qmd.ts index",
|
||||
"vector": "tsx src/cli/qmd.ts vector",
|
||||
@ -31,7 +41,8 @@
|
||||
"vsearch": "tsx src/cli/qmd.ts vsearch",
|
||||
"rerank": "tsx src/cli/qmd.ts rerank",
|
||||
"inspector": "npx @modelcontextprotocol/inspector tsx src/cli/qmd.ts mcp",
|
||||
"release": "./scripts/release.sh"
|
||||
"release": "./scripts/release.sh",
|
||||
"smoke:package-grammars": "node scripts/check-package-grammars.mjs"
|
||||
},
|
||||
"publishConfig": {
|
||||
"access": "public"
|
||||
@ -46,13 +57,17 @@
|
||||
},
|
||||
"dependencies": {
|
||||
"@modelcontextprotocol/sdk": "1.29.0",
|
||||
"better-sqlite3": "12.8.0",
|
||||
"better-sqlite3": "12.10.0",
|
||||
"fast-glob": "3.3.3",
|
||||
"node-llama-cpp": "3.18.1",
|
||||
"picomatch": "4.0.4",
|
||||
"sqlite-vec": "0.1.9",
|
||||
"web-tree-sitter": "0.26.7",
|
||||
"yaml": "2.8.3",
|
||||
"tree-sitter-go": "0.25.0",
|
||||
"tree-sitter-python": "0.25.0",
|
||||
"tree-sitter-rust": "0.24.0",
|
||||
"tree-sitter-typescript": "0.23.2",
|
||||
"web-tree-sitter": "0.26.8",
|
||||
"yaml": "2.9.0",
|
||||
"zod": "4.2.1"
|
||||
},
|
||||
"optionalDependencies": {
|
||||
@ -60,11 +75,7 @@
|
||||
"sqlite-vec-darwin-x64": "0.1.9",
|
||||
"sqlite-vec-linux-arm64": "0.1.9",
|
||||
"sqlite-vec-linux-x64": "0.1.9",
|
||||
"sqlite-vec-windows-x64": "0.1.9",
|
||||
"tree-sitter-go": "0.23.4",
|
||||
"tree-sitter-python": "0.23.4",
|
||||
"tree-sitter-rust": "0.24.0",
|
||||
"tree-sitter-typescript": "0.23.2"
|
||||
"sqlite-vec-windows-x64": "0.1.9"
|
||||
},
|
||||
"devDependencies": {
|
||||
"@types/better-sqlite3": "7.6.13",
|
||||
|
||||
133
pnpm-lock.yaml
generated
133
pnpm-lock.yaml
generated
@ -12,8 +12,8 @@ importers:
|
||||
specifier: 1.29.0
|
||||
version: 1.29.0(zod@4.2.1)
|
||||
better-sqlite3:
|
||||
specifier: 12.8.0
|
||||
version: 12.8.0
|
||||
specifier: 12.10.0
|
||||
version: 12.10.0
|
||||
fast-glob:
|
||||
specifier: 3.3.3
|
||||
version: 3.3.3
|
||||
@ -26,15 +26,27 @@ importers:
|
||||
sqlite-vec:
|
||||
specifier: 0.1.9
|
||||
version: 0.1.9
|
||||
tree-sitter-go:
|
||||
specifier: 0.25.0
|
||||
version: 0.25.0
|
||||
tree-sitter-python:
|
||||
specifier: 0.25.0
|
||||
version: 0.25.0
|
||||
tree-sitter-rust:
|
||||
specifier: 0.24.0
|
||||
version: 0.24.0
|
||||
tree-sitter-typescript:
|
||||
specifier: 0.23.2
|
||||
version: 0.23.2
|
||||
typescript:
|
||||
specifier: ^5.9.3
|
||||
version: 5.9.3
|
||||
web-tree-sitter:
|
||||
specifier: 0.26.7
|
||||
version: 0.26.7
|
||||
specifier: 0.26.8
|
||||
version: 0.26.8
|
||||
yaml:
|
||||
specifier: 2.8.3
|
||||
version: 2.8.3
|
||||
specifier: 2.9.0
|
||||
version: 2.9.0
|
||||
zod:
|
||||
specifier: 4.2.1
|
||||
version: 4.2.1
|
||||
@ -47,7 +59,7 @@ importers:
|
||||
version: 4.21.0
|
||||
vitest:
|
||||
specifier: 3.2.4
|
||||
version: 3.2.4(@types/node@25.5.2)(tsx@4.21.0)(yaml@2.8.3)
|
||||
version: 3.2.4(@types/node@25.5.2)(tsx@4.21.0)(yaml@2.9.0)
|
||||
optionalDependencies:
|
||||
sqlite-vec-darwin-arm64:
|
||||
specifier: 0.1.9
|
||||
@ -64,18 +76,6 @@ importers:
|
||||
sqlite-vec-windows-x64:
|
||||
specifier: 0.1.9
|
||||
version: 0.1.9
|
||||
tree-sitter-go:
|
||||
specifier: 0.23.4
|
||||
version: 0.23.4
|
||||
tree-sitter-python:
|
||||
specifier: 0.23.4
|
||||
version: 0.23.4
|
||||
tree-sitter-rust:
|
||||
specifier: 0.24.0
|
||||
version: 0.24.0
|
||||
tree-sitter-typescript:
|
||||
specifier: 0.23.2
|
||||
version: 0.23.2
|
||||
|
||||
packages:
|
||||
|
||||
@ -273,36 +273,42 @@ packages:
|
||||
engines: {node: '>=20.0.0'}
|
||||
cpu: [arm64, x64]
|
||||
os: [linux]
|
||||
libc: [glibc]
|
||||
|
||||
'@node-llama-cpp/linux-armv7l@3.18.1':
|
||||
resolution: {integrity: sha512-BrJL2cGo0pN5xd5nw+CzTn2rFMpz9MJyZZPUY81ptGkF2uIuXT2hdCVh56i9ImQrTwBfq1YcZL/l/Qe/1+HR/Q==}
|
||||
engines: {node: '>=20.0.0'}
|
||||
cpu: [arm, x64]
|
||||
os: [linux]
|
||||
libc: [glibc]
|
||||
|
||||
'@node-llama-cpp/linux-x64-cuda-ext@3.18.1':
|
||||
resolution: {integrity: sha512-VqyKhAVHPCpFzh0f1koCBgpThL+04QOXwv0oDQ8s8YcpfMMOXQlBhTB0plgTh0HrPExoObfTS4ohkrbyGgmztQ==}
|
||||
engines: {node: '>=20.0.0'}
|
||||
cpu: [x64]
|
||||
os: [linux]
|
||||
libc: [glibc]
|
||||
|
||||
'@node-llama-cpp/linux-x64-cuda@3.18.1':
|
||||
resolution: {integrity: sha512-qOaYP4uwsUoBHQ/7xSOvyJIuXapS57Al+Sudgi00f96ldNZLKe1vuSGptAi5LTM2lIj66PKm6h8PlRWctwsZ2g==}
|
||||
engines: {node: '>=20.0.0'}
|
||||
cpu: [x64]
|
||||
os: [linux]
|
||||
libc: [glibc]
|
||||
|
||||
'@node-llama-cpp/linux-x64-vulkan@3.18.1':
|
||||
resolution: {integrity: sha512-SIaNTK5pUPhwJD0gmiQfHa8OrRctVMmnqu+slJrz2Mzgg/XrwFndJlS9hvc+jSjTXCouwf7sYeQaaJWvQgBh/A==}
|
||||
engines: {node: '>=20.0.0'}
|
||||
cpu: [x64]
|
||||
os: [linux]
|
||||
libc: [glibc]
|
||||
|
||||
'@node-llama-cpp/linux-x64@3.18.1':
|
||||
resolution: {integrity: sha512-tRmWcsyvAcqJHQHXHsaOkx6muGbcirA9nRdNgH6n7bjGUw4VuoBD3dChyNF3/Ktt7ohB9kz+XhhyZjbDHpXyMA==}
|
||||
engines: {node: '>=20.0.0'}
|
||||
cpu: [x64]
|
||||
os: [linux]
|
||||
libc: [glibc]
|
||||
|
||||
'@node-llama-cpp/mac-arm64-metal@3.18.1':
|
||||
resolution: {integrity: sha512-cyZTdsUMlvuRlGmkkoBbN3v/DT6NuruEqoQYd9CqIrPyLa1xLNBTSKIZ9SgRnw23iCOj4URfITvRP+2pu63LuQ==}
|
||||
@ -375,24 +381,28 @@ packages:
|
||||
engines: {node: '>= 10'}
|
||||
cpu: [arm64]
|
||||
os: [linux]
|
||||
libc: [glibc]
|
||||
|
||||
'@reflink/reflink-linux-arm64-musl@0.1.19':
|
||||
resolution: {integrity: sha512-37iO/Dp6m5DDaC2sf3zPtx/hl9FV3Xze4xoYidrxxS9bgP3S8ALroxRK6xBG/1TtfXKTvolvp+IjrUU6ujIGmA==}
|
||||
engines: {node: '>= 10'}
|
||||
cpu: [arm64]
|
||||
os: [linux]
|
||||
libc: [musl]
|
||||
|
||||
'@reflink/reflink-linux-x64-gnu@0.1.19':
|
||||
resolution: {integrity: sha512-jbI8jvuYCaA3MVUdu8vLoLAFqC+iNMpiSuLbxlAgg7x3K5bsS8nOpTRnkLF7vISJ+rVR8W+7ThXlXlUQ93ulkw==}
|
||||
engines: {node: '>= 10'}
|
||||
cpu: [x64]
|
||||
os: [linux]
|
||||
libc: [glibc]
|
||||
|
||||
'@reflink/reflink-linux-x64-musl@0.1.19':
|
||||
resolution: {integrity: sha512-e9FBWDe+lv7QKAwtKOt6A2W/fyy/aEEfr0g6j/hWzvQcrzHCsz07BNQYlNOjTfeytrtLU7k449H1PI95jA4OjQ==}
|
||||
engines: {node: '>= 10'}
|
||||
cpu: [x64]
|
||||
os: [linux]
|
||||
libc: [musl]
|
||||
|
||||
'@reflink/reflink-win32-arm64-msvc@0.1.19':
|
||||
resolution: {integrity: sha512-09PxnVIQcd+UOn4WAW73WU6PXL7DwGS6wPlkMhMg2zlHHG65F3vHepOw06HFCq+N42qkaNAc8AKIabWvtk6cIQ==}
|
||||
@ -444,66 +454,79 @@ packages:
|
||||
resolution: {integrity: sha512-L+34Qqil+v5uC0zEubW7uByo78WOCIrBvci69E7sFASRl0X7b/MB6Cqd1lky/CtcSVTydWa2WZwFuWexjS5o6g==}
|
||||
cpu: [arm]
|
||||
os: [linux]
|
||||
libc: [glibc]
|
||||
|
||||
'@rollup/rollup-linux-arm-musleabihf@4.60.1':
|
||||
resolution: {integrity: sha512-n83O8rt4v34hgFzlkb1ycniJh7IR5RCIqt6mz1VRJD6pmhRi0CXdmfnLu9dIUS6buzh60IvACM842Ffb3xd6Gg==}
|
||||
cpu: [arm]
|
||||
os: [linux]
|
||||
libc: [musl]
|
||||
|
||||
'@rollup/rollup-linux-arm64-gnu@4.60.1':
|
||||
resolution: {integrity: sha512-Nql7sTeAzhTAja3QXeAI48+/+GjBJ+QmAH13snn0AJSNL50JsDqotyudHyMbO2RbJkskbMbFJfIJKWA6R1LCJQ==}
|
||||
cpu: [arm64]
|
||||
os: [linux]
|
||||
libc: [glibc]
|
||||
|
||||
'@rollup/rollup-linux-arm64-musl@4.60.1':
|
||||
resolution: {integrity: sha512-+pUymDhd0ys9GcKZPPWlFiZ67sTWV5UU6zOJat02M1+PiuSGDziyRuI/pPue3hoUwm2uGfxdL+trT6Z9rxnlMA==}
|
||||
cpu: [arm64]
|
||||
os: [linux]
|
||||
libc: [musl]
|
||||
|
||||
'@rollup/rollup-linux-loong64-gnu@4.60.1':
|
||||
resolution: {integrity: sha512-VSvgvQeIcsEvY4bKDHEDWcpW4Yw7BtlKG1GUT4FzBUlEKQK0rWHYBqQt6Fm2taXS+1bXvJT6kICu5ZwqKCnvlQ==}
|
||||
cpu: [loong64]
|
||||
os: [linux]
|
||||
libc: [glibc]
|
||||
|
||||
'@rollup/rollup-linux-loong64-musl@4.60.1':
|
||||
resolution: {integrity: sha512-4LqhUomJqwe641gsPp6xLfhqWMbQV04KtPp7/dIp0nzPxAkNY1AbwL5W0MQpcalLYk07vaW9Kp1PBhdpZYYcEw==}
|
||||
cpu: [loong64]
|
||||
os: [linux]
|
||||
libc: [musl]
|
||||
|
||||
'@rollup/rollup-linux-ppc64-gnu@4.60.1':
|
||||
resolution: {integrity: sha512-tLQQ9aPvkBxOc/EUT6j3pyeMD6Hb8QF2BTBnCQWP/uu1lhc9AIrIjKnLYMEroIz/JvtGYgI9dF3AxHZNaEH0rw==}
|
||||
cpu: [ppc64]
|
||||
os: [linux]
|
||||
libc: [glibc]
|
||||
|
||||
'@rollup/rollup-linux-ppc64-musl@4.60.1':
|
||||
resolution: {integrity: sha512-RMxFhJwc9fSXP6PqmAz4cbv3kAyvD1etJFjTx4ONqFP9DkTkXsAMU4v3Vyc5BgzC+anz7nS/9tp4obsKfqkDHg==}
|
||||
cpu: [ppc64]
|
||||
os: [linux]
|
||||
libc: [musl]
|
||||
|
||||
'@rollup/rollup-linux-riscv64-gnu@4.60.1':
|
||||
resolution: {integrity: sha512-QKgFl+Yc1eEk6MmOBfRHYF6lTxiiiV3/z/BRrbSiW2I7AFTXoBFvdMEyglohPj//2mZS4hDOqeB0H1ACh3sBbg==}
|
||||
cpu: [riscv64]
|
||||
os: [linux]
|
||||
libc: [glibc]
|
||||
|
||||
'@rollup/rollup-linux-riscv64-musl@4.60.1':
|
||||
resolution: {integrity: sha512-RAjXjP/8c6ZtzatZcA1RaQr6O1TRhzC+adn8YZDnChliZHviqIjmvFwHcxi4JKPSDAt6Uhf/7vqcBzQJy0PDJg==}
|
||||
cpu: [riscv64]
|
||||
os: [linux]
|
||||
libc: [musl]
|
||||
|
||||
'@rollup/rollup-linux-s390x-gnu@4.60.1':
|
||||
resolution: {integrity: sha512-wcuocpaOlaL1COBYiA89O6yfjlp3RwKDeTIA0hM7OpmhR1Bjo9j31G1uQVpDlTvwxGn2nQs65fBFL5UFd76FcQ==}
|
||||
cpu: [s390x]
|
||||
os: [linux]
|
||||
libc: [glibc]
|
||||
|
||||
'@rollup/rollup-linux-x64-gnu@4.60.1':
|
||||
resolution: {integrity: sha512-77PpsFQUCOiZR9+LQEFg9GClyfkNXj1MP6wRnzYs0EeWbPcHs02AXu4xuUbM1zhwn3wqaizle3AEYg5aeoohhg==}
|
||||
cpu: [x64]
|
||||
os: [linux]
|
||||
libc: [glibc]
|
||||
|
||||
'@rollup/rollup-linux-x64-musl@4.60.1':
|
||||
resolution: {integrity: sha512-5cIATbk5vynAjqqmyBjlciMJl1+R/CwX9oLk/EyiFXDWd95KpHdrOJT//rnUl4cUcskrd0jCCw3wpZnhIHdD9w==}
|
||||
cpu: [x64]
|
||||
os: [linux]
|
||||
libc: [musl]
|
||||
|
||||
'@rollup/rollup-openbsd-x64@4.60.1':
|
||||
resolution: {integrity: sha512-cl0w09WsCi17mcmWqqglez9Gk8isgeWvoUZ3WiJFYSR3zjBQc2J5/ihSjpl+VLjPqjQ/1hJRcqBfLjssREQILw==}
|
||||
@ -628,9 +651,9 @@ packages:
|
||||
base64-js@1.5.1:
|
||||
resolution: {integrity: sha512-AKpaYlHn8t4SVbOHCy+b5+KKgvR4vrsD8vbvrbiQJps7fKDTkjkDry6ji0rUJjC0kzbNePLwzxq8iypo41qeWA==}
|
||||
|
||||
better-sqlite3@12.8.0:
|
||||
resolution: {integrity: sha512-RxD2Vd96sQDjQr20kdP+F+dK/1OUNiVOl200vKBZY8u0vTwysfolF6Hq+3ZK2+h8My9YvZhHsF+RSGZW2VYrPQ==}
|
||||
engines: {node: 20.x || 22.x || 23.x || 24.x || 25.x}
|
||||
better-sqlite3@12.10.0:
|
||||
resolution: {integrity: sha512-CyzaZRQKyHkB2ZInfTTl2nvT33EbDpjkLEbE8/Zck3Ll6O0qqvuGdrJ45HgtH+HykRg88ITY3AdreBGN70aBSQ==}
|
||||
engines: {node: 20.x || 22.x || 23.x || 24.x || 25.x || 26.x}
|
||||
|
||||
bindings@1.5.0:
|
||||
resolution: {integrity: sha512-p2q/t/mhvuOj/UeLlV6566GD/guowlr0hHxClI0W9m7MWYkL1F0hLo+0Aexs9HSPCtR1SXQ0TD3MMKrXZajbiQ==}
|
||||
@ -943,8 +966,8 @@ packages:
|
||||
resolution: {integrity: sha512-sTSfBjoXBp89JvIKIefqw7U2CCebsc74kiY6awiGogKtoSGbgjYE/G/+l9sF3MWFPNc9IcoOC4ODfKHfxFmp0g==}
|
||||
engines: {node: '>= 0.4'}
|
||||
|
||||
get-tsconfig@4.13.7:
|
||||
resolution: {integrity: sha512-7tN6rFgBlMgpBML5j8typ92BKFi2sFQvIdpAqLA2beia5avZDrMs0FLZiM5etShWq5irVyGcGMEA1jcDaK7A/Q==}
|
||||
get-tsconfig@4.14.0:
|
||||
resolution: {integrity: sha512-yTb+8DXzDREzgvYmh6s9vHsSVCHeC0G3PI5bEXNBHtmshPnO+S5O7qgLEOn0I5QvMy6kpZN8K1NKGyilLb93wA==}
|
||||
|
||||
github-from-package@0.0.0:
|
||||
resolution: {integrity: sha512-SyHy3T1v2NUXn29OsWdxmK6RwHD+vkj3v8en8AOBZ1wBQ/hCAQ5bAQTD02kW4W9tUp/3Qh6J8r9EvntiyCmOOw==}
|
||||
@ -1536,10 +1559,10 @@ packages:
|
||||
resolution: {integrity: sha512-o5sSPKEkg/DIQNmH43V0/uerLrpzVedkUh8tGNvaeXpfpuwjKenlSox/2O/BTlZUtEe+JG7s5YhEz608PlAHRA==}
|
||||
engines: {node: '>=0.6'}
|
||||
|
||||
tree-sitter-go@0.23.4:
|
||||
resolution: {integrity: sha512-iQaHEs4yMa/hMo/ZCGqLfG61F0miinULU1fFh+GZreCRtKylFLtvn798ocCZjO2r/ungNZgAY1s1hPFyAwkc7w==}
|
||||
tree-sitter-go@0.25.0:
|
||||
resolution: {integrity: sha512-APBc/Dq3xz/e35Xpkhb1blu5UgW+2E3RyGWawZSCNcbGwa7jhSQPS8KsUupuzBla8PCo8+lz9W/JDJjmfRa2tw==}
|
||||
peerDependencies:
|
||||
tree-sitter: ^0.21.1
|
||||
tree-sitter: ^0.25.0
|
||||
peerDependenciesMeta:
|
||||
tree-sitter:
|
||||
optional: true
|
||||
@ -1552,10 +1575,10 @@ packages:
|
||||
tree-sitter:
|
||||
optional: true
|
||||
|
||||
tree-sitter-python@0.23.4:
|
||||
resolution: {integrity: sha512-MbmUAl7y5UCUWqHscHke7DdRDwQnVNMNKQYQc4Gq2p09j+fgPxaU8JVsuOI/0HD3BSEEe5k9j3xmdtIWbDtDgw==}
|
||||
tree-sitter-python@0.25.0:
|
||||
resolution: {integrity: sha512-eCmJx6zQa35GxaCtQD+wXHOhYqBxEL+bp71W/s3fcDMu06MrtzkVXR437dRrCrbrDbyLuUDJpAgycs7ncngLXw==}
|
||||
peerDependencies:
|
||||
tree-sitter: ^0.21.1
|
||||
tree-sitter: ^0.25.0
|
||||
peerDependenciesMeta:
|
||||
tree-sitter:
|
||||
optional: true
|
||||
@ -1691,8 +1714,8 @@ packages:
|
||||
jsdom:
|
||||
optional: true
|
||||
|
||||
web-tree-sitter@0.26.7:
|
||||
resolution: {integrity: sha512-KiZhelTvBA/ziUHEO7Emb75cGVAq8iGZNabYaZm53Zpy50NsXyOW+xSHlwHt5CVg/TRPZBfeVLTTobF0LjFJ1w==}
|
||||
web-tree-sitter@0.26.8:
|
||||
resolution: {integrity: sha512-4sUwi7ZyOrIk5KLgYLkc2A/F0LFMQnBhfb+2Cdl7ik4ePJ6JD+fk4ofI2sA5eGawBKBaK4Vntt7Ww5KcEsay4A==}
|
||||
|
||||
which@2.0.2:
|
||||
resolution: {integrity: sha512-BLI3Tl1TW3Pvl70l3yq3Y64i+awpwXqsGBYWkkqMtnbXgrMD+yj7rhW0kuEDxzJaYXGjEW5ogapKNMEKNMjibA==}
|
||||
@ -1724,8 +1747,8 @@ packages:
|
||||
resolution: {integrity: sha512-YgvUTfwqyc7UXVMrB+SImsVYSmTS8X/tSrtdNZMImM+n7+QTriRXyXim0mBrTXNeqzVF0KWGgHPeiyViFFrNDw==}
|
||||
engines: {node: '>=18'}
|
||||
|
||||
yaml@2.8.3:
|
||||
resolution: {integrity: sha512-AvbaCLOO2Otw/lW5bmh9d/WEdcDFdQp2Z2ZUH3pX9U2ihyUY0nvLv7J6TrWowklRGPYbB/IuIMfYgxaCPg5Bpg==}
|
||||
yaml@2.9.0:
|
||||
resolution: {integrity: sha512-2AvhNX3mb8zd6Zy7INTtSpl1F15HW6Wnqj0srWlkKLcpYl/gMIMJiyuGq2KeI2YFxUPjdlB+3Lc10seMLtL4cA==}
|
||||
engines: {node: '>= 14.6'}
|
||||
hasBin: true
|
||||
|
||||
@ -2060,13 +2083,13 @@ snapshots:
|
||||
chai: 5.3.3
|
||||
tinyrainbow: 2.0.0
|
||||
|
||||
'@vitest/mocker@3.2.4(vite@7.3.2(@types/node@25.5.2)(tsx@4.21.0)(yaml@2.8.3))':
|
||||
'@vitest/mocker@3.2.4(vite@7.3.2(@types/node@25.5.2)(tsx@4.21.0)(yaml@2.9.0))':
|
||||
dependencies:
|
||||
'@vitest/spy': 3.2.4
|
||||
estree-walker: 3.0.3
|
||||
magic-string: 0.30.21
|
||||
optionalDependencies:
|
||||
vite: 7.3.2(@types/node@25.5.2)(tsx@4.21.0)(yaml@2.8.3)
|
||||
vite: 7.3.2(@types/node@25.5.2)(tsx@4.21.0)(yaml@2.9.0)
|
||||
|
||||
'@vitest/pretty-format@3.2.4':
|
||||
dependencies:
|
||||
@ -2130,7 +2153,7 @@ snapshots:
|
||||
|
||||
base64-js@1.5.1: {}
|
||||
|
||||
better-sqlite3@12.8.0:
|
||||
better-sqlite3@12.10.0:
|
||||
dependencies:
|
||||
bindings: 1.5.0
|
||||
prebuild-install: 7.1.3
|
||||
@ -2474,7 +2497,7 @@ snapshots:
|
||||
dunder-proto: 1.0.1
|
||||
es-object-atoms: 1.1.1
|
||||
|
||||
get-tsconfig@4.13.7:
|
||||
get-tsconfig@4.14.0:
|
||||
dependencies:
|
||||
resolve-pkg-maps: 1.0.0
|
||||
|
||||
@ -2654,8 +2677,7 @@ snapshots:
|
||||
|
||||
node-api-headers@1.8.0: {}
|
||||
|
||||
node-gyp-build@4.8.4:
|
||||
optional: true
|
||||
node-gyp-build@4.8.4: {}
|
||||
|
||||
node-llama-cpp@3.18.1(typescript@5.9.3):
|
||||
dependencies:
|
||||
@ -3113,41 +3135,36 @@ snapshots:
|
||||
|
||||
toidentifier@1.0.1: {}
|
||||
|
||||
tree-sitter-go@0.23.4:
|
||||
tree-sitter-go@0.25.0:
|
||||
dependencies:
|
||||
node-addon-api: 8.7.0
|
||||
node-gyp-build: 4.8.4
|
||||
optional: true
|
||||
|
||||
tree-sitter-javascript@0.23.1:
|
||||
dependencies:
|
||||
node-addon-api: 8.7.0
|
||||
node-gyp-build: 4.8.4
|
||||
optional: true
|
||||
|
||||
tree-sitter-python@0.23.4:
|
||||
tree-sitter-python@0.25.0:
|
||||
dependencies:
|
||||
node-addon-api: 8.7.0
|
||||
node-gyp-build: 4.8.4
|
||||
optional: true
|
||||
|
||||
tree-sitter-rust@0.24.0:
|
||||
dependencies:
|
||||
node-addon-api: 8.7.0
|
||||
node-gyp-build: 4.8.4
|
||||
optional: true
|
||||
|
||||
tree-sitter-typescript@0.23.2:
|
||||
dependencies:
|
||||
node-addon-api: 8.7.0
|
||||
node-gyp-build: 4.8.4
|
||||
tree-sitter-javascript: 0.23.1
|
||||
optional: true
|
||||
|
||||
tsx@4.21.0:
|
||||
dependencies:
|
||||
esbuild: 0.27.7
|
||||
get-tsconfig: 4.13.7
|
||||
get-tsconfig: 4.14.0
|
||||
optionalDependencies:
|
||||
fsevents: 2.3.3
|
||||
|
||||
@ -3177,13 +3194,13 @@ snapshots:
|
||||
|
||||
vary@1.1.2: {}
|
||||
|
||||
vite-node@3.2.4(@types/node@25.5.2)(tsx@4.21.0)(yaml@2.8.3):
|
||||
vite-node@3.2.4(@types/node@25.5.2)(tsx@4.21.0)(yaml@2.9.0):
|
||||
dependencies:
|
||||
cac: 6.7.14
|
||||
debug: 4.4.3
|
||||
es-module-lexer: 1.7.0
|
||||
pathe: 2.0.3
|
||||
vite: 7.3.2(@types/node@25.5.2)(tsx@4.21.0)(yaml@2.8.3)
|
||||
vite: 7.3.2(@types/node@25.5.2)(tsx@4.21.0)(yaml@2.9.0)
|
||||
transitivePeerDependencies:
|
||||
- '@types/node'
|
||||
- jiti
|
||||
@ -3198,7 +3215,7 @@ snapshots:
|
||||
- tsx
|
||||
- yaml
|
||||
|
||||
vite@7.3.2(@types/node@25.5.2)(tsx@4.21.0)(yaml@2.8.3):
|
||||
vite@7.3.2(@types/node@25.5.2)(tsx@4.21.0)(yaml@2.9.0):
|
||||
dependencies:
|
||||
esbuild: 0.27.7
|
||||
fdir: 6.5.0(picomatch@4.0.4)
|
||||
@ -3210,13 +3227,13 @@ snapshots:
|
||||
'@types/node': 25.5.2
|
||||
fsevents: 2.3.3
|
||||
tsx: 4.21.0
|
||||
yaml: 2.8.3
|
||||
yaml: 2.9.0
|
||||
|
||||
vitest@3.2.4(@types/node@25.5.2)(tsx@4.21.0)(yaml@2.8.3):
|
||||
vitest@3.2.4(@types/node@25.5.2)(tsx@4.21.0)(yaml@2.9.0):
|
||||
dependencies:
|
||||
'@types/chai': 5.2.3
|
||||
'@vitest/expect': 3.2.4
|
||||
'@vitest/mocker': 3.2.4(vite@7.3.2(@types/node@25.5.2)(tsx@4.21.0)(yaml@2.8.3))
|
||||
'@vitest/mocker': 3.2.4(vite@7.3.2(@types/node@25.5.2)(tsx@4.21.0)(yaml@2.9.0))
|
||||
'@vitest/pretty-format': 3.2.4
|
||||
'@vitest/runner': 3.2.4
|
||||
'@vitest/snapshot': 3.2.4
|
||||
@ -3234,8 +3251,8 @@ snapshots:
|
||||
tinyglobby: 0.2.15
|
||||
tinypool: 1.1.1
|
||||
tinyrainbow: 2.0.0
|
||||
vite: 7.3.2(@types/node@25.5.2)(tsx@4.21.0)(yaml@2.8.3)
|
||||
vite-node: 3.2.4(@types/node@25.5.2)(tsx@4.21.0)(yaml@2.8.3)
|
||||
vite: 7.3.2(@types/node@25.5.2)(tsx@4.21.0)(yaml@2.9.0)
|
||||
vite-node: 3.2.4(@types/node@25.5.2)(tsx@4.21.0)(yaml@2.9.0)
|
||||
why-is-node-running: 2.3.0
|
||||
optionalDependencies:
|
||||
'@types/node': 25.5.2
|
||||
@ -3253,7 +3270,7 @@ snapshots:
|
||||
- tsx
|
||||
- yaml
|
||||
|
||||
web-tree-sitter@0.26.7: {}
|
||||
web-tree-sitter@0.26.8: {}
|
||||
|
||||
which@2.0.2:
|
||||
dependencies:
|
||||
@ -3280,7 +3297,7 @@ snapshots:
|
||||
|
||||
yallist@5.0.0: {}
|
||||
|
||||
yaml@2.8.3: {}
|
||||
yaml@2.9.0: {}
|
||||
|
||||
yargs-parser@21.1.1: {}
|
||||
|
||||
|
||||
29
scripts/build.mjs
Normal file
29
scripts/build.mjs
Normal file
@ -0,0 +1,29 @@
|
||||
#!/usr/bin/env node
|
||||
import { spawnSync } from "node:child_process";
|
||||
import { chmodSync, readFileSync, renameSync, writeFileSync } from "node:fs";
|
||||
import { join } from "node:path";
|
||||
import { fileURLToPath } from "node:url";
|
||||
|
||||
const root = join(fileURLToPath(new URL("..", import.meta.url)));
|
||||
|
||||
function run(command, args, options = {}) {
|
||||
const result = spawnSync(command, args, {
|
||||
cwd: root,
|
||||
stdio: "inherit",
|
||||
shell: process.platform === "win32",
|
||||
...options,
|
||||
});
|
||||
if (result.status !== 0) {
|
||||
process.exit(result.status ?? 1);
|
||||
}
|
||||
}
|
||||
|
||||
run(process.execPath, [join(root, "node_modules", "typescript", "bin", "tsc"), "-p", "tsconfig.build.json"]);
|
||||
|
||||
const cliPath = join(root, "dist", "cli", "qmd.js");
|
||||
const tmpPath = `${cliPath}.tmp`;
|
||||
const built = readFileSync(cliPath, "utf8");
|
||||
const withoutExistingShebang = built.startsWith("#!") ? built.slice(built.indexOf("\n") + 1) : built;
|
||||
writeFileSync(tmpPath, `#!/usr/bin/env node\n${withoutExistingShebang}`);
|
||||
renameSync(tmpPath, cliPath);
|
||||
chmodSync(cliPath, 0o755);
|
||||
29
scripts/check-package-grammars.mjs
Normal file
29
scripts/check-package-grammars.mjs
Normal file
@ -0,0 +1,29 @@
|
||||
#!/usr/bin/env node
|
||||
import { createRequire } from "node:module";
|
||||
|
||||
const require = createRequire(import.meta.url);
|
||||
|
||||
const grammars = [
|
||||
"tree-sitter-typescript/tree-sitter-typescript.wasm",
|
||||
"tree-sitter-typescript/tree-sitter-tsx.wasm",
|
||||
"tree-sitter-python/tree-sitter-python.wasm",
|
||||
"tree-sitter-go/tree-sitter-go.wasm",
|
||||
"tree-sitter-rust/tree-sitter-rust.wasm",
|
||||
];
|
||||
|
||||
let ok = true;
|
||||
for (const grammar of grammars) {
|
||||
try {
|
||||
const resolved = require.resolve(grammar);
|
||||
console.log(`ok ${grammar} -> ${resolved}`);
|
||||
} catch (err) {
|
||||
ok = false;
|
||||
console.error(`missing ${grammar}`);
|
||||
console.error(err instanceof Error ? err.message : String(err));
|
||||
}
|
||||
}
|
||||
|
||||
if (!ok) {
|
||||
console.error("\nAST grammar package smoke check failed. Run `bun install` locally or repair a broken global install with the matching `bun add tree-sitter-...@<version>` command shown by `qmd status`.");
|
||||
process.exit(1);
|
||||
}
|
||||
65
scripts/package-smoke.mjs
Normal file
65
scripts/package-smoke.mjs
Normal file
@ -0,0 +1,65 @@
|
||||
#!/usr/bin/env node
|
||||
import { spawnSync } from "node:child_process";
|
||||
import { existsSync, readFileSync, statSync } from "node:fs";
|
||||
import { join } from "node:path";
|
||||
import { fileURLToPath } from "node:url";
|
||||
|
||||
const root = fileURLToPath(new URL("..", import.meta.url));
|
||||
const pkg = JSON.parse(readFileSync(join(root, "package.json"), "utf8"));
|
||||
|
||||
function run(label, command, args, options = {}) {
|
||||
console.log(`==> ${label}`);
|
||||
const { quiet, ...spawnOptions } = options;
|
||||
const result = spawnSync(command, args, {
|
||||
cwd: root,
|
||||
stdio: quiet ? "pipe" : "inherit",
|
||||
shell: process.platform === "win32",
|
||||
...spawnOptions,
|
||||
});
|
||||
if (result.status !== 0) {
|
||||
console.error(`Package smoke failed: ${label}`);
|
||||
if (quiet) {
|
||||
if (result.stdout) process.stderr.write(result.stdout);
|
||||
if (result.stderr) process.stderr.write(result.stderr);
|
||||
}
|
||||
process.exit(result.status ?? 1);
|
||||
}
|
||||
}
|
||||
|
||||
function assertPath(path, label = path) {
|
||||
const full = join(root, path);
|
||||
if (!existsSync(full)) {
|
||||
console.error(`Package smoke failed: missing ${label} (${path})`);
|
||||
process.exit(1);
|
||||
}
|
||||
return full;
|
||||
}
|
||||
|
||||
run("build compiled package", process.execPath, ["scripts/build.mjs"]);
|
||||
run("AST grammar runtime packages", process.execPath, ["scripts/check-package-grammars.mjs"]);
|
||||
|
||||
for (const entry of pkg.files ?? []) {
|
||||
assertPath(entry.replace(/\/$/, ""), `package.json files[] entry ${entry}`);
|
||||
}
|
||||
|
||||
for (const [name, binPath] of Object.entries(pkg.bin ?? {})) {
|
||||
const full = assertPath(binPath, `bin ${name}`);
|
||||
const mode = statSync(full).mode;
|
||||
if ((mode & 0o111) === 0) {
|
||||
console.error(`Package smoke failed: bin ${name} is not executable (${binPath})`);
|
||||
process.exit(1);
|
||||
}
|
||||
}
|
||||
|
||||
assertPath("dist/index.js", "compiled main export");
|
||||
assertPath("dist/index.d.ts", "compiled type export");
|
||||
assertPath("dist/cli/qmd.js", "compiled CLI");
|
||||
|
||||
run("compiled CLI under Node", process.execPath, ["dist/cli/qmd.js", "--help"], { quiet: true });
|
||||
run("package wrapper", "sh", ["bin/qmd", "--help"], { quiet: true });
|
||||
|
||||
if (process.env.QMD_SKIP_BUN_SMOKE === "1") {
|
||||
console.log("==> compiled CLI under Bun (skipped by QMD_SKIP_BUN_SMOKE=1)");
|
||||
} else {
|
||||
run("compiled CLI under Bun", "bun", ["dist/cli/qmd.js", "--help"], { quiet: true });
|
||||
}
|
||||
@ -93,7 +93,7 @@ echo ""
|
||||
|
||||
# --- Rename [Unreleased] -> [X.Y.Z] - date, add fresh [Unreleased] ---
|
||||
|
||||
sed -i '' "s/^## \[Unreleased\].*/## [$NEW] - $DATE/" CHANGELOG.md
|
||||
perl -0pi -e 's/^## \[Unreleased\].*/## ['"$NEW"'] - '"$DATE"'/m' CHANGELOG.md
|
||||
|
||||
# Insert a new empty [Unreleased] section after the header
|
||||
awk '
|
||||
|
||||
118
scripts/repro-metal-rsets-crash.mjs
Normal file
118
scripts/repro-metal-rsets-crash.mjs
Normal file
@ -0,0 +1,118 @@
|
||||
#!/usr/bin/env node
|
||||
/**
|
||||
* Minimal reproduction of llama.cpp issue ggml-org/llama.cpp#22593:
|
||||
*
|
||||
* ggml-metal-device.m:612: GGML_ASSERT([rsets->data count] == 0) failed
|
||||
*
|
||||
* Root cause (per the upstream issue and proposed fix PR #22595):
|
||||
* `ggml_metal_buffer_rset_free` releases the per-buffer residency set object
|
||||
* but does NOT call the symmetric `ggml_metal_device_rsets_rm`. So the
|
||||
* device's `rsets->data` array accumulates dangling references. When the
|
||||
* process exits and libc fires the process-static `ggml_metal_device`
|
||||
* destructor in `__cxa_finalize_ranges`, the destructor asserts the
|
||||
* array is empty — and it isn't.
|
||||
*
|
||||
* Observed downstream behavior:
|
||||
* - With EXPLICIT `dispose()` of every JS handle in order, the assertion
|
||||
* does NOT fire. node-llama-cpp's dispose path tears the Metal buffers
|
||||
* down before the static dtor runs, so the device's rsets array is
|
||||
* empty by exit time. (Tested locally — clean exit.)
|
||||
* - With NO dispose (the typical real-world case: synchronous `exit()`,
|
||||
* `--watch` mode, `process.exit()` after results are written, or any
|
||||
* code path where GC + finalizers race with libc exit), the rset
|
||||
* references linger until the static dtor fires, and the assertion
|
||||
* trips.
|
||||
*
|
||||
* What this script does:
|
||||
* 1. Load node-llama-cpp + a small GGUF model on the Metal backend.
|
||||
* This allocates at least one Metal buffer → calls rsets_add internally.
|
||||
* 2. Run an inference (creating an embedding context populates buffers
|
||||
* that the dispose path would normally clean up).
|
||||
* 3. Skip explicit dispose. Just let the process exit.
|
||||
*
|
||||
* Expected behavior on macOS 15+ with Apple Silicon, current llama.cpp
|
||||
* (bundled in node-llama-cpp 3.18.1, llama.cpp tag b8390):
|
||||
* - Without GGML_METAL_NO_RESIDENCY:
|
||||
* Script writes "ok" and main() returns, then ggml_abort fires the
|
||||
* assertion, prints a multi-kB backtrace, and the process exits with
|
||||
* SIGABRT (exit code 134).
|
||||
* - With GGML_METAL_NO_RESIDENCY=1:
|
||||
* Clean exit code 0. Residency-set code path is skipped entirely.
|
||||
* - With --dispose flag (manual cleanup):
|
||||
* Clean exit code 0 even without the env var, as long as JS dispose()
|
||||
* runs successfully before libc exit.
|
||||
*
|
||||
* Usage:
|
||||
* # Reproduce the crash (no dispose, no env var)
|
||||
* node scripts/repro-metal-rsets-crash.mjs
|
||||
*
|
||||
* # Verify the documented workaround
|
||||
* GGML_METAL_NO_RESIDENCY=1 node scripts/repro-metal-rsets-crash.mjs
|
||||
*
|
||||
* # Verify that explicit dispose also avoids the crash
|
||||
* node scripts/repro-metal-rsets-crash.mjs --dispose
|
||||
*
|
||||
* Refs:
|
||||
* https://github.com/ggml-org/llama.cpp/issues/22593 (root-cause analysis)
|
||||
* https://github.com/ggml-org/llama.cpp/pull/22595 (one-line fix, open)
|
||||
* https://github.com/tobi/qmd/issues/368 (downstream report)
|
||||
* https://github.com/tobi/qmd/issues/674 (downstream, current)
|
||||
* https://github.com/tobi/qmd/pull/600 (downstream workaround PR)
|
||||
*/
|
||||
|
||||
import { existsSync } from "node:fs";
|
||||
import { homedir } from "node:os";
|
||||
import { resolve } from "node:path";
|
||||
|
||||
const DEFAULT_MODEL = resolve(
|
||||
homedir(),
|
||||
".cache/qmd/models/hf_ggml-org_embeddinggemma-300M-Q8_0.gguf",
|
||||
);
|
||||
|
||||
const args = process.argv.slice(2);
|
||||
const wantsDispose = args.includes("--dispose");
|
||||
const modelPath = args.find((a) => !a.startsWith("--")) ?? DEFAULT_MODEL;
|
||||
|
||||
if (!existsSync(modelPath)) {
|
||||
console.error(`Model not found: ${modelPath}`);
|
||||
console.error("Pass a path to any local GGUF as argv[1], or run `qmd embed` once to populate the default cache path.");
|
||||
process.exit(2);
|
||||
}
|
||||
|
||||
console.error(
|
||||
`[repro] GGML_METAL_NO_RESIDENCY=${process.env.GGML_METAL_NO_RESIDENCY ?? "(unset)"}`,
|
||||
);
|
||||
console.error(`[repro] dispose=${wantsDispose}`);
|
||||
console.error(`[repro] loading: ${modelPath}`);
|
||||
|
||||
const { getLlama } = await import("node-llama-cpp");
|
||||
|
||||
const llama = await getLlama();
|
||||
const model = await llama.loadModel({ modelPath });
|
||||
const context = await model.createEmbeddingContext();
|
||||
|
||||
console.error(`[repro] backend: ${llama.gpu}`);
|
||||
|
||||
// Run actual inference so the buffer-allocation path is hit.
|
||||
await context.getEmbeddingFor("repro text");
|
||||
|
||||
if (wantsDispose) {
|
||||
console.error("[repro] explicit dispose…");
|
||||
await context.dispose();
|
||||
await model.dispose();
|
||||
await llama.dispose();
|
||||
}
|
||||
|
||||
console.error("[repro] main() returning via process.exit(0)");
|
||||
console.log("ok");
|
||||
|
||||
// CRITICAL: use process.exit(), not `return`. node-llama-cpp registers a
|
||||
// `process.once('beforeExit', …)` hook that auto-disposes WeakRef'd Llama
|
||||
// instances when the event loop empties naturally. `process.exit()` skips
|
||||
// `beforeExit`, so the rsets stay populated until libc's `exit()` fires the
|
||||
// static dtor — which is when the upstream assertion bug trips.
|
||||
//
|
||||
// CLI tools (qmd query, qmd vsearch, qmd embed, etc.) all call process.exit()
|
||||
// after writing results, which is why every real downstream report crashes
|
||||
// even though the minimal "let main return" version does not.
|
||||
process.exit(0);
|
||||
38
scripts/test-all.mjs
Normal file
38
scripts/test-all.mjs
Normal file
@ -0,0 +1,38 @@
|
||||
#!/usr/bin/env node
|
||||
import { spawnSync } from "node:child_process";
|
||||
import { join } from "node:path";
|
||||
import { fileURLToPath } from "node:url";
|
||||
|
||||
const root = fileURLToPath(new URL("..", import.meta.url));
|
||||
|
||||
// Mirror bin/qmd's darwin Metal residency mitigation for test subprocesses.
|
||||
// libggml-metal asserts on a non-empty residency set during its static
|
||||
// destructor (ggml-org/llama.cpp#22593, fix open as #22595) and dumps a
|
||||
// multi-kB backtrace at process exit even when tests pass. The env var must
|
||||
// be set BEFORE the subprocess starts because libggml-metal reads it via
|
||||
// libc getenv at module-load time. Opt out with QMD_METAL_KEEP_RESIDENCY=1.
|
||||
const darwinMetalEnv =
|
||||
process.platform === "darwin" && process.env.QMD_METAL_KEEP_RESIDENCY !== "1"
|
||||
? { GGML_METAL_NO_RESIDENCY: "1" }
|
||||
: {};
|
||||
|
||||
function run(label, command, args, options = {}) {
|
||||
console.log(`==> ${label}`);
|
||||
const { env: extraEnv, ...spawnOptions } = options;
|
||||
const result = spawnSync(command, args, {
|
||||
cwd: root,
|
||||
stdio: "inherit",
|
||||
shell: process.platform === "win32",
|
||||
env: { ...process.env, ...darwinMetalEnv, ...(extraEnv ?? {}) },
|
||||
...spawnOptions,
|
||||
});
|
||||
if (result.status !== 0) {
|
||||
console.error(`Test task failed: ${label}`);
|
||||
process.exit(result.status ?? 1);
|
||||
}
|
||||
}
|
||||
|
||||
run("TypeScript build typecheck", process.execPath, [join(root, "node_modules", "typescript", "bin", "tsc"), "-p", "tsconfig.build.json", "--noEmit"]);
|
||||
run("Vitest suite under Node", process.execPath, [join(root, "node_modules", "vitest", "vitest.mjs"), "run", "--reporter=verbose", "--testTimeout", "60000", "test/"], { env: { CI: "true" } });
|
||||
run("Bun test suite", "bun", ["test", "--timeout", "60000", "--preload", "./src/test-preload.ts", "test/"], { env: { CI: "true" } });
|
||||
run("Package smoke", process.execPath, ["scripts/package-smoke.mjs"]);
|
||||
@ -1,144 +1,295 @@
|
||||
---
|
||||
name: qmd
|
||||
description: Search markdown knowledge bases, notes, and documentation using QMD. Use when users ask to search notes, find documents, or look up information.
|
||||
description: Search local markdown knowledge bases, notes, docs, and wikis with QMD. Use when users ask to find notes, retrieve documents, inspect a wiki, answer from indexed markdown, or set up QMD access.
|
||||
license: MIT
|
||||
compatibility: Requires qmd CLI or MCP server. Install via `npm install -g @tobilu/qmd`.
|
||||
metadata:
|
||||
author: tobi
|
||||
version: "2.0.0"
|
||||
version: "2.2.0"
|
||||
allowed-tools: Bash(qmd:*), mcp__qmd__*
|
||||
---
|
||||
|
||||
# QMD - Quick Markdown Search
|
||||
# QMD - Query Markdown Documents
|
||||
|
||||
Local search engine for markdown content.
|
||||
## How search works
|
||||
|
||||
## Status
|
||||
QMD searches local markdown collections: notes, docs, wikis, transcripts, and
|
||||
project knowledge bases. Use it before web search when the answer may already be
|
||||
in indexed local files.
|
||||
|
||||
!`qmd status 2>/dev/null || echo "Not installed: npm install -g @tobilu/qmd"`
|
||||
The workflow is always:
|
||||
|
||||
## MCP: `query`
|
||||
1. Search for candidate documents.
|
||||
2. Retrieve the full source with `qmd get` or `qmd multi-get`.
|
||||
3. Answer from retrieved text, citing paths or docids.
|
||||
|
||||
Do not answer from snippets alone when the user needs facts, decisions, quotes,
|
||||
or nuance. Snippets are only leads.
|
||||
|
||||
Typical loop:
|
||||
|
||||
```bash
|
||||
qmd search "merchant reality support interviews" -n 5
|
||||
# leads: #abc123 concepts/customer-proximity.md; #def432 sources/merchant-call.md
|
||||
qmd multi-get "#abc123,#def432" --format md
|
||||
```
|
||||
|
||||
**Default to structured `qmd query` with `intent:`, `lex:`, `vec:`, and `hyde:`
|
||||
fields that you write yourself.** You are a better query expander than the
|
||||
built-in model: you know the user's actual goal, the domain vocabulary, and the
|
||||
nearby-but-wrong concepts to avoid. Do not just paste the user's words into
|
||||
`qmd query "..."` and hope the expansion model guesses right — supply the
|
||||
`intent:` and craft the lexical and semantic terms deliberately (see
|
||||
[Pick the right search mode](#pick-the-right-search-mode)).
|
||||
|
||||
When reporting what you retrieved, a compact note is enough; do not paste whole
|
||||
files unless needed:
|
||||
|
||||
```text
|
||||
Retrieved:
|
||||
- #abc123 concepts/customer-proximity.md
|
||||
- #def432 sources/merchant-call.md
|
||||
```
|
||||
|
||||
## Pick the right search mode
|
||||
|
||||
Use **BM25 lexical search** when you know exact words, titles, names, code
|
||||
symbols, or rare phrases:
|
||||
|
||||
```bash
|
||||
qmd search "cockpit OKR Goodhart" -n 10
|
||||
qmd search '"AI Before Headcount"' -c concepts -n 5
|
||||
```
|
||||
|
||||
Use **`qmd query` with structured fields** when the user describes an idea
|
||||
indirectly, uses different wording than the source, or needs conceptual recall.
|
||||
**This is the default mode — write the fields yourself rather than leaning on
|
||||
query expansion.** Combine exact anchors with semantic recall:
|
||||
|
||||
```bash
|
||||
qmd query $'intent: Find the concept note about metrics as instruments without letting OKRs replace judgment.\nlex: cockpit instruments OKR Goodhart metrics judgment\nvec: data informed not metric driven product judgment\nhyde: A concept note says metrics are useful like cockpit instruments, but leaders should remain data-informed rather than metric-driven because OKRs and dashboards can Goodhart product judgment.'
|
||||
```
|
||||
|
||||
Structured query fields (you author each one — do not delegate this to the
|
||||
expansion model):
|
||||
|
||||
- `intent:` states what you are trying to find **and what to avoid**. Always
|
||||
supply this. It steers ranking away from nearby-but-wrong concepts.
|
||||
- `lex:` exact terms, aliases, titles, code symbols, and rare words you expect
|
||||
in the source. This is your own keyword expansion.
|
||||
- `vec:` paraphrases the idea in natural language, in source-like wording.
|
||||
- `hyde:` describes the document or answer that would satisfy the request.
|
||||
|
||||
You do not need all four every time, but you should almost always write at least
|
||||
`intent:` plus one of `lex:`/`vec:`. A bare `qmd query "the user's sentence"`
|
||||
throws away the context only you have and relies on the built-in expander to
|
||||
reconstruct it — prefer the structured form.
|
||||
|
||||
If you genuinely have nothing to expand (a single rare token, a verbatim phrase),
|
||||
that is a job for `qmd search`, not bare `qmd query`:
|
||||
|
||||
```bash
|
||||
qmd query --format json --explain $'intent: ...\nlex: ...\nvec: ...' # inspect ranking
|
||||
```
|
||||
|
||||
If `qmd query` is slow or model/GPU setup fails, fall back to `qmd search` with
|
||||
better lexical terms.
|
||||
|
||||
## Retrieve sources
|
||||
|
||||
Search results include docids like `#abc123` and `qmd://...` paths. Fetch them:
|
||||
|
||||
```bash
|
||||
qmd get "#abc123"
|
||||
qmd get qmd://concepts/ai-before-headcount.md
|
||||
qmd multi-get "#abc123,#def432" --format md
|
||||
qmd multi-get 'concepts/{ai-before-headcount.md,data-informed-not-metric-driven.md}' --format md
|
||||
qmd multi-get 'sources/podcast-2025-*.md' -l 80
|
||||
```
|
||||
|
||||
Use `multi-get` when comparing several hits or gathering context across pages.
|
||||
|
||||
### Output is line-numbered and carries the docid — cite both
|
||||
|
||||
`get` and `multi-get` are **line-numbered by default** and always print the
|
||||
document's `#docid` and `qmd://` path. So `get` output looks like:
|
||||
|
||||
```text
|
||||
qmd://concepts/note.md #abc123
|
||||
---
|
||||
|
||||
1: # Metrics as instruments
|
||||
2:
|
||||
3: Treat dashboards like cockpit instruments...
|
||||
```
|
||||
|
||||
Cite the docid and exact line numbers in your answer, and use the numbers to ask
|
||||
for the next slice. Pass `--no-line-numbers` only when you need raw content to
|
||||
copy verbatim (e.g. reproducing a code block).
|
||||
|
||||
When you need to open or edit the underlying file (e.g. hand a path to `Read`,
|
||||
`Edit`, or an editor), add `--full-path`. It replaces the `qmd://` URL + docid
|
||||
header with the document's on-disk path, falling back to the canonical header if
|
||||
the file no longer exists on disk:
|
||||
|
||||
```text
|
||||
$ qmd get "#abc123" --full-path
|
||||
/Users/you/notes/concepts/note.md
|
||||
---
|
||||
|
||||
1: # Metrics as instruments
|
||||
```
|
||||
|
||||
`--full-path` works the same way on `qmd search` and `qmd query`: result paths
|
||||
become the file's on-disk path — `./`-prefixed relative path when the file is
|
||||
inside `$PWD`, absolute realpath otherwise — and the per-result `#docid` is
|
||||
dropped because the path is the identifier. The leading `./` is intentional so
|
||||
the output is unambiguously a filesystem path and cannot be mistaken for a bare
|
||||
collection-relative string. Default search/query output still uses `qmd://`
|
||||
URIs; only opt into `--full-path` when you specifically need a path you can hand
|
||||
to a non-QMD tool.
|
||||
|
||||
### Read line ranges with the `:from:count` suffix — never pipe through `sed`/`head`/`tail`
|
||||
|
||||
`qmd get` slices files itself. Use the suffix or flags; do **not** shell out to
|
||||
`sed -n`, `head`, `tail`, or `awk` to pull a line range. Piping defeats docid
|
||||
resolution, virtual-path lookups, line numbering, and the header, and it is
|
||||
slower and more error-prone.
|
||||
|
||||
The most compact form is a `:from:count` suffix right on the path or docid —
|
||||
prefer it:
|
||||
|
||||
```bash
|
||||
qmd get "#abc123:120:40" # 40 lines starting at line 120
|
||||
qmd get qmd://concepts/note.md:200:60 # lines 200–259
|
||||
qmd get "#abc123:120" # from line 120 to end of file
|
||||
qmd get "#abc123" --from 120 -l 40 # equivalent, using flags
|
||||
```
|
||||
|
||||
Suffix and flags:
|
||||
|
||||
- `<path>:<from>:<count>` — start at line `<from>`, read `<count>` lines. **Best
|
||||
for reading around a search hit.**
|
||||
- `<path>:<from>` — start at `<from>`, read to end of file.
|
||||
- `--from <line>` / `-l <lines>` — flag equivalents. Explicit flags override the
|
||||
suffix, so `... :5:2 -l 1` reads 1 line.
|
||||
- `--no-line-numbers` — drop the `N:` prefixes (line numbers are on by default).
|
||||
|
||||
Wrong: `qmd get "#abc123" | sed -n '120,160p'`
|
||||
Right: `qmd get "#abc123:120:40"`
|
||||
|
||||
Search results include a `:line` anchor on each hit — feed it straight into
|
||||
`qmd get path:line:<n>` to read a window around the match (line numbers in the
|
||||
output will start at `line`).
|
||||
|
||||
## Discover what is indexed
|
||||
|
||||
```bash
|
||||
qmd collection list
|
||||
qmd ls
|
||||
qmd status
|
||||
```
|
||||
|
||||
Add collection filters when broad searches drift into the wrong corpus:
|
||||
|
||||
```bash
|
||||
qmd search "headcount autonomous agents" -c concepts -n 10
|
||||
qmd query "merchant support product reality" -c concepts -c sources -n 10
|
||||
```
|
||||
|
||||
Omit `-c` to search everything.
|
||||
|
||||
## MCP Tool: `query`
|
||||
|
||||
When using the MCP server, prefer structured searches:
|
||||
|
||||
```json
|
||||
{
|
||||
"searches": [
|
||||
{ "type": "lex", "query": "CAP theorem consistency" },
|
||||
{ "type": "vec", "query": "tradeoff between consistency and availability" }
|
||||
{ "type": "lex", "query": "cockpit OKR Goodhart" },
|
||||
{ "type": "vec", "query": "data informed not metric driven product judgment" },
|
||||
{ "type": "hyde", "query": "A concept note explains that metrics are useful as instruments, but leaders should not let OKRs or dashboards replace judgment." }
|
||||
],
|
||||
"collections": ["docs"],
|
||||
"intent": "Find the concept note about using metrics as instruments without becoming metric-driven.",
|
||||
"collections": ["concepts"],
|
||||
"limit": 10
|
||||
}
|
||||
```
|
||||
|
||||
### Query Types
|
||||
Query types:
|
||||
|
||||
| Type | Method | Input |
|
||||
|------|--------|-------|
|
||||
| `lex` | BM25 | Keywords — exact terms, names, code |
|
||||
| `vec` | Vector | Question — natural language |
|
||||
| `hyde` | Vector | Answer — hypothetical result (50-100 words) |
|
||||
- `lex` — BM25 keyword search. Best for exact terms, names, titles, and code.
|
||||
- `vec` — vector semantic search. Best for natural-language concepts.
|
||||
- `hyde` — vector search using a hypothetical answer/document passage.
|
||||
|
||||
### Writing Good Queries
|
||||
## Query craft
|
||||
|
||||
**lex (keyword)**
|
||||
- 2-5 terms, no filler words
|
||||
- Exact phrase: `"connection pool"` (quoted)
|
||||
- Exclude terms: `performance -sports` (minus prefix)
|
||||
- Code identifiers work: `handleError async`
|
||||
Good QMD searches mix three things:
|
||||
|
||||
**vec (semantic)**
|
||||
- Full natural language question
|
||||
- Be specific: `"how does the rate limiter handle burst traffic"`
|
||||
- Include context: `"in the payment service, how are refunds processed"`
|
||||
1. **Title/alias anchors:** exact page titles, named entities, phrases.
|
||||
2. **Semantic paraphrase:** how a human would describe the idea.
|
||||
3. **Negative space:** enough intent to avoid nearby-but-wrong concepts.
|
||||
|
||||
**hyde (hypothetical document)**
|
||||
- Write 50-100 words of what the *answer* looks like
|
||||
- Use the vocabulary you expect in the result
|
||||
|
||||
**expand (auto-expand)**
|
||||
- Use a single-line query (implicit) or `expand: question` on its own line
|
||||
- Lets the local LLM generate lex/vec/hyde variations
|
||||
- Do not mix `expand:` with other typed lines — it's either a standalone expand query or a full query document
|
||||
|
||||
### Intent (Disambiguation)
|
||||
|
||||
When a query term is ambiguous, add `intent` to steer results:
|
||||
|
||||
```json
|
||||
{
|
||||
"searches": [
|
||||
{ "type": "lex", "query": "performance" }
|
||||
],
|
||||
"intent": "web page load times and Core Web Vitals"
|
||||
}
|
||||
```
|
||||
|
||||
Intent affects expansion, reranking, chunk selection, and snippet extraction. It does not search on its own — it's a steering signal that disambiguates queries like "performance" (web-perf vs team health vs fitness).
|
||||
|
||||
### Combining Types
|
||||
|
||||
| Goal | Approach |
|
||||
|------|----------|
|
||||
| Know exact terms | `lex` only |
|
||||
| Don't know vocabulary | Use a single-line query (implicit `expand:`) or `vec` |
|
||||
| Best recall | `lex` + `vec` |
|
||||
| Complex topic | `lex` + `vec` + `hyde` |
|
||||
| Ambiguous query | Add `intent` to any combination above |
|
||||
|
||||
First query gets 2x weight in fusion — put your best guess first.
|
||||
|
||||
### Lex Query Syntax
|
||||
|
||||
| Syntax | Meaning | Example |
|
||||
|--------|---------|---------|
|
||||
| `term` | Prefix match | `perf` matches "performance" |
|
||||
| `"phrase"` | Exact phrase | `"rate limiter"` |
|
||||
| `-term` | Exclude | `performance -sports` |
|
||||
|
||||
Note: `-term` only works in lex queries, not vec/hyde.
|
||||
|
||||
### Collection Filtering
|
||||
|
||||
```json
|
||||
{ "collections": ["docs"] } // Single
|
||||
{ "collections": ["docs", "notes"] } // Multiple (OR)
|
||||
```
|
||||
|
||||
Omit to search all collections.
|
||||
|
||||
## Other MCP Tools
|
||||
|
||||
| Tool | Use |
|
||||
|------|-----|
|
||||
| `get` | Retrieve doc by path or `#docid` |
|
||||
| `multi_get` | Retrieve multiple by glob/list |
|
||||
| `status` | Collections and health |
|
||||
|
||||
## CLI
|
||||
Examples:
|
||||
|
||||
```bash
|
||||
qmd query "question" # Auto-expand + rerank
|
||||
qmd query $'lex: X\nvec: Y' # Structured
|
||||
qmd query $'expand: question' # Explicit expand
|
||||
qmd query --json --explain "q" # Show score traces (RRF + rerank blend)
|
||||
qmd search "keywords" # BM25 only (no LLM)
|
||||
qmd get "#abc123" # By docid
|
||||
qmd multi-get "journals/2026-*.md" -l 40 # Batch pull snippets by glob
|
||||
qmd multi-get notes/foo.md,notes/bar.md # Comma-separated list, preserves order
|
||||
# Exact-ish title lookup
|
||||
qmd search '"arm the rebels" merchants tools big companies' -c concepts
|
||||
|
||||
# Semantic concept lookup
|
||||
qmd query $'intent: Find the customer proximity concept, not generic customer delight.\nlex: support pseudonymous merchant customer interviews\nvec: founder stays close to merchant reality through support and product use'
|
||||
|
||||
# Source lookup
|
||||
qmd search "six-week cadence WhatsApp merchant relationships Shawn Ryan" -c sources -n 10
|
||||
```
|
||||
|
||||
## HTTP API
|
||||
## Setup and maintenance
|
||||
|
||||
```bash
|
||||
curl -X POST http://localhost:8181/query \
|
||||
-H "Content-Type: application/json" \
|
||||
-d '{"searches": [{"type": "lex", "query": "test"}]}'
|
||||
```
|
||||
|
||||
## Setup
|
||||
Only mutate indexes when the user asked for setup or maintenance. Searching and
|
||||
retrieving are safe; collection/index mutation is not a casual first step.
|
||||
|
||||
```bash
|
||||
npm install -g @tobilu/qmd
|
||||
qmd collection add ~/notes --name notes
|
||||
qmd update
|
||||
qmd embed
|
||||
```
|
||||
|
||||
Health and diagnostics:
|
||||
|
||||
```bash
|
||||
qmd doctor
|
||||
qmd status
|
||||
qmd pull
|
||||
```
|
||||
|
||||
`qmd doctor` checks config, model cache, device/GPU setup, vector fingerprints,
|
||||
and common environment overrides. If a model-backed command fails, run it before
|
||||
changing configuration.
|
||||
|
||||
## MCP setup
|
||||
|
||||
See `references/mcp-setup.md` for Claude Code, Claude Desktop, OpenClaw, and HTTP
|
||||
server configuration.
|
||||
|
||||
## Pitfalls
|
||||
|
||||
- **Do not stop at snippets.** Fetch documents before making claims.
|
||||
- **Do not slice files with `sed`/`head`/`tail`.** Use the `path:from:count`
|
||||
suffix (e.g. `qmd get "#abc123:120:40"`) or `--from`/`-l`. Output is already
|
||||
line-numbered; piping breaks docid resolution, the header, and virtual paths.
|
||||
- **Do not lean on query expansion.** Write `intent:`/`lex:`/`vec:`/`hyde:`
|
||||
yourself. A bare `qmd query "user sentence"` discards the context only you
|
||||
have. You expand the query; the model just ranks.
|
||||
- **Do not overuse semantic search.** If you know exact titles or terms, BM25 is
|
||||
faster and often better.
|
||||
- **Do not mutate indexes casually.** `qmd collection add`, `qmd update`, and
|
||||
`qmd embed` change local state and can be expensive.
|
||||
- **Model-backed commands can be environment-sensitive.** If `qmd query`,
|
||||
`qmd vsearch`, or reranking fails because local models/GPU are unavailable,
|
||||
use `qmd search` and stronger lexical/structured terms.
|
||||
- **Ambiguous user wording needs intent.** Add `intent:` rather than hoping query
|
||||
expansion guesses the right domain.
|
||||
- **Collection names matter.** Search `concepts` for synthesized wiki pages,
|
||||
`sources` for transcripts/raw source pages, and docs collections for code or
|
||||
project documentation.
|
||||
|
||||
30
src/ast.ts
30
src/ast.ts
@ -63,15 +63,22 @@ export function detectLanguage(filepath: string): SupportedLanguage | null {
|
||||
/**
|
||||
* Maps language to the npm package and wasm filename for the grammar.
|
||||
*/
|
||||
const GRAMMAR_MAP: Record<SupportedLanguage, { pkg: string; wasm: string }> = {
|
||||
typescript: { pkg: "tree-sitter-typescript", wasm: "tree-sitter-typescript.wasm" },
|
||||
tsx: { pkg: "tree-sitter-typescript", wasm: "tree-sitter-tsx.wasm" },
|
||||
javascript: { pkg: "tree-sitter-typescript", wasm: "tree-sitter-typescript.wasm" },
|
||||
python: { pkg: "tree-sitter-python", wasm: "tree-sitter-python.wasm" },
|
||||
go: { pkg: "tree-sitter-go", wasm: "tree-sitter-go.wasm" },
|
||||
rust: { pkg: "tree-sitter-rust", wasm: "tree-sitter-rust.wasm" },
|
||||
const GRAMMAR_MAP: Record<SupportedLanguage, { pkg: string; wasm: string; version: string }> = {
|
||||
typescript: { pkg: "tree-sitter-typescript", wasm: "tree-sitter-typescript.wasm", version: "0.23.2" },
|
||||
tsx: { pkg: "tree-sitter-typescript", wasm: "tree-sitter-tsx.wasm", version: "0.23.2" },
|
||||
javascript: { pkg: "tree-sitter-typescript", wasm: "tree-sitter-typescript.wasm", version: "0.23.2" },
|
||||
python: { pkg: "tree-sitter-python", wasm: "tree-sitter-python.wasm", version: "0.23.4" },
|
||||
go: { pkg: "tree-sitter-go", wasm: "tree-sitter-go.wasm", version: "0.23.4" },
|
||||
rust: { pkg: "tree-sitter-rust", wasm: "tree-sitter-rust.wasm", version: "0.24.0" },
|
||||
};
|
||||
|
||||
export function formatGrammarLoadError(language: SupportedLanguage, err: unknown): string {
|
||||
const grammar = GRAMMAR_MAP[language];
|
||||
const detail = err instanceof Error ? err.message : String(err);
|
||||
return `${grammar.pkg}/${grammar.wasm} failed to load (${detail}); falling back to regex chunking. ` +
|
||||
`Repair a broken global install with: bun add ${grammar.pkg}@${grammar.version}`;
|
||||
}
|
||||
|
||||
// =============================================================================
|
||||
// Per-Language Query Definitions
|
||||
// =============================================================================
|
||||
@ -176,6 +183,9 @@ let initPromise: Promise<void> | null = null;
|
||||
/** Languages that have already failed to load — warn only once per process. */
|
||||
const failedLanguages = new Set<string>();
|
||||
|
||||
/** Last grammar load error by language, for status output. */
|
||||
const grammarLoadErrors = new Map<SupportedLanguage, string>();
|
||||
|
||||
/** Cached grammar load promises. */
|
||||
const grammarCache = new Map<string, Promise<LanguageType>>();
|
||||
|
||||
@ -228,7 +238,9 @@ async function loadGrammar(language: SupportedLanguage): Promise<LanguageType |
|
||||
} catch (err) {
|
||||
failedLanguages.add(language);
|
||||
grammarCache.delete(wasmKey);
|
||||
console.warn(`[qmd] Failed to load tree-sitter grammar for ${language}: ${err}`);
|
||||
const message = formatGrammarLoadError(language, err);
|
||||
grammarLoadErrors.set(language, message);
|
||||
console.warn(`[qmd] AST grammar unavailable for ${language}: ${message}`);
|
||||
return null;
|
||||
}
|
||||
}
|
||||
@ -345,7 +357,7 @@ export async function getASTStatus(): Promise<{
|
||||
getQuery(lang, grammar);
|
||||
languages.push({ language: lang, available: true });
|
||||
} else {
|
||||
languages.push({ language: lang, available: false, error: "grammar failed to load" });
|
||||
languages.push({ language: lang, available: false, error: grammarLoadErrors.get(lang) ?? "grammar failed to load" });
|
||||
}
|
||||
} catch (err) {
|
||||
languages.push({
|
||||
|
||||
@ -260,16 +260,18 @@ async function main() {
|
||||
const r = await benchmarkConfig(model, llama, docs, p, true);
|
||||
results.push(r);
|
||||
process.stdout.write(` ${r.medianMs.toFixed(0)}ms (${r.docsPerSec.toFixed(1)} docs/s)\n`);
|
||||
} catch (e: any) {
|
||||
process.stdout.write(` failed: ${e.message}\n`);
|
||||
} catch (e: unknown) {
|
||||
const message = e instanceof Error ? e.message : String(e);
|
||||
process.stdout.write(` failed: ${message}\n`);
|
||||
// Try without flash
|
||||
process.stdout.write(` [${p} ctx, no flash] running...`);
|
||||
try {
|
||||
const r = await benchmarkConfig(model, llama, docs, p, false);
|
||||
results.push(r);
|
||||
process.stdout.write(` ${r.medianMs.toFixed(0)}ms (${r.docsPerSec.toFixed(1)} docs/s)\n`);
|
||||
} catch (e2: any) {
|
||||
process.stdout.write(` failed: ${e2.message}\n`);
|
||||
} catch (e2: unknown) {
|
||||
const message = e2 instanceof Error ? e2.message : String(e2);
|
||||
process.stdout.write(` failed: ${message}\n`);
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
@ -22,6 +22,7 @@ import {
|
||||
type QMDStore,
|
||||
type SearchResult,
|
||||
type HybridQueryResult,
|
||||
type ExpandedQuery,
|
||||
} from "../index.js";
|
||||
import { scoreResults } from "./score.js";
|
||||
import type {
|
||||
@ -34,35 +35,130 @@ import type {
|
||||
|
||||
type Backend = {
|
||||
name: string;
|
||||
run: (store: QMDStore, query: string, limit: number, collection?: string) => Promise<string[]>;
|
||||
run: (store: QMDStore, query: BenchmarkQuery, limit: number, collection?: string) => Promise<string[]>;
|
||||
};
|
||||
|
||||
type ParsedStructuredQuery = {
|
||||
searches: ExpandedQuery[];
|
||||
intent?: string;
|
||||
};
|
||||
|
||||
function parseStructuredQuery(query: string): ParsedStructuredQuery | undefined {
|
||||
const lines = query.split("\n").map((line, idx) => ({
|
||||
trimmed: line.trim(),
|
||||
number: idx + 1,
|
||||
})).filter(line => line.trimmed.length > 0);
|
||||
|
||||
if (lines.length === 0) return undefined;
|
||||
|
||||
const prefixRe = /^(lex|vec|hyde):\s*/i;
|
||||
const intentRe = /^intent:\s*/i;
|
||||
const searches: ExpandedQuery[] = [];
|
||||
let intent: string | undefined;
|
||||
|
||||
for (const line of lines) {
|
||||
if (intentRe.test(line.trimmed)) {
|
||||
if (intent !== undefined) {
|
||||
throw new Error(`Line ${line.number}: only one intent: line is allowed per benchmark query.`);
|
||||
}
|
||||
intent = line.trimmed.replace(intentRe, "").trim();
|
||||
if (!intent) {
|
||||
throw new Error(`Line ${line.number}: intent: must include text.`);
|
||||
}
|
||||
continue;
|
||||
}
|
||||
|
||||
const match = line.trimmed.match(prefixRe);
|
||||
if (match) {
|
||||
const type = match[1]!.toLowerCase() as "lex" | "vec" | "hyde";
|
||||
const text = line.trimmed.slice(match[0].length).trim();
|
||||
if (!text) {
|
||||
throw new Error(`Line ${line.number} (${type}:) must include text.`);
|
||||
}
|
||||
searches.push({ type, query: text, line: line.number });
|
||||
continue;
|
||||
}
|
||||
|
||||
if (lines.length === 1) {
|
||||
return undefined;
|
||||
}
|
||||
|
||||
throw new Error(`Line ${line.number} is missing a lex:/vec:/hyde:/intent: prefix.`);
|
||||
}
|
||||
|
||||
if (intent && searches.length === 0) {
|
||||
throw new Error("intent: cannot appear alone. Add at least one lex:, vec:, or hyde: line.");
|
||||
}
|
||||
|
||||
return searches.length > 0 ? { searches, intent } : undefined;
|
||||
}
|
||||
|
||||
function uniqueFiles(files: string[], limit: number): string[] {
|
||||
const seen = new Set<string>();
|
||||
const out: string[] = [];
|
||||
for (const file of files) {
|
||||
if (seen.has(file)) continue;
|
||||
seen.add(file);
|
||||
out.push(file);
|
||||
if (out.length >= limit) break;
|
||||
}
|
||||
return out;
|
||||
}
|
||||
|
||||
const BACKENDS: Backend[] = [
|
||||
{
|
||||
name: "bm25",
|
||||
run: async (store, query, limit, collection) => {
|
||||
const results = await store.searchLex(query, { limit, collection });
|
||||
const structured = parseStructuredQuery(query.query);
|
||||
const lexQueries = structured?.searches.filter(q => q.type === "lex");
|
||||
if (structured) {
|
||||
const files: string[] = [];
|
||||
for (const lex of lexQueries ?? []) {
|
||||
const results = await store.searchLex(lex.query, { limit, collection });
|
||||
files.push(...results.map((r: SearchResult) => r.filepath));
|
||||
}
|
||||
return uniqueFiles(files, limit);
|
||||
}
|
||||
|
||||
const results = await store.searchLex(query.query, { limit, collection });
|
||||
return results.map((r: SearchResult) => r.filepath);
|
||||
},
|
||||
},
|
||||
{
|
||||
name: "vector",
|
||||
run: async (store, query, limit, collection) => {
|
||||
const results = await store.searchVector(query, { limit, collection });
|
||||
const structured = parseStructuredQuery(query.query);
|
||||
const vectorQueries = structured?.searches.filter(q => q.type === "vec" || q.type === "hyde");
|
||||
if (structured) {
|
||||
const files: string[] = [];
|
||||
for (const vectorQuery of vectorQueries ?? []) {
|
||||
const results = await store.searchVector(vectorQuery.query, { limit, collection });
|
||||
files.push(...results.map((r: SearchResult) => r.filepath));
|
||||
}
|
||||
return uniqueFiles(files, limit);
|
||||
}
|
||||
|
||||
const results = await store.searchVector(query.query, { limit, collection });
|
||||
return results.map((r: SearchResult) => r.filepath);
|
||||
},
|
||||
},
|
||||
{
|
||||
name: "hybrid",
|
||||
run: async (store, query, limit, collection) => {
|
||||
const results = await store.search({ query, limit, collection, rerank: false });
|
||||
const structured = parseStructuredQuery(query.query);
|
||||
const results = structured
|
||||
? await store.search({ queries: structured.searches, intent: structured.intent, limit, collection, rerank: false })
|
||||
: await store.search({ query: query.query, limit, collection, rerank: false });
|
||||
return results.map((r: HybridQueryResult) => r.file);
|
||||
},
|
||||
},
|
||||
{
|
||||
name: "full",
|
||||
run: async (store, query, limit, collection) => {
|
||||
const results = await store.search({ query, limit, collection, rerank: true });
|
||||
const structured = parseStructuredQuery(query.query);
|
||||
const results = structured
|
||||
? await store.search({ queries: structured.searches, intent: structured.intent, limit, collection, rerank: true })
|
||||
: await store.search({ query: query.query, limit, collection, rerank: true });
|
||||
return results.map((r: HybridQueryResult) => r.file);
|
||||
},
|
||||
},
|
||||
@ -79,18 +175,23 @@ async function runQuery(
|
||||
|
||||
let resultFiles: string[];
|
||||
try {
|
||||
resultFiles = await backend.run(store, query.query, limit, collection);
|
||||
} catch (err: any) {
|
||||
resultFiles = await backend.run(store, query, limit, collection);
|
||||
} catch {
|
||||
// Backend may not be available (e.g., no embeddings for vector search)
|
||||
return {
|
||||
precision_at_k: 0,
|
||||
recall: 0,
|
||||
recall_at_1: 0,
|
||||
recall_at_3: 0,
|
||||
recall_at_5: 0,
|
||||
mrr: 0,
|
||||
f1: 0,
|
||||
hits_at_k: 0,
|
||||
total_expected: query.expected_files.length,
|
||||
latency_ms: Date.now() - start,
|
||||
top_files: [],
|
||||
matched_files: [],
|
||||
unmatched_expected_files: query.expected_files,
|
||||
};
|
||||
}
|
||||
|
||||
@ -111,14 +212,14 @@ function formatTable(results: QueryResult[]): string {
|
||||
const num = (n: number) => n.toFixed(2).padStart(5);
|
||||
|
||||
lines.push(
|
||||
`${pad("Query", 25)} ${pad("Backend", 8)} ${pad("P@k", 6)} ${pad("Recall", 7)} ${pad("MRR", 6)} ${pad("F1", 6)} ${pad("ms", 8)}`
|
||||
`${pad("Query", 25)} ${pad("Backend", 8)} ${pad("P@k", 6)} ${pad("R@1", 6)} ${pad("R@3", 6)} ${pad("R@5", 6)} ${pad("MRR", 6)} ${pad("F1", 6)} ${pad("ms", 8)}`
|
||||
);
|
||||
lines.push("-".repeat(70));
|
||||
lines.push("-".repeat(88));
|
||||
|
||||
for (const r of results) {
|
||||
for (const [backend, br] of Object.entries(r.backends)) {
|
||||
lines.push(
|
||||
`${pad(r.id, 25)} ${pad(backend, 8)} ${num(br.precision_at_k)} ${num(br.recall)} ${num(br.mrr)} ${num(br.f1)} ${String(Math.round(br.latency_ms)).padStart(7)}ms`
|
||||
`${pad(r.id, 25)} ${pad(backend, 8)} ${num(br.precision_at_k)} ${num(br.recall_at_1)} ${num(br.recall_at_3)} ${num(br.recall_at_5)} ${num(br.mrr)} ${num(br.f1)} ${String(Math.round(br.latency_ms)).padStart(7)}ms`
|
||||
);
|
||||
}
|
||||
lines.push("");
|
||||
@ -138,13 +239,16 @@ function computeSummary(results: QueryResult[]): BenchmarkResult["summary"] {
|
||||
}
|
||||
}
|
||||
|
||||
for (const name of backendNames) {
|
||||
let totalP = 0, totalR = 0, totalMrr = 0, totalF1 = 0, totalLat = 0, count = 0;
|
||||
for (const name of Array.from(backendNames)) {
|
||||
let totalP = 0, totalR = 0, totalR1 = 0, totalR3 = 0, totalR5 = 0, totalMrr = 0, totalF1 = 0, totalLat = 0, count = 0;
|
||||
for (const r of results) {
|
||||
const br = r.backends[name];
|
||||
if (!br) continue;
|
||||
totalP += br.precision_at_k;
|
||||
totalR += br.recall;
|
||||
totalR1 += br.recall_at_1;
|
||||
totalR3 += br.recall_at_3;
|
||||
totalR5 += br.recall_at_5;
|
||||
totalMrr += br.mrr;
|
||||
totalF1 += br.f1;
|
||||
totalLat += br.latency_ms;
|
||||
@ -154,6 +258,9 @@ function computeSummary(results: QueryResult[]): BenchmarkResult["summary"] {
|
||||
summary[name] = {
|
||||
avg_precision: totalP / count,
|
||||
avg_recall: totalR / count,
|
||||
avg_recall_at_1: totalR1 / count,
|
||||
avg_recall_at_3: totalR3 / count,
|
||||
avg_recall_at_5: totalR5 / count,
|
||||
avg_mrr: totalMrr / count,
|
||||
avg_f1: totalF1 / count,
|
||||
avg_latency_ms: totalLat / count,
|
||||
@ -166,7 +273,7 @@ function computeSummary(results: QueryResult[]): BenchmarkResult["summary"] {
|
||||
|
||||
export async function runBenchmark(
|
||||
fixturePath: string,
|
||||
options: { json?: boolean; collection?: string; backends?: string[] } = {},
|
||||
options: { json?: boolean; collection?: string; backends?: string[]; dbPath?: string; configPath?: string } = {},
|
||||
): Promise<BenchmarkResult> {
|
||||
// Load fixture
|
||||
const raw = readFileSync(resolve(fixturePath), "utf-8");
|
||||
@ -177,7 +284,10 @@ export async function runBenchmark(
|
||||
}
|
||||
|
||||
// Open store
|
||||
const store = await createStore({ dbPath: getDefaultDbPath() });
|
||||
const store = await createStore({
|
||||
dbPath: options.dbPath ?? getDefaultDbPath(),
|
||||
...(options.configPath ? { configPath: options.configPath } : {}),
|
||||
});
|
||||
|
||||
// Filter backends if requested
|
||||
const activeBackends = options.backends
|
||||
@ -232,7 +342,7 @@ export async function runBenchmark(
|
||||
const num = (n: number) => n.toFixed(3).padStart(6);
|
||||
for (const [name, s] of Object.entries(summary)) {
|
||||
console.log(
|
||||
` ${pad(name, 8)} P@k=${num(s.avg_precision)} Recall=${num(s.avg_recall)} MRR=${num(s.avg_mrr)} F1=${num(s.avg_f1)} Avg=${Math.round(s.avg_latency_ms)}ms`
|
||||
` ${pad(name, 8)} P@k=${num(s.avg_precision)} R@1=${num(s.avg_recall_at_1)} R@3=${num(s.avg_recall_at_3)} R@5=${num(s.avg_recall_at_5)} MRR=${num(s.avg_mrr)} F1=${num(s.avg_f1)} Avg=${Math.round(s.avg_latency_ms)}ms`
|
||||
);
|
||||
}
|
||||
}
|
||||
|
||||
@ -11,7 +11,7 @@
|
||||
*/
|
||||
export function normalizePath(p: string): string {
|
||||
if (p.startsWith("qmd://")) {
|
||||
// qmd://collection/path/to/file → path/to/file
|
||||
// qmd://collection/docs/readme.md → docs/readme.md
|
||||
const withoutScheme = p.slice("qmd://".length);
|
||||
const slashIdx = withoutScheme.indexOf("/");
|
||||
p = slashIdx >= 0 ? withoutScheme.slice(slashIdx + 1) : withoutScheme;
|
||||
@ -31,6 +31,30 @@ export function pathsMatch(result: string, expected: string): boolean {
|
||||
return false;
|
||||
}
|
||||
|
||||
type ScoreMetrics = {
|
||||
precision_at_k: number;
|
||||
recall: number;
|
||||
recall_at_1: number;
|
||||
recall_at_3: number;
|
||||
recall_at_5: number;
|
||||
mrr: number;
|
||||
f1: number;
|
||||
hits_at_k: number;
|
||||
matched_files: string[];
|
||||
unmatched_expected_files: string[];
|
||||
};
|
||||
|
||||
function hitsWithin(resultFiles: string[], expectedFiles: string[], k: number): number {
|
||||
const topKResults = resultFiles.slice(0, k);
|
||||
let hits = 0;
|
||||
for (const expected of expectedFiles) {
|
||||
if (topKResults.some(r => pathsMatch(r, expected))) {
|
||||
hits++;
|
||||
}
|
||||
}
|
||||
return hits;
|
||||
}
|
||||
|
||||
/**
|
||||
* Score a set of search results against expected files.
|
||||
*/
|
||||
@ -38,21 +62,18 @@ export function scoreResults(
|
||||
resultFiles: string[],
|
||||
expectedFiles: string[],
|
||||
topK: number,
|
||||
): { precision_at_k: number; recall: number; mrr: number; f1: number; hits_at_k: number } {
|
||||
): ScoreMetrics {
|
||||
// Count hits in top-k
|
||||
const topKResults = resultFiles.slice(0, topK);
|
||||
let hitsAtK = 0;
|
||||
for (const expected of expectedFiles) {
|
||||
if (topKResults.some(r => pathsMatch(r, expected))) {
|
||||
hitsAtK++;
|
||||
}
|
||||
}
|
||||
const hitsAtK = hitsWithin(resultFiles, expectedFiles, topK);
|
||||
|
||||
const matchedFiles: string[] = [];
|
||||
const unmatchedExpectedFiles: string[] = [];
|
||||
|
||||
// Count total hits anywhere
|
||||
let totalHits = 0;
|
||||
for (const expected of expectedFiles) {
|
||||
if (resultFiles.some(r => pathsMatch(r, expected))) {
|
||||
totalHits++;
|
||||
matchedFiles.push(expected);
|
||||
} else {
|
||||
unmatchedExpectedFiles.push(expected);
|
||||
}
|
||||
}
|
||||
|
||||
@ -67,10 +88,24 @@ export function scoreResults(
|
||||
|
||||
const denominator = Math.min(topK, expectedFiles.length);
|
||||
const precision_at_k = denominator > 0 ? hitsAtK / denominator : 0;
|
||||
const recall = expectedFiles.length > 0 ? totalHits / expectedFiles.length : 0;
|
||||
const recall = expectedFiles.length > 0 ? matchedFiles.length / expectedFiles.length : 0;
|
||||
const recall_at_1 = expectedFiles.length > 0 ? hitsWithin(resultFiles, expectedFiles, 1) / expectedFiles.length : 0;
|
||||
const recall_at_3 = expectedFiles.length > 0 ? hitsWithin(resultFiles, expectedFiles, 3) / expectedFiles.length : 0;
|
||||
const recall_at_5 = expectedFiles.length > 0 ? hitsWithin(resultFiles, expectedFiles, 5) / expectedFiles.length : 0;
|
||||
const f1 = precision_at_k + recall > 0
|
||||
? 2 * (precision_at_k * recall) / (precision_at_k + recall)
|
||||
: 0;
|
||||
|
||||
return { precision_at_k, recall, mrr, f1, hits_at_k: hitsAtK };
|
||||
return {
|
||||
precision_at_k,
|
||||
recall,
|
||||
recall_at_1,
|
||||
recall_at_3,
|
||||
recall_at_5,
|
||||
mrr,
|
||||
f1,
|
||||
hits_at_k: hitsAtK,
|
||||
matched_files: matchedFiles,
|
||||
unmatched_expected_files: unmatchedExpectedFiles,
|
||||
};
|
||||
}
|
||||
|
||||
@ -37,6 +37,12 @@ export interface BackendResult {
|
||||
precision_at_k: number;
|
||||
/** Fraction of expected files found anywhere in results */
|
||||
recall: number;
|
||||
/** Fraction of expected files found in the first result */
|
||||
recall_at_1: number;
|
||||
/** Fraction of expected files found in the top 3 results */
|
||||
recall_at_3: number;
|
||||
/** Fraction of expected files found in the top 5 results */
|
||||
recall_at_5: number;
|
||||
/** Reciprocal rank of first relevant result (1/rank, 0 if not found) */
|
||||
mrr: number;
|
||||
/** Harmonic mean of precision_at_k and recall */
|
||||
@ -49,6 +55,10 @@ export interface BackendResult {
|
||||
latency_ms: number;
|
||||
/** Top result file paths (for inspection) */
|
||||
top_files: string[];
|
||||
/** Expected files that were found anywhere in the returned result set */
|
||||
matched_files: string[];
|
||||
/** Expected files missing from the returned result set */
|
||||
unmatched_expected_files: string[];
|
||||
}
|
||||
|
||||
export interface QueryResult {
|
||||
@ -65,6 +75,9 @@ export interface BenchmarkResult {
|
||||
summary: Record<string, {
|
||||
avg_precision: number;
|
||||
avg_recall: number;
|
||||
avg_recall_at_1: number;
|
||||
avg_recall_at_3: number;
|
||||
avg_recall_at_5: number;
|
||||
avg_mrr: number;
|
||||
avg_f1: number;
|
||||
avg_latency_ms: number;
|
||||
|
||||
@ -185,8 +185,9 @@ export function searchResultsToMarkdown(
|
||||
if (opts.lineNumbers) {
|
||||
content = addLineNumbers(content);
|
||||
}
|
||||
const fileLine = `**file:** \`${row.displayPath}\`\n`;
|
||||
const contextLine = row.context ? `**context:** ${row.context}\n` : "";
|
||||
return `---\n# ${heading}\n\n**docid:** \`#${row.docid}\`\n${contextLine}\n${content}\n`;
|
||||
return `---\n# ${heading}\n\n${fileLine}**docid:** \`#${row.docid}\`\n${contextLine}\n${content}\n`;
|
||||
}).join("\n");
|
||||
}
|
||||
|
||||
|
||||
1688
src/cli/qmd.ts
1688
src/cli/qmd.ts
File diff suppressed because it is too large
Load Diff
@ -6,8 +6,8 @@
|
||||
*/
|
||||
|
||||
import { existsSync, mkdirSync, readFileSync, writeFileSync } from "fs";
|
||||
import { join, dirname } from "path";
|
||||
import { homedir } from "os";
|
||||
import { join, dirname, resolve } from "path";
|
||||
import { qmdHomedir } from "./paths.js";
|
||||
import YAML from "yaml";
|
||||
|
||||
// ============================================================================
|
||||
@ -101,9 +101,7 @@ export function setConfigSource(source?: { configPath?: string; config?: Collect
|
||||
export function setConfigIndexName(name: string): void {
|
||||
// Resolve relative paths to absolute paths and sanitize for use as filename
|
||||
if (name.includes('/')) {
|
||||
const { resolve } = require('path');
|
||||
const { cwd } = require('process');
|
||||
const absolutePath = resolve(cwd(), name);
|
||||
const absolutePath = resolve(process.cwd(), name);
|
||||
// Replace path separators with underscores to create a valid filename
|
||||
currentIndexName = absolutePath.replace(/\//g, '_').replace(/^_/, '');
|
||||
} else {
|
||||
@ -120,13 +118,41 @@ function getConfigDir(): string {
|
||||
if (process.env.XDG_CONFIG_HOME) {
|
||||
return join(process.env.XDG_CONFIG_HOME, "qmd");
|
||||
}
|
||||
return join(homedir(), ".config", "qmd");
|
||||
return join(qmdHomedir(), ".config", "qmd");
|
||||
}
|
||||
|
||||
function getConfigFilePath(): string {
|
||||
return join(getConfigDir(), `${currentIndexName}.yml`);
|
||||
}
|
||||
|
||||
/**
|
||||
* Find a project-local QMD config by walking upward from startDir.
|
||||
* The local config lives at .qmd/index.yaml or .qmd/index.yml and,
|
||||
* when used by the CLI, keeps both config and index DB writes inside
|
||||
* the project instead of the global ~/.config / ~/.cache locations.
|
||||
*/
|
||||
export function findLocalConfigPath(startDir: string = process.cwd()): string | undefined {
|
||||
let dir = resolve(startDir);
|
||||
|
||||
while (true) {
|
||||
const qmdDir = join(dir, ".qmd");
|
||||
const yamlPath = join(qmdDir, "index.yaml");
|
||||
if (existsSync(yamlPath)) return yamlPath;
|
||||
|
||||
const ymlPath = join(qmdDir, "index.yml");
|
||||
if (existsSync(ymlPath)) return ymlPath;
|
||||
|
||||
const parent = dirname(dir);
|
||||
if (parent === dir) return undefined;
|
||||
dir = parent;
|
||||
}
|
||||
}
|
||||
|
||||
/** Return the local SQLite index path paired with a local .qmd/index.yaml file. */
|
||||
export function getLocalDbPath(configPath: string): string {
|
||||
return join(dirname(configPath), "index.sqlite");
|
||||
}
|
||||
|
||||
/**
|
||||
* Ensure config directory exists
|
||||
*/
|
||||
@ -161,7 +187,8 @@ export function loadConfig(): CollectionConfig {
|
||||
|
||||
try {
|
||||
const content = readFileSync(configPath, "utf-8");
|
||||
const config = YAML.parse(content) as CollectionConfig;
|
||||
const parsed = YAML.parse(content) as CollectionConfig | null | undefined;
|
||||
const config = parsed ?? { collections: {} };
|
||||
|
||||
// Ensure collections object exists
|
||||
if (!config.collections) {
|
||||
|
||||
25
src/db.ts
25
src/db.ts
@ -11,10 +11,16 @@
|
||||
* SQLite build before creating any database instances.
|
||||
*/
|
||||
|
||||
export const isBun = typeof globalThis.Bun !== "undefined";
|
||||
export const isBun = "Bun" in globalThis;
|
||||
|
||||
let _Database: any;
|
||||
let _sqliteVecLoad: ((db: any) => void) | null;
|
||||
export type SQLiteValue = string | number | bigint | Buffer | Uint8Array | Float32Array | null;
|
||||
export type SQLiteParams = readonly SQLiteValue[];
|
||||
|
||||
type DatabaseConstructor = new (path: string) => Database;
|
||||
type LoadableSqliteDatabase = Pick<Database, "loadExtension">;
|
||||
|
||||
let _Database: DatabaseConstructor;
|
||||
let _sqliteVecLoad: ((db: LoadableSqliteDatabase) => void) | null;
|
||||
|
||||
if (isBun) {
|
||||
// Dynamic string prevents tsc from resolving bun:sqlite on Node.js builds
|
||||
@ -44,15 +50,15 @@ if (isBun) {
|
||||
const testDb = new BunDatabase(":memory:");
|
||||
testDb.loadExtension(vecPath);
|
||||
testDb.close();
|
||||
_sqliteVecLoad = (db: any) => db.loadExtension(vecPath);
|
||||
_sqliteVecLoad = (db: LoadableSqliteDatabase) => db.loadExtension(vecPath);
|
||||
} catch {
|
||||
// Vector search won't work, but BM25 and other operations are unaffected.
|
||||
_sqliteVecLoad = null;
|
||||
}
|
||||
} else {
|
||||
_Database = (await import("better-sqlite3")).default;
|
||||
_Database = (await import("better-sqlite3")).default as unknown as DatabaseConstructor;
|
||||
const sqliteVec = await import("sqlite-vec");
|
||||
_sqliteVecLoad = (db: any) => sqliteVec.load(db);
|
||||
_sqliteVecLoad = (db: LoadableSqliteDatabase) => sqliteVec.load(db as Parameters<typeof sqliteVec.load>[0]);
|
||||
}
|
||||
|
||||
/**
|
||||
@ -70,13 +76,14 @@ export interface Database {
|
||||
prepare(sql: string): Statement;
|
||||
transaction<T extends (...args: any[]) => any>(fn: T): T;
|
||||
loadExtension(path: string): void;
|
||||
transaction<T extends (...args: SQLiteValue[]) => unknown>(fn: T): T;
|
||||
close(): void;
|
||||
}
|
||||
|
||||
export interface Statement {
|
||||
run(...params: any[]): { changes: number; lastInsertRowid: number | bigint };
|
||||
get(...params: any[]): any;
|
||||
all(...params: any[]): any[];
|
||||
run(...params: SQLiteValue[]): { changes: number; lastInsertRowid: number | bigint };
|
||||
get<T = unknown>(...params: SQLiteValue[]): T | undefined;
|
||||
all<T = unknown>(...params: SQLiteValue[]): T[];
|
||||
}
|
||||
|
||||
/**
|
||||
|
||||
File diff suppressed because one or more lines are too long
10
src/index.ts
10
src/index.ts
@ -23,7 +23,6 @@ import {
|
||||
structuredSearch,
|
||||
extractSnippet,
|
||||
addLineNumbers,
|
||||
DEFAULT_EMBED_MODEL,
|
||||
DEFAULT_MULTI_GET_MAX_BYTES,
|
||||
reindexCollection,
|
||||
generateEmbeddings,
|
||||
@ -159,6 +158,8 @@ export interface SearchOptions {
|
||||
collections?: string[];
|
||||
/** Max results (default: 10) */
|
||||
limit?: number;
|
||||
/** Max candidates to rerank (default: 40) */
|
||||
candidateLimit?: number;
|
||||
/** Minimum score threshold */
|
||||
minScore?: number;
|
||||
/** Include explain traces */
|
||||
@ -290,6 +291,8 @@ export interface QMDStore {
|
||||
embed(options?: {
|
||||
force?: boolean;
|
||||
model?: string;
|
||||
/** Restrict embedding to documents in one collection. */
|
||||
collection?: string;
|
||||
maxDocsPerBatch?: number;
|
||||
maxBatchBytes?: number;
|
||||
chunkStrategy?: ChunkStrategy;
|
||||
@ -400,6 +403,7 @@ export async function createStore(options: StoreOptions): Promise<QMDStore> {
|
||||
minScore: opts.minScore,
|
||||
explain: opts.explain,
|
||||
intent: opts.intent,
|
||||
candidateLimit: opts.candidateLimit,
|
||||
skipRerank,
|
||||
chunkStrategy: opts.chunkStrategy,
|
||||
});
|
||||
@ -412,12 +416,13 @@ export async function createStore(options: StoreOptions): Promise<QMDStore> {
|
||||
minScore: opts.minScore,
|
||||
explain: opts.explain,
|
||||
intent: opts.intent,
|
||||
candidateLimit: opts.candidateLimit,
|
||||
skipRerank,
|
||||
chunkStrategy: opts.chunkStrategy,
|
||||
});
|
||||
},
|
||||
searchLex: async (q, opts) => internal.searchFTS(q, opts?.limit, opts?.collection),
|
||||
searchVector: async (q, opts) => internal.searchVec(q, DEFAULT_EMBED_MODEL, opts?.limit, opts?.collection),
|
||||
searchVector: async (q, opts) => internal.searchVec(q, llm.embedModelName, opts?.limit, opts?.collection),
|
||||
expandQuery: async (q, opts) => internal.expandQuery(q, undefined, opts?.intent),
|
||||
get: async (pathOrDocid, opts) => internal.findDocument(pathOrDocid, opts),
|
||||
getDocumentBody: async (pathOrDocid, opts) => {
|
||||
@ -516,6 +521,7 @@ export async function createStore(options: StoreOptions): Promise<QMDStore> {
|
||||
return generateEmbeddings(internal, {
|
||||
force: embedOpts?.force,
|
||||
model: embedOpts?.model,
|
||||
collection: embedOpts?.collection,
|
||||
maxDocsPerBatch: embedOpts?.maxDocsPerBatch,
|
||||
maxBatchBytes: embedOpts?.maxBatchBytes,
|
||||
chunkStrategy: embedOpts?.chunkStrategy,
|
||||
|
||||
550
src/llm.ts
550
src/llm.ts
@ -5,16 +5,72 @@
|
||||
* local GGUF embeddings plus local text generation and reranking via node-llama-cpp.
|
||||
*/
|
||||
|
||||
import {
|
||||
getLlama,
|
||||
resolveModelFile,
|
||||
LlamaChatSession,
|
||||
LlamaLogLevel,
|
||||
type Llama,
|
||||
type LlamaModel,
|
||||
type LlamaEmbeddingContext,
|
||||
type Token as LlamaToken,
|
||||
import type {
|
||||
Llama,
|
||||
LlamaModel,
|
||||
LlamaEmbeddingContext,
|
||||
Token as LlamaToken,
|
||||
} from "node-llama-cpp";
|
||||
|
||||
type StdoutChunk = string | Uint8Array;
|
||||
type WriteCallback = (err?: Error | null) => void;
|
||||
|
||||
type NodeLlamaCppModule = {
|
||||
getLlama: (options: Record<string, unknown>) => Promise<Llama>;
|
||||
getLlamaGpuTypes?: (include?: "supported" | "allValid") => Promise<LlamaGpuMode[]>;
|
||||
resolveModelFile: (model: string, cacheDir: string) => Promise<string>;
|
||||
LlamaChatSession: new (options: { contextSequence: unknown }) => {
|
||||
prompt: (prompt: string, options?: Record<string, unknown>) => Promise<string>;
|
||||
};
|
||||
LlamaLogLevel: { error: unknown };
|
||||
};
|
||||
|
||||
let nodeLlamaCppImport: Promise<NodeLlamaCppModule> | null = null;
|
||||
async function loadNodeLlamaCpp(): Promise<NodeLlamaCppModule> {
|
||||
nodeLlamaCppImport ??= withNativeStdoutRedirectedToStderr(
|
||||
() => import("node-llama-cpp") as Promise<NodeLlamaCppModule>
|
||||
);
|
||||
return nodeLlamaCppImport;
|
||||
}
|
||||
|
||||
export function setNodeLlamaCppModuleForTest(module: NodeLlamaCppModule | null): void {
|
||||
nodeLlamaCppImport = module ? Promise.resolve(module) : null;
|
||||
failedGpuInitModes.clear();
|
||||
noGpuAccelerationWarningShown = false;
|
||||
cpuForcedPrebuiltFallbackWarningShown = false;
|
||||
}
|
||||
|
||||
type StdoutWrite = typeof process.stdout.write;
|
||||
let nativeStdoutRedirectDepth = 0;
|
||||
let originalStdoutWrite: StdoutWrite | null = null;
|
||||
|
||||
/**
|
||||
* Some node-llama-cpp native build/probe paths write library noise to stdout.
|
||||
* JSON APIs must reserve stdout for machine-readable payloads, so route that
|
||||
* noise to stderr while native llama initialization is in progress.
|
||||
*/
|
||||
export async function withNativeStdoutRedirectedToStderr<T>(fn: () => Promise<T>): Promise<T> {
|
||||
if (nativeStdoutRedirectDepth === 0) {
|
||||
originalStdoutWrite = process.stdout.write.bind(process.stdout) as StdoutWrite;
|
||||
process.stdout.write = ((chunk: StdoutChunk, encodingOrCallback?: BufferEncoding | WriteCallback, callback?: WriteCallback) => {
|
||||
if (typeof encodingOrCallback === "function") {
|
||||
return process.stderr.write(chunk, encodingOrCallback);
|
||||
}
|
||||
return process.stderr.write(chunk, encodingOrCallback, callback);
|
||||
}) as StdoutWrite;
|
||||
}
|
||||
nativeStdoutRedirectDepth++;
|
||||
try {
|
||||
return await fn();
|
||||
} finally {
|
||||
nativeStdoutRedirectDepth--;
|
||||
if (nativeStdoutRedirectDepth === 0 && originalStdoutWrite) {
|
||||
process.stdout.write = originalStdoutWrite;
|
||||
originalStdoutWrite = null;
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
import { homedir } from "os";
|
||||
import { join } from "path";
|
||||
import { existsSync, mkdirSync, statSync, unlinkSync, readdirSync, readFileSync, writeFileSync, openSync, readSync, closeSync } from "fs";
|
||||
@ -37,7 +93,7 @@ export function isQwen3EmbeddingModel(modelUri: string): boolean {
|
||||
* Uses Qwen3-Embedding instruct format when a Qwen embedding model is active.
|
||||
*/
|
||||
export function formatQueryForEmbedding(query: string, modelUri?: string): string {
|
||||
const uri = modelUri ?? process.env.QMD_EMBED_MODEL ?? DEFAULT_EMBED_MODEL;
|
||||
const uri = modelUri ?? resolveEmbedModel();
|
||||
if (isQwen3EmbeddingModel(uri)) {
|
||||
return `Instruct: Retrieve relevant documents for the given query\nQuery: ${query}`;
|
||||
}
|
||||
@ -50,7 +106,7 @@ export function formatQueryForEmbedding(query: string, modelUri?: string): strin
|
||||
* Qwen3-Embedding encodes documents as raw text without special prefixes.
|
||||
*/
|
||||
export function formatDocForEmbedding(text: string, title?: string, modelUri?: string): string {
|
||||
const uri = modelUri ?? process.env.QMD_EMBED_MODEL ?? DEFAULT_EMBED_MODEL;
|
||||
const uri = modelUri ?? resolveEmbedModel();
|
||||
if (isQwen3EmbeddingModel(uri)) {
|
||||
// Qwen3-Embedding: documents are raw text, no task prefix
|
||||
return title ? `${title}\n${text}` : text;
|
||||
@ -208,6 +264,32 @@ export const DEFAULT_EMBED_MODEL_URI = DEFAULT_EMBED_MODEL;
|
||||
export const DEFAULT_RERANK_MODEL_URI = DEFAULT_RERANK_MODEL;
|
||||
export const DEFAULT_GENERATE_MODEL_URI = DEFAULT_GENERATE_MODEL;
|
||||
|
||||
export type ModelResolutionConfig = {
|
||||
embed?: string;
|
||||
generate?: string;
|
||||
rerank?: string;
|
||||
};
|
||||
|
||||
export function resolveEmbedModel(config?: ModelResolutionConfig): string {
|
||||
return config?.embed || process.env.QMD_EMBED_MODEL || DEFAULT_EMBED_MODEL;
|
||||
}
|
||||
|
||||
export function resolveGenerateModel(config?: ModelResolutionConfig): string {
|
||||
return config?.generate || process.env.QMD_GENERATE_MODEL || DEFAULT_GENERATE_MODEL;
|
||||
}
|
||||
|
||||
export function resolveRerankModel(config?: ModelResolutionConfig): string {
|
||||
return config?.rerank || process.env.QMD_RERANK_MODEL || DEFAULT_RERANK_MODEL;
|
||||
}
|
||||
|
||||
export function resolveModels(config?: ModelResolutionConfig): Required<ModelResolutionConfig> {
|
||||
return {
|
||||
embed: resolveEmbedModel(config),
|
||||
generate: resolveGenerateModel(config),
|
||||
rerank: resolveRerankModel(config),
|
||||
};
|
||||
}
|
||||
|
||||
// Local model cache directory
|
||||
const MODEL_CACHE_DIR = process.env.XDG_CACHE_HOME
|
||||
? join(process.env.XDG_CACHE_HOME, "qmd", "models")
|
||||
@ -270,37 +352,106 @@ async function getRemoteEtag(ref: HfRef): Promise<string | null> {
|
||||
|
||||
const GGUF_MAGIC = Buffer.from("GGUF");
|
||||
|
||||
export type GgufFileInspection = {
|
||||
exists: boolean;
|
||||
valid: boolean;
|
||||
kind: "missing" | "gguf" | "html" | "invalid";
|
||||
sizeBytes?: number;
|
||||
magic?: string;
|
||||
details: string;
|
||||
};
|
||||
|
||||
function formatModelFileSize(sizeBytes: number): string {
|
||||
return `${(sizeBytes / 1024).toFixed(0)} KB`;
|
||||
}
|
||||
|
||||
function printableMagic(header: Buffer): string {
|
||||
const text = header.toString("utf-8");
|
||||
return /^[\x20-\x7e]{1,4}$/.test(text) ? text : `0x${header.toString("hex")}`;
|
||||
}
|
||||
|
||||
/**
|
||||
* Inspect a potential GGUF model file without mutating it.
|
||||
* Used by doctor for early diagnostics and by runtime validation before load.
|
||||
*/
|
||||
export function inspectGgufFile(filePath: string): GgufFileInspection {
|
||||
if (!existsSync(filePath)) {
|
||||
return { exists: false, valid: false, kind: "missing", details: "file does not exist" };
|
||||
}
|
||||
|
||||
let sizeBytes = 0;
|
||||
try {
|
||||
sizeBytes = statSync(filePath).size;
|
||||
const fd = openSync(filePath, "r");
|
||||
const sniff = Buffer.alloc(512);
|
||||
try {
|
||||
readSync(fd, sniff, 0, 512, 0);
|
||||
} finally {
|
||||
closeSync(fd);
|
||||
}
|
||||
|
||||
const header = sniff.subarray(0, 4);
|
||||
if (header.equals(GGUF_MAGIC)) {
|
||||
return {
|
||||
exists: true,
|
||||
valid: true,
|
||||
kind: "gguf",
|
||||
sizeBytes,
|
||||
magic: "GGUF",
|
||||
details: `valid GGUF (${formatModelFileSize(sizeBytes)})`,
|
||||
};
|
||||
}
|
||||
|
||||
const magic = printableMagic(header);
|
||||
const text = sniff.toString("utf-8").toLowerCase();
|
||||
const isHtml = text.includes("<!doctype") || text.includes("<html");
|
||||
if (isHtml) {
|
||||
return {
|
||||
exists: true,
|
||||
valid: false,
|
||||
kind: "html",
|
||||
sizeBytes,
|
||||
magic,
|
||||
details: `HTML page, not a GGUF model (${formatModelFileSize(sizeBytes)}); likely proxy/firewall/captive portal response`,
|
||||
};
|
||||
}
|
||||
|
||||
return {
|
||||
exists: true,
|
||||
valid: false,
|
||||
kind: "invalid",
|
||||
sizeBytes,
|
||||
magic,
|
||||
details: `not valid GGUF (expected magic "GGUF", got "${magic}", ${formatModelFileSize(sizeBytes)})`,
|
||||
};
|
||||
} catch (error) {
|
||||
return {
|
||||
exists: true,
|
||||
valid: false,
|
||||
kind: "invalid",
|
||||
sizeBytes,
|
||||
details: `cannot read model file: ${error instanceof Error ? error.message : String(error)}`,
|
||||
};
|
||||
}
|
||||
}
|
||||
|
||||
/**
|
||||
* Validate that a file is actually a GGUF model, not an HTML error page
|
||||
* from a proxy, firewall, or failed download.
|
||||
* Throws a descriptive error if the file is not valid GGUF.
|
||||
*/
|
||||
function validateGgufFile(filePath: string, modelUri: string): void {
|
||||
if (!existsSync(filePath)) return; // let downstream handle missing files
|
||||
|
||||
// Read header + sniff bytes in one go, then close immediately
|
||||
const fd = openSync(filePath, "r");
|
||||
const sniff = Buffer.alloc(512);
|
||||
try {
|
||||
readSync(fd, sniff, 0, 512, 0);
|
||||
} finally {
|
||||
closeSync(fd);
|
||||
}
|
||||
|
||||
const header = sniff.subarray(0, 4);
|
||||
if (header.equals(GGUF_MAGIC)) return; // valid GGUF
|
||||
|
||||
const text = sniff.toString("utf-8").toLowerCase();
|
||||
const isHtml = text.includes("<!doctype") || text.includes("<html");
|
||||
const got = header.toString("utf-8");
|
||||
const sizeKB = (statSync(filePath).size / 1024).toFixed(0);
|
||||
const inspection = inspectGgufFile(filePath);
|
||||
if (!inspection.exists || inspection.valid) return; // let downstream handle missing files
|
||||
|
||||
// Remove the bad file so the next attempt re-downloads
|
||||
unlinkSync(filePath);
|
||||
try {
|
||||
unlinkSync(filePath);
|
||||
} catch { /* best effort */ }
|
||||
|
||||
if (isHtml) {
|
||||
if (inspection.kind === "html") {
|
||||
throw new Error(
|
||||
`Downloaded model file is an HTML page, not a GGUF model (${sizeKB} KB).\n` +
|
||||
`Downloaded model file is an HTML page, not a GGUF model (${formatModelFileSize(inspection.sizeBytes ?? 0)}).\n` +
|
||||
`Something is intercepting the download from huggingface.co (a proxy, firewall, or captive portal).\n\n` +
|
||||
`Model: ${modelUri}\n` +
|
||||
`Path: ${filePath}\n\n` +
|
||||
@ -313,7 +464,7 @@ function validateGgufFile(filePath: string, modelUri: string): void {
|
||||
}
|
||||
|
||||
throw new Error(
|
||||
`Model file is not valid GGUF (expected magic "GGUF", got "${got}", file is ${sizeKB} KB).\n` +
|
||||
`Model file is not valid GGUF (expected magic "GGUF", got "${inspection.magic ?? "unknown"}", file is ${formatModelFileSize(inspection.sizeBytes ?? 0)}).\n` +
|
||||
`Model: ${modelUri}\n` +
|
||||
`Path: ${filePath}\n\n` +
|
||||
`The file has been removed. Run the command again to re-download.`
|
||||
@ -364,6 +515,7 @@ export async function pullModels(
|
||||
}
|
||||
}
|
||||
|
||||
const { resolveModelFile } = await loadNodeLlamaCpp();
|
||||
const path = await resolveModelFile(model, cacheDir);
|
||||
validateGgufFile(path, model);
|
||||
const sizeBytes = existsSync(path) ? statSync(path).size : 0;
|
||||
@ -460,9 +612,51 @@ export type LlamaCppConfig = {
|
||||
const DEFAULT_INACTIVITY_TIMEOUT_MS = 5 * 60 * 1000;
|
||||
const DEFAULT_EXPAND_CONTEXT_SIZE = 2048;
|
||||
|
||||
type LlamaGpuMode = "auto" | "metal" | "vulkan" | "cuda" | false;
|
||||
export type LlamaGpuMode = "auto" | "metal" | "vulkan" | "cuda" | false;
|
||||
|
||||
type ParallelismOptions = {
|
||||
gpu: string | false;
|
||||
platform?: NodeJS.Platform;
|
||||
computed: number;
|
||||
envValue?: string;
|
||||
};
|
||||
|
||||
export function resolveParallelismOverride(envValue = process.env.QMD_EMBED_PARALLELISM): number | undefined {
|
||||
const normalized = envValue?.trim() ?? "";
|
||||
if (!normalized) return undefined;
|
||||
|
||||
const parsed = Number(normalized);
|
||||
if (!Number.isInteger(parsed) || parsed < 1) {
|
||||
process.stderr.write(`QMD Warning: invalid QMD_EMBED_PARALLELISM="${envValue}", using automatic parallelism.\n`);
|
||||
return undefined;
|
||||
}
|
||||
|
||||
return Math.min(8, parsed);
|
||||
}
|
||||
|
||||
export function resolveSafeParallelism(options: ParallelismOptions): number {
|
||||
const override = resolveParallelismOverride(options.envValue);
|
||||
if (override !== undefined) return override;
|
||||
|
||||
// node-llama-cpp/llama.cpp CUDA on Windows is unstable with multiple
|
||||
// simultaneous contexts (ggml-cuda.cu:98 in #519). Vulkan and CPU do not
|
||||
// show the same failure mode, so only serialize Windows CUDA by default.
|
||||
if ((options.platform ?? process.platform) === "win32" && options.gpu === "cuda") {
|
||||
return 1;
|
||||
}
|
||||
|
||||
return Math.max(1, options.computed);
|
||||
}
|
||||
|
||||
export function resolveLlamaGpuMode(
|
||||
envValue = process.env.QMD_LLAMA_GPU,
|
||||
forceCpuValue = process.env.QMD_FORCE_CPU
|
||||
): LlamaGpuMode {
|
||||
const forceCpu = forceCpuValue?.trim().toLowerCase() ?? "";
|
||||
if (forceCpu && !["false", "off", "none", "disable", "disabled", "0"].includes(forceCpu)) {
|
||||
return false;
|
||||
}
|
||||
|
||||
export function resolveLlamaGpuMode(envValue = process.env.QMD_LLAMA_GPU): LlamaGpuMode {
|
||||
const normalized = envValue?.trim().toLowerCase() ?? "";
|
||||
if (!normalized) return "auto";
|
||||
if (["false", "off", "none", "disable", "disabled", "0"].includes(normalized)) return false;
|
||||
@ -472,6 +666,23 @@ export function resolveLlamaGpuMode(envValue = process.env.QMD_LLAMA_GPU): Llama
|
||||
return "auto";
|
||||
}
|
||||
|
||||
async function disposeWithTimeout(resourceName: string, dispose: () => Promise<void>, timeoutMs = 1000): Promise<void> {
|
||||
const timeoutPromise = new Promise<"timeout">((resolve) => {
|
||||
setTimeout(() => resolve("timeout"), timeoutMs).unref();
|
||||
});
|
||||
|
||||
try {
|
||||
const result = await Promise.race([dispose(), timeoutPromise]);
|
||||
if (result === "timeout") {
|
||||
process.stderr.write(`QMD Warning: timed out disposing ${resourceName}; continuing shutdown.\n`);
|
||||
}
|
||||
} catch (error) {
|
||||
process.stderr.write(
|
||||
`QMD Warning: failed to dispose ${resourceName} (${error instanceof Error ? error.message : String(error)}); continuing shutdown.\n`
|
||||
);
|
||||
}
|
||||
}
|
||||
|
||||
function resolveExpandContextSize(configValue?: number): number {
|
||||
if (configValue !== undefined) {
|
||||
if (!Number.isInteger(configValue) || configValue <= 0) {
|
||||
@ -493,6 +704,14 @@ function resolveExpandContextSize(configValue?: number): number {
|
||||
return parsed;
|
||||
}
|
||||
|
||||
const failedGpuInitModes = new Set<LlamaGpuMode>();
|
||||
let noGpuAccelerationWarningShown = false;
|
||||
let cpuForcedPrebuiltFallbackWarningShown = false;
|
||||
|
||||
function isCpuModeRequested(): boolean {
|
||||
return resolveLlamaGpuMode() === false;
|
||||
}
|
||||
|
||||
export class LlamaCpp implements LLM {
|
||||
private readonly _ciMode = !!process.env.CI;
|
||||
private llama: Llama | null = null;
|
||||
@ -530,6 +749,15 @@ export class LlamaCpp implements LLM {
|
||||
this.embedApiKey = config.embedApiKey || process.env.QMD_EMBED_API_KEY || process.env.NVIDIA_API_KEY || process.env.OPENAI_API_KEY;
|
||||
this.generateModelUri = config.generateModel || process.env.QMD_GENERATE_MODEL || DEFAULT_GENERATE_MODEL;
|
||||
this.rerankModelUri = config.rerankModel || process.env.QMD_RERANK_MODEL || DEFAULT_RERANK_MODEL;
|
||||
// STRUCTURAL INVARIANT: the launcher (bin/qmd) sets GGML_METAL_NO_RESIDENCY=1
|
||||
// on darwin BEFORE the native binding loads, which prevents the libggml-metal
|
||||
// static destructor assertion at process exit (ggml-org/llama.cpp#22593).
|
||||
// See isDarwinMetalMitigationActive() for the runtime check exposed to
|
||||
// diagnostics. No constructor-time guard installation is needed.
|
||||
|
||||
this.embedModelUri = resolveEmbedModel({ embed: config.embedModel });
|
||||
this.generateModelUri = resolveGenerateModel({ generate: config.generateModel });
|
||||
this.rerankModelUri = resolveRerankModel({ rerank: config.rerankModel });
|
||||
this.modelCacheDir = config.modelCacheDir || MODEL_CACHE_DIR;
|
||||
this.expandContextSize = resolveExpandContextSize(config.expandContextSize);
|
||||
this.inactivityTimeoutMs = config.inactivityTimeoutMs ?? DEFAULT_INACTIVITY_TIMEOUT_MS;
|
||||
@ -542,6 +770,13 @@ export class LlamaCpp implements LLM {
|
||||
|
||||
get usesLocalEmbedding(): boolean {
|
||||
return isLocalEmbeddingModel(this.embedModelUri);
|
||||
|
||||
get generateModelName(): string {
|
||||
return this.generateModelUri;
|
||||
}
|
||||
|
||||
get rerankModelName(): string {
|
||||
return this.rerankModelUri;
|
||||
}
|
||||
|
||||
/**
|
||||
@ -649,33 +884,89 @@ export class LlamaCpp implements LLM {
|
||||
if (!this.llama) {
|
||||
const gpuMode = resolveLlamaGpuMode();
|
||||
|
||||
const loadLlama = async (gpu: LlamaGpuMode) =>
|
||||
await getLlama({
|
||||
build: allowBuild ? "autoAttempt" : "never",
|
||||
const { getLlama, getLlamaGpuTypes, LlamaLogLevel } = await loadNodeLlamaCpp();
|
||||
const loadLlama = async (gpu: LlamaGpuMode, sourceBuildAllowed = allowBuild, buildOverride?: "auto" | "never") =>
|
||||
await withNativeStdoutRedirectedToStderr(() => getLlama({
|
||||
// Prefer packaged prebuilt bindings before compiling llama.cpp locally.
|
||||
// node-llama-cpp documents gpu:"auto" as the best default: Metal on
|
||||
// Apple Silicon, CUDA when fully available, Vulkan where available,
|
||||
// then CPU. Use build:"auto" for normal loads and build:"never" for
|
||||
// diagnostic/probe paths that must not compile llama.cpp.
|
||||
build: buildOverride ?? (sourceBuildAllowed ? "auto" : "never"),
|
||||
logLevel: LlamaLogLevel.error,
|
||||
gpu,
|
||||
skipDownload: !allowBuild,
|
||||
});
|
||||
progressLogs: false,
|
||||
skipDownload: !sourceBuildAllowed,
|
||||
}));
|
||||
const loadCpuCompatibleLlama = async () => {
|
||||
try {
|
||||
return await loadLlama(false, false);
|
||||
} catch (err) {
|
||||
// Some platforms, notably Apple Silicon, ship a Metal prebuilt but no
|
||||
// CPU-only prebuilt. Do a fast no-build lookup for an actual CPU
|
||||
// binding first; if it does not exist, use the packaged auto/Metal
|
||||
// binding and disable model offloading via gpuLayers: 0.
|
||||
if (!cpuForcedPrebuiltFallbackWarningShown) {
|
||||
cpuForcedPrebuiltFallbackWarningShown = true;
|
||||
process.stderr.write(
|
||||
`QMD Warning: CPU-only llama.cpp prebuilt not available (${err instanceof Error ? err.message : String(err)}); using packaged backend with GPU offloading disabled.\n`
|
||||
);
|
||||
}
|
||||
return await loadLlama("auto", false);
|
||||
}
|
||||
};
|
||||
|
||||
let llama: Llama;
|
||||
if (gpuMode === false) {
|
||||
llama = await loadLlama(false);
|
||||
llama = await loadCpuCompatibleLlama();
|
||||
} else if (failedGpuInitModes.has(gpuMode)) {
|
||||
process.stderr.write(
|
||||
`QMD Warning: skipping previously failed GPU init${gpuMode === "auto" ? "" : ` for QMD_LLAMA_GPU=${gpuMode}`}, using CPU.\n`
|
||||
);
|
||||
llama = await loadCpuCompatibleLlama();
|
||||
} else {
|
||||
try {
|
||||
llama = await loadLlama(gpuMode);
|
||||
|
||||
// If node-llama-cpp auto-detection chose CPU, do one no-build pass
|
||||
// over all OS-valid packaged GPU backends. This preserves the
|
||||
// documented auto mode for Metal/CUDA/Vulkan while recovering on
|
||||
// systems where a packaged backend can load but detection is too
|
||||
// conservative. Never compile during these extra probes.
|
||||
if (gpuMode === "auto" && llama.gpu === false && getLlamaGpuTypes) {
|
||||
const candidates = (await getLlamaGpuTypes("allValid"))
|
||||
.filter((candidate): candidate is Exclude<LlamaGpuMode, "auto" | false> => candidate !== false && candidate !== "auto");
|
||||
for (const candidate of candidates) {
|
||||
if (failedGpuInitModes.has(candidate)) continue;
|
||||
try {
|
||||
const gpuLlama = await loadLlama(candidate, false, "never");
|
||||
if (gpuLlama.gpu !== false) {
|
||||
await disposeWithTimeout("CPU llama runtime", () => llama.dispose());
|
||||
llama = gpuLlama;
|
||||
break;
|
||||
}
|
||||
await disposeWithTimeout(`${candidate} probe runtime`, () => gpuLlama.dispose());
|
||||
} catch {
|
||||
failedGpuInitModes.add(candidate);
|
||||
}
|
||||
}
|
||||
}
|
||||
} catch (err) {
|
||||
// GPU backend (e.g. Vulkan on headless/driverless machines) can throw at init.
|
||||
// Fall back to CPU so qmd still works.
|
||||
// GPU backend (e.g. Vulkan/CUDA on headless/driverless machines) can throw at init.
|
||||
// Fall back to CPU so qmd still works, and cache the failure to avoid repeated
|
||||
// expensive native build/probe attempts in this process.
|
||||
failedGpuInitModes.add(gpuMode);
|
||||
process.stderr.write(
|
||||
`QMD Warning: GPU init failed${gpuMode === "auto" ? "" : ` for QMD_LLAMA_GPU=${gpuMode}`} (${err instanceof Error ? err.message : String(err)}), falling back to CPU.\n`
|
||||
);
|
||||
llama = await loadLlama(false);
|
||||
llama = await loadCpuCompatibleLlama();
|
||||
}
|
||||
}
|
||||
|
||||
if (llama.gpu === false) {
|
||||
if (llama.gpu === false && !noGpuAccelerationWarningShown) {
|
||||
noGpuAccelerationWarningShown = true;
|
||||
process.stderr.write(
|
||||
"QMD Warning: no GPU acceleration, running on CPU (slow). Run 'qmd status' for details.\n"
|
||||
"QMD Warning: no GPU acceleration, running on CPU (slow). Run 'qmd doctor' for device diagnostics.\n"
|
||||
);
|
||||
}
|
||||
this.llama = llama;
|
||||
@ -683,6 +974,17 @@ export class LlamaCpp implements LLM {
|
||||
return this.llama;
|
||||
}
|
||||
|
||||
private isCpuOffloadForced(): boolean {
|
||||
return isCpuModeRequested();
|
||||
}
|
||||
|
||||
private modelLoadOptions(modelPath: string): { modelPath: string; gpuLayers?: number } {
|
||||
return {
|
||||
modelPath,
|
||||
...(this.isCpuOffloadForced() ? { gpuLayers: 0 } : {}),
|
||||
};
|
||||
}
|
||||
|
||||
/**
|
||||
* Resolve a model URI to a local path, downloading if needed.
|
||||
* Validates the downloaded file is actually a GGUF model (not an HTML error page
|
||||
@ -691,6 +993,7 @@ export class LlamaCpp implements LLM {
|
||||
private async resolveModel(modelUri: string): Promise<string> {
|
||||
this.ensureModelCacheDir();
|
||||
// resolveModelFile handles HF URIs and downloads to the cache dir
|
||||
const { resolveModelFile } = await loadNodeLlamaCpp();
|
||||
const modelPath = await resolveModelFile(modelUri, this.modelCacheDir);
|
||||
validateGgufFile(modelPath, modelUri);
|
||||
return modelPath;
|
||||
@ -713,7 +1016,7 @@ export class LlamaCpp implements LLM {
|
||||
this.embedModelLoadPromise = (async () => {
|
||||
const llama = await this.ensureLlama();
|
||||
const modelPath = await this.resolveModel(this.embedModelUri);
|
||||
const model = await llama.loadModel({ modelPath });
|
||||
const model = await llama.loadModel(this.modelLoadOptions(modelPath));
|
||||
this.embedModel = model;
|
||||
// Model loading counts as activity - ping to keep alive
|
||||
this.touchActivity();
|
||||
@ -739,21 +1042,23 @@ export class LlamaCpp implements LLM {
|
||||
private async computeParallelism(perContextMB: number): Promise<number> {
|
||||
const llama = await this.ensureLlama();
|
||||
|
||||
if (llama.gpu) {
|
||||
if (!this.isCpuOffloadForced() && llama.gpu) {
|
||||
try {
|
||||
const vram = await llama.getVramState();
|
||||
const freeMB = vram.free / (1024 * 1024);
|
||||
const maxByVram = Math.floor((freeMB * 0.25) / perContextMB);
|
||||
return Math.max(1, Math.min(8, maxByVram));
|
||||
const computed = Math.max(1, Math.min(8, maxByVram));
|
||||
return resolveSafeParallelism({ gpu: llama.gpu, computed });
|
||||
} catch {
|
||||
return 2;
|
||||
return resolveSafeParallelism({ gpu: llama.gpu, computed: 2 });
|
||||
}
|
||||
}
|
||||
|
||||
// CPU: split cores across contexts. At least 4 threads per context.
|
||||
const cores = llama.cpuMathCores || 4;
|
||||
const maxContexts = Math.floor(cores / 4);
|
||||
return Math.max(1, Math.min(4, maxContexts));
|
||||
const computed = Math.max(1, Math.min(4, maxContexts));
|
||||
return resolveSafeParallelism({ gpu: false, computed });
|
||||
}
|
||||
|
||||
/**
|
||||
@ -762,7 +1067,7 @@ export class LlamaCpp implements LLM {
|
||||
*/
|
||||
private async threadsPerContext(parallelism: number): Promise<number> {
|
||||
const llama = await this.ensureLlama();
|
||||
if (llama.gpu) return 0; // GPU: let the library decide
|
||||
if (!this.isCpuOffloadForced() && llama.gpu) return 0; // GPU: let the library decide
|
||||
const cores = llama.cpuMathCores || 4;
|
||||
return Math.max(1, Math.floor(cores / parallelism));
|
||||
}
|
||||
@ -830,7 +1135,7 @@ export class LlamaCpp implements LLM {
|
||||
this.generateModelLoadPromise = (async () => {
|
||||
const llama = await this.ensureLlama();
|
||||
const modelPath = await this.resolveModel(this.generateModelUri);
|
||||
const model = await llama.loadModel({ modelPath });
|
||||
const model = await llama.loadModel(this.modelLoadOptions(modelPath));
|
||||
this.generateModel = model;
|
||||
return model;
|
||||
})();
|
||||
@ -862,7 +1167,7 @@ export class LlamaCpp implements LLM {
|
||||
this.rerankModelLoadPromise = (async () => {
|
||||
const llama = await this.ensureLlama();
|
||||
const modelPath = await this.resolveModel(this.rerankModelUri);
|
||||
const model = await llama.loadModel({ modelPath });
|
||||
const model = await llama.loadModel(this.modelLoadOptions(modelPath));
|
||||
this.rerankModel = model;
|
||||
// Model loading counts as activity - ping to keep alive
|
||||
this.touchActivity();
|
||||
@ -911,9 +1216,8 @@ export class LlamaCpp implements LLM {
|
||||
try {
|
||||
this.rerankContexts.push(await model.createRankingContext({
|
||||
contextSize: LlamaCpp.RERANK_CONTEXT_SIZE,
|
||||
flashAttention: true,
|
||||
...(threads > 0 ? { threads } : {}),
|
||||
} as any));
|
||||
}));
|
||||
} catch {
|
||||
if (this.rerankContexts.length === 0) {
|
||||
// Flash attention might not be supported — retry without it
|
||||
@ -1194,6 +1498,7 @@ export class LlamaCpp implements LLM {
|
||||
// Create fresh context -> sequence -> session for each call
|
||||
const context = await this.generateModel!.createContext();
|
||||
const sequence = context.getSequence();
|
||||
const { LlamaChatSession } = await loadNodeLlamaCpp();
|
||||
const session = new LlamaChatSession({ contextSequence: sequence });
|
||||
|
||||
const maxTokens = options.maxTokens ?? 150;
|
||||
@ -1208,7 +1513,7 @@ export class LlamaCpp implements LLM {
|
||||
temperature,
|
||||
topK: 20,
|
||||
topP: 0.8,
|
||||
onTextChunk: (text) => {
|
||||
onTextChunk: (text: string) => {
|
||||
result += text;
|
||||
},
|
||||
});
|
||||
@ -1274,6 +1579,7 @@ export class LlamaCpp implements LLM {
|
||||
contextSize: this.expandContextSize,
|
||||
});
|
||||
const sequence = genContext.getSequence();
|
||||
const { LlamaChatSession } = await loadNodeLlamaCpp();
|
||||
const session = new LlamaChatSession({ contextSequence: sequence });
|
||||
|
||||
try {
|
||||
@ -1452,17 +1758,18 @@ export class LlamaCpp implements LLM {
|
||||
cpuCores: number;
|
||||
}> {
|
||||
const llama = await this.ensureLlama(options.allowBuild ?? true);
|
||||
const gpuDevices = await llama.getGpuDeviceNames();
|
||||
const cpuForced = this.isCpuOffloadForced();
|
||||
const gpuDevices = cpuForced ? [] : await llama.getGpuDeviceNames();
|
||||
let vram: { total: number; used: number; free: number } | undefined;
|
||||
if (llama.gpu) {
|
||||
if (!cpuForced && llama.gpu) {
|
||||
try {
|
||||
const state = await llama.getVramState();
|
||||
vram = { total: state.total, used: state.used, free: state.free };
|
||||
} catch { /* no vram info */ }
|
||||
}
|
||||
return {
|
||||
gpu: llama.gpu,
|
||||
gpuOffloading: llama.supportsGpuOffloading,
|
||||
gpu: cpuForced ? false : llama.gpu,
|
||||
gpuOffloading: !cpuForced && llama.supportsGpuOffloading,
|
||||
gpuDevices,
|
||||
vram,
|
||||
cpuCores: llama.cpuMathCores,
|
||||
@ -1482,22 +1789,37 @@ export class LlamaCpp implements LLM {
|
||||
this.inactivityTimer = null;
|
||||
}
|
||||
|
||||
// Disposing llama cascades to models and contexts automatically
|
||||
// See: https://node-llama-cpp.withcat.ai/guide/objects-lifecycle
|
||||
// Note: llama.dispose() can hang indefinitely, so we use a timeout
|
||||
if (this.llama) {
|
||||
const disposePromise = this.llama.dispose();
|
||||
const timeoutPromise = new Promise<void>((resolve) => setTimeout(resolve, 1000));
|
||||
await Promise.race([disposePromise, timeoutPromise]);
|
||||
// Explicitly dispose in dependency order: contexts first, then models, then llama.
|
||||
// Relying only on llama.dispose() leaves Metal resource sets alive until process
|
||||
// finalization on Apple Silicon, where ggml_metal_device_free can abort after
|
||||
// otherwise-successful CLI output (#368).
|
||||
for (const ctx of this.embedContexts) {
|
||||
await disposeWithTimeout("embedding context", () => ctx.dispose());
|
||||
}
|
||||
this.embedContexts = [];
|
||||
|
||||
for (const ctx of this.rerankContexts) {
|
||||
await disposeWithTimeout("rerank context", () => ctx.dispose());
|
||||
}
|
||||
this.rerankContexts = [];
|
||||
|
||||
if (this.embedModel) {
|
||||
await disposeWithTimeout("embedding model", () => this.embedModel!.dispose());
|
||||
this.embedModel = null;
|
||||
}
|
||||
if (this.generateModel) {
|
||||
await disposeWithTimeout("generation model", () => this.generateModel!.dispose());
|
||||
this.generateModel = null;
|
||||
}
|
||||
if (this.rerankModel) {
|
||||
await disposeWithTimeout("rerank model", () => this.rerankModel!.dispose());
|
||||
this.rerankModel = null;
|
||||
}
|
||||
|
||||
// Clear references
|
||||
this.embedContexts = [];
|
||||
this.rerankContexts = [];
|
||||
this.embedModel = null;
|
||||
this.generateModel = null;
|
||||
this.rerankModel = null;
|
||||
this.llama = null;
|
||||
if (this.llama) {
|
||||
await disposeWithTimeout("llama runtime", () => this.llama!.dispose());
|
||||
this.llama = null;
|
||||
}
|
||||
|
||||
// Clear any in-flight load/create promises
|
||||
this.embedModelLoadPromise = null;
|
||||
@ -1752,6 +2074,66 @@ export function canUnloadLLM(): boolean {
|
||||
return defaultSessionManager.canUnload();
|
||||
}
|
||||
|
||||
// =============================================================================
|
||||
// Darwin Metal exit-crash mitigation
|
||||
// =============================================================================
|
||||
//
|
||||
// libggml-metal on macOS keeps allocated model memory wired via "residency
|
||||
// sets" with a 180-second keep_alive timer (added in ggml-org/llama.cpp#11427).
|
||||
// The process-static `std::vector<std::unique_ptr<ggml_metal_device>>`
|
||||
// destructor fires during libc `exit()` → `__cxa_finalize_ranges` and asserts
|
||||
// `[rsets->data count] == 0` — but the keep_alive hasn't expired, so the
|
||||
// assertion fails and `ggml_abort` dumps a multi-kilobyte stack trace to
|
||||
// stderr after the user-visible output. See ggml-org/llama.cpp#22593.
|
||||
//
|
||||
// No JS-side dispose call (`llama.dispose()`, `model.dispose()`, etc.) can
|
||||
// prevent it: the static destructor runs after every JS-reachable cleanup,
|
||||
// and `process.reallyExit` on Node calls libc `exit()` not `_exit()` (it
|
||||
// does NOT skip C++ static destructors — verified in
|
||||
// node/src/api/environment.cc).
|
||||
//
|
||||
// The actual fix is to disable residency sets via `GGML_METAL_NO_RESIDENCY=1`,
|
||||
// which we set from `bin/qmd` before Node loads the native binding. For QMD's
|
||||
// short-lived CLI workflow this has no measurable cost (subsequent calls
|
||||
// don't reuse the warm mapping). The functions below report whether that
|
||||
// mitigation is in effect — kept here, in the module that depends on the
|
||||
// underlying resource, so doctor can answer "is the protection active?"
|
||||
// without reaching into env handling directly.
|
||||
//
|
||||
// Setting `QMD_METAL_KEEP_RESIDENCY=1` opts back into residency sets (with
|
||||
// the visible-noise consequences). The legacy `QMD_DISABLE_DARWIN_SAFE_EXIT`
|
||||
// env var is accepted as a no-op alias for back-compat; it had no effect on
|
||||
// Node prior to this fix.
|
||||
|
||||
/**
|
||||
* Whether QMD's darwin Metal exit-crash mitigation is active in this process:
|
||||
* true → residency sets disabled, process exit completes silently
|
||||
* false → either non-darwin, or `QMD_METAL_KEEP_RESIDENCY=1` overrode it,
|
||||
* in which case the libggml-metal teardown assertion may fire
|
||||
*/
|
||||
export function isDarwinMetalMitigationActive(): boolean {
|
||||
if (process.platform !== "darwin") return false;
|
||||
if (process.env.QMD_METAL_KEEP_RESIDENCY === "1") return false;
|
||||
return process.env.GGML_METAL_NO_RESIDENCY === "1";
|
||||
}
|
||||
|
||||
/**
|
||||
* Compatibility shim: previous releases installed a `process.on('exit')` hook
|
||||
* that tried to skip the C++ static destructor by calling `process.reallyExit`.
|
||||
* That mechanism didn't work on Node (Environment::Exit still calls libc
|
||||
* `exit()`), so it was replaced by `GGML_METAL_NO_RESIDENCY=1` from bin/qmd.
|
||||
* Kept as a no-op for code paths that still call it; safe to remove once no
|
||||
* production launcher predates the residency-set fix.
|
||||
*/
|
||||
export function installDarwinExitGuard(): void {
|
||||
// Intentional no-op. See isDarwinMetalMitigationActive() for the real check.
|
||||
}
|
||||
|
||||
/** @deprecated Replaced by isDarwinMetalMitigationActive. */
|
||||
export function isDarwinExitGuardInstalled(): boolean {
|
||||
return isDarwinMetalMitigationActive();
|
||||
}
|
||||
|
||||
// =============================================================================
|
||||
// Singleton for default LlamaCpp instance
|
||||
// =============================================================================
|
||||
@ -1759,7 +2141,9 @@ export function canUnloadLLM(): boolean {
|
||||
let defaultLlamaCpp: LlamaCpp | null = null;
|
||||
|
||||
/**
|
||||
* Get the default LlamaCpp instance (creates one if needed)
|
||||
* Get the default LlamaCpp instance (creates one if needed). The LlamaCpp
|
||||
* constructor installs the darwin exit guard, so any code path that obtains
|
||||
* the singleton is protected.
|
||||
*/
|
||||
export function getDefaultLlamaCpp(): LlamaCpp {
|
||||
if (!defaultLlamaCpp) {
|
||||
@ -1769,12 +2153,24 @@ export function getDefaultLlamaCpp(): LlamaCpp {
|
||||
}
|
||||
|
||||
/**
|
||||
* Set a custom default LlamaCpp instance (useful for testing)
|
||||
* Set a custom default LlamaCpp instance (useful for testing). Setting a
|
||||
* non-null instance also ensures the darwin exit guard is installed — keeps
|
||||
* the invariant intact for test doubles that didn't go through the real
|
||||
* constructor.
|
||||
*/
|
||||
export function setDefaultLlamaCpp(llm: LlamaCpp | null): void {
|
||||
if (llm !== null) installDarwinExitGuard();
|
||||
defaultLlamaCpp = llm;
|
||||
}
|
||||
|
||||
/**
|
||||
* Peek at the default LlamaCpp instance without instantiating one. Used by
|
||||
* doctor and lifecycle diagnostics.
|
||||
*/
|
||||
export function hasDefaultLlamaCpp(): boolean {
|
||||
return defaultLlamaCpp !== null;
|
||||
}
|
||||
|
||||
/**
|
||||
* Dispose the default LlamaCpp instance if it exists.
|
||||
* Call this before process exit to prevent NAPI crashes.
|
||||
|
||||
@ -32,8 +32,6 @@ import {
|
||||
import { getConfigPath } from "../collections.js";
|
||||
import { enableProductionMode } from "../store.js";
|
||||
|
||||
enableProductionMode();
|
||||
|
||||
// =============================================================================
|
||||
// Types for structured content
|
||||
// =============================================================================
|
||||
@ -44,6 +42,7 @@ type SearchResultItem = {
|
||||
title: string;
|
||||
score: number;
|
||||
context: string | null;
|
||||
line: number; // Absolute line in source markdown
|
||||
snippet: string;
|
||||
};
|
||||
|
||||
@ -108,7 +107,6 @@ function getPackageVersion(): string {
|
||||
*/
|
||||
async function buildInstructions(store: QMDStore): Promise<string> {
|
||||
const status = await store.getStatus();
|
||||
const contexts = await store.listContexts();
|
||||
const globalCtx = await store.getGlobalContext();
|
||||
const lines: string[] = [];
|
||||
|
||||
@ -117,15 +115,13 @@ async function buildInstructions(store: QMDStore): Promise<string> {
|
||||
if (globalCtx) lines.push(`Context: ${globalCtx}`);
|
||||
|
||||
// --- What's searchable? ---
|
||||
// Emit names only — the per-collection doc counts and descriptions can run to ~1.5 KB
|
||||
// across a dozen collections, and the same info is available on demand via the `status` tool.
|
||||
if (status.collections.length > 0) {
|
||||
lines.push("");
|
||||
lines.push("Collections (scope with `collection` parameter):");
|
||||
for (const col of status.collections) {
|
||||
// Find root context for this collection
|
||||
const rootCtx = contexts.find(c => c.collection === col.name && (c.path === "" || c.path === "/"));
|
||||
const desc = rootCtx ? ` — ${rootCtx.context}` : "";
|
||||
lines.push(` - "${col.name}" (${col.documents} docs)${desc}`);
|
||||
}
|
||||
const names = status.collections.map(c => c.name).join(", ");
|
||||
lines.push(`Collections (scope with \`collections\` parameter): ${names}`);
|
||||
lines.push("Call the `status` tool for collection descriptions, paths, and per-collection doc counts.");
|
||||
}
|
||||
|
||||
// --- Capability gaps ---
|
||||
@ -155,7 +151,7 @@ async function buildInstructions(store: QMDStore): Promise<string> {
|
||||
// --- Retrieval workflow ---
|
||||
lines.push("");
|
||||
lines.push("Retrieval:");
|
||||
lines.push(" - `get` — single document by path or docid (#abc123). Supports line offset (`file.md:100`).");
|
||||
lines.push(" - `get` — single document by path or docid (#abc123). Supports a line-range suffix: `file.md:100` (from line 100) or `file.md:100:40` (40 lines from line 100).");
|
||||
lines.push(" - `multi_get` — batch retrieve by glob (`journals/2025-05*.md`) or comma-separated list.");
|
||||
|
||||
// --- Non-obvious things that prevent mistakes ---
|
||||
@ -244,6 +240,8 @@ async function createMcpServer(store: QMDStore): Promise<McpServer> {
|
||||
title: "Query",
|
||||
description: `Search the knowledge base using a query document — one or more typed sub-queries combined for best recall.
|
||||
|
||||
Each result includes a \`line\` field with the absolute 1-indexed line of the best match in the source markdown. To read more context around a hit, call \`get(file, fromLine = max(1, line - 20), maxLines = 80, lineNumbers = true)\`.
|
||||
|
||||
## Query Types
|
||||
|
||||
**lex** — BM25 keyword search. Fast, exact, no LLM needed.
|
||||
@ -333,6 +331,7 @@ Intent-aware lex (C++ performance, not sports):
|
||||
collections: effectiveCollections.length > 0 ? effectiveCollections : undefined,
|
||||
limit,
|
||||
minScore,
|
||||
candidateLimit,
|
||||
rerank,
|
||||
intent,
|
||||
});
|
||||
@ -343,13 +342,14 @@ Intent-aware lex (C++ performance, not sports):
|
||||
|| searches[0]?.query || "";
|
||||
|
||||
const filtered: SearchResultItem[] = results.map(r => {
|
||||
const { line, snippet } = extractSnippet(r.bestChunk, primaryQuery, 300, undefined, undefined, intent);
|
||||
const { line, snippet } = extractSnippet(r.body, primaryQuery, 300, r.bestChunkPos, r.bestChunk.length, intent);
|
||||
return {
|
||||
docid: `#${r.docid}`,
|
||||
file: r.displayPath,
|
||||
title: r.title,
|
||||
score: Math.round(r.score * 100) / 100,
|
||||
context: r.context,
|
||||
line,
|
||||
snippet: addLineNumbers(snippet, line),
|
||||
};
|
||||
});
|
||||
@ -372,21 +372,31 @@ Intent-aware lex (C++ performance, not sports):
|
||||
description: "Retrieve the full content of a document by its file path or docid. Use paths or docids (#abc123) from search results. Suggests similar files if not found.",
|
||||
annotations: { readOnlyHint: true, openWorldHint: false },
|
||||
inputSchema: {
|
||||
file: z.string().describe("File path or docid from search results (e.g., 'pages/meeting.md', '#abc123', or 'pages/meeting.md:100' to start at line 100)"),
|
||||
file: z.string().describe("File path or docid from search results. Supports a line-range suffix: 'pages/meeting.md:100' starts at line 100; 'pages/meeting.md:100:40' (or '#abc123:100:40') reads 40 lines from line 100."),
|
||||
fromLine: z.number().optional().describe("Start from this line number (1-indexed)"),
|
||||
maxLines: z.number().optional().describe("Maximum number of lines to return"),
|
||||
lineNumbers: z.boolean().optional().default(false).describe("Add line numbers to output (format: 'N: content')"),
|
||||
lineNumbers: z.boolean().optional().default(true).describe("Add line numbers to output (format: 'N: content'). On by default; set false for raw content."),
|
||||
},
|
||||
},
|
||||
async ({ file, fromLine, maxLines, lineNumbers }) => {
|
||||
// Support :line suffix in `file` (e.g. "foo.md:120") when fromLine isn't provided
|
||||
// Support :line and :from:count suffixes in `file` (e.g. "foo.md:120" or
|
||||
// "foo.md:120:40"). Explicit fromLine/maxLines args take precedence.
|
||||
let parsedFromLine = fromLine;
|
||||
let parsedMaxLines = maxLines;
|
||||
let lookup = file;
|
||||
const colonMatch = lookup.match(/:(\d+)$/);
|
||||
if (colonMatch && colonMatch[1] && parsedFromLine === undefined) {
|
||||
parsedFromLine = parseInt(colonMatch[1], 10);
|
||||
lookup = lookup.slice(0, -colonMatch[0].length);
|
||||
const rangeMatch = lookup.match(/:(\d+):(\d+)$/);
|
||||
if (rangeMatch) {
|
||||
if (parsedFromLine === undefined) parsedFromLine = parseInt(rangeMatch[1]!, 10);
|
||||
if (parsedMaxLines === undefined) parsedMaxLines = parseInt(rangeMatch[2]!, 10);
|
||||
lookup = lookup.slice(0, -rangeMatch[0].length);
|
||||
} else {
|
||||
const colonMatch = lookup.match(/:(\d+)$/);
|
||||
if (colonMatch && colonMatch[1] && parsedFromLine === undefined) {
|
||||
parsedFromLine = parseInt(colonMatch[1], 10);
|
||||
lookup = lookup.slice(0, -colonMatch[0].length);
|
||||
}
|
||||
}
|
||||
if (parsedFromLine !== undefined) parsedFromLine = Math.max(1, parsedFromLine);
|
||||
|
||||
const result = await store.get(lookup, { includeBody: false });
|
||||
|
||||
@ -401,7 +411,7 @@ Intent-aware lex (C++ performance, not sports):
|
||||
};
|
||||
}
|
||||
|
||||
const body = await store.getDocumentBody(result.filepath, { fromLine: parsedFromLine, maxLines }) ?? "";
|
||||
const body = await store.getDocumentBody(result.filepath, { fromLine: parsedFromLine, maxLines: parsedMaxLines }) ?? "";
|
||||
let text = body;
|
||||
if (lineNumbers) {
|
||||
const startLine = parsedFromLine || 1;
|
||||
@ -440,7 +450,7 @@ Intent-aware lex (C++ performance, not sports):
|
||||
pattern: z.string().describe("Glob pattern or comma-separated list of file paths"),
|
||||
maxLines: z.number().optional().describe("Maximum lines per file"),
|
||||
maxBytes: z.number().optional().default(10240).describe("Skip files larger than this (default: 10240 = 10KB)"),
|
||||
lineNumbers: z.boolean().optional().default(false).describe("Add line numbers to output (format: 'N: content')"),
|
||||
lineNumbers: z.boolean().optional().default(true).describe("Add line numbers to output (format: 'N: content'). On by default; set false for raw content."),
|
||||
},
|
||||
},
|
||||
async ({ pattern, maxLines, maxBytes, lineNumbers }) => {
|
||||
@ -540,10 +550,20 @@ Intent-aware lex (C++ performance, not sports):
|
||||
// Transport: stdio (default)
|
||||
// =============================================================================
|
||||
|
||||
export async function startMcpServer(): Promise<void> {
|
||||
export type McpStartupOptions = {
|
||||
dbPath?: string;
|
||||
};
|
||||
|
||||
export async function startMcpServer(options: McpStartupOptions = {}): Promise<void> {
|
||||
// Opt into production mode when the MCP server is actually started, not
|
||||
// when this module is merely imported for its exports. Importing the module
|
||||
// at the top level flipped the global production flag and broke test
|
||||
// isolation for downstream suites that expect the default (development)
|
||||
// database path behaviour.
|
||||
enableProductionMode();
|
||||
const configPath = getConfigPath();
|
||||
const store = await createStore({
|
||||
dbPath: getDefaultDbPath(),
|
||||
dbPath: options.dbPath ?? getDefaultDbPath(),
|
||||
...(existsSync(configPath) ? { configPath } : {}),
|
||||
});
|
||||
const server = await createMcpServer(store);
|
||||
@ -565,10 +585,17 @@ export type HttpServerHandle = {
|
||||
* Start MCP server over Streamable HTTP (JSON responses, no SSE).
|
||||
* Binds to localhost only. Returns a handle for shutdown and port discovery.
|
||||
*/
|
||||
export async function startMcpHttpServer(port: number, options?: { quiet?: boolean }): Promise<HttpServerHandle> {
|
||||
export async function startMcpHttpServer(
|
||||
port: number,
|
||||
options: ({ quiet?: boolean } & McpStartupOptions) = {},
|
||||
): Promise<HttpServerHandle> {
|
||||
// See startMcpServer() for the rationale — flip production mode here so the
|
||||
// HTTP transport resolves the real database path, without leaking state into
|
||||
// callers that only import this module for its exports (e.g. tests).
|
||||
enableProductionMode();
|
||||
const configPath = getConfigPath();
|
||||
const store = await createStore({
|
||||
dbPath: getDefaultDbPath(),
|
||||
dbPath: options.dbPath ?? getDefaultDbPath(),
|
||||
...(existsSync(configPath) ? { configPath } : {}),
|
||||
});
|
||||
|
||||
@ -608,9 +635,21 @@ export async function startMcpHttpServer(port: number, options?: { quiet?: boole
|
||||
return new Date().toISOString().slice(11, 23); // HH:mm:ss.SSS
|
||||
}
|
||||
|
||||
type JsonRpcLikeBody = {
|
||||
method?: unknown;
|
||||
params?: {
|
||||
name?: unknown;
|
||||
arguments?: Record<string, unknown>;
|
||||
};
|
||||
};
|
||||
type RestSearchInput = {
|
||||
type?: unknown;
|
||||
query?: unknown;
|
||||
};
|
||||
|
||||
/** Extract a human-readable label from a JSON-RPC body */
|
||||
function describeRequest(body: any): string {
|
||||
const method = body?.method ?? "unknown";
|
||||
function describeRequest(body: JsonRpcLikeBody): string {
|
||||
const method = typeof body.method === "string" ? body.method : "unknown";
|
||||
if (method === "tools/call") {
|
||||
const tool = body.params?.name ?? "?";
|
||||
const args = body.params?.arguments;
|
||||
@ -654,7 +693,7 @@ export async function startMcpHttpServer(port: number, options?: { quiet?: boole
|
||||
// REST endpoint: POST /query (alias: /search) — structured search without MCP protocol
|
||||
if ((pathname === "/query" || pathname === "/search") && nodeReq.method === "POST") {
|
||||
const rawBody = await collectBody(nodeReq);
|
||||
const params = JSON.parse(rawBody);
|
||||
const params = JSON.parse(rawBody) as Record<string, unknown>;
|
||||
|
||||
// Validate required fields
|
||||
if (!params.searches || !Array.isArray(params.searches)) {
|
||||
@ -664,35 +703,39 @@ export async function startMcpHttpServer(port: number, options?: { quiet?: boole
|
||||
}
|
||||
|
||||
// Map to internal format
|
||||
const queries: ExpandedQuery[] = params.searches.map((s: any) => ({
|
||||
const searches = params.searches as RestSearchInput[];
|
||||
const queries: ExpandedQuery[] = searches.map((s) => ({
|
||||
type: s.type as 'lex' | 'vec' | 'hyde',
|
||||
query: String(s.query || ""),
|
||||
}));
|
||||
|
||||
// Use default collections if none specified
|
||||
const effectiveCollections = params.collections ?? defaultCollectionNames;
|
||||
const effectiveCollections = Array.isArray(params.collections) ? params.collections.map(String) : defaultCollectionNames;
|
||||
|
||||
const results = await store.search({
|
||||
queries,
|
||||
collections: effectiveCollections.length > 0 ? effectiveCollections : undefined,
|
||||
limit: params.limit ?? 10,
|
||||
minScore: params.minScore ?? 0,
|
||||
intent: params.intent,
|
||||
limit: typeof params.limit === "number" ? params.limit : 10,
|
||||
minScore: typeof params.minScore === "number" ? params.minScore : 0,
|
||||
candidateLimit: typeof params.candidateLimit === "number" ? params.candidateLimit : undefined,
|
||||
intent: typeof params.intent === "string" ? params.intent : undefined,
|
||||
rerank: typeof params.rerank === "boolean" ? params.rerank : undefined,
|
||||
});
|
||||
|
||||
// Use first lex or vec query for snippet extraction
|
||||
const primaryQuery = params.searches.find((s: any) => s.type === 'lex')?.query
|
||||
|| params.searches.find((s: any) => s.type === 'vec')?.query
|
||||
|| params.searches[0]?.query || "";
|
||||
const primaryQuery = searches.find((s) => s.type === 'lex')?.query
|
||||
|| searches.find((s) => s.type === 'vec')?.query
|
||||
|| searches[0]?.query || "";
|
||||
|
||||
const formatted = results.map(r => {
|
||||
const { line, snippet } = extractSnippet(r.bestChunk, primaryQuery, 300);
|
||||
const { line, snippet } = extractSnippet(r.body, String(primaryQuery), 300, r.bestChunkPos, r.bestChunk.length, typeof params.intent === "string" ? params.intent : undefined);
|
||||
return {
|
||||
docid: `#${r.docid}`,
|
||||
file: r.displayPath,
|
||||
file: `qmd://${encodeQmdPath(r.displayPath)}`,
|
||||
title: r.title,
|
||||
score: Math.round(r.score * 100) / 100,
|
||||
context: r.context,
|
||||
line,
|
||||
snippet: addLineNumbers(snippet, line),
|
||||
};
|
||||
});
|
||||
|
||||
5
src/paths.ts
Normal file
5
src/paths.ts
Normal file
@ -0,0 +1,5 @@
|
||||
import { homedir as osHomedir } from "node:os";
|
||||
|
||||
export function qmdHomedir(): string {
|
||||
return process.env.HOME || process.env.USERPROFILE || osHomedir() || "/tmp";
|
||||
}
|
||||
881
src/store.ts
881
src/store.ts
File diff suppressed because it is too large
Load Diff
@ -1,8 +1,19 @@
|
||||
/**
|
||||
* Test preload file to ensure proper cleanup of native resources.
|
||||
*
|
||||
* Uses bun:test afterAll to properly dispose of llama.cpp Metal
|
||||
* resources before the process exits, avoiding GGML_ASSERT failures.
|
||||
* Uses bun:test afterAll to dispose of llama.cpp Metal resources before
|
||||
* the process exits — necessary on darwin to avoid the upstream rsets
|
||||
* destructor assertion (ggml-org/llama.cpp#22593, fix open as #22595).
|
||||
*
|
||||
* The runner-level mitigation `GGML_METAL_NO_RESIDENCY=1` must be set
|
||||
* BEFORE bun/node starts (libggml-metal reads it via libc getenv at
|
||||
* module load). Bun does not propagate `process.env` writes to libc
|
||||
* setenv, so setting it from here would be a no-op for the native
|
||||
* binding. The env var is injected by:
|
||||
* - bin/qmd for production CLI runs
|
||||
* - scripts/test-all.mjs for `npm test`
|
||||
* - package.json test:bun / test:unit scripts for direct invocation
|
||||
* See CLAUDE.md for invoking `bun test` manually on darwin.
|
||||
*/
|
||||
import { afterAll } from "bun:test";
|
||||
import { disposeDefaultLlamaCpp } from "./llm";
|
||||
|
||||
4
src/types/picomatch.d.ts
vendored
Normal file
4
src/types/picomatch.d.ts
vendored
Normal file
@ -0,0 +1,4 @@
|
||||
declare module "picomatch" {
|
||||
export type Matcher = (input: string) => boolean;
|
||||
export default function picomatch(pattern: string | string[], options?: Record<string, unknown>): Matcher;
|
||||
}
|
||||
@ -13,10 +13,12 @@ ENV PATH="/root/.local/bin:$PATH"
|
||||
# Pre-install node and bun
|
||||
RUN mise use -g node@latest bun@latest
|
||||
|
||||
# Copy the packed tarball and install via both package managers
|
||||
COPY tobilu-qmd-*.tgz /tmp/
|
||||
RUN mise exec node@latest -- npm install -g /tmp/tobilu-qmd-*.tgz
|
||||
RUN mise exec bun@latest -- bun install -g /tmp/tobilu-qmd-*.tgz
|
||||
# Copy the packed tarball and install via both package managers. Keep a stable
|
||||
# tarball path for npm-exec/npx-style smoke scenarios.
|
||||
COPY tobilu-qmd-*.tgz /tmp/qmd-package.tgz
|
||||
RUN cp /tmp/qmd-package.tgz /tmp/tobilu-qmd.tgz
|
||||
RUN mise exec node@latest -- npm install -g /tmp/qmd-package.tgz
|
||||
RUN mise exec bun@latest -- bun install -g /tmp/qmd-package.tgz
|
||||
|
||||
# Copy test project (src + test + configs) and install deps
|
||||
COPY test-src/ /opt/qmd/
|
||||
|
||||
@ -6,7 +6,7 @@
|
||||
*/
|
||||
|
||||
import { describe, test, expect } from "vitest";
|
||||
import { detectLanguage, getASTBreakPoints, extractSymbols } from "../src/ast.js";
|
||||
import { detectLanguage, getASTBreakPoints, extractSymbols, formatGrammarLoadError } from "../src/ast.js";
|
||||
import type { SupportedLanguage } from "../src/ast.js";
|
||||
|
||||
// =============================================================================
|
||||
@ -315,6 +315,16 @@ describe("getASTBreakPoints - error handling", () => {
|
||||
// Should either return some partial break points or empty array — not throw
|
||||
expect(Array.isArray(points)).toBe(true);
|
||||
});
|
||||
|
||||
test("explains missing grammar packages with a repair command", () => {
|
||||
const msg = formatGrammarLoadError(
|
||||
"typescript",
|
||||
new Error("Cannot find module 'tree-sitter-typescript/tree-sitter-typescript.wasm'"),
|
||||
);
|
||||
expect(msg).toContain("tree-sitter-typescript");
|
||||
expect(msg).toContain("bun add tree-sitter-typescript@0.23.2");
|
||||
expect(msg).toContain("falling back to regex");
|
||||
});
|
||||
});
|
||||
|
||||
// =============================================================================
|
||||
|
||||
@ -99,6 +99,20 @@ describe("scoreResults", () => {
|
||||
expect(result.mrr).toBeCloseTo(0.5); // 1/2
|
||||
});
|
||||
|
||||
test("reports recall@1/3/5 and matched documents", () => {
|
||||
const result = scoreResults(
|
||||
["x.md", "qmd://concepts/a.md", "docs/b.md", "docs/c.md", "docs/d.md"],
|
||||
["concepts/a.md", "b.md", "missing.md"],
|
||||
3,
|
||||
);
|
||||
|
||||
expect(result.recall_at_1).toBe(0);
|
||||
expect(result.recall_at_3).toBeCloseTo(2 / 3);
|
||||
expect(result.recall_at_5).toBeCloseTo(2 / 3);
|
||||
expect(result.matched_files).toEqual(["concepts/a.md", "b.md"]);
|
||||
expect(result.unmatched_expected_files).toEqual(["missing.md"]);
|
||||
});
|
||||
|
||||
test("empty results", () => {
|
||||
const result = scoreResults([], ["a.md"], 1);
|
||||
expect(result.precision_at_k).toBe(0);
|
||||
|
||||
263
test/bin-wrapper.test.ts
Normal file
263
test/bin-wrapper.test.ts
Normal file
@ -0,0 +1,263 @@
|
||||
import { afterEach, describe, expect, test } from "vitest";
|
||||
import { chmodSync, copyFileSync, mkdtempSync, mkdirSync, readFileSync, realpathSync, rmSync, symlinkSync, writeFileSync } from "node:fs";
|
||||
import { tmpdir } from "node:os";
|
||||
import { dirname, join, relative } from "node:path";
|
||||
import { execFileSync, spawnSync } from "node:child_process";
|
||||
import { fileURLToPath } from "node:url";
|
||||
|
||||
const repoRoot = fileURLToPath(new URL("..", import.meta.url));
|
||||
const fixtures: string[] = [];
|
||||
|
||||
function makeTempFixture() {
|
||||
const root = mkdtempSync(join(tmpdir(), "qmd-bin-wrapper-"));
|
||||
fixtures.push(root);
|
||||
const capturePath = join(root, "capture.txt");
|
||||
const runtimeBin = join(root, "runtime-bin");
|
||||
mkdirSync(runtimeBin, { recursive: true });
|
||||
|
||||
for (const runtime of ["node", "bun"]) {
|
||||
const runtimePath = join(runtimeBin, runtime);
|
||||
if (runtime === "node") {
|
||||
writeFileSync(
|
||||
runtimePath,
|
||||
`#!/bin/sh
|
||||
if [ "$(basename "$1")" = "qmd" ]; then
|
||||
exec "${process.execPath}" "$@"
|
||||
else
|
||||
{
|
||||
printf '%s\\n' 'node'
|
||||
printf '%s\\n' "$1"
|
||||
shift
|
||||
printf '%s\\n' "$@"
|
||||
} > "$QMD_WRAPPER_CAPTURE"
|
||||
fi
|
||||
`,
|
||||
);
|
||||
} else {
|
||||
writeFileSync(
|
||||
runtimePath,
|
||||
`#!/bin/sh\n{\n printf '%s\\n' '${runtime}'\n printf '%s\\n' "$1"\n shift\n printf '%s\\n' "$@"\n} > "$QMD_WRAPPER_CAPTURE"\n`,
|
||||
);
|
||||
}
|
||||
chmodSync(runtimePath, 0o755);
|
||||
}
|
||||
|
||||
return { root, capturePath, runtimeBin };
|
||||
}
|
||||
|
||||
function makePackage(root: string, packagePath: string, lockfiles: string[] = [], options: { dist?: boolean; source?: boolean; tsx?: boolean; git?: boolean } = {}) {
|
||||
const packageRoot = join(root, packagePath);
|
||||
const includeDist = options.dist ?? true;
|
||||
mkdirSync(join(packageRoot, "bin"), { recursive: true });
|
||||
copyFileSync(join(repoRoot, "bin", "qmd"), join(packageRoot, "bin", "qmd"));
|
||||
chmodSync(join(packageRoot, "bin", "qmd"), 0o755);
|
||||
if (includeDist) {
|
||||
mkdirSync(join(packageRoot, "dist", "cli"), { recursive: true });
|
||||
writeFileSync(join(packageRoot, "dist", "cli", "qmd.js"), "// fixture\n");
|
||||
}
|
||||
if (options.source) {
|
||||
mkdirSync(join(packageRoot, "src", "cli"), { recursive: true });
|
||||
writeFileSync(join(packageRoot, "src", "cli", "qmd.ts"), "// source fixture\n");
|
||||
}
|
||||
if (options.tsx) {
|
||||
mkdirSync(join(packageRoot, "node_modules", "tsx", "dist"), { recursive: true });
|
||||
writeFileSync(join(packageRoot, "node_modules", "tsx", "dist", "cli.mjs"), "// tsx fixture\n");
|
||||
}
|
||||
if (options.git) {
|
||||
mkdirSync(join(packageRoot, ".git"), { recursive: true });
|
||||
}
|
||||
for (const lockfile of lockfiles) {
|
||||
writeFileSync(join(packageRoot, lockfile), "");
|
||||
}
|
||||
return packageRoot;
|
||||
}
|
||||
|
||||
function symlinkRelative(target: string, linkPath: string) {
|
||||
mkdirSync(dirname(linkPath), { recursive: true });
|
||||
symlinkSync(relative(dirname(linkPath), target), linkPath);
|
||||
}
|
||||
|
||||
function runWrapper(commandPath: string, runtimeBin: string, capturePath: string, env: Record<string, string> = {}) {
|
||||
rmSync(capturePath, { force: true });
|
||||
execFileSync(commandPath, ["--version"], {
|
||||
env: {
|
||||
...process.env,
|
||||
...env,
|
||||
PATH: `${runtimeBin}:${process.env.PATH ?? ""}`,
|
||||
QMD_WRAPPER_CAPTURE: capturePath,
|
||||
},
|
||||
stdio: ["ignore", "pipe", "pipe"],
|
||||
});
|
||||
const [runtime, scriptPath, ...args] = readFileSync(capturePath, "utf8").trimEnd().split("\n");
|
||||
return { runtime, scriptPath, args };
|
||||
}
|
||||
|
||||
afterEach(() => {
|
||||
for (const fixture of fixtures.splice(0)) {
|
||||
rmSync(fixture, { recursive: true, force: true });
|
||||
}
|
||||
});
|
||||
|
||||
describe("bin/qmd package wrapper", () => {
|
||||
test("direct package invocation resolves dist/cli/qmd.js from the package root", () => {
|
||||
const { root, runtimeBin, capturePath } = makeTempFixture();
|
||||
const packageRoot = makePackage(root, "node_modules/@tobilu/qmd");
|
||||
|
||||
const result = runWrapper(join(packageRoot, "bin", "qmd"), runtimeBin, capturePath);
|
||||
|
||||
expect(result.runtime).toBe("node");
|
||||
expect(result.scriptPath).toBe(realpathSync(join(packageRoot, "dist", "cli", "qmd.js")));
|
||||
expect(result.args).toEqual(["--version"]);
|
||||
});
|
||||
|
||||
test("npm/Homebrew global bin symlink resolves scoped package path", () => {
|
||||
const { root, runtimeBin, capturePath } = makeTempFixture();
|
||||
const packageRoot = makePackage(root, "opt/homebrew/lib/node_modules/@tobilu/qmd");
|
||||
const globalBin = join(root, "opt", "homebrew", "bin", "qmd");
|
||||
symlinkRelative(join(packageRoot, "bin", "qmd"), globalBin);
|
||||
|
||||
const result = runWrapper(globalBin, runtimeBin, capturePath);
|
||||
|
||||
expect(result.runtime).toBe("node");
|
||||
expect(result.scriptPath).toBe(realpathSync(join(packageRoot, "dist", "cli", "qmd.js")));
|
||||
});
|
||||
|
||||
test("multi-hop global bin symlink chain resolves to the real package root", () => {
|
||||
const { root, runtimeBin, capturePath } = makeTempFixture();
|
||||
const packageRoot = makePackage(root, "opt/homebrew/lib/node_modules/@tobilu/qmd");
|
||||
const globalBin = join(root, "opt", "homebrew", "bin", "qmd");
|
||||
const shim = join(root, "opt", "homebrew", "Cellar", "qmd", "current", "bin", "qmd");
|
||||
symlinkRelative(join(packageRoot, "bin", "qmd"), shim);
|
||||
symlinkRelative(shim, globalBin);
|
||||
|
||||
const result = runWrapper(globalBin, runtimeBin, capturePath);
|
||||
|
||||
expect(result.runtime).toBe("node");
|
||||
expect(result.scriptPath).toBe(realpathSync(join(packageRoot, "dist", "cli", "qmd.js")));
|
||||
});
|
||||
|
||||
test("linuxbrew global bin symlink resolves lib/node_modules scoped package path", () => {
|
||||
const { root, runtimeBin, capturePath } = makeTempFixture();
|
||||
const packageRoot = makePackage(root, "home/linuxbrew/.linuxbrew/lib/node_modules/@tobilu/qmd");
|
||||
const globalBin = join(root, "home", "linuxbrew", ".linuxbrew", "bin", "qmd");
|
||||
symlinkRelative(join(packageRoot, "bin", "qmd"), globalBin);
|
||||
|
||||
const result = runWrapper(globalBin, runtimeBin, capturePath);
|
||||
|
||||
expect(result.runtime).toBe("node");
|
||||
expect(result.scriptPath).toBe(realpathSync(join(packageRoot, "dist", "cli", "qmd.js")));
|
||||
});
|
||||
|
||||
test("npx scoped package .bin symlink resolves @tobilu/qmd package path", () => {
|
||||
const { root, runtimeBin, capturePath } = makeTempFixture();
|
||||
const packageRoot = makePackage(root, "npm/_npx/abc123/node_modules/@tobilu/qmd");
|
||||
const npxBin = join(root, "npm", "_npx", "abc123", "node_modules", ".bin", "qmd");
|
||||
symlinkRelative(join(packageRoot, "bin", "qmd"), npxBin);
|
||||
|
||||
const result = runWrapper(npxBin, runtimeBin, capturePath);
|
||||
|
||||
expect(result.runtime).toBe("node");
|
||||
expect(result.scriptPath).toBe(realpathSync(join(packageRoot, "dist", "cli", "qmd.js")));
|
||||
});
|
||||
|
||||
test("bun global symlink uses bun when package-local bun lockfile exists", () => {
|
||||
const { root, runtimeBin, capturePath } = makeTempFixture();
|
||||
const packageRoot = makePackage(root, "home/user/.bun/install/global/node_modules/@tobilu/qmd", ["bun.lock"]);
|
||||
const bunBin = join(root, "home", "user", ".bun", "bin", "qmd");
|
||||
symlinkRelative(join(packageRoot, "bin", "qmd"), bunBin);
|
||||
|
||||
const result = runWrapper(bunBin, runtimeBin, capturePath);
|
||||
|
||||
expect(result.runtime).toBe("bun");
|
||||
expect(result.scriptPath).toBe(realpathSync(join(packageRoot, "dist", "cli", "qmd.js")));
|
||||
});
|
||||
|
||||
test("ambient BUN_INSTALL alone does not select bun for an npm-installed package", () => {
|
||||
const { root, runtimeBin, capturePath } = makeTempFixture();
|
||||
const packageRoot = makePackage(root, "opt/homebrew/lib/node_modules/@tobilu/qmd");
|
||||
const globalBin = join(root, "opt", "homebrew", "bin", "qmd");
|
||||
symlinkRelative(join(packageRoot, "bin", "qmd"), globalBin);
|
||||
|
||||
const result = runWrapper(globalBin, runtimeBin, capturePath, { BUN_INSTALL: join(root, ".bun") });
|
||||
|
||||
expect(result.runtime).toBe("node");
|
||||
expect(result.scriptPath).toBe(realpathSync(join(packageRoot, "dist", "cli", "qmd.js")));
|
||||
});
|
||||
|
||||
test("package-lock.json takes priority over bun lockfiles", () => {
|
||||
const { root, runtimeBin, capturePath } = makeTempFixture();
|
||||
const packageRoot = makePackage(root, "node_modules/@tobilu/qmd", ["package-lock.json", "bun.lock"]);
|
||||
|
||||
const result = runWrapper(join(packageRoot, "bin", "qmd"), runtimeBin, capturePath);
|
||||
|
||||
expect(result.runtime).toBe("node");
|
||||
expect(result.scriptPath).toBe(realpathSync(join(packageRoot, "dist", "cli", "qmd.js")));
|
||||
});
|
||||
|
||||
test("packaged tree uses dist even if source files are present", () => {
|
||||
const { root, runtimeBin, capturePath } = makeTempFixture();
|
||||
const packageRoot = makePackage(root, "node_modules/@tobilu/qmd", ["bun.lock"], { source: true });
|
||||
|
||||
const result = runWrapper(join(packageRoot, "bin", "qmd"), runtimeBin, capturePath);
|
||||
|
||||
expect(result.runtime).toBe("bun");
|
||||
expect(result.scriptPath).toBe(realpathSync(join(packageRoot, "dist", "cli", "qmd.js")));
|
||||
});
|
||||
|
||||
test("prefers source with bun in a Bun checkout even when dist exists", () => {
|
||||
const { root, runtimeBin, capturePath } = makeTempFixture();
|
||||
const packageRoot = makePackage(root, "qmd", ["bun.lock"], { source: true, git: true });
|
||||
|
||||
const result = runWrapper(join(packageRoot, "bin", "qmd"), runtimeBin, capturePath);
|
||||
|
||||
expect(result.runtime).toBe("bun");
|
||||
expect(result.scriptPath).toBe(realpathSync(join(packageRoot, "src", "cli", "qmd.ts")));
|
||||
expect(result.args).toEqual(["--version"]);
|
||||
});
|
||||
|
||||
test("prefers source through tsx in a Node checkout even when dist exists", () => {
|
||||
const { root, runtimeBin, capturePath } = makeTempFixture();
|
||||
const packageRoot = makePackage(root, "qmd", [], { source: true, tsx: true, git: true });
|
||||
|
||||
const result = runWrapper(join(packageRoot, "bin", "qmd"), runtimeBin, capturePath);
|
||||
|
||||
expect(result.runtime).toBe("node");
|
||||
expect(result.scriptPath).toBe(realpathSync(join(packageRoot, "node_modules", "tsx", "dist", "cli.mjs")));
|
||||
expect(result.args).toEqual([realpathSync(join(packageRoot, "src", "cli", "qmd.ts")), "--version"]);
|
||||
});
|
||||
|
||||
test("source checkout with both bun.lock and package-lock.json prefers node+tsx", () => {
|
||||
// Mirrors the dist-mode "npm priority" rule: a working tree that has both
|
||||
// lockfiles (because the user ran `npm install` against a repo that also
|
||||
// ships bun.lock) installed native modules for Node's ABI, so source mode
|
||||
// must route through tsx to avoid better-sqlite3 / sqlite-vec mismatches.
|
||||
const { root, runtimeBin, capturePath } = makeTempFixture();
|
||||
const packageRoot = makePackage(root, "qmd", ["bun.lock", "package-lock.json"], { source: true, tsx: true, git: true });
|
||||
|
||||
const result = runWrapper(join(packageRoot, "bin", "qmd"), runtimeBin, capturePath);
|
||||
|
||||
expect(result.runtime).toBe("node");
|
||||
expect(result.scriptPath).toBe(realpathSync(join(packageRoot, "node_modules", "tsx", "dist", "cli.mjs")));
|
||||
expect(result.args).toEqual([realpathSync(join(packageRoot, "src", "cli", "qmd.ts")), "--version"]);
|
||||
});
|
||||
|
||||
test("explains how to build when dist is missing and source cannot run", () => {
|
||||
const { root, runtimeBin } = makeTempFixture();
|
||||
const packageRoot = makePackage(root, "qmd", [], { dist: false });
|
||||
|
||||
const result = spawnSync(join(packageRoot, "bin", "qmd"), ["--version"], {
|
||||
env: {
|
||||
...process.env,
|
||||
PATH: `${runtimeBin}:${process.env.PATH ?? ""}`,
|
||||
},
|
||||
encoding: "utf8",
|
||||
stdio: ["ignore", "pipe", "pipe"],
|
||||
});
|
||||
|
||||
expect(result.status).toBe(1);
|
||||
expect(result.stderr).toContain("qmd is not built");
|
||||
expect(result.stderr).toContain("bun install && bun run build");
|
||||
expect(result.stderr).toContain("npm install && npm run build");
|
||||
expect(result.stderr).toContain("qmd doctor");
|
||||
});
|
||||
});
|
||||
128
test/cli-exit-lifecycle.test.ts
Normal file
128
test/cli-exit-lifecycle.test.ts
Normal file
@ -0,0 +1,128 @@
|
||||
import { describe, expect, test } from "vitest";
|
||||
import { finishSuccessfulCliCommand } from "../src/cli/qmd.ts";
|
||||
import { LlamaCpp, isDarwinMetalMitigationActive } from "../src/llm.ts";
|
||||
|
||||
describe("CLI successful-exit lifecycle", () => {
|
||||
test("exits 0 after successful output when post-output LLM cleanup fails", async () => {
|
||||
const exitCodes: number[] = [];
|
||||
const stderr: string[] = [];
|
||||
const flushed: string[] = [];
|
||||
|
||||
await finishSuccessfulCliCommand({
|
||||
command: "query",
|
||||
format: "json",
|
||||
cleanup: async () => {
|
||||
throw new Error("ggml_metal_device_free abort simulation");
|
||||
},
|
||||
exit: (code) => {
|
||||
exitCodes.push(code);
|
||||
},
|
||||
stdout: { write: (chunk: string | Uint8Array, cb?: (error?: Error | null) => void) => { flushed.push(String(chunk)); cb?.(); return true; } },
|
||||
stderr: { write: (chunk: string | Uint8Array, cb?: (error?: Error | null) => void) => { stderr.push(String(chunk)); cb?.(); return true; } },
|
||||
});
|
||||
|
||||
expect(exitCodes).toEqual([0]);
|
||||
expect(stderr.join("")).toContain("QMD Warning: cleanup after successful output failed");
|
||||
expect(flushed).toEqual([""]);
|
||||
});
|
||||
|
||||
test("flushes stdout, runs cleanup, flushes stderr, then exits (when exit is provided)", async () => {
|
||||
// The legacy lifecycle order is preserved for callers that pass an
|
||||
// explicit `exit` function — primarily this test, which needs an
|
||||
// observable terminating step.
|
||||
const calls: string[] = [];
|
||||
|
||||
await finishSuccessfulCliCommand({
|
||||
command: "query",
|
||||
format: "json",
|
||||
cleanup: async () => { calls.push("cleanup"); },
|
||||
exit: (code) => { calls.push(`exit:${code}`); },
|
||||
stdout: { write: (_chunk: string | Uint8Array, cb?: (error?: Error | null) => void) => { calls.push("stdout-flush"); cb?.(); return true; } },
|
||||
stderr: { write: (_chunk: string | Uint8Array, cb?: (error?: Error | null) => void) => { calls.push("stderr-flush"); cb?.(); return true; } },
|
||||
});
|
||||
|
||||
expect(calls).toEqual(["stdout-flush", "cleanup", "stderr-flush", "exit:0"]);
|
||||
});
|
||||
|
||||
test("production path: sets process.exitCode=0 and returns instead of calling process.exit", async () => {
|
||||
// The real CLI does NOT pass `exit` — finishSuccessfulCliCommand should set
|
||||
// process.exitCode and return, letting Node's `beforeExit` fire so
|
||||
// node-llama-cpp's auto-dispose runs BEFORE libc's static destructor.
|
||||
// process.exit() skips `beforeExit`, which is what trips the libggml-metal
|
||||
// assertion (ggml-org/llama.cpp#22593) even with explicit dispose.
|
||||
const prevCode = process.exitCode;
|
||||
process.exitCode = 1; // poison the state to verify we set it
|
||||
try {
|
||||
const calls: string[] = [];
|
||||
await finishSuccessfulCliCommand({
|
||||
command: "query",
|
||||
format: "json",
|
||||
cleanup: async () => { calls.push("cleanup"); },
|
||||
stdout: { write: (_c: string | Uint8Array, cb?: (error?: Error | null) => void) => { calls.push("stdout-flush"); cb?.(); return true; } },
|
||||
stderr: { write: (_c: string | Uint8Array, cb?: (error?: Error | null) => void) => { calls.push("stderr-flush"); cb?.(); return true; } },
|
||||
});
|
||||
|
||||
expect(calls).toEqual(["stdout-flush", "cleanup", "stderr-flush"]);
|
||||
expect(process.exitCode).toBe(0);
|
||||
} finally {
|
||||
process.exitCode = prevCode;
|
||||
}
|
||||
});
|
||||
|
||||
test("darwin Metal mitigation reflects launcher-exported env on darwin", () => {
|
||||
// The real mitigation lives in bin/qmd, which sets GGML_METAL_NO_RESIDENCY=1
|
||||
// before Node loads the llama.cpp native binding. The JS-side predicate
|
||||
// just reports whether that env was set (and not overridden by
|
||||
// QMD_METAL_KEEP_RESIDENCY). On non-darwin the function returns false.
|
||||
const expected =
|
||||
process.platform === "darwin" &&
|
||||
process.env.QMD_METAL_KEEP_RESIDENCY !== "1" &&
|
||||
process.env.GGML_METAL_NO_RESIDENCY === "1";
|
||||
expect(isDarwinMetalMitigationActive()).toBe(expected);
|
||||
});
|
||||
|
||||
test("QMD_METAL_KEEP_RESIDENCY=1 disables the mitigation even when GGML_METAL_NO_RESIDENCY is set", () => {
|
||||
const prevKeep = process.env.QMD_METAL_KEEP_RESIDENCY;
|
||||
const prevNoRes = process.env.GGML_METAL_NO_RESIDENCY;
|
||||
try {
|
||||
process.env.QMD_METAL_KEEP_RESIDENCY = "1";
|
||||
process.env.GGML_METAL_NO_RESIDENCY = "1";
|
||||
expect(isDarwinMetalMitigationActive()).toBe(false);
|
||||
} finally {
|
||||
if (prevKeep === undefined) delete process.env.QMD_METAL_KEEP_RESIDENCY;
|
||||
else process.env.QMD_METAL_KEEP_RESIDENCY = prevKeep;
|
||||
if (prevNoRes === undefined) delete process.env.GGML_METAL_NO_RESIDENCY;
|
||||
else process.env.GGML_METAL_NO_RESIDENCY = prevNoRes;
|
||||
}
|
||||
});
|
||||
|
||||
test("disposes Llama resources in dependency order before CLI exit", async () => {
|
||||
const calls: string[] = [];
|
||||
const llm = new LlamaCpp({ inactivityTimeoutMs: 0 });
|
||||
const disposable = (name: string) => ({
|
||||
dispose: async () => {
|
||||
calls.push(name);
|
||||
},
|
||||
});
|
||||
|
||||
Object.assign(llm as unknown as Record<string, unknown>, {
|
||||
embedContexts: [disposable("embed-context")],
|
||||
rerankContexts: [disposable("rerank-context")],
|
||||
embedModel: disposable("embed-model"),
|
||||
generateModel: disposable("generate-model"),
|
||||
rerankModel: disposable("rerank-model"),
|
||||
llama: disposable("llama"),
|
||||
});
|
||||
|
||||
await llm.dispose();
|
||||
|
||||
expect(calls).toEqual([
|
||||
"embed-context",
|
||||
"rerank-context",
|
||||
"embed-model",
|
||||
"generate-model",
|
||||
"rerank-model",
|
||||
"llama",
|
||||
]);
|
||||
});
|
||||
});
|
||||
20
test/cli-lazy-llm-import.test.ts
Normal file
20
test/cli-lazy-llm-import.test.ts
Normal file
@ -0,0 +1,20 @@
|
||||
import { describe, expect, test } from "vitest";
|
||||
import { readFileSync } from "fs";
|
||||
import { join } from "path";
|
||||
|
||||
describe("LLM module loading", () => {
|
||||
test("node-llama-cpp is only dynamically imported by LLM operations", () => {
|
||||
const source = readFileSync(join(process.cwd(), "src", "llm.ts"), "utf-8");
|
||||
|
||||
expect(source).not.toMatch(/import\s+(?!type\b)[\s\S]*?from\s+["']node-llama-cpp["']/);
|
||||
expect(source).toContain('import("node-llama-cpp")');
|
||||
});
|
||||
|
||||
test("importing the CLI for lightweight commands succeeds", async () => {
|
||||
const mod = await import("../src/cli/qmd.ts");
|
||||
expect(mod).toMatchObject({
|
||||
buildEditorUri: expect.any(Function),
|
||||
termLink: expect.any(Function),
|
||||
});
|
||||
});
|
||||
});
|
||||
851
test/cli.test.ts
851
test/cli.test.ts
File diff suppressed because it is too large
Load Diff
@ -6,15 +6,19 @@
|
||||
*/
|
||||
|
||||
import { describe, test, expect, beforeEach, afterEach } from "vitest";
|
||||
import { mkdtemp, rm, writeFile } from "fs/promises";
|
||||
import { tmpdir } from "os";
|
||||
import { join } from "path";
|
||||
import { homedir } from "os";
|
||||
import { getConfigPath, setConfigIndexName } from "../src/collections.js";
|
||||
import { qmdHomedir } from "../src/paths.js";
|
||||
import { getConfigPath, loadConfig, setConfigIndexName } from "../src/collections.js";
|
||||
|
||||
// Save/restore env vars around each test
|
||||
let savedEnv: Record<string, string | undefined>;
|
||||
|
||||
beforeEach(() => {
|
||||
savedEnv = {
|
||||
HOME: process.env.HOME,
|
||||
USERPROFILE: process.env.USERPROFILE,
|
||||
QMD_CONFIG_DIR: process.env.QMD_CONFIG_DIR,
|
||||
XDG_CONFIG_HOME: process.env.XDG_CONFIG_HOME,
|
||||
};
|
||||
@ -38,7 +42,16 @@ describe("getConfigDir via getConfigPath", () => {
|
||||
test("defaults to ~/.config/qmd when no env vars are set", () => {
|
||||
delete process.env.QMD_CONFIG_DIR;
|
||||
delete process.env.XDG_CONFIG_HOME;
|
||||
expect(getConfigPath()).toBe(join(homedir(), ".config", "qmd", "index.yml"));
|
||||
expect(getConfigPath()).toBe(join(qmdHomedir(), ".config", "qmd", "index.yml"));
|
||||
});
|
||||
|
||||
test("uses the same USERPROFILE fallback as default DB path when HOME is unset", () => {
|
||||
delete process.env.HOME;
|
||||
delete process.env.QMD_CONFIG_DIR;
|
||||
delete process.env.XDG_CONFIG_HOME;
|
||||
process.env.USERPROFILE = "/Users/windows-user";
|
||||
|
||||
expect(getConfigPath()).toBe(join("/Users/windows-user", ".config", "qmd", "index.yml"));
|
||||
});
|
||||
|
||||
test("QMD_CONFIG_DIR takes highest priority", () => {
|
||||
@ -71,4 +84,15 @@ describe("getConfigDir via getConfigPath", () => {
|
||||
setConfigIndexName("myindex");
|
||||
expect(getConfigPath()).toBe(join("/xdg/config", "qmd", "myindex.yml"));
|
||||
});
|
||||
|
||||
test("loadConfig treats an empty YAML file as an empty config", async () => {
|
||||
const dir = await mkdtemp(join(tmpdir(), "qmd-empty-config-"));
|
||||
try {
|
||||
process.env.QMD_CONFIG_DIR = dir;
|
||||
await writeFile(join(dir, "index.yml"), "");
|
||||
expect(loadConfig()).toEqual({ collections: {} });
|
||||
} finally {
|
||||
await rm(dir, { recursive: true, force: true });
|
||||
}
|
||||
});
|
||||
});
|
||||
|
||||
27
test/esm-ambiguous-module.test.ts
Normal file
27
test/esm-ambiguous-module.test.ts
Normal file
@ -0,0 +1,27 @@
|
||||
import { describe, expect, test } from "vitest";
|
||||
import { execFileSync } from "child_process";
|
||||
import { mkdtempSync } from "fs";
|
||||
import { tmpdir } from "os";
|
||||
import { dirname, join, resolve } from "path";
|
||||
import { fileURLToPath } from "url";
|
||||
|
||||
const repoRoot = resolve(dirname(fileURLToPath(import.meta.url)), "..");
|
||||
|
||||
describe("Node ESM entrypoints", () => {
|
||||
test("CLI --index path normalizes via setIndexName/setConfigIndexName under Node 22+", () => {
|
||||
execFileSync(process.execPath, ["scripts/build.mjs"], {
|
||||
cwd: repoRoot,
|
||||
encoding: "utf-8",
|
||||
stdio: "pipe",
|
||||
});
|
||||
|
||||
const indexPath = join(mkdtempSync(join(tmpdir(), "qmd-index-")), "nested", "idx");
|
||||
const output = execFileSync(process.execPath, ["dist/cli/qmd.js", "--index", indexPath, "--version"], {
|
||||
cwd: repoRoot,
|
||||
encoding: "utf-8",
|
||||
stdio: "pipe",
|
||||
});
|
||||
|
||||
expect(output).toContain("qmd ");
|
||||
}, 120_000);
|
||||
});
|
||||
285
test/llm.test.ts
285
test/llm.test.ts
@ -12,6 +12,14 @@ import {
|
||||
getDefaultLlamaCpp,
|
||||
disposeDefaultLlamaCpp,
|
||||
resolveLlamaGpuMode,
|
||||
setNodeLlamaCppModuleForTest,
|
||||
withNativeStdoutRedirectedToStderr,
|
||||
resolveParallelismOverride,
|
||||
resolveSafeParallelism,
|
||||
resolveEmbedModel,
|
||||
resolveGenerateModel,
|
||||
resolveRerankModel,
|
||||
resolveModels,
|
||||
withLLMSession,
|
||||
canUnloadLLM,
|
||||
SessionReleasedError,
|
||||
@ -19,6 +27,63 @@ import {
|
||||
type ILLMSession,
|
||||
} from "../src/llm.js";
|
||||
|
||||
describe("model name resolution", () => {
|
||||
function withModelEnv(env: Record<string, string | undefined>, fn: () => void): void {
|
||||
const previous = {
|
||||
QMD_EMBED_MODEL: process.env.QMD_EMBED_MODEL,
|
||||
QMD_GENERATE_MODEL: process.env.QMD_GENERATE_MODEL,
|
||||
QMD_RERANK_MODEL: process.env.QMD_RERANK_MODEL,
|
||||
};
|
||||
try {
|
||||
for (const [key, value] of Object.entries(env)) {
|
||||
if (value === undefined) delete process.env[key];
|
||||
else process.env[key] = value;
|
||||
}
|
||||
fn();
|
||||
} finally {
|
||||
for (const [key, value] of Object.entries(previous)) {
|
||||
if (value === undefined) delete process.env[key];
|
||||
else process.env[key] = value;
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
test("all model roles resolve config hints before env fallbacks", () => {
|
||||
withModelEnv({
|
||||
QMD_EMBED_MODEL: "env-embed",
|
||||
QMD_GENERATE_MODEL: "env-generate",
|
||||
QMD_RERANK_MODEL: "env-rerank",
|
||||
}, () => {
|
||||
const config = {
|
||||
embed: "config-embed",
|
||||
generate: "config-generate",
|
||||
rerank: "config-rerank",
|
||||
};
|
||||
expect(resolveEmbedModel(config)).toBe("config-embed");
|
||||
expect(resolveGenerateModel(config)).toBe("config-generate");
|
||||
expect(resolveRerankModel(config)).toBe("config-rerank");
|
||||
expect(resolveModels(config)).toEqual(config);
|
||||
});
|
||||
});
|
||||
|
||||
test("LlamaCpp constructor uses the same resolver as status/embed/query helpers", () => {
|
||||
withModelEnv({
|
||||
QMD_EMBED_MODEL: "env-embed",
|
||||
QMD_GENERATE_MODEL: "env-generate",
|
||||
QMD_RERANK_MODEL: "env-rerank",
|
||||
}, () => {
|
||||
const llm = new LlamaCpp({
|
||||
embedModel: "config-embed",
|
||||
generateModel: "config-generate",
|
||||
rerankModel: "config-rerank",
|
||||
});
|
||||
expect(llm.embedModelName).toBe(resolveEmbedModel({ embed: "config-embed" }));
|
||||
expect(llm.generateModelName).toBe(resolveGenerateModel({ generate: "config-generate" }));
|
||||
expect(llm.rerankModelName).toBe(resolveRerankModel({ rerank: "config-rerank" }));
|
||||
});
|
||||
});
|
||||
});
|
||||
|
||||
// =============================================================================
|
||||
// Singleton Tests (no model loading required)
|
||||
// =============================================================================
|
||||
@ -75,6 +140,29 @@ describe("QMD_LLAMA_GPU resolution", () => {
|
||||
expect(resolveLlamaGpuMode(" cuda ")).toBe("cuda");
|
||||
});
|
||||
|
||||
test("QMD_FORCE_CPU disables GPU before QMD_LLAMA_GPU auto-detection", () => {
|
||||
const prevForceCpu = process.env.QMD_FORCE_CPU;
|
||||
process.env.QMD_FORCE_CPU = "1";
|
||||
try {
|
||||
expect(resolveLlamaGpuMode(undefined)).toBe(false);
|
||||
expect(resolveLlamaGpuMode("cuda")).toBe(false);
|
||||
} finally {
|
||||
if (prevForceCpu === undefined) delete process.env.QMD_FORCE_CPU;
|
||||
else process.env.QMD_FORCE_CPU = prevForceCpu;
|
||||
}
|
||||
});
|
||||
|
||||
test("QMD_FORCE_CPU ignores false-ish values", () => {
|
||||
const prevForceCpu = process.env.QMD_FORCE_CPU;
|
||||
process.env.QMD_FORCE_CPU = "0";
|
||||
try {
|
||||
expect(resolveLlamaGpuMode(undefined)).toBe("auto");
|
||||
} finally {
|
||||
if (prevForceCpu === undefined) delete process.env.QMD_FORCE_CPU;
|
||||
else process.env.QMD_FORCE_CPU = prevForceCpu;
|
||||
}
|
||||
});
|
||||
|
||||
test("warns and falls back to auto for unsupported values", () => {
|
||||
const stderrSpy = vi.spyOn(process.stderr, "write").mockReturnValue(true);
|
||||
try {
|
||||
@ -87,6 +175,201 @@ describe("QMD_LLAMA_GPU resolution", () => {
|
||||
});
|
||||
});
|
||||
|
||||
describe("native llama stdout containment", () => {
|
||||
test("redirects native stdout noise to stderr while JSON callers are initializing llama", async () => {
|
||||
const stdoutSpy = vi.spyOn(process.stdout, "write").mockReturnValue(true);
|
||||
const stderrSpy = vi.spyOn(process.stderr, "write").mockReturnValue(true);
|
||||
try {
|
||||
await withNativeStdoutRedirectedToStderr(async () => {
|
||||
process.stdout.write("cmake build spam\n");
|
||||
return "ok";
|
||||
});
|
||||
|
||||
expect(stdoutSpy).not.toHaveBeenCalled();
|
||||
expect(stderrSpy).toHaveBeenCalledWith("cmake build spam\n", undefined, undefined);
|
||||
} finally {
|
||||
stdoutSpy.mockRestore();
|
||||
stderrSpy.mockRestore();
|
||||
}
|
||||
});
|
||||
|
||||
test("keeps native GPU failure noise off stdout and caches failed GPU init", async () => {
|
||||
const prevGpu = process.env.QMD_LLAMA_GPU;
|
||||
const prevForceCpu = process.env.QMD_FORCE_CPU;
|
||||
process.env.QMD_LLAMA_GPU = "cuda";
|
||||
delete process.env.QMD_FORCE_CPU;
|
||||
|
||||
const calls: unknown[] = [];
|
||||
const fakeLlama = { gpu: false, cpuMathCores: 4 };
|
||||
setNodeLlamaCppModuleForTest({
|
||||
LlamaLogLevel: { error: "error" },
|
||||
resolveModelFile: vi.fn(),
|
||||
LlamaChatSession: vi.fn() as any,
|
||||
getLlama: vi.fn(async (options: Record<string, unknown>) => {
|
||||
calls.push(options.gpu);
|
||||
if (options.gpu === "cuda") {
|
||||
process.stdout.write("cmake build spam\n");
|
||||
throw new Error("CUDA unavailable");
|
||||
}
|
||||
return fakeLlama as any;
|
||||
}),
|
||||
});
|
||||
|
||||
const stdoutSpy = vi.spyOn(process.stdout, "write").mockReturnValue(true);
|
||||
const stderrSpy = vi.spyOn(process.stderr, "write").mockReturnValue(true);
|
||||
try {
|
||||
const first = new LlamaCpp();
|
||||
const second = new LlamaCpp();
|
||||
|
||||
await (first as any).ensureLlama();
|
||||
await (second as any).ensureLlama();
|
||||
|
||||
expect(stdoutSpy).not.toHaveBeenCalled();
|
||||
expect(stderrSpy).toHaveBeenCalledWith("cmake build spam\n", undefined, undefined);
|
||||
expect(calls).toEqual(["cuda", false, false]);
|
||||
expect(String(stderrSpy.mock.calls.map(call => call[0]).join(""))).toContain("skipping previously failed GPU init");
|
||||
} finally {
|
||||
stdoutSpy.mockRestore();
|
||||
stderrSpy.mockRestore();
|
||||
setNodeLlamaCppModuleForTest(null);
|
||||
if (prevGpu === undefined) delete process.env.QMD_LLAMA_GPU;
|
||||
else process.env.QMD_LLAMA_GPU = prevGpu;
|
||||
if (prevForceCpu === undefined) delete process.env.QMD_FORCE_CPU;
|
||||
else process.env.QMD_FORCE_CPU = prevForceCpu;
|
||||
}
|
||||
});
|
||||
|
||||
test("warns about CPU fallback only once per process", async () => {
|
||||
const prevGpu = process.env.QMD_LLAMA_GPU;
|
||||
const prevForceCpu = process.env.QMD_FORCE_CPU;
|
||||
process.env.QMD_LLAMA_GPU = "false";
|
||||
delete process.env.QMD_FORCE_CPU;
|
||||
|
||||
setNodeLlamaCppModuleForTest({
|
||||
LlamaLogLevel: { error: "error" },
|
||||
resolveModelFile: vi.fn(),
|
||||
LlamaChatSession: vi.fn() as any,
|
||||
getLlama: vi.fn(async () => ({ gpu: false, cpuMathCores: 4 }) as any),
|
||||
});
|
||||
|
||||
const stderrSpy = vi.spyOn(process.stderr, "write").mockReturnValue(true);
|
||||
try {
|
||||
const first = new LlamaCpp();
|
||||
const second = new LlamaCpp();
|
||||
|
||||
await (first as any).ensureLlama();
|
||||
await (second as any).ensureLlama();
|
||||
|
||||
const stderr = String(stderrSpy.mock.calls.map(call => call[0]).join(""));
|
||||
expect(stderr.match(/no GPU acceleration/g)?.length).toBe(1);
|
||||
expect(stderr).toContain("qmd doctor");
|
||||
expect(stderr).not.toContain("QMD_STATUS_DEVICE_PROBE");
|
||||
} finally {
|
||||
stderrSpy.mockRestore();
|
||||
setNodeLlamaCppModuleForTest(null);
|
||||
if (prevGpu === undefined) delete process.env.QMD_LLAMA_GPU;
|
||||
else process.env.QMD_LLAMA_GPU = prevGpu;
|
||||
if (prevForceCpu === undefined) delete process.env.QMD_FORCE_CPU;
|
||||
else process.env.QMD_FORCE_CPU = prevForceCpu;
|
||||
}
|
||||
});
|
||||
|
||||
test("embeds hello world with QMD_FORCE_CPU=1 without throwing", async () => {
|
||||
const prevGpu = process.env.QMD_LLAMA_GPU;
|
||||
const prevForceCpu = process.env.QMD_FORCE_CPU;
|
||||
process.env.QMD_FORCE_CPU = "1";
|
||||
process.env.QMD_LLAMA_GPU = "metal";
|
||||
|
||||
const getEmbeddingFor = vi.fn(async (text: string) => ({
|
||||
vector: new Float32Array([0.1, 0.2, 0.3]),
|
||||
text,
|
||||
}));
|
||||
const createEmbeddingContext = vi.fn(async () => ({
|
||||
getEmbeddingFor,
|
||||
dispose: vi.fn(async () => {}),
|
||||
}));
|
||||
const loadModel = vi.fn(async () => ({
|
||||
trainContextSize: 2048,
|
||||
tokenize: (text: string) => Array.from(text),
|
||||
detokenize: (tokens: string[]) => tokens.join(""),
|
||||
createEmbeddingContext,
|
||||
dispose: vi.fn(async () => {}),
|
||||
}));
|
||||
const getLlama = vi.fn(async (options: Record<string, unknown>) => ({
|
||||
gpu: false,
|
||||
cpuMathCores: 4,
|
||||
loadModel,
|
||||
dispose: vi.fn(async () => {}),
|
||||
}) as any);
|
||||
|
||||
setNodeLlamaCppModuleForTest({
|
||||
LlamaLogLevel: { error: "error" },
|
||||
resolveModelFile: vi.fn(async () => "/tmp/nonexistent-model.gguf"),
|
||||
LlamaChatSession: vi.fn() as any,
|
||||
getLlama,
|
||||
});
|
||||
|
||||
const stderrSpy = vi.spyOn(process.stderr, "write").mockReturnValue(true);
|
||||
const llm = new LlamaCpp();
|
||||
try {
|
||||
const result = await llm.embed("hello world");
|
||||
expect(result).toEqual({
|
||||
embedding: [0.10000000149011612, 0.20000000298023224, 0.30000001192092896],
|
||||
model: llm.embedModelName,
|
||||
});
|
||||
expect(getLlama).toHaveBeenCalledWith(expect.objectContaining({ gpu: false, build: "never" }));
|
||||
expect(loadModel).toHaveBeenCalledWith(expect.objectContaining({ gpuLayers: 0 }));
|
||||
expect(getEmbeddingFor).toHaveBeenCalledWith("hello world");
|
||||
} finally {
|
||||
await llm.dispose();
|
||||
stderrSpy.mockRestore();
|
||||
setNodeLlamaCppModuleForTest(null);
|
||||
if (prevGpu === undefined) delete process.env.QMD_LLAMA_GPU;
|
||||
else process.env.QMD_LLAMA_GPU = prevGpu;
|
||||
if (prevForceCpu === undefined) delete process.env.QMD_FORCE_CPU;
|
||||
else process.env.QMD_FORCE_CPU = prevForceCpu;
|
||||
}
|
||||
});
|
||||
});
|
||||
|
||||
describe("LLM context parallelism safety", () => {
|
||||
test("defaults Windows CUDA to one context to avoid ggml-cuda.cu:98 crashes", () => {
|
||||
expect(resolveSafeParallelism({
|
||||
gpu: "cuda",
|
||||
platform: "win32",
|
||||
computed: 8,
|
||||
envValue: undefined,
|
||||
})).toBe(1);
|
||||
});
|
||||
|
||||
test("keeps non-Windows and non-CUDA backends on computed parallelism", () => {
|
||||
expect(resolveSafeParallelism({ gpu: "cuda", platform: "linux", computed: 8 })).toBe(8);
|
||||
expect(resolveSafeParallelism({ gpu: "vulkan", platform: "win32", computed: 8 })).toBe(8);
|
||||
expect(resolveSafeParallelism({ gpu: false, platform: "win32", computed: 4 })).toBe(4);
|
||||
});
|
||||
|
||||
test("QMD_EMBED_PARALLELISM overrides the Windows CUDA safety default", () => {
|
||||
expect(resolveSafeParallelism({
|
||||
gpu: "cuda",
|
||||
platform: "win32",
|
||||
computed: 8,
|
||||
envValue: "2",
|
||||
})).toBe(2);
|
||||
});
|
||||
|
||||
test("QMD_EMBED_PARALLELISM clamps invalid values and warns", () => {
|
||||
const stderrSpy = vi.spyOn(process.stderr, "write").mockReturnValue(true);
|
||||
try {
|
||||
expect(resolveParallelismOverride("0")).toBeUndefined();
|
||||
expect(resolveParallelismOverride("bad")).toBeUndefined();
|
||||
expect(stderrSpy).toHaveBeenCalledTimes(2);
|
||||
expect(String(stderrSpy.mock.calls[0]?.[0] || "")).toContain("QMD_EMBED_PARALLELISM");
|
||||
} finally {
|
||||
stderrSpy.mockRestore();
|
||||
}
|
||||
});
|
||||
});
|
||||
|
||||
describe("LlamaCpp expand context size config", () => {
|
||||
const defaultExpandContextSize = 2048;
|
||||
|
||||
@ -820,7 +1103,7 @@ describe.skipIf(!!process.env.CI)("LlamaCpp Integration", () => {
|
||||
for (const doc of result.results) {
|
||||
console.log(` ${doc.file}: ${doc.score.toFixed(4)}`);
|
||||
}
|
||||
});
|
||||
}, 30000);
|
||||
});
|
||||
|
||||
describe("expandQuery", () => {
|
||||
|
||||
98
test/local-config.test.ts
Normal file
98
test/local-config.test.ts
Normal file
@ -0,0 +1,98 @@
|
||||
import { existsSync, mkdtempSync, mkdirSync, writeFileSync, rmSync, realpathSync } from "node:fs";
|
||||
import { execFileSync } from "node:child_process";
|
||||
import { join } from "node:path";
|
||||
import { tmpdir } from "node:os";
|
||||
import { afterEach, describe, expect, test } from "vitest";
|
||||
import { findLocalConfigPath, getLocalDbPath } from "../src/collections.js";
|
||||
|
||||
function cliCommandArgs(command: string): { bin: string; args: string[] } {
|
||||
const cliPath = join(process.cwd(), "src/cli/qmd.ts");
|
||||
if (process.versions.bun) {
|
||||
return { bin: process.execPath, args: [cliPath, command] };
|
||||
}
|
||||
return {
|
||||
bin: process.execPath,
|
||||
args: [join(process.cwd(), "node_modules/tsx/dist/cli.mjs"), cliPath, command],
|
||||
};
|
||||
}
|
||||
|
||||
const roots: string[] = [];
|
||||
|
||||
function tempProject(): string {
|
||||
const root = mkdtempSync(join(tmpdir(), "qmd-local-config-"));
|
||||
roots.push(root);
|
||||
return root;
|
||||
}
|
||||
|
||||
afterEach(() => {
|
||||
for (const root of roots.splice(0)) {
|
||||
rmSync(root, { recursive: true, force: true });
|
||||
}
|
||||
});
|
||||
|
||||
describe("local .qmd project config", () => {
|
||||
test("finds .qmd/index.yaml from nested working directories", () => {
|
||||
const root = tempProject();
|
||||
const configPath = join(root, ".qmd", "index.yaml");
|
||||
mkdirSync(join(root, ".qmd"), { recursive: true });
|
||||
writeFileSync(configPath, "collections: {}\n");
|
||||
const nested = join(root, "wiki", "Shopify");
|
||||
mkdirSync(nested, { recursive: true });
|
||||
|
||||
expect(findLocalConfigPath(nested)).toBe(configPath);
|
||||
});
|
||||
|
||||
test("prefers index.yaml over index.yml when both exist", () => {
|
||||
const root = tempProject();
|
||||
mkdirSync(join(root, ".qmd"), { recursive: true });
|
||||
const yaml = join(root, ".qmd", "index.yaml");
|
||||
const yml = join(root, ".qmd", "index.yml");
|
||||
writeFileSync(yaml, "collections: {}\n");
|
||||
writeFileSync(yml, "collections: {}\n");
|
||||
|
||||
expect(findLocalConfigPath(root)).toBe(yaml);
|
||||
});
|
||||
|
||||
test("uses .qmd/index.sqlite next to the local config", () => {
|
||||
const root = tempProject();
|
||||
mkdirSync(join(root, ".qmd"), { recursive: true });
|
||||
const configPath = join(root, ".qmd", "index.yaml");
|
||||
writeFileSync(configPath, "collections: {}\n");
|
||||
|
||||
expect(getLocalDbPath(configPath)).toBe(join(root, ".qmd", "index.sqlite"));
|
||||
});
|
||||
|
||||
test("CLI uses local .qmd config and index instead of global cache", () => {
|
||||
const root = tempProject();
|
||||
mkdirSync(join(root, ".qmd"), { recursive: true });
|
||||
mkdirSync(join(root, "docs"), { recursive: true });
|
||||
writeFileSync(join(root, "docs", "a.md"), "# A\n\nLocal test document.\n");
|
||||
writeFileSync(join(root, ".qmd", "index.yaml"), `collections:\n docs:\n path: ${JSON.stringify(join(root, "docs"))}\n pattern: "**/*.md"\n context:\n /: Local test docs\nmodels:\n embed: local-embed-model\n rerank: local-rerank-model\n generate: local-generate-model\n`);
|
||||
|
||||
const home = join(root, "home");
|
||||
const { bin, args } = cliCommandArgs("status");
|
||||
const output = execFileSync(bin, args, {
|
||||
cwd: root,
|
||||
encoding: "utf-8",
|
||||
env: {
|
||||
...process.env,
|
||||
HOME: home,
|
||||
XDG_CONFIG_HOME: join(home, ".config"),
|
||||
XDG_CACHE_HOME: join(home, ".cache"),
|
||||
QMD_EMBED_MODEL: "env-embed-model",
|
||||
QMD_RERANK_MODEL: "env-rerank-model",
|
||||
QMD_GENERATE_MODEL: "env-generate-model",
|
||||
},
|
||||
});
|
||||
|
||||
const localIndex = join(root, ".qmd", "index.sqlite");
|
||||
expect(output).toContain(`Index: ${realpathSync(localIndex)}`);
|
||||
expect(output).toContain("docs (qmd://docs/)");
|
||||
expect(output).toContain("Embedding: local-embed-model");
|
||||
expect(output).toContain("Reranking: local-rerank-model");
|
||||
expect(output).toContain("Generation: local-generate-model");
|
||||
expect(output).not.toContain("env-embed-model");
|
||||
expect(existsSync(localIndex)).toBe(true);
|
||||
expect(existsSync(join(home, ".cache", "qmd", "index.sqlite"))).toBe(false);
|
||||
});
|
||||
});
|
||||
@ -80,6 +80,7 @@ function initTestDatabase(db: Database): void {
|
||||
seq INTEGER NOT NULL DEFAULT 0,
|
||||
pos INTEGER NOT NULL DEFAULT 0,
|
||||
model TEXT NOT NULL,
|
||||
embed_fingerprint TEXT NOT NULL DEFAULT '',
|
||||
embedded_at TEXT NOT NULL,
|
||||
PRIMARY KEY (hash, seq)
|
||||
)
|
||||
@ -186,7 +187,7 @@ function seedTestData(db: Database): void {
|
||||
for (let i = 0; i < 768; i++) embedding[i] = Math.random();
|
||||
|
||||
for (const doc of docs.slice(0, 4)) { // Skip large file for embeddings
|
||||
db.prepare(`INSERT INTO content_vectors (hash, seq, pos, model, embedded_at) VALUES (?, 0, 0, 'embeddinggemma', ?)`).run(doc.hash, now);
|
||||
db.prepare(`INSERT INTO content_vectors (hash, seq, pos, model, embed_fingerprint, embedded_at) VALUES (?, 0, 0, ?, ?, ?)`).run(doc.hash, DEFAULT_EMBED_MODEL, getEmbeddingFingerprint(DEFAULT_EMBED_MODEL), now);
|
||||
db.prepare(`INSERT INTO vectors_vec (hash_seq, embedding) VALUES (?, ?)`).run(`${doc.hash}_0`, embedding);
|
||||
}
|
||||
}
|
||||
@ -211,6 +212,7 @@ import {
|
||||
findDocuments,
|
||||
getStatus,
|
||||
DEFAULT_EMBED_MODEL,
|
||||
getEmbeddingFingerprint,
|
||||
DEFAULT_QUERY_MODEL,
|
||||
DEFAULT_RERANK_MODEL,
|
||||
DEFAULT_MULTI_GET_MAX_BYTES,
|
||||
@ -887,6 +889,33 @@ describe("MCP Server", () => {
|
||||
expect(typeof col.documents).toBe("number");
|
||||
}
|
||||
});
|
||||
|
||||
test("REST /query and /search file field uses qmd:// URI prefix (#576)", () => {
|
||||
// Regression test: the HTTP REST endpoint was returning r.displayPath (e.g.
|
||||
// "docs/readme.md") instead of "qmd://docs/readme.md", while the CLI and MCP
|
||||
// resource URIs always use the qmd:// scheme. This simulates the fix: the REST
|
||||
// handler now applies encodeQmdPath and prepends "qmd://".
|
||||
const results = searchFTS(testDb, "readme", 5);
|
||||
expect(results.length).toBeGreaterThan(0);
|
||||
|
||||
// Simulate what the fixed REST handler produces for each result
|
||||
const restResponseItems = results.map(r => ({
|
||||
docid: `#${r.docid}`,
|
||||
file: `qmd://${r.displayPath.split('/').map(s => encodeURIComponent(s)).join('/')}`,
|
||||
title: r.title,
|
||||
score: Math.round(r.score * 100) / 100,
|
||||
}));
|
||||
|
||||
// Every file field must start with qmd://
|
||||
for (const item of restResponseItems) {
|
||||
expect(item.file).toMatch(/^qmd:\/\//);
|
||||
}
|
||||
|
||||
// Spot-check the readme result
|
||||
const readmeItem = restResponseItems.find(item => item.file.includes("readme"));
|
||||
expect(readmeItem).toBeDefined();
|
||||
expect(readmeItem!.file).toBe("qmd://docs/readme.md");
|
||||
});
|
||||
});
|
||||
});
|
||||
|
||||
@ -913,6 +942,22 @@ describe.skipIf(!!process.env.CI)("MCP HTTP Transport", () => {
|
||||
initTestDatabase(db);
|
||||
seedTestData(db);
|
||||
|
||||
// 300 pad lines (37 chars each = 11100 chars) puts the marker past the
|
||||
// first chunk boundary at CHUNK_SIZE_CHARS = 3600.
|
||||
{
|
||||
const padLine = "Pad line for chunk boundary coverage\n";
|
||||
const absLineFixtureBody =
|
||||
padLine.repeat(300) +
|
||||
"UNIQUE_KEYWORD_XYZ marker\n" +
|
||||
padLine.repeat(20);
|
||||
const fixtureHash = "hash-abslines";
|
||||
const now = new Date().toISOString();
|
||||
db.prepare(`INSERT OR IGNORE INTO content (hash, doc, created_at) VALUES (?, ?, ?)`)
|
||||
.run(fixtureHash, absLineFixtureBody, now);
|
||||
db.prepare(`INSERT INTO documents (collection, path, title, hash, created_at, modified_at, active) VALUES ('docs', ?, ?, ?, ?, ?, 1)`)
|
||||
.run("absolute-line-fixture.md", "Absolute Line Fixture", fixtureHash, now, now);
|
||||
}
|
||||
|
||||
// Sync config into SQLite
|
||||
const httpTestConfig: CollectionConfig = {
|
||||
collections: {
|
||||
@ -1074,4 +1119,29 @@ describe.skipIf(!!process.env.CI)("MCP HTTP Transport", () => {
|
||||
expect(json.result).toBeDefined();
|
||||
expect(json.result.content.length).toBeGreaterThan(0);
|
||||
});
|
||||
|
||||
test("POST /mcp tools/call query returns absolute source-file line numbers, not chunk-local", async () => {
|
||||
await mcpRequest({
|
||||
jsonrpc: "2.0", id: 1, method: "initialize",
|
||||
params: { protocolVersion: "2025-03-26", capabilities: {}, clientInfo: { name: "test", version: "1.0" } },
|
||||
});
|
||||
|
||||
const { status, json } = await mcpRequest({
|
||||
jsonrpc: "2.0", id: 5, method: "tools/call",
|
||||
params: {
|
||||
name: "query",
|
||||
arguments: {
|
||||
searches: [{ type: "lex", query: "UNIQUE_KEYWORD_XYZ" }],
|
||||
rerank: false,
|
||||
},
|
||||
},
|
||||
});
|
||||
expect(status).toBe(200);
|
||||
const results = json.result.structuredContent.results;
|
||||
expect(results.length).toBeGreaterThan(0);
|
||||
const hit = results.find((r: any) => r.file === "docs/absolute-line-fixture.md");
|
||||
expect(hit).toBeDefined();
|
||||
expect(hit.line).toBe(301);
|
||||
expect(hit.snippet).toMatch(/^\d+: @@ -3\d\d,/);
|
||||
});
|
||||
});
|
||||
|
||||
71
test/package.test.ts
Normal file
71
test/package.test.ts
Normal file
@ -0,0 +1,71 @@
|
||||
import { describe, expect, test } from "vitest";
|
||||
import { readFileSync } from "node:fs";
|
||||
import { join } from "node:path";
|
||||
|
||||
const root = new URL("..", import.meta.url);
|
||||
const pkg = JSON.parse(readFileSync(new URL("package.json", root), "utf8"));
|
||||
|
||||
describe("package test task", () => {
|
||||
test("runs typecheck, unit tests, and package smoke checks", () => {
|
||||
expect(pkg.scripts.test).toContain("scripts/test-all.mjs");
|
||||
|
||||
expect(pkg.scripts["test:types"]).toContain("tsconfig.build.json --noEmit");
|
||||
expect(pkg.scripts["test:unit"]).toContain("vitest.mjs");
|
||||
expect(pkg.scripts["test:unit"]).toContain("bun test");
|
||||
expect(pkg.scripts["test:unit"]).toContain("CI=true");
|
||||
|
||||
expect(pkg.scripts["test:package"]).toContain("scripts/package-smoke.mjs");
|
||||
|
||||
const testAllScript = readFileSync(new URL("scripts/test-all.mjs", root), "utf8");
|
||||
expect(testAllScript).toContain("TypeScript build typecheck");
|
||||
expect(testAllScript).toContain("Vitest suite under Node");
|
||||
expect(testAllScript).toContain("Bun test suite");
|
||||
expect(testAllScript).toContain("Package smoke");
|
||||
|
||||
const packageSmokeScript = readFileSync(new URL("scripts/package-smoke.mjs", root), "utf8");
|
||||
expect(packageSmokeScript).toContain("scripts/build.mjs");
|
||||
expect(packageSmokeScript).toContain("scripts/check-package-grammars.mjs");
|
||||
expect(packageSmokeScript).toContain("compiled CLI under Node");
|
||||
expect(packageSmokeScript).toContain("compiled CLI under Bun");
|
||||
expect(packageSmokeScript).toContain("package wrapper");
|
||||
});
|
||||
});
|
||||
|
||||
describe("package grammar distribution", () => {
|
||||
test("installs AST grammar wasm packages as required runtime dependencies", () => {
|
||||
for (const dep of ["tree-sitter-typescript", "tree-sitter-python", "tree-sitter-go", "tree-sitter-rust"]) {
|
||||
expect(pkg.dependencies, `${dep} should be a required dependency`).toHaveProperty(dep);
|
||||
expect(pkg.optionalDependencies ?? {}, `${dep} should not be optional`).not.toHaveProperty(dep);
|
||||
}
|
||||
});
|
||||
|
||||
test("documents a packaging smoke check for grammar wasm availability", () => {
|
||||
expect(pkg.scripts, "package.json scripts").toHaveProperty("smoke:package-grammars");
|
||||
expect(String(pkg.scripts["smoke:package-grammars"])).toContain("check-package-grammars");
|
||||
|
||||
expect(pkg.files, "published package files").toContain("scripts/build.mjs");
|
||||
expect(pkg.files, "published package files").toContain("scripts/check-package-grammars.mjs");
|
||||
expect(pkg.files, "published package files").toContain("scripts/package-smoke.mjs");
|
||||
expect(pkg.files, "published package files").toContain("scripts/test-all.mjs");
|
||||
expect(pkg.files, "published package files").toContain("skills/");
|
||||
const qmdSkill = readFileSync(new URL("skills/qmd/SKILL.md", root), "utf8");
|
||||
expect(qmdSkill).toContain("# QMD - Query Markdown Documents");
|
||||
expect(qmdSkill).toContain("## How search works");
|
||||
expect(qmdSkill).toContain("## MCP Tool: `query`");
|
||||
expect(qmdSkill).not.toContain("This file is a discovery stub");
|
||||
|
||||
const firstSixtyLines = qmdSkill.split(/\r?\n/).slice(0, 60).join("\n");
|
||||
expect(firstSixtyLines).toContain("Search for candidate documents");
|
||||
expect(firstSixtyLines).toContain("qmd search");
|
||||
expect(firstSixtyLines).toContain('qmd multi-get "#abc123,#def432"');
|
||||
expect(firstSixtyLines).toContain("Retrieved:");
|
||||
expect(firstSixtyLines).toContain("qmd query");
|
||||
// The skill must teach structured, self-authored queries near the top.
|
||||
expect(firstSixtyLines).toContain("Default to structured");
|
||||
|
||||
const scriptPath = join(root.pathname, "scripts", "check-package-grammars.mjs");
|
||||
const script = readFileSync(scriptPath, "utf8");
|
||||
expect(script).toContain("tree-sitter-typescript/tree-sitter-typescript.wasm");
|
||||
expect(script).toContain("tree-sitter-typescript/tree-sitter-tsx.wasm");
|
||||
});
|
||||
});
|
||||
414
test/path-fidelity.test.ts
Normal file
414
test/path-fidelity.test.ts
Normal file
@ -0,0 +1,414 @@
|
||||
/**
|
||||
* Path Fidelity Tests
|
||||
*
|
||||
* Verifies that QMD stores literal filesystem paths (not handalized slugs) so
|
||||
* that paths with special characters — spaces, #, &, @, [], (), etc. — round-
|
||||
* trip correctly through index → search → get → full-path.
|
||||
*
|
||||
* This covers the five breakage points found before the literal-path fix:
|
||||
* 1. search --json `file` field shows handalized slug instead of real path
|
||||
* 2. `qmd get --full-path` silently falls back (resolveVirtualPath built
|
||||
* a non-existent path from the slug, existsSync returned false)
|
||||
* 3. `qmd get <actual-fs-path>` returns "Document not found"
|
||||
* 4. `qmd ls` shows handalized slugs
|
||||
* 5. `toVirtualPath(db, absPath)` returns null
|
||||
*
|
||||
* Also covers backward-compat migration: an index created with the old
|
||||
* handalize-at-index-time code can be updated with `qmd update` and the paths
|
||||
* are renamed to their literal forms in-place.
|
||||
*/
|
||||
|
||||
import { describe, test, expect, beforeAll, afterAll } from "vitest";
|
||||
import { mkdir, mkdtemp, rm, writeFile } from "fs/promises";
|
||||
import { existsSync, realpathSync } from "fs";
|
||||
import { tmpdir } from "os";
|
||||
import { join } from "path";
|
||||
import { spawn } from "child_process";
|
||||
import { fileURLToPath } from "url";
|
||||
import { dirname } from "path";
|
||||
import YAML from "yaml";
|
||||
import { openDatabase } from "../src/db.js";
|
||||
import type { Database } from "../src/db.js";
|
||||
import {
|
||||
createStore,
|
||||
toVirtualPath,
|
||||
insertDocument,
|
||||
insertContent,
|
||||
hashContent,
|
||||
handelize,
|
||||
normalizePathSeparators,
|
||||
syncConfigToDb,
|
||||
} from "../src/store.js";
|
||||
import type { CollectionConfig } from "../src/collections.js";
|
||||
|
||||
const thisDir = dirname(fileURLToPath(import.meta.url));
|
||||
const projectRoot = join(thisDir, "..");
|
||||
const qmdScript = join(projectRoot, "src", "cli", "qmd.ts");
|
||||
const isBunRuntime = typeof (globalThis as { Bun?: unknown }).Bun !== "undefined";
|
||||
const tsxCli = join(projectRoot, "node_modules", "tsx", "dist", "cli.mjs");
|
||||
|
||||
async function runQmd(
|
||||
args: string[],
|
||||
opts: { cwd: string; dbPath: string; configDir: string; env?: Record<string, string> }
|
||||
): Promise<{ stdout: string; stderr: string; exitCode: number }> {
|
||||
const runner = isBunRuntime
|
||||
? { command: process.execPath, args: [qmdScript, ...args] }
|
||||
: { command: process.execPath, args: [tsxCli, qmdScript, ...args] };
|
||||
|
||||
const proc = spawn(runner.command, runner.args, {
|
||||
cwd: opts.cwd,
|
||||
env: {
|
||||
...process.env,
|
||||
INDEX_PATH: opts.dbPath,
|
||||
QMD_CONFIG_DIR: opts.configDir,
|
||||
PWD: opts.cwd,
|
||||
QMD_DOCTOR_DEVICE_PROBE: "0",
|
||||
...(opts.env ?? {}),
|
||||
},
|
||||
stdio: ["ignore", "pipe", "pipe"],
|
||||
});
|
||||
|
||||
let stdout = "";
|
||||
let stderr = "";
|
||||
proc.stdout?.on("data", (c: Buffer) => { stdout += c.toString(); });
|
||||
proc.stderr?.on("data", (c: Buffer) => { stderr += c.toString(); });
|
||||
const exitCode = await new Promise<number>((res, rej) => {
|
||||
proc.once("error", rej);
|
||||
proc.on("close", (code) => res(code ?? 1));
|
||||
});
|
||||
return { stdout, stderr, exitCode };
|
||||
}
|
||||
|
||||
// ---------------------------------------------------------------------------
|
||||
// Test environment setup
|
||||
// ---------------------------------------------------------------------------
|
||||
|
||||
let testDir: string;
|
||||
|
||||
// Files with names that previously broke due to handalize() at index time.
|
||||
const crazyFiles: Array<{ name: string; content: string }> = [
|
||||
{
|
||||
name: "# Meeting - 234232 3432 __ 5.md",
|
||||
content: "# Meeting - 234232 3432 // 5\n\nSome meeting content with searchterm-alpha.\n",
|
||||
},
|
||||
{
|
||||
name: "Budget & Revenue (Q4) [2024].md",
|
||||
content: "# Budget & Revenue Q4 2024\n\nFinancial overview searchterm-beta.\n",
|
||||
},
|
||||
{
|
||||
name: "normal-file.md",
|
||||
content: "# Normal File\n\nPlain filename, should always work.\n",
|
||||
},
|
||||
];
|
||||
|
||||
const crazySubFiles: Array<{ name: string; content: string }> = [
|
||||
{
|
||||
name: "Notes #42 - foo@bar.md",
|
||||
content: "# Notes #42\n\nSubdir file with searchterm-gamma.\n",
|
||||
},
|
||||
];
|
||||
|
||||
beforeAll(async () => {
|
||||
testDir = await mkdtemp(join(tmpdir(), "qmd-path-fidelity-"));
|
||||
});
|
||||
|
||||
afterAll(async () => {
|
||||
await rm(testDir, { recursive: true, force: true });
|
||||
});
|
||||
|
||||
// Helper: create a fresh isolated test environment with a corpus of crazy filenames.
|
||||
async function createCrazyCollection(prefix: string): Promise<{
|
||||
collectionDir: string;
|
||||
dbPath: string;
|
||||
configDir: string;
|
||||
}> {
|
||||
const envDir = join(testDir, prefix);
|
||||
const collectionDir = join(envDir, "corpus");
|
||||
const dbPath = join(envDir, "test.sqlite");
|
||||
const configDir = join(envDir, "config");
|
||||
|
||||
await mkdir(collectionDir, { recursive: true });
|
||||
await mkdir(join(collectionDir, "subdir"), { recursive: true });
|
||||
await mkdir(configDir, { recursive: true });
|
||||
|
||||
// Resolve symlinks so the path matches what getRealPath() stores in the DB.
|
||||
// On macOS /tmp is a symlink to /private/tmp; without this normalisation
|
||||
// toVirtualPath() and --full-path resolution fail.
|
||||
const realCollectionDir = realpathSync(collectionDir);
|
||||
|
||||
for (const f of crazyFiles) {
|
||||
await writeFile(join(collectionDir, f.name), f.content);
|
||||
}
|
||||
for (const f of crazySubFiles) {
|
||||
await writeFile(join(collectionDir, "subdir", f.name), f.content);
|
||||
}
|
||||
|
||||
// Write empty YAML config — `collection add` will populate it
|
||||
await writeFile(join(configDir, "index.yml"), "collections: {}\n");
|
||||
|
||||
return { collectionDir: realCollectionDir, dbPath, configDir };
|
||||
}
|
||||
|
||||
// ---------------------------------------------------------------------------
|
||||
// Unit tests: store-level path storage
|
||||
// ---------------------------------------------------------------------------
|
||||
|
||||
describe("Path fidelity — store level", () => {
|
||||
test("reindexCollection stores literal relative paths, not handalized slugs", async () => {
|
||||
const { collectionDir, dbPath, configDir } = await createCrazyCollection("store-unit");
|
||||
|
||||
// Run `collection add` to index
|
||||
const add = await runQmd(
|
||||
["collection", "add", collectionDir, "--name", "crazytest"],
|
||||
{ cwd: collectionDir, dbPath, configDir }
|
||||
);
|
||||
expect(add.exitCode, `collection add failed: ${add.stderr}`).toBe(0);
|
||||
|
||||
// Inspect the DB directly
|
||||
const db = openDatabase(dbPath);
|
||||
const rows = db.prepare(
|
||||
"SELECT path FROM documents WHERE active = 1 ORDER BY path"
|
||||
).all() as { path: string }[];
|
||||
db.close();
|
||||
|
||||
const paths = rows.map((r) => r.path);
|
||||
|
||||
// Must contain literal filenames — not handalized slugs
|
||||
expect(paths).toContain("# Meeting - 234232 3432 __ 5.md");
|
||||
expect(paths).toContain("Budget & Revenue (Q4) [2024].md");
|
||||
expect(paths).toContain("normal-file.md");
|
||||
expect(paths).toContain("subdir/Notes #42 - foo@bar.md");
|
||||
|
||||
// Must NOT contain handalized versions
|
||||
expect(paths).not.toContain("Meeting-234232-3432-5.md");
|
||||
expect(paths).not.toContain("Budget-Revenue-Q4-2024.md");
|
||||
expect(paths).not.toContain("subdir/Notes-42-foo-bar.md");
|
||||
});
|
||||
|
||||
test("toVirtualPath returns non-null for crazy-named files", async () => {
|
||||
const { collectionDir, dbPath, configDir } = await createCrazyCollection("store-to-virtual");
|
||||
const add = await runQmd(
|
||||
["collection", "add", collectionDir, "--name", "crazytest"],
|
||||
{ cwd: collectionDir, dbPath, configDir }
|
||||
);
|
||||
expect(add.exitCode).toBe(0);
|
||||
|
||||
const rawDb = openDatabase(dbPath);
|
||||
const result = toVirtualPath(rawDb, join(collectionDir, "Budget & Revenue (Q4) [2024].md"));
|
||||
rawDb.close();
|
||||
|
||||
expect(result).not.toBeNull();
|
||||
expect(result).toBe(`qmd://crazytest/Budget & Revenue (Q4) [2024].md`);
|
||||
});
|
||||
});
|
||||
|
||||
// ---------------------------------------------------------------------------
|
||||
// CLI integration tests — the five original breakage points
|
||||
// ---------------------------------------------------------------------------
|
||||
|
||||
describe("Path fidelity — CLI integration", () => {
|
||||
let collectionDir: string;
|
||||
let dbPath: string;
|
||||
let configDir: string;
|
||||
|
||||
// Index once for the whole describe block (read-only tests share it)
|
||||
beforeAll(async () => {
|
||||
({ collectionDir, dbPath, configDir } = await createCrazyCollection("cli-shared"));
|
||||
const add = await runQmd(
|
||||
["collection", "add", collectionDir, "--name", "crazytest"],
|
||||
{ cwd: collectionDir, dbPath, configDir }
|
||||
);
|
||||
expect(add.exitCode, `collection add failed: ${add.stderr}`).toBe(0);
|
||||
});
|
||||
|
||||
test("(1) search --json file field contains literal path, not handalized slug", async () => {
|
||||
const { stdout, exitCode } = await runQmd(
|
||||
["search", "searchterm-alpha", "--json"],
|
||||
{ cwd: collectionDir, dbPath, configDir }
|
||||
);
|
||||
expect(exitCode).toBe(0);
|
||||
|
||||
const results = JSON.parse(stdout) as Array<{ file: string }>;
|
||||
expect(results.length).toBeGreaterThan(0);
|
||||
|
||||
const meetingResult = results.find((r) => r.file.includes("Meeting"));
|
||||
expect(meetingResult).toBeDefined();
|
||||
// Must contain the literal filename fragment
|
||||
expect(meetingResult!.file).toContain("# Meeting - 234232 3432 __ 5.md");
|
||||
// Must not contain the handalized version
|
||||
expect(meetingResult!.file).not.toContain("Meeting-234232-3432-5.md");
|
||||
});
|
||||
|
||||
test("(2) get --full-path resolves to real filesystem path for crazy-named file", async () => {
|
||||
const virtualPath = `qmd://crazytest/Budget & Revenue (Q4) [2024].md`;
|
||||
const { stdout, exitCode } = await runQmd(
|
||||
["get", virtualPath, "--full-path"],
|
||||
{ cwd: collectionDir, dbPath, configDir }
|
||||
);
|
||||
expect(exitCode, `get failed: ${stdout}`).toBe(0);
|
||||
|
||||
const header = stdout.split("\n")[0]!;
|
||||
// Should show a real filesystem path, not a qmd:// virtual path
|
||||
expect(header).not.toMatch(/^qmd:\/\//);
|
||||
// Should include the literal filename
|
||||
expect(header).toContain("Budget & Revenue (Q4) [2024].md");
|
||||
// The resolved filesystem path should exist — strip the trailing docid (#abc123)
|
||||
const fsPath = header.trim().replace(/\s+#[a-f0-9]{6}$/, "");
|
||||
// Path may be absolute or relative-to-collectionDir; resolve against collectionDir
|
||||
const absPath = fsPath.startsWith("/") ? fsPath : join(collectionDir, fsPath.replace(/^\.\//, ""));
|
||||
expect(existsSync(absPath), `resolved path does not exist: ${absPath}`).toBe(true);
|
||||
});
|
||||
test("(3) get <actual-fs-path> finds the document", async () => {
|
||||
const fsPath = join(collectionDir, "Budget & Revenue (Q4) [2024].md");
|
||||
const { stdout, exitCode, stderr } = await runQmd(
|
||||
["get", fsPath],
|
||||
{ cwd: collectionDir, dbPath, configDir }
|
||||
);
|
||||
expect(exitCode, `get by fs path failed: ${stderr}`).toBe(0);
|
||||
// Header should contain the document identifier
|
||||
expect(stdout).toContain("Budget & Revenue (Q4) [2024].md");
|
||||
});
|
||||
|
||||
test("(3b) get <actual-fs-path> finds subdir file with crazy name", async () => {
|
||||
const fsPath = join(collectionDir, "subdir", "Notes #42 - foo@bar.md");
|
||||
const { stdout, exitCode, stderr } = await runQmd(
|
||||
["get", fsPath],
|
||||
{ cwd: collectionDir, dbPath, configDir }
|
||||
);
|
||||
expect(exitCode, `get subdir file failed: ${stderr}`).toBe(0);
|
||||
expect(stdout).toContain("Notes #42 - foo@bar.md");
|
||||
});
|
||||
|
||||
test("(4) ls shows literal paths, not handalized slugs", async () => {
|
||||
const { stdout, exitCode } = await runQmd(
|
||||
["ls", "crazytest"],
|
||||
{ cwd: collectionDir, dbPath, configDir }
|
||||
);
|
||||
expect(exitCode).toBe(0);
|
||||
|
||||
// Literal paths must appear
|
||||
expect(stdout).toContain("# Meeting - 234232 3432 __ 5.md");
|
||||
expect(stdout).toContain("Budget & Revenue (Q4) [2024].md");
|
||||
expect(stdout).toContain("Notes #42 - foo@bar.md");
|
||||
|
||||
// Handalized slugs must NOT appear
|
||||
expect(stdout).not.toContain("Meeting-234232-3432-5.md");
|
||||
expect(stdout).not.toContain("Budget-Revenue-Q4-2024.md");
|
||||
expect(stdout).not.toContain("Notes-42-foo-bar.md");
|
||||
});
|
||||
|
||||
test("(5) search --json returns docid that can be fetched back", async () => {
|
||||
const { stdout: searchOut, exitCode: searchExit } = await runQmd(
|
||||
["search", "searchterm-beta", "--json"],
|
||||
{ cwd: collectionDir, dbPath, configDir }
|
||||
);
|
||||
expect(searchExit).toBe(0);
|
||||
|
||||
const results = JSON.parse(searchOut) as Array<{ docid: string; file: string }>;
|
||||
expect(results.length).toBeGreaterThan(0);
|
||||
|
||||
const hit = results[0]!;
|
||||
expect(hit.docid).toMatch(/^#[a-f0-9]{6}$/);
|
||||
|
||||
// Fetch by docid — must work
|
||||
const { stdout: getOut, exitCode: getExit } = await runQmd(
|
||||
["get", hit.docid],
|
||||
{ cwd: collectionDir, dbPath, configDir }
|
||||
);
|
||||
expect(getExit, `get by docid failed`).toBe(0);
|
||||
expect(getOut).toContain("Budget & Revenue (Q4) [2024].md");
|
||||
});
|
||||
|
||||
test("normal filenames are still stored correctly (regression)", async () => {
|
||||
const { stdout, exitCode } = await runQmd(
|
||||
["search", "Plain filename", "--json"],
|
||||
{ cwd: collectionDir, dbPath, configDir }
|
||||
);
|
||||
expect(exitCode).toBe(0);
|
||||
const results = JSON.parse(stdout) as Array<{ file: string }>;
|
||||
const hit = results.find((r) => r.file.includes("normal-file"));
|
||||
expect(hit).toBeDefined();
|
||||
expect(hit!.file).toContain("normal-file.md");
|
||||
});
|
||||
});
|
||||
|
||||
// ---------------------------------------------------------------------------
|
||||
// Migration test: old handalized DB upgraded by `qmd update`
|
||||
// ---------------------------------------------------------------------------
|
||||
|
||||
describe("Path fidelity — migration from handalized index", () => {
|
||||
test("qmd update migrates handalized paths to literal paths in existing index", async () => {
|
||||
const { collectionDir, dbPath, configDir } = await createCrazyCollection("migration");
|
||||
|
||||
// Manually build an old-style DB using handalize() (simulates pre-fix index)
|
||||
const store = createStore(dbPath);
|
||||
const now = new Date().toISOString();
|
||||
// Write and sync a config that points at the collection so `qmd update` knows where it is
|
||||
const migrationYaml = `collections:\n crazytest:\n path: "${collectionDir}"\n mask: "**/*.md"\n`;
|
||||
await writeFile(join(configDir, "index.yml"), migrationYaml);
|
||||
const config = YAML.parse(migrationYaml) as CollectionConfig;
|
||||
syncConfigToDb(store.db, config);
|
||||
|
||||
// Insert documents with handalized paths (old behavior)
|
||||
for (const f of crazyFiles) {
|
||||
const relPath = normalizePathSeparators(f.name);
|
||||
const handleized = handelize(relPath);
|
||||
const hash = await hashContent(f.content);
|
||||
insertContent(store.db, hash, f.content, now);
|
||||
insertDocument(store.db, "crazytest", handleized, `Title ${f.name}`, hash, now, now);
|
||||
}
|
||||
const subFile = crazySubFiles[0]!;
|
||||
const subRel = `subdir/${subFile.name}`;
|
||||
const subHandelized = handelize(subRel);
|
||||
const subHash = await hashContent(subFile.content);
|
||||
insertContent(store.db, subHash, subFile.content, now);
|
||||
insertDocument(store.db, "crazytest", subHandelized, "Sub title", subHash, now, now);
|
||||
store.close();
|
||||
|
||||
// Verify the old DB has handalized paths
|
||||
const dbBefore = openDatabase(dbPath);
|
||||
const pathsBefore = (dbBefore.prepare(
|
||||
"SELECT path FROM documents WHERE active = 1 ORDER BY path"
|
||||
).all() as { path: string }[]).map((r) => r.path);
|
||||
dbBefore.close();
|
||||
|
||||
expect(pathsBefore).toContain("Meeting-234232-3432-5.md");
|
||||
expect(pathsBefore).toContain("Budget-Revenue-Q4-2024.md");
|
||||
expect(pathsBefore).not.toContain("# Meeting - 234232 3432 __ 5.md");
|
||||
|
||||
// Run `qmd update` with the new code — should migrate paths in-place
|
||||
const update = await runQmd(
|
||||
["update"],
|
||||
{ cwd: collectionDir, dbPath, configDir }
|
||||
);
|
||||
expect(update.exitCode, `qmd update failed: ${update.stderr}`).toBe(0);
|
||||
|
||||
// Verify the DB now has literal paths
|
||||
const dbAfter = openDatabase(dbPath);
|
||||
const pathsAfter = (dbAfter.prepare(
|
||||
"SELECT path FROM documents WHERE active = 1 ORDER BY path"
|
||||
).all() as { path: string }[]).map((r) => r.path);
|
||||
dbAfter.close();
|
||||
|
||||
expect(pathsAfter).toContain("# Meeting - 234232 3432 __ 5.md");
|
||||
expect(pathsAfter).toContain("Budget & Revenue (Q4) [2024].md");
|
||||
expect(pathsAfter).toContain("normal-file.md");
|
||||
expect(pathsAfter).toContain("subdir/Notes #42 - foo@bar.md");
|
||||
|
||||
// Handalized slugs must be gone
|
||||
expect(pathsAfter).not.toContain("Meeting-234232-3432-5.md");
|
||||
expect(pathsAfter).not.toContain("Budget-Revenue-Q4-2024.md");
|
||||
|
||||
// Search must work after migration
|
||||
const { stdout: searchOut, exitCode: searchExit } = await runQmd(
|
||||
["search", "searchterm-alpha", "--json"],
|
||||
{ cwd: collectionDir, dbPath, configDir }
|
||||
);
|
||||
expect(searchExit).toBe(0);
|
||||
const results = JSON.parse(searchOut) as Array<{ file: string }>;
|
||||
expect(results.length).toBeGreaterThan(0);
|
||||
const meetingResult = results.find((r) => r.file.includes("Meeting"));
|
||||
expect(meetingResult).toBeDefined();
|
||||
expect(meetingResult!.file).toContain("# Meeting - 234232 3432 __ 5.md");
|
||||
});
|
||||
});
|
||||
100
test/sdk.test.ts
100
test/sdk.test.ts
@ -614,6 +614,20 @@ describe("search (unified API)", () => {
|
||||
expect(results.length).toBeGreaterThan(0);
|
||||
});
|
||||
|
||||
test("search() forwards candidateLimit to structured search", async () => {
|
||||
const results = await store.search({
|
||||
queries: [
|
||||
{ type: "lex", query: "authentication" },
|
||||
{ type: "lex", query: "meeting" },
|
||||
],
|
||||
limit: 5,
|
||||
candidateLimit: 1,
|
||||
rerank: false,
|
||||
});
|
||||
|
||||
expect(results).toHaveLength(1);
|
||||
});
|
||||
|
||||
// Tests below use search({ query: ... }) which triggers LLM query expansion
|
||||
describe.skipIf(!!process.env.CI)("with LLM query expansion", () => {
|
||||
test("search() with query and rerank:false returns results", async () => {
|
||||
@ -982,6 +996,92 @@ describe("embed", () => {
|
||||
}
|
||||
});
|
||||
|
||||
test("store.embed scopes pending documents to the requested collection", async () => {
|
||||
const store = await createStore({
|
||||
dbPath: freshDbPath(),
|
||||
config: {
|
||||
collections: {
|
||||
docs: { path: docsDir, pattern: "**/*.md" },
|
||||
notes: { path: notesDir, pattern: "**/*.md" },
|
||||
},
|
||||
},
|
||||
});
|
||||
|
||||
const fakeLlm = createFakeEmbedLlm();
|
||||
setDefaultLlamaCpp(createFakeTokenizer() as any);
|
||||
store.internal.llm = fakeLlm as any;
|
||||
|
||||
try {
|
||||
await store.update();
|
||||
const result = await store.embed({ collection: "docs" });
|
||||
|
||||
const vectorCounts = store.internal.db.prepare(`
|
||||
SELECT d.collection, COUNT(DISTINCT v.hash) AS count
|
||||
FROM documents d
|
||||
LEFT JOIN content_vectors v ON v.hash = d.hash AND v.seq = 0
|
||||
WHERE d.active = 1
|
||||
GROUP BY d.collection
|
||||
ORDER BY d.collection
|
||||
`).all() as Array<{ collection: string; count: number }>;
|
||||
|
||||
expect(result.docsProcessed).toBe(3);
|
||||
expect(result.chunksEmbedded).toBe(3);
|
||||
expect(vectorCounts).toEqual([
|
||||
{ collection: "docs", count: 3 },
|
||||
{ collection: "notes", count: 0 },
|
||||
]);
|
||||
} finally {
|
||||
setDefaultLlamaCpp(null);
|
||||
await store.close();
|
||||
}
|
||||
});
|
||||
|
||||
test("store.embed with force only clears the requested collection", async () => {
|
||||
const store = await createStore({
|
||||
dbPath: freshDbPath(),
|
||||
config: {
|
||||
collections: {
|
||||
docs: { path: docsDir, pattern: "**/*.md" },
|
||||
notes: { path: notesDir, pattern: "**/*.md" },
|
||||
},
|
||||
},
|
||||
});
|
||||
|
||||
const fakeLlm = createFakeEmbedLlm();
|
||||
setDefaultLlamaCpp(createFakeTokenizer() as any);
|
||||
store.internal.llm = fakeLlm as any;
|
||||
|
||||
const vectorCounts = () => store.internal.db.prepare(`
|
||||
SELECT d.collection, COUNT(DISTINCT v.hash) AS count
|
||||
FROM documents d
|
||||
LEFT JOIN content_vectors v ON v.hash = d.hash AND v.seq = 0
|
||||
WHERE d.active = 1
|
||||
GROUP BY d.collection
|
||||
ORDER BY d.collection
|
||||
`).all() as Array<{ collection: string; count: number }>;
|
||||
|
||||
try {
|
||||
await store.update();
|
||||
await store.embed();
|
||||
expect(vectorCounts()).toEqual([
|
||||
{ collection: "docs", count: 3 },
|
||||
{ collection: "notes", count: 3 },
|
||||
]);
|
||||
|
||||
const result = await store.embed({ force: true, collection: "docs" });
|
||||
|
||||
expect(result.docsProcessed).toBe(3);
|
||||
expect(result.chunksEmbedded).toBe(3);
|
||||
expect(vectorCounts()).toEqual([
|
||||
{ collection: "docs", count: 3 },
|
||||
{ collection: "notes", count: 3 },
|
||||
]);
|
||||
} finally {
|
||||
setDefaultLlamaCpp(null);
|
||||
await store.close();
|
||||
}
|
||||
});
|
||||
|
||||
test("store.embed rejects invalid batch limits", async () => {
|
||||
const store = await createStore({
|
||||
dbPath: freshDbPath(),
|
||||
|
||||
@ -1,17 +1,28 @@
|
||||
#!/usr/bin/env bash
|
||||
# Build a container image with qmd installed via npm and bun, then run smoke tests.
|
||||
# Works with docker or podman (whichever is available).
|
||||
# Build a clean container image from the current checkout package and exercise
|
||||
# install/runtime scenarios under npm, npx, and Bun. Supports optional qmd embed
|
||||
# and GPU probes, but keeps those expensive/device-specific checks opt-in.
|
||||
#
|
||||
# Usage:
|
||||
# test/smoke-install.sh # build + run all smoke tests
|
||||
# test/smoke-install.sh --build # build image only
|
||||
# test/smoke-install.sh --shell # drop into container shell
|
||||
# test/smoke-install.sh -- CMD... # run arbitrary command in container
|
||||
# test/smoke-install.sh # build + run default smoke scenarios
|
||||
# test/smoke-install.sh --build # build image only
|
||||
# test/smoke-install.sh --shell # drop into container shell
|
||||
# test/smoke-install.sh --scenario node # run one scenario (node|npx|bun|all)
|
||||
# test/smoke-install.sh --with-embed # also run tiny qmd embed smoke tests
|
||||
# test/smoke-install.sh --with-gpu # also probe GPU in doctor/embed scenarios
|
||||
# QMD_SMOKE_GPU_BACKEND=cuda|vulkan|auto # backend for --with-gpu (default: auto)
|
||||
# test/smoke-install.sh --no-build # reuse existing image
|
||||
# test/smoke-install.sh -- CMD... # run arbitrary command in container
|
||||
#
|
||||
# GPU notes:
|
||||
# Docker uses: --gpus all
|
||||
# Podman uses: --device nvidia.com/gpu=all
|
||||
# If your podman setup uses a different CDI device name, override with:
|
||||
# QMD_SMOKE_GPU_ARGS='--device nvidia.com/gpu=all' test/smoke-install.sh --with-gpu
|
||||
set -euo pipefail
|
||||
|
||||
cd "$(dirname "$0")/.."
|
||||
|
||||
# Pick container runtime
|
||||
if command -v podman &>/dev/null; then
|
||||
CTR=podman
|
||||
elif command -v docker &>/dev/null; then
|
||||
@ -21,10 +32,50 @@ else
|
||||
exit 1
|
||||
fi
|
||||
|
||||
IMAGE=qmd-smoke
|
||||
IMAGE=${QMD_SMOKE_IMAGE:-qmd-smoke}
|
||||
SCENARIO=all
|
||||
DO_BUILD=1
|
||||
WITH_EMBED=0
|
||||
WITH_GPU=0
|
||||
GPU_BACKEND=${QMD_SMOKE_GPU_BACKEND:-auto}
|
||||
declare -a ARBITRARY_CMD=()
|
||||
|
||||
usage() {
|
||||
sed -n '2,20p' "$0" | sed 's/^# \{0,1\}//'
|
||||
}
|
||||
|
||||
while [[ $# -gt 0 ]]; do
|
||||
case "$1" in
|
||||
--build) DO_BUILD=1; BUILD_ONLY=1; shift ;;
|
||||
--no-build) DO_BUILD=0; shift ;;
|
||||
--shell) SHELL_ONLY=1; shift ;;
|
||||
--scenario) SCENARIO="${2:-}"; shift 2 ;;
|
||||
--with-embed) WITH_EMBED=1; shift ;;
|
||||
--with-gpu) WITH_GPU=1; shift ;;
|
||||
--help|-h) usage; exit 0 ;;
|
||||
--) shift; ARBITRARY_CMD=("$@"); break ;;
|
||||
*) echo "Unknown argument: $1" >&2; usage >&2; exit 1 ;;
|
||||
esac
|
||||
done
|
||||
|
||||
BUILD_ONLY=${BUILD_ONLY:-0}
|
||||
SHELL_ONLY=${SHELL_ONLY:-0}
|
||||
|
||||
gpu_args() {
|
||||
if [[ $WITH_GPU -ne 1 ]]; then return 0; fi
|
||||
if [[ -n "${QMD_SMOKE_GPU_ARGS:-}" ]]; then
|
||||
# shellcheck disable=SC2206
|
||||
echo ${QMD_SMOKE_GPU_ARGS}
|
||||
return 0
|
||||
fi
|
||||
case "$CTR" in
|
||||
docker) echo "--gpus all" ;;
|
||||
podman) echo "--device nvidia.com/gpu=all" ;;
|
||||
esac
|
||||
}
|
||||
|
||||
build_image() {
|
||||
echo "==> Building TypeScript..."
|
||||
echo "==> Building TypeScript package..."
|
||||
npm run build --silent
|
||||
|
||||
echo "==> Packing tarball..."
|
||||
@ -32,32 +83,35 @@ build_image() {
|
||||
TARBALL=$(npm pack --pack-destination test/ 2>/dev/null | tail -1)
|
||||
echo " $TARBALL"
|
||||
|
||||
# Copy project files into build context so vitest/bun tests can run inside
|
||||
echo "==> Preparing container test project..."
|
||||
rm -rf test/test-src
|
||||
mkdir -p test/test-src/src test/test-src/test
|
||||
cp src/*.ts test/test-src/src/
|
||||
mkdir -p test/test-src/test
|
||||
cp -r src test/test-src/
|
||||
cp -r dist test/test-src/
|
||||
cp test/*.test.ts test/test-src/test/
|
||||
cp -r test/*.test.ts test/test-src/test/
|
||||
cp package.json tsconfig.json tsconfig.build.json test/test-src/
|
||||
|
||||
echo "==> Building container image ($CTR)..."
|
||||
echo "==> Building container image ($CTR): $IMAGE"
|
||||
$CTR build -f test/Containerfile -t "$IMAGE" test/
|
||||
|
||||
# Clean up
|
||||
rm -f test/tobilu-qmd-*.tgz
|
||||
rm -rf test/test-src
|
||||
echo "==> Image ready: $IMAGE"
|
||||
}
|
||||
|
||||
run() {
|
||||
$CTR run --rm "$IMAGE" bash -c "$*"
|
||||
local args=()
|
||||
# Intentionally word-split GPU args: container CLIs expect separate flags.
|
||||
# shellcheck disable=SC2206
|
||||
args=( $(gpu_args) )
|
||||
$CTR run --rm "${args[@]}" "$IMAGE" bash -lc "$*"
|
||||
}
|
||||
|
||||
PASS=0
|
||||
FAIL=0
|
||||
|
||||
ok() { printf " %-50s OK\n" "$1"; PASS=$((PASS + 1)); }
|
||||
fail() { printf " %-50s FAIL\n" "$1"; FAIL=$((FAIL + 1)); echo "$2" | sed 's/^/ /'; }
|
||||
ok() { printf " %-58s OK\n" "$1"; PASS=$((PASS + 1)); }
|
||||
fail() { printf " %-58s FAIL\n" "$1"; FAIL=$((FAIL + 1)); echo "$2" | sed 's/^/ /'; }
|
||||
|
||||
smoke_test() {
|
||||
local label="$1"; shift
|
||||
@ -73,97 +127,136 @@ smoke_test_output() {
|
||||
local label="$1"; local expect="$2"; shift 2
|
||||
local out
|
||||
out=$(run "$@" 2>&1) || true
|
||||
if echo "$out" | grep -q "$expect"; then
|
||||
if grep -q "$expect" <<<"$out"; then
|
||||
ok "$label"
|
||||
else
|
||||
fail "$label" "$out"
|
||||
fi
|
||||
}
|
||||
|
||||
run_smoke_tests() {
|
||||
# ------------------------------------------------------------------
|
||||
# Node (npm-installed qmd)
|
||||
# ------------------------------------------------------------------
|
||||
fixture_setup='rm -rf /tmp/qmd-fixture /tmp/qmd-cache /tmp/qmd-config /tmp/qmd-models; mkdir -p /tmp/qmd-fixture; printf "# Smoke Doc\n\nGPU and CPU embedding smoke test.\n" > /tmp/qmd-fixture/doc.md; export XDG_CACHE_HOME=/tmp/qmd-cache QMD_CONFIG_DIR=/tmp/qmd-config'
|
||||
|
||||
gpu_env() {
|
||||
case "$GPU_BACKEND" in
|
||||
auto|"") echo "" ;;
|
||||
cuda|vulkan|metal) echo "QMD_LLAMA_GPU=$GPU_BACKEND" ;;
|
||||
*) echo "Unsupported QMD_SMOKE_GPU_BACKEND=$GPU_BACKEND" >&2; exit 1 ;;
|
||||
esac
|
||||
}
|
||||
|
||||
run_doctor_smoke() {
|
||||
local label="$1" bin="$2" extra_env="${3:-}"
|
||||
smoke_test_output "$label doctor" "QMD Doctor" \
|
||||
"$fixture_setup; $extra_env $bin doctor"
|
||||
}
|
||||
|
||||
run_collection_smoke() {
|
||||
local label="$1" bin="$2" extra_env="${3:-}"
|
||||
smoke_test "$label collection add/list/status" \
|
||||
"$fixture_setup; cd /tmp/qmd-fixture; $extra_env $bin collection add . --name smoke; $extra_env $bin collection list; $extra_env $bin status"
|
||||
}
|
||||
|
||||
run_embed_smoke() {
|
||||
local label="$1" bin="$2" extra_env="${3:-}"
|
||||
[[ $WITH_EMBED -eq 1 ]] || return 0
|
||||
smoke_test "$label qmd embed tiny fixture" \
|
||||
"$fixture_setup; cd /tmp/qmd-fixture; $extra_env $bin collection add . --name smoke; $extra_env $bin embed --max-docs-per-batch 1 --max-batch-mb 1; $extra_env $bin doctor"
|
||||
}
|
||||
|
||||
run_runtime_matrix() {
|
||||
local label="$1" bin="$2" path_env="$3"
|
||||
smoke_test_output "$label qmd help" "Usage:" "$path_env; $bin"
|
||||
run_doctor_smoke "$label auto" "$path_env; $bin"
|
||||
run_doctor_smoke "$label force-cpu" "$path_env; $bin" "QMD_FORCE_CPU=1"
|
||||
run_collection_smoke "$label" "$path_env; $bin" "QMD_FORCE_CPU=1"
|
||||
run_embed_smoke "$label force-cpu" "$path_env; $bin" "QMD_FORCE_CPU=1"
|
||||
run_embed_smoke "$label auto" "$path_env; $bin"
|
||||
if [[ $WITH_GPU -eq 1 ]]; then
|
||||
local ge
|
||||
ge=$(gpu_env)
|
||||
run_doctor_smoke "$label gpu-$GPU_BACKEND" "$path_env; $bin" "$ge"
|
||||
run_embed_smoke "$label gpu-$GPU_BACKEND" "$path_env; $bin" "$ge"
|
||||
fi
|
||||
}
|
||||
|
||||
run_node_scenario() {
|
||||
local NODE_BIN='$(mise where node@latest)/bin'
|
||||
echo "=== Node (npm install) ==="
|
||||
|
||||
smoke_test_output "qmd shows help" "Usage:" \
|
||||
"export PATH=$NODE_BIN:\$PATH; qmd"
|
||||
|
||||
smoke_test "qmd collection list" \
|
||||
"export PATH=$NODE_BIN:\$PATH; qmd collection list"
|
||||
|
||||
smoke_test "qmd status" \
|
||||
"export PATH=$NODE_BIN:\$PATH; qmd status"
|
||||
|
||||
smoke_test "sqlite-vec loads" \
|
||||
"export PATH=$NODE_BIN:\$PATH;
|
||||
NPM_GLOBAL=\$(npm root -g);
|
||||
node -e \"
|
||||
const {openDatabase, loadSqliteVec} = await import('\$NPM_GLOBAL/@tobilu/qmd/dist/db.js');
|
||||
local bin='qmd'
|
||||
echo "=== Node: npm install -g packed tarball ==="
|
||||
run_runtime_matrix "node" "$bin" "export PATH=$NODE_BIN:\$PATH"
|
||||
smoke_test "node sqlite-vec loads" \
|
||||
"export PATH=$NODE_BIN:\$PATH; NPM_GLOBAL=\$(npm root -g); node -e \"
|
||||
const {openDatabase, loadSqliteVec} = await import('\\$NPM_GLOBAL/@tobilu/qmd/dist/db.js');
|
||||
const db = openDatabase(':memory:');
|
||||
loadSqliteVec(db);
|
||||
const r = db.prepare('SELECT vec_version() as v').get();
|
||||
console.log('sqlite-vec', r.v);
|
||||
if (!r.v) process.exit(1);
|
||||
\""
|
||||
|
||||
smoke_test "vitest (node)" \
|
||||
smoke_test "node vitest store subset" \
|
||||
"export PATH=$NODE_BIN:\$PATH; cd /opt/qmd && npx vitest run --reporter=verbose test/store.test.ts 2>&1 | tail -5"
|
||||
}
|
||||
|
||||
# ------------------------------------------------------------------
|
||||
# Bun (bun-installed qmd)
|
||||
# ------------------------------------------------------------------
|
||||
run_npx_scenario() {
|
||||
local NODE_BIN='$(mise where node@latest)/bin'
|
||||
local bin='npm exec --yes --package /tmp/tobilu-qmd.tgz -- qmd'
|
||||
echo "=== Node: npm exec/npx-style packed tarball ==="
|
||||
run_runtime_matrix "npx-style" "$bin" "export PATH=$NODE_BIN:\$PATH"
|
||||
}
|
||||
|
||||
run_bun_scenario() {
|
||||
local NODE_BIN='$(mise where node@latest)/bin'
|
||||
local BUN_BIN='$(mise where bun@latest)/bin'
|
||||
echo ""
|
||||
echo "=== Bun (bun install) ==="
|
||||
|
||||
smoke_test_output "qmd shows help" "Usage:" \
|
||||
"export PATH=$BUN_BIN:$NODE_BIN:\$PATH; \$HOME/.bun/bin/qmd"
|
||||
|
||||
smoke_test "qmd collection list" \
|
||||
"export PATH=$BUN_BIN:$NODE_BIN:\$PATH; \$HOME/.bun/bin/qmd collection list"
|
||||
|
||||
smoke_test "qmd status" \
|
||||
"export PATH=$BUN_BIN:$NODE_BIN:\$PATH; \$HOME/.bun/bin/qmd status"
|
||||
|
||||
smoke_test "sqlite-vec loads (bun)" \
|
||||
local bin='$HOME/.bun/bin/qmd'
|
||||
echo "=== Bun: bun install -g packed tarball ==="
|
||||
run_runtime_matrix "bun" "$bin" "export PATH=$BUN_BIN:$NODE_BIN:\$PATH"
|
||||
smoke_test "bun sqlite-vec loads" \
|
||||
"export PATH=$BUN_BIN:\$PATH; bun -e \"
|
||||
const {openDatabase, loadSqliteVec} = await import('\$HOME/.bun/install/global/node_modules/@tobilu/qmd/dist/db.js');
|
||||
const {openDatabase, loadSqliteVec} = await import('\\$HOME/.bun/install/global/node_modules/@tobilu/qmd/dist/db.js');
|
||||
const db = openDatabase(':memory:');
|
||||
loadSqliteVec(db);
|
||||
const r = db.prepare('SELECT vec_version() as v').get();
|
||||
console.log('sqlite-vec', r.v);
|
||||
if (!r.v) process.exit(1);
|
||||
\""
|
||||
|
||||
smoke_test "bun test store" \
|
||||
smoke_test "bun test store subset" \
|
||||
"export PATH=$BUN_BIN:\$PATH; cd /opt/qmd && bun test --preload ./src/test-preload.ts --timeout 30000 test/store.test.ts 2>&1 | tail -10"
|
||||
}
|
||||
|
||||
# ------------------------------------------------------------------
|
||||
run_smoke_tests() {
|
||||
case "$SCENARIO" in
|
||||
node) run_node_scenario ;;
|
||||
npx) run_npx_scenario ;;
|
||||
bun) run_bun_scenario ;;
|
||||
all) run_node_scenario; echo; run_npx_scenario; echo; run_bun_scenario ;;
|
||||
*) echo "Unknown scenario: $SCENARIO" >&2; exit 1 ;;
|
||||
esac
|
||||
echo ""
|
||||
echo "=== Results: $PASS passed, $FAIL failed ==="
|
||||
[[ $FAIL -eq 0 ]]
|
||||
}
|
||||
|
||||
# Parse arguments
|
||||
case "${1:-}" in
|
||||
--build)
|
||||
build_image
|
||||
;;
|
||||
--shell)
|
||||
build_image
|
||||
echo "==> Dropping into container shell..."
|
||||
$CTR run --rm -it "$IMAGE" bash
|
||||
;;
|
||||
--)
|
||||
shift
|
||||
run "$@"
|
||||
;;
|
||||
*)
|
||||
build_image
|
||||
echo ""
|
||||
echo "==> Running smoke tests..."
|
||||
run_smoke_tests
|
||||
;;
|
||||
esac
|
||||
if [[ $DO_BUILD -eq 1 ]]; then
|
||||
build_image
|
||||
fi
|
||||
|
||||
if [[ ${#ARBITRARY_CMD[@]} -gt 0 ]]; then
|
||||
run "${ARBITRARY_CMD[*]}"
|
||||
exit $?
|
||||
fi
|
||||
|
||||
if [[ $BUILD_ONLY -eq 1 ]]; then
|
||||
exit 0
|
||||
fi
|
||||
|
||||
if [[ $SHELL_ONLY -eq 1 ]]; then
|
||||
echo "==> Dropping into container shell..."
|
||||
# shellcheck disable=SC2206
|
||||
gpu=( $(gpu_args) )
|
||||
$CTR run --rm -it "${gpu[@]}" "$IMAGE" bash
|
||||
exit $?
|
||||
fi
|
||||
|
||||
echo ""
|
||||
echo "==> Running smoke tests..."
|
||||
run_smoke_tests
|
||||
|
||||
@ -9,7 +9,7 @@
|
||||
import { describe, test, expect, beforeAll, afterAll, beforeEach, afterEach, vi } from "vitest";
|
||||
import { openDatabase, loadSqliteVec } from "../src/db.js";
|
||||
import type { Database } from "../src/db.js";
|
||||
import { unlink, mkdtemp, rmdir, writeFile } from "node:fs/promises";
|
||||
import { unlink, mkdtemp, rmdir, writeFile, rm, mkdir, rename } from "node:fs/promises";
|
||||
import { tmpdir } from "node:os";
|
||||
import { join } from "node:path";
|
||||
import YAML from "yaml";
|
||||
@ -26,6 +26,7 @@ import {
|
||||
extractTitle,
|
||||
formatQueryForEmbedding,
|
||||
formatDocForEmbedding,
|
||||
getEmbeddingFingerprint,
|
||||
chunkDocument,
|
||||
chunkDocumentByTokens,
|
||||
chunkDocumentAsync,
|
||||
@ -46,13 +47,22 @@ import {
|
||||
normalizeDocid,
|
||||
isDocid,
|
||||
syncConfigToDb,
|
||||
reindexCollection,
|
||||
STRONG_SIGNAL_MIN_SCORE,
|
||||
STRONG_SIGNAL_MIN_GAP,
|
||||
insertContent,
|
||||
insertDocument,
|
||||
generateEmbeddings,
|
||||
getHybridRrfWeights,
|
||||
_resetProductionModeForTesting,
|
||||
hybridQuery,
|
||||
structuredSearch,
|
||||
vectorSearchQuery,
|
||||
type Store,
|
||||
type DocumentResult,
|
||||
type SearchResult,
|
||||
type RankedResult,
|
||||
type RankedListMeta,
|
||||
} from "../src/store.js";
|
||||
import type { CollectionConfig } from "../src/collections.js";
|
||||
|
||||
@ -156,18 +166,18 @@ async function insertTestDocument(
|
||||
const hash = opts.hash || await hashContent(body);
|
||||
|
||||
// Insert content (with OR IGNORE for deduplication)
|
||||
db.prepare(`
|
||||
INSERT OR IGNORE INTO content (hash, doc, created_at)
|
||||
VALUES (?, ?, ?)
|
||||
`).run(hash, body, now);
|
||||
insertContent(db, hash, body, now);
|
||||
|
||||
// Insert document
|
||||
const result = db.prepare(`
|
||||
INSERT INTO documents (collection, path, title, hash, created_at, modified_at, active)
|
||||
VALUES (?, ?, ?, ?, ?, ?, ?)
|
||||
`).run(collectionName, path, title, hash, now, now, active);
|
||||
insertDocument(db, collectionName, path, title, hash, now, now);
|
||||
const row = db.prepare(`
|
||||
SELECT id FROM documents WHERE collection = ? AND path = ?
|
||||
`).get(collectionName, path) as { id: number } | undefined;
|
||||
|
||||
return Number(result.lastInsertRowid);
|
||||
if (active === 0 && row) {
|
||||
db.prepare(`UPDATE documents SET active = 0 WHERE id = ?`).run(row.id);
|
||||
}
|
||||
|
||||
return row?.id ?? 0;
|
||||
}
|
||||
|
||||
/** Sync YAML config file to SQLite store_collections in the current test store */
|
||||
@ -277,7 +287,9 @@ afterAll(async () => {
|
||||
|
||||
describe("Store Creation", () => {
|
||||
test("createStore throws without explicit path in test mode", () => {
|
||||
// In test mode, createStore without path should throw to prevent accidental writes
|
||||
// In test mode, createStore without path should throw to prevent accidental writes.
|
||||
// Other tests may enable production mode in the same Bun process, so reset first.
|
||||
_resetProductionModeForTesting();
|
||||
const originalIndexPath = process.env.INDEX_PATH;
|
||||
delete process.env.INDEX_PATH;
|
||||
|
||||
@ -300,19 +312,127 @@ describe("Store Creation", () => {
|
||||
|
||||
// Check tables exist
|
||||
const tables = store.db.prepare(`
|
||||
SELECT name FROM sqlite_master WHERE type='table' ORDER BY name
|
||||
SELECT name FROM sqlite_master
|
||||
WHERE type='table'
|
||||
ORDER BY name
|
||||
`).all() as { name: string }[];
|
||||
|
||||
const tableNames = tables.map(t => t.name);
|
||||
expect(tableNames).toContain("documents");
|
||||
expect(tableNames).toContain("documents_fts");
|
||||
expect(tableNames).toContain("content_vectors");
|
||||
expect(tableNames).toContain("content");
|
||||
expect(tableNames).toContain("llm_cache");
|
||||
// Note: path_contexts table removed in favor of YAML-based context storage
|
||||
|
||||
await cleanupTestDb(store);
|
||||
});
|
||||
|
||||
test("createStore defers content_vectors embed_fingerprint migration until embedding health needs it", async () => {
|
||||
const dbPath = join(testDir, `legacy-${Date.now()}-${Math.random().toString(36).slice(2)}.sqlite`);
|
||||
const model = "hf:test/embed-model.gguf";
|
||||
const legacyDb = openDatabase(dbPath);
|
||||
legacyDb.exec(`
|
||||
CREATE TABLE content (
|
||||
hash TEXT PRIMARY KEY,
|
||||
doc TEXT NOT NULL,
|
||||
created_at TEXT NOT NULL
|
||||
);
|
||||
CREATE TABLE documents (
|
||||
id INTEGER PRIMARY KEY AUTOINCREMENT,
|
||||
collection TEXT NOT NULL,
|
||||
path TEXT NOT NULL,
|
||||
title TEXT,
|
||||
hash TEXT NOT NULL,
|
||||
created_at TEXT NOT NULL,
|
||||
modified_at TEXT NOT NULL,
|
||||
active INTEGER NOT NULL DEFAULT 1,
|
||||
FOREIGN KEY (hash) REFERENCES content(hash) ON DELETE CASCADE,
|
||||
UNIQUE(collection, path)
|
||||
);
|
||||
CREATE TABLE content_vectors (
|
||||
hash TEXT NOT NULL,
|
||||
seq INTEGER NOT NULL DEFAULT 0,
|
||||
pos INTEGER NOT NULL DEFAULT 0,
|
||||
model TEXT NOT NULL,
|
||||
total_chunks INTEGER NOT NULL DEFAULT 1,
|
||||
embedded_at TEXT NOT NULL,
|
||||
PRIMARY KEY (hash, seq)
|
||||
)
|
||||
`);
|
||||
const now = new Date().toISOString();
|
||||
legacyDb.prepare(`INSERT INTO content (hash, doc, created_at) VALUES (?, ?, ?)`).run("hash1", "# Legacy\nbody", now);
|
||||
legacyDb.prepare(`INSERT INTO documents (collection, path, title, hash, created_at, modified_at, active) VALUES (?, ?, ?, ?, ?, ?, 1)`).run("test", "legacy.md", "Legacy", "hash1", now, now);
|
||||
legacyDb.prepare(`INSERT INTO content_vectors (hash, seq, pos, model, total_chunks, embedded_at) VALUES (?, ?, ?, ?, ?, ?)`).run("hash1", 0, 0, model, 1, now);
|
||||
legacyDb.close();
|
||||
|
||||
const store = createStore(dbPath);
|
||||
let columns = store.db.prepare(`PRAGMA table_info(content_vectors)`).all() as { name: string }[];
|
||||
expect(columns.map(col => col.name)).not.toContain("embed_fingerprint");
|
||||
|
||||
expect(store.getHashesNeedingEmbedding(model)).toBe(1);
|
||||
|
||||
columns = store.db.prepare(`PRAGMA table_info(content_vectors)`).all() as { name: string }[];
|
||||
const migratedRow = store.db.prepare(`SELECT embed_fingerprint FROM content_vectors WHERE hash = ?`).get("hash1") as { embed_fingerprint: string };
|
||||
expect(columns.map(col => col.name)).toContain("embed_fingerprint");
|
||||
expect(migratedRow.embed_fingerprint).toBe("");
|
||||
|
||||
await cleanupTestDb(store);
|
||||
});
|
||||
|
||||
test("content_vectors column repair runs the full ALTER series and retries the failed operation", async () => {
|
||||
const dbPath = join(testDir, `legacy-no-seq-${Date.now()}-${Math.random().toString(36).slice(2)}.sqlite`);
|
||||
const model = "hf:test/embed-model.gguf";
|
||||
const legacyDb = openDatabase(dbPath);
|
||||
legacyDb.exec(`
|
||||
CREATE TABLE content (
|
||||
hash TEXT PRIMARY KEY,
|
||||
doc TEXT NOT NULL,
|
||||
created_at TEXT NOT NULL
|
||||
);
|
||||
CREATE TABLE documents (
|
||||
id INTEGER PRIMARY KEY AUTOINCREMENT,
|
||||
collection TEXT NOT NULL,
|
||||
path TEXT NOT NULL,
|
||||
title TEXT,
|
||||
hash TEXT NOT NULL,
|
||||
created_at TEXT NOT NULL,
|
||||
modified_at TEXT NOT NULL,
|
||||
active INTEGER NOT NULL DEFAULT 1,
|
||||
FOREIGN KEY (hash) REFERENCES content(hash) ON DELETE CASCADE,
|
||||
UNIQUE(collection, path)
|
||||
);
|
||||
CREATE TABLE content_vectors (
|
||||
hash TEXT NOT NULL,
|
||||
model TEXT NOT NULL,
|
||||
embed_fingerprint TEXT NOT NULL DEFAULT '',
|
||||
total_chunks INTEGER NOT NULL DEFAULT 1,
|
||||
embedded_at TEXT NOT NULL
|
||||
)
|
||||
`);
|
||||
legacyDb.close();
|
||||
|
||||
const store = createStore(dbPath);
|
||||
let columns = store.db.prepare(`PRAGMA table_info(content_vectors)`).all() as { name: string }[];
|
||||
expect(columns.map(col => col.name)).not.toContain("seq");
|
||||
expect(columns.map(col => col.name)).not.toContain("pos");
|
||||
|
||||
store.ensureVecTable(3);
|
||||
store.insertEmbedding("hash1", 1, 42, new Float32Array([1, 2, 3]), model, new Date().toISOString(), 2);
|
||||
|
||||
columns = store.db.prepare(`PRAGMA table_info(content_vectors)`).all() as { name: string }[];
|
||||
const columnNames = columns.map(col => col.name);
|
||||
expect(columnNames).toEqual(expect.arrayContaining(["seq", "pos", "model", "embed_fingerprint", "total_chunks", "embedded_at"]));
|
||||
expect(store.db.prepare(`SELECT seq, pos, model, total_chunks FROM content_vectors WHERE hash = ?`).get("hash1")).toEqual({
|
||||
seq: 1,
|
||||
pos: 42,
|
||||
model,
|
||||
total_chunks: 2,
|
||||
});
|
||||
|
||||
await cleanupTestDb(store);
|
||||
});
|
||||
|
||||
test("createStore sets WAL journal mode", async () => {
|
||||
const store = await createTestStore();
|
||||
const result = store.db.prepare("PRAGMA journal_mode").get() as { journal_mode: string };
|
||||
@ -1250,6 +1370,61 @@ describe("FTS Search", () => {
|
||||
await cleanupTestDb(store);
|
||||
});
|
||||
|
||||
test("searchFTS finds CJK documents by exact and mixed queries", async () => {
|
||||
const store = await createTestStore();
|
||||
const collectionName = await createTestCollection();
|
||||
|
||||
await insertTestDocument(store.db, collectionName, {
|
||||
name: "zh",
|
||||
title: "中文检索说明",
|
||||
body: "这里介绍 vector 数据库和关键词检索。",
|
||||
displayPath: "cjk/zh.md",
|
||||
});
|
||||
await insertTestDocument(store.db, collectionName, {
|
||||
name: "ja",
|
||||
title: "日本語検索メモ",
|
||||
body: "この文書は検索品質とトークン化について説明します。",
|
||||
displayPath: "cjk/ja.md",
|
||||
});
|
||||
await insertTestDocument(store.db, collectionName, {
|
||||
name: "ko",
|
||||
title: "한국어 검색 노트",
|
||||
body: "이 문서는 검색 품질과 토큰화 문제를 설명합니다.",
|
||||
displayPath: "cjk/ko.md",
|
||||
});
|
||||
|
||||
expect(store.searchFTS("关键词检索", 10).map(r => r.displayPath)).toContain(`${collectionName}/cjk/zh.md`);
|
||||
expect(store.searchFTS("検索品質", 10).map(r => r.displayPath)).toContain(`${collectionName}/cjk/ja.md`);
|
||||
expect(store.searchFTS("검색 품질", 10).map(r => r.displayPath)).toContain(`${collectionName}/cjk/ko.md`);
|
||||
expect(store.searchFTS("vector 关键词", 10).map(r => r.displayPath)).toContain(`${collectionName}/cjk/zh.md`);
|
||||
|
||||
await cleanupTestDb(store);
|
||||
});
|
||||
|
||||
test("searchFTS keeps English behavior while indexing CJK text", async () => {
|
||||
const store = await createTestStore();
|
||||
const collectionName = await createTestCollection();
|
||||
|
||||
await insertTestDocument(store.db, collectionName, {
|
||||
name: "english",
|
||||
title: "Vector Search Notes",
|
||||
body: "The quick brown fox explains vector search and BM25 ranking.",
|
||||
displayPath: "english.md",
|
||||
});
|
||||
await insertTestDocument(store.db, collectionName, {
|
||||
name: "zh",
|
||||
title: "中文检索说明",
|
||||
body: "这里介绍向量数据库和关键词检索。",
|
||||
displayPath: "zh.md",
|
||||
});
|
||||
|
||||
const foxResults = store.searchFTS("quick fox", 10);
|
||||
expect(foxResults.map(r => r.displayPath)).toContain(`${collectionName}/english.md`);
|
||||
expect(foxResults.map(r => r.displayPath)).not.toContain(`${collectionName}/zh.md`);
|
||||
|
||||
await cleanupTestDb(store);
|
||||
});
|
||||
|
||||
test("searchFTS handles special characters in query", async () => {
|
||||
const store = await createTestStore();
|
||||
const collectionName = await createTestCollection();
|
||||
@ -1429,6 +1604,39 @@ describe("FTS Search", () => {
|
||||
|
||||
await cleanupTestDb(store);
|
||||
});
|
||||
|
||||
test("searchFTS matches dotted version strings like 2026.4.10 (#563)", async () => {
|
||||
// Regression test: porter unicode61 tokenizer splits on dots, so the index
|
||||
// stores "2026", "4", "10" as separate tokens. Before the fix, sanitizeFTS5Term
|
||||
// stripped the dots producing "2026410" which never matched anything.
|
||||
const store = await createTestStore();
|
||||
const collectionName = await createTestCollection();
|
||||
|
||||
await insertTestDocument(store.db, collectionName, {
|
||||
name: "release-notes",
|
||||
title: "Release Notes",
|
||||
body: "## Release 2026.4.10\n\nThis version introduces new features and bug fixes.",
|
||||
displayPath: "test/release-notes.md",
|
||||
});
|
||||
|
||||
// A document that does NOT contain the version string
|
||||
await insertTestDocument(store.db, collectionName, {
|
||||
name: "other-doc",
|
||||
title: "Other Document",
|
||||
body: "Unrelated content about gardening and cooking.",
|
||||
displayPath: "test/other.md",
|
||||
});
|
||||
|
||||
const results = store.searchFTS("2026.4.10", 10);
|
||||
expect(results.length).toBeGreaterThan(0);
|
||||
expect(results.map(r => r.displayPath)).toContain(`${collectionName}/test/release-notes.md`);
|
||||
|
||||
// Partial version should also work
|
||||
const partial = store.searchFTS("2026.4", 10);
|
||||
expect(partial.map(r => r.displayPath)).toContain(`${collectionName}/test/release-notes.md`);
|
||||
|
||||
await cleanupTestDb(store);
|
||||
});
|
||||
});
|
||||
|
||||
// =============================================================================
|
||||
@ -1647,6 +1855,21 @@ describe("Document Retrieval", () => {
|
||||
expect(body).toBeNull();
|
||||
await cleanupTestDb(store);
|
||||
});
|
||||
|
||||
test("getDocumentBody clamps negative fromLine to top of document", async () => {
|
||||
const store = await createTestStore();
|
||||
const collectionName = await createTestCollection({ pwd: "/path" });
|
||||
await insertTestDocument(store.db, collectionName, {
|
||||
name: "mydoc",
|
||||
displayPath: "mydoc.md",
|
||||
body: "Line 1\nLine 2\nLine 3\nLine 4\nLine 5",
|
||||
});
|
||||
|
||||
const body = store.getDocumentBody({ filepath: "/path/mydoc.md" }, -19, 80);
|
||||
expect(body).toBe("Line 1\nLine 2\nLine 3\nLine 4\nLine 5");
|
||||
|
||||
await cleanupTestDb(store);
|
||||
});
|
||||
});
|
||||
|
||||
describe("findDocuments (multi-get)", () => {
|
||||
@ -1903,6 +2126,26 @@ describe("Snippet Extraction", () => {
|
||||
expect(linesAfter).toBe(2); // Fourth, Fifth
|
||||
});
|
||||
|
||||
test("extractSnippet with leading blank/frontmatter lines reports 1 before, not 0", () => {
|
||||
// Regression: a user looked at `@@ -2,4 @@ (1 before, 72 after)` and
|
||||
// suspected "1 before" was wrong because the match appeared to be the
|
||||
// topmost visible line. The math takes "before" from the absolute file
|
||||
// line, not from the visible portion of the snippet — so when the
|
||||
// snippet starts at line 2, "1 before" is the correct count. Lock that
|
||||
// in with a 77-line document whose match sits on line 3.
|
||||
const otherLines = Array.from({ length: 72 }, (_, i) => `body line ${i + 6}`).join("\n");
|
||||
const body = `---\ntitle: Notes\n# Heading with keyword\nIntro paragraph.\nMore intro lines.\n${otherLines}`;
|
||||
|
||||
const { line, linesBefore, snippetLines, linesAfter, snippet } =
|
||||
extractSnippet(body, "keyword", 500);
|
||||
|
||||
expect(line).toBe(3); // match is on line 3
|
||||
expect(linesBefore).toBe(1); // exactly one line above the 4-line snippet window
|
||||
expect(snippetLines).toBe(4); // lines 2..5 form the snippet
|
||||
expect(linesAfter).toBe(72); // remaining body
|
||||
expect(snippet).toContain("@@ -2,4 @@ (1 before, 72 after)");
|
||||
});
|
||||
|
||||
test("extractSnippet at document end shows 0 after", () => {
|
||||
const body = "First\nSecond\nThird\nFourth\nFifth keyword";
|
||||
const { linesBefore, linesAfter, snippetLines, line } = extractSnippet(body, "keyword", 500);
|
||||
@ -1935,6 +2178,33 @@ describe("Snippet Extraction", () => {
|
||||
expect(line).toBe(51); // "Target keyword" is line 51
|
||||
expect(linesBefore).toBeGreaterThan(40); // Many lines before
|
||||
});
|
||||
|
||||
test("extractSnippet anchors on chunkPos when lexical scoring finds no match", () => {
|
||||
// The snippet tokenizer does not strip FTS5 syntax, so a quoted-phrase query
|
||||
// tokenises into terms with embedded quotes that never appear in body text.
|
||||
// bestScore stays at 0 even though the reranker correctly identified a chunk;
|
||||
// the fallback should anchor on chunkPos rather than defaulting to line 1.
|
||||
const padLine = "Lorem ipsum dolor sit amet\n";
|
||||
const padding = padLine.repeat(100);
|
||||
const body = padding + "chunk content here\nmore chunk content\n" + padding;
|
||||
const chunkPos = padding.length;
|
||||
|
||||
const { line } = extractSnippet(body, '"unrelated quoted phrase"', 200, chunkPos);
|
||||
|
||||
expect(line).toBeGreaterThan(50);
|
||||
expect(line).toBeLessThan(110);
|
||||
});
|
||||
|
||||
test("extractSnippet with chunkPos=0 falls back to full-body scan when chunk has no match", () => {
|
||||
// chunkPos=0 may be the chunk selector's bestIdx=0 default rather than a real
|
||||
// first-chunk hit, so the fallback must consider matches outside chunk 0.
|
||||
const padding = "Lorem ipsum dolor sit amet\n".repeat(200);
|
||||
const body = padding + "TARGET_KEYWORD line content\ntail line\n";
|
||||
|
||||
const { line } = extractSnippet(body, "TARGET_KEYWORD", 200, 0);
|
||||
|
||||
expect(line).toBe(201);
|
||||
});
|
||||
});
|
||||
|
||||
// =============================================================================
|
||||
@ -1988,6 +2258,38 @@ describe("Reciprocal Rank Fusion", () => {
|
||||
expect(fused[0]!.file).toBe("doc1");
|
||||
});
|
||||
|
||||
test("hybrid RRF weights boost original vector evidence over expansion-only hits", () => {
|
||||
const originalFtsOnly = makeResult("original-fts-only.md", 0.95);
|
||||
const expansionOnly = makeResult("lex-expansion-only.md", 0.95);
|
||||
const originalVector = makeResult("original-vector.md", 0.95);
|
||||
|
||||
// Mirrors hybridQuery's common list order when a lex expansion exists:
|
||||
// original FTS, lex expansion FTS, original vector.
|
||||
const rankedLists = [
|
||||
[originalFtsOnly],
|
||||
[expansionOnly],
|
||||
[originalVector],
|
||||
];
|
||||
const rankedListMeta: RankedListMeta[] = [
|
||||
{ source: "fts", queryType: "original", query: "user query" },
|
||||
{ source: "fts", queryType: "lex", query: "lex expansion" },
|
||||
{ source: "vec", queryType: "original", query: "user query" },
|
||||
];
|
||||
|
||||
const positionBasedWeights = rankedLists.map((_, i) => i < 2 ? 2.0 : 1.0);
|
||||
const buggyOrder = reciprocalRankFusion(rankedLists, positionBasedWeights);
|
||||
|
||||
expect(buggyOrder.findIndex(r => r.file === "lex-expansion-only.md"))
|
||||
.toBeLessThan(buggyOrder.findIndex(r => r.file === "original-vector.md"));
|
||||
|
||||
const semanticWeights = getHybridRrfWeights(rankedListMeta);
|
||||
const fixedOrder = reciprocalRankFusion(rankedLists, semanticWeights);
|
||||
|
||||
expect(semanticWeights).toEqual([2.0, 1.0, 2.0]);
|
||||
expect(fixedOrder.findIndex(r => r.file === "original-vector.md"))
|
||||
.toBeLessThan(fixedOrder.findIndex(r => r.file === "lex-expansion-only.md"));
|
||||
});
|
||||
|
||||
test("RRF adds top-rank bonus", () => {
|
||||
// doc1 is #1 in list1, doc2 is #2 in list1
|
||||
const list1 = [makeResult("doc1", 0.9), makeResult("doc2", 0.8)];
|
||||
@ -2020,6 +2322,65 @@ describe("Reciprocal Rank Fusion", () => {
|
||||
});
|
||||
});
|
||||
|
||||
// =============================================================================
|
||||
// Reindex Collection Tests
|
||||
// =============================================================================
|
||||
|
||||
describe("Reindex Collection", () => {
|
||||
test("preserves document id and embeddings when file path changes only by case", async () => {
|
||||
const store = await createTestStore();
|
||||
const collectionName = "docs";
|
||||
const collectionPath = join(testDir, `case-rename-${Date.now()}-${Math.random().toString(36).slice(2)}`);
|
||||
await mkdir(collectionPath, { recursive: true });
|
||||
|
||||
const originalPath = join(collectionPath, "README.md");
|
||||
const renamedPath = join(collectionPath, "readme.md");
|
||||
const body = "# Case Rename\n\nContent that should keep the same embedding.";
|
||||
await writeFile(originalPath, body);
|
||||
|
||||
const firstResult = await reindexCollection(store, collectionPath, "**/*.md", collectionName);
|
||||
expect(firstResult.indexed).toBe(1);
|
||||
|
||||
const before = store.db.prepare(`
|
||||
SELECT id, path, hash FROM documents
|
||||
WHERE collection = ? AND active = 1
|
||||
`).get(collectionName) as { id: number; path: string; hash: string };
|
||||
expect(before.path).toBe("README.md");
|
||||
|
||||
store.db.prepare(`
|
||||
INSERT INTO content_vectors (hash, seq, pos, model, embedded_at)
|
||||
VALUES (?, 0, 0, 'test-model', ?)
|
||||
`).run(before.hash, new Date().toISOString());
|
||||
|
||||
await rename(originalPath, renamedPath);
|
||||
|
||||
const secondResult = await reindexCollection(store, collectionPath, "**/*.md", collectionName);
|
||||
expect(secondResult.indexed).toBe(0);
|
||||
expect(secondResult.unchanged).toBe(1);
|
||||
expect(secondResult.removed).toBe(0);
|
||||
|
||||
const afterRows = store.db.prepare(`
|
||||
SELECT id, path, hash, active FROM documents
|
||||
WHERE collection = ?
|
||||
ORDER BY id
|
||||
`).all(collectionName) as { id: number; path: string; hash: string; active: number }[];
|
||||
expect(afterRows).toHaveLength(1);
|
||||
expect(afterRows[0]).toMatchObject({ id: before.id, path: "readme.md", hash: before.hash, active: 1 });
|
||||
|
||||
const vectorCount = store.db.prepare(`
|
||||
SELECT COUNT(*) AS count FROM content_vectors WHERE hash = ?
|
||||
`).get(before.hash) as { count: number };
|
||||
expect(vectorCount.count).toBe(1);
|
||||
|
||||
const ftsRows = store.db.prepare(`
|
||||
SELECT rowid, filepath FROM documents_fts WHERE rowid = ?
|
||||
`).all(before.id) as { rowid: number; filepath: string }[];
|
||||
expect(ftsRows).toEqual([{ rowid: before.id, filepath: "docs/readme.md" }]);
|
||||
|
||||
await cleanupTestDb(store);
|
||||
});
|
||||
});
|
||||
|
||||
// =============================================================================
|
||||
// Index Status Tests
|
||||
// =============================================================================
|
||||
@ -2082,6 +2443,43 @@ describe("Index Status", () => {
|
||||
await cleanupTestDb(store);
|
||||
});
|
||||
|
||||
test("embedding health is scoped to the active embed model", async () => {
|
||||
const store = await createTestStore();
|
||||
const collectionName = await createTestCollection();
|
||||
const activeModel = "hf:active/embed-model.gguf";
|
||||
const staleModel = "hf:stale/embed-model.gguf";
|
||||
const now = new Date().toISOString();
|
||||
|
||||
store.llm = { embedModelName: activeModel } as any;
|
||||
store.ensureVecTable(3);
|
||||
await insertTestDocument(store.db, collectionName, { name: "doc1", hash: "hash1" });
|
||||
store.insertEmbedding("hash1", 0, 0, new Float32Array([1, 2, 3]), staleModel, now, 1);
|
||||
|
||||
expect(store.getHashesNeedingEmbedding()).toBe(1);
|
||||
expect(store.getStatus().needsEmbedding).toBe(1);
|
||||
expect(store.getIndexHealth().needsEmbedding).toBe(1);
|
||||
expect(store.getHashesNeedingEmbedding(staleModel)).toBe(0);
|
||||
|
||||
await cleanupTestDb(store);
|
||||
});
|
||||
|
||||
test("embedding health treats stale fingerprints as needing re-embedding", async () => {
|
||||
const store = await createTestStore();
|
||||
const collectionName = await createTestCollection();
|
||||
const model = "hf:test/embed-model.gguf";
|
||||
const now = new Date().toISOString();
|
||||
|
||||
store.llm = { embedModelName: model } as any;
|
||||
store.ensureVecTable(3);
|
||||
await insertTestDocument(store.db, collectionName, { name: "doc1", hash: "hash1" });
|
||||
store.insertEmbedding("hash1", 0, 0, new Float32Array([1, 2, 3]), model, now, 1, "stale1");
|
||||
|
||||
expect(getEmbeddingFingerprint(model)).toMatch(/^[a-f0-9]{6}$/);
|
||||
expect(store.getHashesNeedingEmbedding()).toBe(1);
|
||||
|
||||
await cleanupTestDb(store);
|
||||
});
|
||||
|
||||
test("getIndexHealth returns health info", async () => {
|
||||
const store = await createTestStore();
|
||||
const collectionName = await createTestCollection();
|
||||
@ -2256,6 +2654,33 @@ describe("Vector Table", () => {
|
||||
|
||||
await cleanupTestDb(store);
|
||||
});
|
||||
|
||||
test("insertEmbedding is idempotent for an existing vec0 hash_seq (#598)", async () => {
|
||||
const store = await createTestStore();
|
||||
store.ensureVecTable(2);
|
||||
|
||||
const hash = "existinghashseq";
|
||||
const first = new Float32Array([0.1, 0.2]);
|
||||
const second = new Float32Array([0.3, 0.4]);
|
||||
const now = new Date().toISOString();
|
||||
|
||||
store.db.prepare(`INSERT INTO vectors_vec (hash_seq, embedding) VALUES (?, ?)`).run(`${hash}_0`, first);
|
||||
|
||||
// Reproduces sqlite-vec's broken conflict handling: vec0 does not honor OR REPLACE.
|
||||
expect(() => {
|
||||
store.db.prepare(`INSERT OR REPLACE INTO vectors_vec (hash_seq, embedding) VALUES (?, ?)`).run(`${hash}_0`, second);
|
||||
}).toThrow(/UNIQUE constraint failed/i);
|
||||
|
||||
// QMD must therefore use DELETE + INSERT when upserting the vector row.
|
||||
expect(() => store.insertEmbedding(hash, 0, 0, second, "test-model", now)).not.toThrow();
|
||||
|
||||
const vectorCount = store.db.prepare(`SELECT COUNT(*) AS count FROM vectors_vec WHERE hash_seq = ?`).get(`${hash}_0`) as { count: number };
|
||||
const metadataCount = store.db.prepare(`SELECT COUNT(*) AS count FROM content_vectors WHERE hash = ? AND seq = 0`).get(hash) as { count: number };
|
||||
expect(vectorCount.count).toBe(1);
|
||||
expect(metadataCount.count).toBe(1);
|
||||
|
||||
await cleanupTestDb(store);
|
||||
});
|
||||
});
|
||||
|
||||
// =============================================================================
|
||||
@ -2263,6 +2688,47 @@ describe("Vector Table", () => {
|
||||
// =============================================================================
|
||||
|
||||
describe("Integration", () => {
|
||||
test("reindexCollection soft-deletes removed files and preserves inactive content (#585)", async () => {
|
||||
const store = await createTestStore();
|
||||
const collectionDir = await mkdtemp(join(testDir, "orphan-regression-"));
|
||||
const collectionName = "orphan-regression";
|
||||
|
||||
try {
|
||||
for (let i = 1; i <= 5; i++) {
|
||||
await writeFile(join(collectionDir, `doc-${i}.md`), `# Doc ${i}\n\nUnique body ${i}`);
|
||||
}
|
||||
|
||||
await createTestCollection({ pwd: collectionDir, glob: "**/*.md", name: collectionName });
|
||||
|
||||
const initial = await reindexCollection(store, collectionDir, "**/*.md", collectionName);
|
||||
expect(initial.indexed).toBe(5);
|
||||
expect(initial.removed).toBe(0);
|
||||
|
||||
await rm(join(collectionDir, "doc-3.md"));
|
||||
await rm(join(collectionDir, "doc-4.md"));
|
||||
await rm(join(collectionDir, "doc-5.md"));
|
||||
|
||||
const afterDelete = await reindexCollection(store, collectionDir, "**/*.md", collectionName);
|
||||
expect(afterDelete.removed).toBe(3);
|
||||
|
||||
const counts = store.db.prepare(`
|
||||
SELECT
|
||||
SUM(CASE WHEN active = 1 THEN 1 ELSE 0 END) AS active,
|
||||
SUM(CASE WHEN active = 0 THEN 1 ELSE 0 END) AS inactive,
|
||||
COUNT(*) AS total
|
||||
FROM documents
|
||||
WHERE collection = ?
|
||||
`).get(collectionName) as { active: number; inactive: number; total: number };
|
||||
const contentCount = store.db.prepare(`SELECT COUNT(*) AS count FROM content`).get() as { count: number };
|
||||
|
||||
expect(counts).toEqual({ active: 2, inactive: 3, total: 5 });
|
||||
expect(contentCount.count).toBe(5);
|
||||
} finally {
|
||||
await rm(collectionDir, { recursive: true, force: true });
|
||||
await cleanupTestDb(store);
|
||||
}
|
||||
});
|
||||
|
||||
test("full document lifecycle: create, search, retrieve", async () => {
|
||||
const store = await createTestStore();
|
||||
const collectionName = await createTestCollection({ pwd: "/test/notes", glob: "**/*.md" });
|
||||
@ -2802,6 +3268,219 @@ describe("Embedding batching", () => {
|
||||
}
|
||||
});
|
||||
|
||||
test("generateEmbeddings uses the active llm embed model when no explicit model is passed", async () => {
|
||||
const store = await createTestStore();
|
||||
const db = store.db;
|
||||
const fakeLlm = createFakeEmbedLlm();
|
||||
const model = "hf:env/embed-model.gguf";
|
||||
|
||||
setDefaultLlamaCpp(createFakeTokenizer() as any);
|
||||
store.llm = { ...fakeLlm, embedModelName: model } as any;
|
||||
|
||||
try {
|
||||
await insertTestDocument(db, "docs", { name: "one", body: "# One\n\nAlpha" });
|
||||
|
||||
const result = await generateEmbeddings(store);
|
||||
|
||||
expect(result.chunksEmbedded).toBe(1);
|
||||
expect(fakeLlm.embedCalls[0]?.options?.model).toBe(model);
|
||||
expect(fakeLlm.embedBatchModelCalls).toEqual([{ model }]);
|
||||
expect(db.prepare(`SELECT DISTINCT model FROM content_vectors`).all()).toEqual([{ model }]);
|
||||
} finally {
|
||||
setDefaultLlamaCpp(null);
|
||||
await cleanupTestDb(store);
|
||||
}
|
||||
});
|
||||
|
||||
test("generateEmbeddings does not mark a partially embedded multi-chunk document complete", async () => {
|
||||
const store = await createTestStore();
|
||||
const db = store.db;
|
||||
let embedCalls = 0;
|
||||
const fakeLlm = {
|
||||
async embed(_text: string, _options?: { model?: string }) {
|
||||
embedCalls++;
|
||||
return embedCalls === 1
|
||||
? { embedding: [0.1, 0.2, 0.3], model: "fake-embed" }
|
||||
: null;
|
||||
},
|
||||
async embedBatch(texts: string[], _options?: { model?: string }) {
|
||||
return texts.map((_text, index) => index === 0
|
||||
? { embedding: [1, 2, 3], model: "fake-embed" }
|
||||
: null
|
||||
);
|
||||
},
|
||||
};
|
||||
|
||||
setDefaultLlamaCpp(createFakeTokenizer() as any);
|
||||
store.llm = fakeLlm as any;
|
||||
|
||||
try {
|
||||
await insertTestDocument(db, "docs", {
|
||||
name: "long-doc",
|
||||
body: "# Long doc\n\n" + "partial embedding regression ".repeat(260),
|
||||
});
|
||||
|
||||
const result = await generateEmbeddings(store);
|
||||
|
||||
expect(result.errors).toBeGreaterThan(0);
|
||||
expect(result.failures?.[0]?.attempts).toBe(3);
|
||||
expect(db.prepare(`SELECT COUNT(*) as count FROM content_vectors`).get()).toEqual({ count: 0 });
|
||||
expect(db.prepare(`SELECT COUNT(*) as count FROM vectors_vec`).get()).toEqual({ count: 0 });
|
||||
expect(store.getHashesNeedingEmbedding()).toBe(1);
|
||||
expect(store.getStatus().needsEmbedding).toBe(1);
|
||||
} finally {
|
||||
setDefaultLlamaCpp(null);
|
||||
await cleanupTestDb(store);
|
||||
}
|
||||
});
|
||||
|
||||
test("generateEmbeddings clears chunk errors after successful retry", async () => {
|
||||
const store = await createTestStore();
|
||||
const db = store.db;
|
||||
const fakeLlm = {
|
||||
async embed(_text: string, _options?: { model?: string }) {
|
||||
return { embedding: [0.1, 0.2, 0.3], model: "fake-embed" };
|
||||
},
|
||||
async embedBatch(texts: string[], _options?: { model?: string }) {
|
||||
return texts.map((_text, index) => index === 0
|
||||
? { embedding: [1, 2, 3], model: "fake-embed" }
|
||||
: null
|
||||
);
|
||||
},
|
||||
};
|
||||
|
||||
setDefaultLlamaCpp(createFakeTokenizer() as any);
|
||||
store.llm = fakeLlm as any;
|
||||
|
||||
try {
|
||||
await insertTestDocument(db, "docs", {
|
||||
name: "retry-doc",
|
||||
body: "# Retry doc\n\n" + "transient embedding failure ".repeat(260),
|
||||
});
|
||||
|
||||
const result = await generateEmbeddings(store);
|
||||
|
||||
expect(result.errors).toBe(0);
|
||||
expect(result.failures).toEqual([]);
|
||||
expect(db.prepare(`SELECT COUNT(*) as count FROM content_vectors`).get()).toEqual({ count: result.chunksEmbedded });
|
||||
expect(store.getHashesNeedingEmbedding()).toBe(0);
|
||||
} finally {
|
||||
setDefaultLlamaCpp(null);
|
||||
await cleanupTestDb(store);
|
||||
}
|
||||
});
|
||||
|
||||
test("generateEmbeddings opens a long-lived LLM session for embed runs", async () => {
|
||||
const store = await createTestStore();
|
||||
const fakeLlm = createFakeEmbedLlm();
|
||||
const sessionSpy = vi.spyOn(llmModule, "withLLMSessionForLlm");
|
||||
|
||||
setDefaultLlamaCpp(createFakeTokenizer() as any);
|
||||
store.llm = fakeLlm as any;
|
||||
|
||||
try {
|
||||
await insertTestDocument(store.db, "docs", { name: "one", body: "# One\n\nAlpha" });
|
||||
|
||||
await generateEmbeddings(store);
|
||||
|
||||
expect(sessionSpy).toHaveBeenCalledWith(
|
||||
fakeLlm,
|
||||
expect.any(Function),
|
||||
expect.objectContaining({ maxDuration: 30 * 60 * 1000, name: "generateEmbeddings" }),
|
||||
);
|
||||
} finally {
|
||||
sessionSpy.mockRestore();
|
||||
setDefaultLlamaCpp(null);
|
||||
await cleanupTestDb(store);
|
||||
}
|
||||
});
|
||||
|
||||
test("vectorSearchQuery uses the active llm embed model for vector lookups", async () => {
|
||||
const store = await createTestStore();
|
||||
const model = "hf:Qwen/Qwen3-Embedding-0.6B-GGUF/Qwen3-Embedding-0.6B-Q8_0.gguf";
|
||||
const searchVecSpy = vi.fn(async () => [] as SearchResult[]) as any;
|
||||
|
||||
store.db.exec(`CREATE TABLE vectors_vec (hash_seq TEXT PRIMARY KEY, embedding BLOB)`);
|
||||
store.llm = { embedModelName: model } as any;
|
||||
store.searchVec = searchVecSpy as any;
|
||||
store.expandQuery = vi.fn(async () => []) as any;
|
||||
|
||||
try {
|
||||
await vectorSearchQuery(store, "custom query", { limit: 7, minScore: 0 });
|
||||
|
||||
expect(searchVecSpy).toHaveBeenCalledTimes(1);
|
||||
expect(searchVecSpy.mock.calls[0]?.[0]).toBe("custom query");
|
||||
expect(searchVecSpy.mock.calls[0]?.[1]).toBe(model);
|
||||
expect(searchVecSpy.mock.calls[0]?.[2]).toBe(7);
|
||||
} finally {
|
||||
await cleanupTestDb(store);
|
||||
}
|
||||
});
|
||||
|
||||
test("hybridQuery uses the active llm embed model for precomputed vector lookups", async () => {
|
||||
const store = await createTestStore();
|
||||
const model = "hf:Qwen/Qwen3-Embedding-0.6B-GGUF/Qwen3-Embedding-0.6B-Q8_0.gguf";
|
||||
const embedBatchSpy = vi.fn(async (texts: string[]) => texts.map(() => ({
|
||||
embedding: [1, 2, 3],
|
||||
model,
|
||||
})));
|
||||
const searchVecSpy = vi.fn(async () => [] as SearchResult[]) as any;
|
||||
|
||||
store.db.exec(`CREATE TABLE vectors_vec (hash_seq TEXT PRIMARY KEY, embedding BLOB)`);
|
||||
store.llm = {
|
||||
embedModelName: model,
|
||||
embedBatch: embedBatchSpy,
|
||||
} as any;
|
||||
store.searchVec = searchVecSpy as any;
|
||||
store.searchFTS = vi.fn(() => []) as any;
|
||||
store.expandQuery = vi.fn(async () => []) as any;
|
||||
|
||||
try {
|
||||
await hybridQuery(store, "hybrid query", { limit: 5, minScore: 0, skipRerank: true });
|
||||
|
||||
expect(embedBatchSpy).toHaveBeenCalledTimes(1);
|
||||
expect(searchVecSpy).toHaveBeenCalledTimes(1);
|
||||
expect(searchVecSpy.mock.calls[0]?.[0]).toBe("hybrid query");
|
||||
expect(searchVecSpy.mock.calls[0]?.[1]).toBe(model);
|
||||
expect(searchVecSpy.mock.calls[0]?.[5]).toEqual([1, 2, 3]);
|
||||
} finally {
|
||||
await cleanupTestDb(store);
|
||||
}
|
||||
});
|
||||
|
||||
test("structuredSearch uses the active llm embed model for precomputed vector lookups", async () => {
|
||||
const store = await createTestStore();
|
||||
const model = "hf:Qwen/Qwen3-Embedding-0.6B-GGUF/Qwen3-Embedding-0.6B-Q8_0.gguf";
|
||||
const embedBatchSpy = vi.fn(async (texts: string[]) => texts.map(() => ({
|
||||
embedding: [1, 2, 3],
|
||||
model,
|
||||
})));
|
||||
const searchVecSpy = vi.fn(async () => [] as SearchResult[]) as any;
|
||||
|
||||
store.db.exec(`CREATE TABLE vectors_vec (hash_seq TEXT PRIMARY KEY, embedding BLOB)`);
|
||||
store.llm = {
|
||||
embedModelName: model,
|
||||
embedBatch: embedBatchSpy,
|
||||
} as any;
|
||||
store.searchVec = searchVecSpy as any;
|
||||
|
||||
try {
|
||||
await structuredSearch(store, [{ type: "vec", query: "structured query" }], {
|
||||
limit: 5,
|
||||
minScore: 0,
|
||||
skipRerank: true,
|
||||
});
|
||||
|
||||
expect(embedBatchSpy).toHaveBeenCalledTimes(1);
|
||||
expect(searchVecSpy).toHaveBeenCalledTimes(1);
|
||||
expect(searchVecSpy.mock.calls[0]?.[0]).toBe("structured query");
|
||||
expect(searchVecSpy.mock.calls[0]?.[1]).toBe(model);
|
||||
expect(searchVecSpy.mock.calls[0]?.[5]).toEqual([1, 2, 3]);
|
||||
} finally {
|
||||
await cleanupTestDb(store);
|
||||
}
|
||||
});
|
||||
|
||||
test("generateEmbeddings rejects invalid batch limits", async () => {
|
||||
const store = await createTestStore();
|
||||
|
||||
|
||||
@ -361,17 +361,73 @@ describe("lex query syntax", () => {
|
||||
expect(validateSemanticQuery("what is the CAP theorem")).toBeNull();
|
||||
});
|
||||
|
||||
test("rejects negation syntax", () => {
|
||||
test("rejects negation at start of query", () => {
|
||||
expect(validateSemanticQuery("-redis connection pooling")).toContain("Negation");
|
||||
});
|
||||
|
||||
test("rejects negation after space", () => {
|
||||
expect(validateSemanticQuery("performance -sports")).toContain("Negation");
|
||||
});
|
||||
|
||||
test("rejects negated quoted phrase", () => {
|
||||
expect(validateSemanticQuery('-"exact phrase"')).toContain("Negation");
|
||||
});
|
||||
|
||||
test("rejects multiple negations", () => {
|
||||
expect(validateSemanticQuery("error handling -java -python")).toContain("Negation");
|
||||
});
|
||||
|
||||
test("rejects negation after leading whitespace", () => {
|
||||
expect(validateSemanticQuery(" -term at start")).toContain("Negation");
|
||||
});
|
||||
|
||||
test("rejects negation after tab", () => {
|
||||
expect(validateSemanticQuery("foo\t-bar")).toContain("Negation");
|
||||
});
|
||||
|
||||
test("accepts hyphenated compound words", () => {
|
||||
expect(validateSemanticQuery("long-lived server shared across clients")).toBeNull();
|
||||
expect(validateSemanticQuery("real-time voice processing pipeline")).toBeNull();
|
||||
expect(validateSemanticQuery("how does the rate-limiter handle burst traffic")).toBeNull();
|
||||
expect(validateSemanticQuery("self-hosted deployment options")).toBeNull();
|
||||
expect(validateSemanticQuery("multi-client session architecture")).toBeNull();
|
||||
expect(validateSemanticQuery("cross-platform compatibility")).toBeNull();
|
||||
expect(validateSemanticQuery("non-blocking I/O model")).toBeNull();
|
||||
expect(validateSemanticQuery("in-memory caching strategy")).toBeNull();
|
||||
expect(validateSemanticQuery("write-ahead log for crash recovery")).toBeNull();
|
||||
expect(validateSemanticQuery("copy-on-write semantics")).toBeNull();
|
||||
});
|
||||
|
||||
test("accepts multiple hyphens in a phrase", () => {
|
||||
expect(validateSemanticQuery("state-of-the-art embedding models")).toBeNull();
|
||||
expect(validateSemanticQuery("end-to-end testing")).toBeNull();
|
||||
expect(validateSemanticQuery("man-in-the-middle attack prevention")).toBeNull();
|
||||
});
|
||||
|
||||
test("accepts multiple hyphenated words in one query", () => {
|
||||
expect(validateSemanticQuery("built-in vs add-on features")).toBeNull();
|
||||
});
|
||||
|
||||
test("accepts short hyphenated terms", () => {
|
||||
expect(validateSemanticQuery("A-B testing for ML models")).toBeNull();
|
||||
expect(validateSemanticQuery("e-commerce platform")).toBeNull();
|
||||
});
|
||||
|
||||
test("accepts bare hyphen without word character", () => {
|
||||
expect(validateSemanticQuery("-")).toBeNull();
|
||||
});
|
||||
|
||||
test("accepts hyde-style hypothetical answers", () => {
|
||||
expect(validateSemanticQuery(
|
||||
"The CAP theorem states that a distributed system cannot simultaneously provide consistency, availability, and partition tolerance."
|
||||
)).toBeNull();
|
||||
});
|
||||
|
||||
test("accepts hyde with hyphenated words", () => {
|
||||
expect(validateSemanticQuery(
|
||||
"HTTP transport runs a single long-lived daemon shared across all clients, avoiding per-session model re-loading."
|
||||
)).toBeNull();
|
||||
});
|
||||
});
|
||||
|
||||
describe("validateLexQuery", () => {
|
||||
|
||||
@ -4,7 +4,7 @@
|
||||
"noEmit": false,
|
||||
"outDir": "dist",
|
||||
"declaration": true,
|
||||
"noImplicitAny": false
|
||||
"noImplicitAny": true
|
||||
},
|
||||
"include": ["src/**/*.ts"],
|
||||
"exclude": ["src/**/*.test.ts", "src/test-preload.ts", "src/bench-*.ts"]
|
||||
|
||||
Loading…
Reference in New Issue
Block a user