Commit Graph

398 Commits

Author SHA1 Message Date
Tobias Lütke
828823d20a fix: restore toLowerCase() in handelize + align tests with post-#501 behavior
- Restore .toLowerCase() in handelize (was dropped somewhere, tests expect it)
- Update dimension-mismatch test to expect throw instead of silent rebuild
  (matches new behavior from #501)
- Fix one stale test expectation for preserved dots in filenames
2026-04-05 16:56:06 -04:00
Antonio Mello
ef062e1b54
fix(multi-get): support brace expansion patterns in glob matching (#424)
Brace expansion patterns like `{doc1,doc2}.md` or `collection/{a,b}.md`
were incorrectly parsed as comma-separated file lists instead of being
passed to the glob matcher (picomatch). This happened because the
comma-detection heuristic only checked for `*` and `?` but not `{`.

Also adds `collection/path` matching in `matchFilesByGlob` so patterns
like `my-collection/{file1,file2}.md` work — previously the glob only
matched against `qmd://collection/path` (virtual) and `path` (relative
to collection root), missing the `collection/path` form.

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-05 16:45:33 -04:00
John R. Enders
09a4d19b31
fix(store): error on embedding dimension mismatch instead of silent rebuild (#501)
When switching to an embedding model with different dimensions,
ensureVecTableInternal() silently drops the vector table and all
embeddings are lost. Users only discover this when semantic search
returns empty results.

Throw an error instead, telling users to run 'qmd embed -f' to
explicitly re-embed. This is safe because embed -f calls
clearAllEmbeddings() which drops the table before ensureVecTable
is reached.

Related to #497

Co-authored-by: JohnRichardEnders <john@telli.com>
2026-04-05 16:45:24 -04:00
John R. Enders
54550a3366
fix(llm): set explicit embed context size, default 2048, configurable via env var (#500)
Without an explicit contextSize, node-llama-cpp defaults to "auto" which
allocates the model's full training context (often 32k). For embedding
chunks that are typically ~900 tokens this wastes ~3.5 GB of KV cache
per context on Apple Silicon unified memory.

Default to 2048 (matching the rerank context pattern) and allow override
via QMD_EMBED_CONTEXT_SIZE for users with larger chunks.

Addresses #329, related to #297

Co-authored-by: JohnRichardEnders <john@telli.com>
2026-04-05 16:45:12 -04:00
LJY
698b44fe87
Fix qmd embed model selection (#494) 2026-04-05 16:45:04 -04:00
Oliver Ratzesberger
021236378b
fix(mcp): read version from package.json instead of hardcoding (#431) 2026-04-05 16:44:35 -04:00
Matt Van Horn
1ad3388132
fix(store): preserve underscores in BM25 search terms (#404)
sanitizeFTS5Term stripped all non-letter/non-number characters including
underscores, causing snake_case identifiers like `my_variable` to become
`myvariable` and silently fail BM25 matches.

Add underscore to the preserved character set in the Unicode regex.
Export the function and add unit tests covering snake_case, contractions,
punctuation stripping, and unicode.

Fixes #305

Co-authored-by: Matt Van Horn <455140+mvanhorn@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-05 16:44:14 -04:00
Oliver Ratzesberger
959b719038
fix(mcp): include collection name in status output (#416)
The MCP status tool shows collection paths but not names,
making it impossible for agents to discover valid collection
filter values. The CLI 'qmd status' already shows names.

Add col.name prefix to each collection line in the status
tool response.
2026-04-05 16:43:25 -04:00
Ahmed El Gabri
209b797c6a
fix(nix): correct CLI entry point path in wrapper (#413)
The wrapper points at src/qmd.ts which no longer exists after the CLI
was moved to src/cli/qmd.ts.
2026-04-05 16:43:18 -04:00
Tobias Lütke
1fb2e2819e Merge origin/main into feat/ast-aware-chunking
Resolve conflicts: combine AST chunking args (filepath, chunkStrategy)
with abort signal parameter from #458.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-28 20:00:49 -04:00
Tobias Lütke
dd27f499c7
Merge pull request #463 from goldsr09/fix/hyphenated-lex-queries
Fix hyphenated tokens in FTS5 lex queries
2026-03-28 19:58:22 -04:00
Tobias Lütke
ea653304af
Merge pull request #458 from ccc-fff/fix/embed-infinite-loop
fix: prevent qmd embed from running indefinitely
2026-03-28 19:58:17 -04:00
Tobias Lütke
616776ebdd
Merge pull request #453 from builderjarvis/fix/rerank-context-size
fix: increase RERANK_CONTEXT_SIZE default 2048→4096, configurable via env var, fix template overhead underestimate
2026-03-28 19:58:11 -04:00
Tobias Lütke
6a45150f5a
Merge pull request #456 from antonio-mello-ai/fix/insert-embedding-vec0-replace
fix(embed): handle vec0 OR REPLACE limitation in insertEmbedding
2026-03-28 19:58:06 -04:00
Tobias Lütke
827ad839f4 Merge origin/main into fix/fts5-collection-filter-performance
Resolve conflict: use CTE approach from #455 with updated BM25
weights (1.5, 4.0, 1.0) from #462.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-28 19:57:33 -04:00
Tobias Lütke
08566ec316
Merge pull request #462 from goldsr09/fix/bm25-field-weights
Fix BM25 field weights to include all 3 FTS columns
2026-03-28 19:56:04 -04:00
Tobias Lütke
5bef789ad3
Merge pull request #478 from zestyboy/feat/no-rerank-option
Add rerank parameter to MCP query tool
2026-03-28 19:55:00 -04:00
Tobias Lütke
f73386ae02
Merge pull request #457 from antonio-mello-ai/fix/model-cache-xdg-cache-home
fix: respect XDG_CACHE_HOME for model cache directory
2026-03-28 19:54:54 -04:00
Tobias Lütke
8d343b9da1 Update handelize tests for case/dot preservation (#475)
PR #475 changed handelize() to preserve original case and dots,
but the tests still expected lowercase output. Update assertions
to match the new behavior.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-28 19:54:18 -04:00
Tobias Lütke
2ec3360a8b
Merge pull request #475 from alexei-led/fix/handelize-dots-and-case
fix: preserve dots and original case in handelize()
2026-03-28 19:44:40 -04:00
Tobias Lütke
49ebf6e8ff
Merge pull request #479 from surma-dump/surma/fix-flake
Fix flake
2026-03-28 19:43:15 -04:00
Surma
cf9991cfa7
Fix flake 2026-03-27 23:12:20 +00:00
Niven
792992ef65 Add rerank parameter to MCP query tool
The MCP query tool always ran LLM reranking, even for lex-only queries.
On CPU-only infrastructure (e.g. Railway), the reranker adds 60-120s
per query. The SDK and CLI already support skipping reranking, but the
MCP server did not expose this option.

Add a `rerank` boolean parameter (default: true) to the MCP query
tool's input schema, forwarded to store.search() as the existing
`rerank` option.

Fixes #477

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-27 13:10:31 -07:00
Alexei Ledenev
72f2dd1fe5
fix: preserve original filename case in handelize (MEMORY.md not memory.md) 2026-03-27 16:38:04 +03:00
Alexei Ledenev
ddecde78da
fix: preserve dots in filenames during handelize
The handelize() regex replaced all non-letter/non-number chars with
dashes, including dots in the filename stem. This mangled session
filenames like "topic-1773595309.753009.md" to "topic-1773595309-753009.md",
breaking memory_get path resolution (file not found on disk).

Fix: add dot to the preserved character class in the filename regex.
After deploying, run qmd-reindex.sh to rebuild indexes with correct paths.
2026-03-27 16:37:59 +03:00
Ryan
7b9bd01226 fix: handle hyphenated tokens in FTS5 lex queries
Hyphenated terms like multi-agent, DEC-0054, gpt-4 were being stripped
of hyphens and concatenated (e.g., "multiagent") which missed matches.
Now they're split into FTS5 phrase queries ("multi agent") so the porter
tokenizer matches them correctly.
2026-03-24 20:13:52 -04:00
Ryan
fa214db367 fix: correct BM25 field weights to include all 3 FTS columns
The bm25() call only had 2 weights for 3 columns (filepath, title, body),
giving body an implicit weight of 0. Add proper weights: filepath=1.5,
title=4.0, body=1.0 so title matches are boosted and body content is scored.
2026-03-24 20:12:45 -04:00
Fred
70db2f5226 fix: prevent qmd embed from running indefinitely
After the session's max duration timer fires (30 min), the embedding loop
continued iterating over all remaining chunks. Each embed call threw
SessionReleasedError, was caught, incremented errors, and the loop moved
to the next chunk — burning 100% CPU for days with zero useful output.

Three targeted fixes:

1. Check session.isValid before each batch iteration in the embedding loop,
   breaking early when the session has been aborted.

2. Pass the session's AbortSignal to chunkDocumentByTokens so tokenization
   also respects session expiry instead of running unbounded.

3. Add an error-rate circuit breaker: if >80% of processed chunks fail,
   abort early rather than grinding through the remaining work.

Fixes #440

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-24 22:38:57 +01:00
Antonio
902e14650e fix(embed): handle vec0 OR REPLACE limitation in insertEmbedding
sqlite-vec's vec0 virtual tables silently ignore the OR REPLACE conflict
clause. When a crash interrupts embedding mid-way, chunks that were
inserted into vectors_vec but not content_vectors get re-selected by
getHashesForEmbedding, causing a UNIQUE constraint error on re-embed.

Two changes:
1. Insert content_vectors first so getHashesForEmbedding won't re-select
   the hash if a crash occurs between the two inserts.
2. Use DELETE + INSERT for vectors_vec instead of INSERT OR REPLACE.

Fixes #445

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-24 11:11:31 -03:00
Antonio
840a614223 fix: respect XDG_CACHE_HOME for model cache directory
MODEL_CACHE_DIR was hardcoded to ~/.cache/qmd/models/, ignoring the
XDG_CACHE_HOME environment variable. This was inconsistent with the rest
of the codebase (store.ts, cli/qmd.ts) which already respects XDG paths.

Fixes #425

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-24 11:08:21 -03:00
Mike Bannister
bc80e72a06 chore: update bun.lock after dependency install 2026-03-23 11:49:25 -04:00
Mike Bannister
939d15652c fix: use CTE in searchFTS to prevent query planner regression with collection filter
When searchFTS combines FTS5 MATCH with a collection filter (d.collection = ?)
in the same WHERE clause, SQLite's query planner abandons the FTS5 index and
falls back to a full scan. This turns an 8ms query into a 17+ second query on
large collections (16K+ documents).

The fix wraps the FTS5 query in a CTE so it runs first with proper index usage,
then filters by collection on the materialized results.

Benchmarks on a 16,258-document collection:
  Before: qmd search "knowctl" -c <collection> → 19.8s
  After:  qmd search "knowctl" -c <collection> → 0.4s

The CTE fetches limit*10 candidates from the FTS index to ensure enough results
survive collection filtering. Without a collection filter, the query plan was
already optimal, so no CTE overhead is added in that case.
2026-03-23 11:35:22 -04:00
James Risberg
244ddf5ecb feat: AST-aware chunking for code files via tree-sitter
Add opt-in AST-aware chunk boundary detection for code files using
web-tree-sitter. When enabled with `--chunk-strategy auto`, code files
(.ts, .tsx, .js, .jsx, .py, .go, .rs) are chunked at function, class,
and import boundaries instead of arbitrary text positions. Default
behavior (`regex`) is unchanged — no surprises on upgrade.

In testing on QMD's own codebase, AST mode split 42% fewer function
bodies across chunk boundaries compared to regex-only chunking.

Usage:
  qmd embed --chunk-strategy auto
  qmd query "search terms" --chunk-strategy auto

What's included:
- Language detection from file extension with support for TypeScript,
  JavaScript (including arrow functions and function expressions),
  Python, Go, and Rust
- Per-language tree-sitter queries with scored break points aligned to
  the existing markdown scale (class=100, function=90, type=80, import=60)
- AST break points merged with regex break points — highest score wins
  at each position, so embedded markdown (comments, docstrings) still
  benefits from regex patterns
- Refactored chunking core: chunkDocumentWithBreakPoints() extracted,
  mergeBreakPoints() added, async chunkDocumentAsync() wrapper for AST
- ChunkStrategy type ("auto" | "regex") threaded through
  generateEmbeddings(), hybridQuery(), structuredSearch(), CLI, and SDK
- getASTStatus() health check wired into `qmd status`
- Parse failures log a warning and fall back to regex — never crash

Hardening:
- Grammar packages are optionalDependencies with pinned versions to
  prevent ABI breaks from semver drift
- web-tree-sitter is a direct dependency (pinned)
- Errors are logged (not silently swallowed) for debuggability
- Tested on both Node.js and Bun (Bun is actually faster)

Testing:
- 26 unit tests (test/ast.test.ts) — all 4 languages, error handling
- 7 integration tests (test/store.test.ts) — merge, equivalence, bypass
- Standalone test-ast-chunking.mjs with 63 synthetic tests and a
  real-collection performance scanner (npx tsx test-ast-chunking.mjs ~/code)
- Validated end-to-end with qmd embed + qmd query on QMD's own codebase
- Zero markdown regressions across all test paths

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-22 01:22:39 -04:00
Jarvis
783359f55c
fix: increase RERANK_CONTEXT_SIZE default 2048→4096, make configurable via QMD_RERANK_CONTEXT_SIZE env var, fix RERANK_TEMPLATE_OVERHEAD underestimate 200→512
Default 2048 was too small for longer documents (session transcripts, CJK
text, large markdown files). After truncation the Qwen3 reranker template
adds more overhead than the original 200-token estimate, causing node-llama-cpp
to throw 'input lengths exceed context size'.

Fixes: tobi/qmd#91 tobi/qmd#290 tobi/qmd#291 tobi/qmd#314
2026-03-21 20:59:11 -07:00
Tobias Lütke
2b8f329d7e
Merge pull request #370 from mvanhorn/osc/231-no-rerank-cli-flag
feat(cli): add --no-rerank flag to skip reranking in qmd query
2026-03-14 08:10:38 -04:00
Tobias Lütke
4721e07975
Merge pull request #371 from oysteinkrog/fix/wsl-drvfs-path-detection
Fix QMD when running under WSL (Windows Subsystem for Linux)
2026-03-14 08:10:25 -04:00
Tobias Lütke
95dc295433
Merge pull request #377 from serhii12/fix/sqlite-vec-macos-bun-error-handling
fix(db): add macOS Homebrew SQLite support for Bun and restore actionable errors
2026-03-14 08:08:50 -04:00
Tobias Lütke
43660c468e
Merge pull request #382 from rymalia/fix/zod-version-pin
fix: pin zod to exact 4.2.1 to prevent tsc build failure
2026-03-14 08:08:20 -04:00
Tobias Lütke
5f6821629b
Merge pull request #385 from rymalia/fix/launcher-lockfile-priority
fix: prioritize package-lock.json in launcher to prevent Bun false positive
2026-03-14 08:08:03 -04:00
Tobias Lütke
f35b4e19e0
Merge pull request #393 from lskun/fix/embed-context-overflow
fix: truncate oversized text before embedding to prevent GGML crash
2026-03-14 08:07:47 -04:00
Tobias Lütke
a13a84fb28
Merge pull request #396 from Mic92/qmd-fix
sync stale bun.lock, guard against future lockfile drift
2026-03-14 08:07:25 -04:00
Tobias Lütke
5b48bcb6c1
Merge pull request #389 from sonwr/fix-issue-380-cleanup-no-sqlite-vec
fix: skip cleanup when sqlite-vec is unavailable
2026-03-14 08:07:11 -04:00
Tobias Lütke
7ab1497ebb
Merge pull request #395 from ProgramCaiCai/fix/embed-batching-memory
Bound qmd embed memory usage with default batched processing
2026-03-14 08:06:51 -04:00
Tobias Lütke
398eadf15b
Merge pull request #399 from shreyaskarnik/feat/onnx-conversion
Add ONNX conversion script for Transformers.js deployment
2026-03-14 08:05:10 -04:00
Shreyas Karnik
df8d625c00
fix: map quantize_type to valid Transformers.js dtype values
--quantize none now emits dtype: "fp32" in the README instead of
dtype: "none", matching Transformers.js documented values (fp32,
fp16, q8, q4).
2026-03-13 12:57:19 -07:00
Shreyas Karnik
b05d8863ca
fix: quantization paths, missing imports, and hardcoded metadata
- Add missing subprocess import (NameError on any quantize path)
- Replace broken optimum-cli quantize calls with direct onnxruntime:
  Q4 uses MatMulNBitsQuantizer, Q8 uses quantize_dynamic
- Add onnxconverter-common to deps for FP16 (was silently swallowed)
- Make FP16 fail loudly on missing dep instead of silently uploading FP32
- README and transformers_js_config now reflect actual quantize_type
  instead of always hardcoding Q4
- Remove dead _convert_fp16_external function
2026-03-13 12:45:48 -07:00
Shreyas Karnik
e1ce37c989
fix: handle 2GB protobuf limit, add validation, fix input feeds
- Use no_post_process=True for ONNX export to avoid protobuf serialize error
- Add --validate and --validate-only flags for inference verification
- Fix position_ids in validation feed (required by Qwen3 ONNX export)
- Use optimum-cli for quantization to handle external data format
- Fix optimum dependency to optimum[onnxruntime]

Tested: export + validation passes on CPU, KV cache present (56 tensors).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-13 12:30:26 -07:00
Shreyas Karnik
2df95ac9ba
feat: add ONNX conversion script for Transformers.js deployment
Add convert_onnx.py that mirrors convert_gguf.py's structure:
- Loads base Qwen3 model, merges SFT + GRPO adapters
- Exports to ONNX via Optimum (text-generation-with-past task)
- Supports Q4 (MatMulNBits), Q8, FP16, and FP32 output
- Uploads to separate HF repo (e.g. tobil/qmd-query-expansion-1.7B-ONNX)
- Writes Transformers.js compatibility config
- Includes model card with usage example

Usage:
    uv run convert_onnx.py --size 1.7B
    uv run convert_onnx.py --size 1.7B --quantize q4 --no-upload

Also adds `just convert-onnx` and `just convert-gguf` tasks.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-13 11:50:03 -07:00
Jörg Thalheim
8c4b4b335d sync stale bun.lock, guard against future lockfile drift
bun.lock still resolved better-sqlite3 to 11.x after package.json was
bumped to ^12.4.5 in v2.0.0. This breaks sandboxed builds (e.g. Nix
with bun2nix) where network access is unavailable to resolve the
mismatch.

CI and the publish workflow now use --frozen-lockfile so drift is caught
immediately. The release script also validates lockfile consistency
before tagging.

Closes #386
2026-03-13 13:34:17 +01:00
programcaicai
809aa36172 fix: bound memory usage during embed 2026-03-13 17:39:17 +08:00