ai-workspace-services/qmd

Author	SHA1	Message	Date
Oliver Ratzesberger	959b719038	fix(mcp): include collection name in status output (#416 ) The MCP status tool shows collection paths but not names, making it impossible for agents to discover valid collection filter values. The CLI 'qmd status' already shows names. Add col.name prefix to each collection line in the status tool response.	2026-04-05 16:43:25 -04:00
Ahmed El Gabri	209b797c6a	fix(nix): correct CLI entry point path in wrapper (#413 ) The wrapper points at src/qmd.ts which no longer exists after the CLI was moved to src/cli/qmd.ts.	2026-04-05 16:43:18 -04:00
Tobias Lütke	1fb2e2819e	Merge origin/main into feat/ast-aware-chunking Resolve conflicts: combine AST chunking args (filepath, chunkStrategy) with abort signal parameter from #458. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-28 20:00:49 -04:00
Tobias Lütke	dd27f499c7	Merge pull request #463 from goldsr09/fix/hyphenated-lex-queries Fix hyphenated tokens in FTS5 lex queries	2026-03-28 19:58:22 -04:00
Tobias Lütke	ea653304af	Merge pull request #458 from ccc-fff/fix/embed-infinite-loop fix: prevent qmd embed from running indefinitely	2026-03-28 19:58:17 -04:00
Tobias Lütke	616776ebdd	Merge pull request #453 from builderjarvis/fix/rerank-context-size fix: increase RERANK_CONTEXT_SIZE default 2048→4096, configurable via env var, fix template overhead underestimate	2026-03-28 19:58:11 -04:00
Tobias Lütke	6a45150f5a	Merge pull request #456 from antonio-mello-ai/fix/insert-embedding-vec0-replace fix(embed): handle vec0 OR REPLACE limitation in insertEmbedding	2026-03-28 19:58:06 -04:00
Tobias Lütke	827ad839f4	Merge origin/main into fix/fts5-collection-filter-performance Resolve conflict: use CTE approach from #455 with updated BM25 weights (1.5, 4.0, 1.0) from #462. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-28 19:57:33 -04:00
Tobias Lütke	08566ec316	Merge pull request #462 from goldsr09/fix/bm25-field-weights Fix BM25 field weights to include all 3 FTS columns	2026-03-28 19:56:04 -04:00
Tobias Lütke	5bef789ad3	Merge pull request #478 from zestyboy/feat/no-rerank-option Add rerank parameter to MCP query tool	2026-03-28 19:55:00 -04:00
Tobias Lütke	f73386ae02	Merge pull request #457 from antonio-mello-ai/fix/model-cache-xdg-cache-home fix: respect XDG_CACHE_HOME for model cache directory	2026-03-28 19:54:54 -04:00
Tobias Lütke	8d343b9da1	Update handelize tests for case/dot preservation (#475 ) PR #475 changed handelize() to preserve original case and dots, but the tests still expected lowercase output. Update assertions to match the new behavior. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-28 19:54:18 -04:00
Tobias Lütke	2ec3360a8b	Merge pull request #475 from alexei-led/fix/handelize-dots-and-case fix: preserve dots and original case in handelize()	2026-03-28 19:44:40 -04:00
Tobias Lütke	49ebf6e8ff	Merge pull request #479 from surma-dump/surma/fix-flake Fix flake	2026-03-28 19:43:15 -04:00
Surma	cf9991cfa7	Fix flake	2026-03-27 23:12:20 +00:00
Niven	792992ef65	Add rerank parameter to MCP query tool The MCP query tool always ran LLM reranking, even for lex-only queries. On CPU-only infrastructure (e.g. Railway), the reranker adds 60-120s per query. The SDK and CLI already support skipping reranking, but the MCP server did not expose this option. Add a `rerank` boolean parameter (default: true) to the MCP query tool's input schema, forwarded to store.search() as the existing `rerank` option. Fixes #477 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-27 13:10:31 -07:00
Alexei Ledenev	72f2dd1fe5	fix: preserve original filename case in handelize (MEMORY.md not memory.md)	2026-03-27 16:38:04 +03:00
Alexei Ledenev	ddecde78da	fix: preserve dots in filenames during handelize The handelize() regex replaced all non-letter/non-number chars with dashes, including dots in the filename stem. This mangled session filenames like "topic-1773595309.753009.md" to "topic-1773595309-753009.md", breaking memory_get path resolution (file not found on disk). Fix: add dot to the preserved character class in the filename regex. After deploying, run qmd-reindex.sh to rebuild indexes with correct paths.	2026-03-27 16:37:59 +03:00
Ryan	7b9bd01226	fix: handle hyphenated tokens in FTS5 lex queries Hyphenated terms like multi-agent, DEC-0054, gpt-4 were being stripped of hyphens and concatenated (e.g., "multiagent") which missed matches. Now they're split into FTS5 phrase queries ("multi agent") so the porter tokenizer matches them correctly.	2026-03-24 20:13:52 -04:00
Ryan	fa214db367	fix: correct BM25 field weights to include all 3 FTS columns The bm25() call only had 2 weights for 3 columns (filepath, title, body), giving body an implicit weight of 0. Add proper weights: filepath=1.5, title=4.0, body=1.0 so title matches are boosted and body content is scored.	2026-03-24 20:12:45 -04:00
Fred	70db2f5226	fix: prevent qmd embed from running indefinitely After the session's max duration timer fires (30 min), the embedding loop continued iterating over all remaining chunks. Each embed call threw SessionReleasedError, was caught, incremented errors, and the loop moved to the next chunk — burning 100% CPU for days with zero useful output. Three targeted fixes: 1. Check session.isValid before each batch iteration in the embedding loop, breaking early when the session has been aborted. 2. Pass the session's AbortSignal to chunkDocumentByTokens so tokenization also respects session expiry instead of running unbounded. 3. Add an error-rate circuit breaker: if >80% of processed chunks fail, abort early rather than grinding through the remaining work. Fixes #440 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-24 22:38:57 +01:00
Antonio	902e14650e	fix(embed): handle vec0 OR REPLACE limitation in insertEmbedding sqlite-vec's vec0 virtual tables silently ignore the OR REPLACE conflict clause. When a crash interrupts embedding mid-way, chunks that were inserted into vectors_vec but not content_vectors get re-selected by getHashesForEmbedding, causing a UNIQUE constraint error on re-embed. Two changes: 1. Insert content_vectors first so getHashesForEmbedding won't re-select the hash if a crash occurs between the two inserts. 2. Use DELETE + INSERT for vectors_vec instead of INSERT OR REPLACE. Fixes #445 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-24 11:11:31 -03:00
Antonio	840a614223	fix: respect XDG_CACHE_HOME for model cache directory MODEL_CACHE_DIR was hardcoded to ~/.cache/qmd/models/, ignoring the XDG_CACHE_HOME environment variable. This was inconsistent with the rest of the codebase (store.ts, cli/qmd.ts) which already respects XDG paths. Fixes #425 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-24 11:08:21 -03:00
Mike Bannister	bc80e72a06	chore: update bun.lock after dependency install	2026-03-23 11:49:25 -04:00
Mike Bannister	939d15652c	fix: use CTE in searchFTS to prevent query planner regression with collection filter When searchFTS combines FTS5 MATCH with a collection filter (d.collection = ?) in the same WHERE clause, SQLite's query planner abandons the FTS5 index and falls back to a full scan. This turns an 8ms query into a 17+ second query on large collections (16K+ documents). The fix wraps the FTS5 query in a CTE so it runs first with proper index usage, then filters by collection on the materialized results. Benchmarks on a 16,258-document collection: Before: qmd search "knowctl" -c <collection> → 19.8s After: qmd search "knowctl" -c <collection> → 0.4s The CTE fetches limit*10 candidates from the FTS index to ensure enough results survive collection filtering. Without a collection filter, the query plan was already optimal, so no CTE overhead is added in that case.	2026-03-23 11:35:22 -04:00
James Risberg	244ddf5ecb	feat: AST-aware chunking for code files via tree-sitter Add opt-in AST-aware chunk boundary detection for code files using web-tree-sitter. When enabled with `--chunk-strategy auto`, code files (.ts, .tsx, .js, .jsx, .py, .go, .rs) are chunked at function, class, and import boundaries instead of arbitrary text positions. Default behavior (`regex`) is unchanged — no surprises on upgrade. In testing on QMD's own codebase, AST mode split 42% fewer function bodies across chunk boundaries compared to regex-only chunking. Usage: qmd embed --chunk-strategy auto qmd query "search terms" --chunk-strategy auto What's included: - Language detection from file extension with support for TypeScript, JavaScript (including arrow functions and function expressions), Python, Go, and Rust - Per-language tree-sitter queries with scored break points aligned to the existing markdown scale (class=100, function=90, type=80, import=60) - AST break points merged with regex break points — highest score wins at each position, so embedded markdown (comments, docstrings) still benefits from regex patterns - Refactored chunking core: chunkDocumentWithBreakPoints() extracted, mergeBreakPoints() added, async chunkDocumentAsync() wrapper for AST - ChunkStrategy type ("auto" \| "regex") threaded through generateEmbeddings(), hybridQuery(), structuredSearch(), CLI, and SDK - getASTStatus() health check wired into `qmd status` - Parse failures log a warning and fall back to regex — never crash Hardening: - Grammar packages are optionalDependencies with pinned versions to prevent ABI breaks from semver drift - web-tree-sitter is a direct dependency (pinned) - Errors are logged (not silently swallowed) for debuggability - Tested on both Node.js and Bun (Bun is actually faster) Testing: - 26 unit tests (test/ast.test.ts) — all 4 languages, error handling - 7 integration tests (test/store.test.ts) — merge, equivalence, bypass - Standalone test-ast-chunking.mjs with 63 synthetic tests and a real-collection performance scanner (npx tsx test-ast-chunking.mjs ~/code) - Validated end-to-end with qmd embed + qmd query on QMD's own codebase - Zero markdown regressions across all test paths Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-22 01:22:39 -04:00
Jarvis	783359f55c	fix: increase RERANK_CONTEXT_SIZE default 2048→4096, make configurable via QMD_RERANK_CONTEXT_SIZE env var, fix RERANK_TEMPLATE_OVERHEAD underestimate 200→512 Default 2048 was too small for longer documents (session transcripts, CJK text, large markdown files). After truncation the Qwen3 reranker template adds more overhead than the original 200-token estimate, causing node-llama-cpp to throw 'input lengths exceed context size'. Fixes: tobi/qmd#91 tobi/qmd#290 tobi/qmd#291 tobi/qmd#314	2026-03-21 20:59:11 -07:00
Tobias Lütke	2b8f329d7e	Merge pull request #370 from mvanhorn/osc/231-no-rerank-cli-flag feat(cli): add --no-rerank flag to skip reranking in qmd query	2026-03-14 08:10:38 -04:00
Tobias Lütke	4721e07975	Merge pull request #371 from oysteinkrog/fix/wsl-drvfs-path-detection Fix QMD when running under WSL (Windows Subsystem for Linux)	2026-03-14 08:10:25 -04:00
Tobias Lütke	95dc295433	Merge pull request #377 from serhii12/fix/sqlite-vec-macos-bun-error-handling fix(db): add macOS Homebrew SQLite support for Bun and restore actionable errors	2026-03-14 08:08:50 -04:00
Tobias Lütke	43660c468e	Merge pull request #382 from rymalia/fix/zod-version-pin fix: pin zod to exact 4.2.1 to prevent tsc build failure	2026-03-14 08:08:20 -04:00
Tobias Lütke	5f6821629b	Merge pull request #385 from rymalia/fix/launcher-lockfile-priority fix: prioritize package-lock.json in launcher to prevent Bun false positive	2026-03-14 08:08:03 -04:00
Tobias Lütke	f35b4e19e0	Merge pull request #393 from lskun/fix/embed-context-overflow fix: truncate oversized text before embedding to prevent GGML crash	2026-03-14 08:07:47 -04:00
Tobias Lütke	a13a84fb28	Merge pull request #396 from Mic92/qmd-fix sync stale bun.lock, guard against future lockfile drift	2026-03-14 08:07:25 -04:00
Tobias Lütke	5b48bcb6c1	Merge pull request #389 from sonwr/fix-issue-380-cleanup-no-sqlite-vec fix: skip cleanup when sqlite-vec is unavailable	2026-03-14 08:07:11 -04:00
Tobias Lütke	7ab1497ebb	Merge pull request #395 from ProgramCaiCai/fix/embed-batching-memory Bound qmd embed memory usage with default batched processing	2026-03-14 08:06:51 -04:00
Tobias Lütke	398eadf15b	Merge pull request #399 from shreyaskarnik/feat/onnx-conversion Add ONNX conversion script for Transformers.js deployment	2026-03-14 08:05:10 -04:00
Shreyas Karnik	df8d625c00	fix: map quantize_type to valid Transformers.js dtype values --quantize none now emits dtype: "fp32" in the README instead of dtype: "none", matching Transformers.js documented values (fp32, fp16, q8, q4).	2026-03-13 12:57:19 -07:00
Shreyas Karnik	b05d8863ca	fix: quantization paths, missing imports, and hardcoded metadata - Add missing subprocess import (NameError on any quantize path) - Replace broken optimum-cli quantize calls with direct onnxruntime: Q4 uses MatMulNBitsQuantizer, Q8 uses quantize_dynamic - Add onnxconverter-common to deps for FP16 (was silently swallowed) - Make FP16 fail loudly on missing dep instead of silently uploading FP32 - README and transformers_js_config now reflect actual quantize_type instead of always hardcoding Q4 - Remove dead _convert_fp16_external function	2026-03-13 12:45:48 -07:00
Shreyas Karnik	e1ce37c989	fix: handle 2GB protobuf limit, add validation, fix input feeds - Use no_post_process=True for ONNX export to avoid protobuf serialize error - Add --validate and --validate-only flags for inference verification - Fix position_ids in validation feed (required by Qwen3 ONNX export) - Use optimum-cli for quantization to handle external data format - Fix optimum dependency to optimum[onnxruntime] Tested: export + validation passes on CPU, KV cache present (56 tensors). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-13 12:30:26 -07:00
Shreyas Karnik	2df95ac9ba	feat: add ONNX conversion script for Transformers.js deployment Add convert_onnx.py that mirrors convert_gguf.py's structure: - Loads base Qwen3 model, merges SFT + GRPO adapters - Exports to ONNX via Optimum (text-generation-with-past task) - Supports Q4 (MatMulNBits), Q8, FP16, and FP32 output - Uploads to separate HF repo (e.g. tobil/qmd-query-expansion-1.7B-ONNX) - Writes Transformers.js compatibility config - Includes model card with usage example Usage: uv run convert_onnx.py --size 1.7B uv run convert_onnx.py --size 1.7B --quantize q4 --no-upload Also adds `just convert-onnx` and `just convert-gguf` tasks. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-13 11:50:03 -07:00
Jörg Thalheim	8c4b4b335d	sync stale bun.lock, guard against future lockfile drift bun.lock still resolved better-sqlite3 to 11.x after package.json was bumped to ^12.4.5 in v2.0.0. This breaks sandboxed builds (e.g. Nix with bun2nix) where network access is unavailable to resolve the mismatch. CI and the publish workflow now use --frozen-lockfile so drift is caught immediately. The release script also validates lockfile consistency before tagging. Closes #386	2026-03-13 13:34:17 +01:00
programcaicai	809aa36172	fix: bound memory usage during embed	2026-03-13 17:39:17 +08:00
edy	9718d3767c	fix: truncate oversized text before embedding to prevent GGML crash When a chunk exceeds the embedding model's context window (trainContextSize), node-llama-cpp's getEmbeddingFor() triggers a native SIGABRT in GGML/Metal, crashing the entire process. Fix: Add truncateToContextSize() guard in embed() and embedBatch() that uses the model's own tokenizer to check token count before calling getEmbeddingFor(). Oversized text is truncated to (trainContextSize - 4) tokens with a warning, preserving partial embedding coverage instead of crashing. Fixes #303	2026-03-13 09:35:17 +08:00
sonwr	7df09e8235	fix: skip vector cleanup when sqlite-vec is unavailable	2026-03-12 13:51:20 +00:00
Ryan Malia	28903d8eba	fix: prioritize package-lock.json in launcher to prevent Bun false positive The bin/qmd wrapper checks for bun.lock to select the runtime, but since bun.lock is committed to the repo, source builds using npm install are incorrectly routed to Bun — causing native module ABI mismatches (#381) and sqlite-vec crashes (#380). Add package-lock.json as a higher-priority signal: if it exists, npm installed the dependencies and Node should be used. Also fix cleanupOrphanedVectors() to use the existing isSqliteVecAvailable() guard instead of checking sqlite_master, which can report the virtual table even when the vec0 module isn't loaded. Fixes #381, fixes #380 Continuation of #362 (runtime detection false positives)	2026-03-12 01:46:38 -07:00
Ryan Malia	e988c5d286	fix: pin zod to exact 4.2.1 to fix tsc build failure (#379 ) The caret range ^4.2.1 allows npm to resolve zod 4.3.x, which has breaking type changes against @modelcontextprotocol/sdk. Source builds fail with TypeScript errors. Pinning to exact 4.2.1 resolves this. See: https://github.com/tobi/qmd/issues/379	2026-03-11 21:43:09 -07:00
serhii12	52ebafce11	fix(db): add macOS Homebrew SQLite support for Bun and restore actionable errors On macOS, bun:sqlite uses Apple's system SQLite which is compiled with SQLITE_OMIT_LOAD_EXTENSION, preventing sqlite-vec from loading. The v2.0 refactor also silently swallowed extension loading failures, losing the actionable error messages that existed pre-2.0. - Call Database.setCustomSQLite() on macOS to use Homebrew's SQLite - Eagerly validate extension loading at init, not at first query - Throw with platform-specific fix instructions in loadSqliteVec() - Log warning in store.ts instead of silently catching Fixes #363	2026-03-11 12:55:20 -03:00
Øystein Krog	26d4ebfa56	fix: skip Git Bash path detection on WSL On WSL, paths like /c/work/... are valid drvfs mount points, not Git Bash drive-letter shortcuts. The existing code in isAbsolutePath() and resolve() detected /c/ as a Windows C: path, converting drvfs paths to C:/work/... which broke indexing entirely. Fix: detect WSL via WSL_DISTRO_NAME or WSL_INTEROP environment variables and skip the Git Bash /c/ -> C: branch on WSL. Native Linux path handling continues as before.	2026-03-11 12:25:28 +01:00
Matt Van Horn	11b3f17fba	feat(cli): add --no-rerank flag to skip reranking in qmd query Exposes the existing skipRerank option as a --no-rerank CLI flag for qmd query. On CPU-only machines, reranking takes 120s+ for 20 chunks - this flag lets users get RRF-fused results without the reranking penalty. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-10 23:41:43 -07:00

1 2 3 4 5 ...

391 Commits