ai-workspace-services/qmd

Author	SHA1	Message	Date
JohnRichardEnders	ce0cd64409	feat(mcp): pass YAML config path to createStore for model resolution Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-05 18:00:09 -04:00
JohnRichardEnders	c8d49d26da	feat(cli): configure LlamaCpp singleton from YAML models config Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-05 18:00:09 -04:00
JohnRichardEnders	33d42a2a04	feat(sdk): pass YAML models config to LlamaCpp in createStore Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-05 18:00:08 -04:00
JohnRichardEnders	50ce17bbfa	feat(llm): resolve models as config > env > default Separate hardcoded default from env var in DEFAULT_EMBED_MODEL so the constructor can resolve: config param > env var > hardcoded default. Also add env var support for QMD_GENERATE_MODEL and QMD_RERANK_MODEL. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-05 18:00:08 -04:00
JohnRichardEnders	14b384d9d8	feat(config): add ModelsConfig type to CollectionConfig Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-05 18:00:08 -04:00
Tobias Lütke	c940ce19d0	Merge pull request #508 from danmackinlay/dm/issue-507-osc8-editor-links feat: Add clickable OSC 8 editor links in CLI search output	2026-04-05 17:59:47 -04:00
dan mackinlay	1bada2eba6	Add explicit TTY link output tests	2026-04-05 17:58:09 -04:00
dan mackinlay	06f5642252	Fix stale ls test expectation	2026-04-05 17:56:26 -04:00
dan mackinlay	08b410696f	Document clickable TTY link output in README	2026-04-05 17:56:26 -04:00
dan mackinlay	636631225e	Add clickable OSC8 editor links for CLI search results	2026-04-05 17:56:26 -04:00
Tobias Lütke	4909aea28d	Merge pull request #485 from jamesrisberg/chore/migrate-ast-tests-to-vitest chore: migrate AST chunking tests to vitest	2026-04-05 17:21:42 -04:00
James Risberg	33fae1c4f5	chore: migrate AST chunking tests to vitest Replace standalone test-ast-chunking.mjs (823 lines, custom check() harness, invisible to CI) with proper vitest integration tests. All unique assertions preserved; duplicates already in ast.test.ts dropped. Performance benchmarks and real-collection scanner removed (dev tools, not regression tests). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-05 17:19:59 -04:00
Tobias Lütke	60c2c64c3a	Merge pull request #470 from jmilinovich/feat/bench-command feat(cli): add `qmd bench` for search quality benchmarks	2026-04-05 17:19:52 -04:00
John R Milinovich	b7a5a86a9b	feat(cli): add `qmd bench` command for search quality benchmarks Adds a benchmark harness that measures search quality across backends. Given a fixture file with queries and expected results, it runs each query through BM25, vector, hybrid (no rerank), and full pipeline, then reports precision@k, recall, MRR, F1, and latency. This is primarily a regression testing tool — users create fixtures for their own vaults to catch quality regressions after config or index changes. Ships with an example fixture against the eval-docs test collection to demonstrate the format. New files: src/bench/bench.ts — main runner src/bench/score.ts — precision, recall, MRR, F1, path matching src/bench/types.ts — fixture and result types src/bench/fixtures/ — example fixture test/bench-score.test.ts — unit tests for scoring (16 tests) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-05 17:17:59 -04:00
Tobias Lütke	76a2f0fb31	Merge pull request #506 from danmackinlay/fix-505-json-line-output feat: Include line in --json search output # Conflicts: # CHANGELOG.md	2026-04-05 17:16:05 -04:00
Tobias Lütke	6db34d7278	fix(llm): catch GPU init failures and fall back to CPU Adds QMD_LLAMA_GPU env var (set to false/off/none to force CPU) and wraps getLlama() in try/catch so Vulkan/CUDA init failures on headless or driverless machines fall back gracefully instead of crashing the node process with an uncatchable C++ terminate().	2026-04-05 17:12:53 -04:00
Tobias Lütke	9c9de94bd8	fix(handelize): restore lowercase + convert dots to dashes - Restore .toLowerCase() in handelize (was dropped, both test files expected it inconsistently) - Convert dots to dashes in filename body (e.g. v2.0 -> v2-0), keeping only the extension dot. Tobi confirmed this is the intended behavior. - Align both test/store.test.ts and test/store.helpers.unit.test.ts to match (they had diverged, one expected case-preserved, one lowercase) - Adjust 'ensureVecTable recreates' test to expect throw behavior (matches #501 dimension-mismatch fix)	2026-04-05 17:12:53 -04:00
Surma	2de225c9e7	Test nix flake builds in CI (#487 ) * Test nix flake builds in CI * Update outdated bun.lock file * fix: restore toLowerCase() in handelize and update tests * Fix flake to use proper FODs --------- Co-authored-by: Tobias Lütke <tobi@shopify.com>	2026-04-05 16:59:27 -04:00
Tobias Lütke	828823d20a	fix: restore toLowerCase() in handelize + align tests with post-#501 behavior - Restore .toLowerCase() in handelize (was dropped somewhere, tests expect it) - Update dimension-mismatch test to expect throw instead of silent rebuild (matches new behavior from #501) - Fix one stale test expectation for preserved dots in filenames	2026-04-05 16:56:06 -04:00
Antonio Mello	ef062e1b54	fix(multi-get): support brace expansion patterns in glob matching (#424 ) Brace expansion patterns like `{doc1,doc2}.md` or `collection/{a,b}.md` were incorrectly parsed as comma-separated file lists instead of being passed to the glob matcher (picomatch). This happened because the comma-detection heuristic only checked for `*` and `?` but not `{`. Also adds `collection/path` matching in `matchFilesByGlob` so patterns like `my-collection/{file1,file2}.md` work — previously the glob only matched against `qmd://collection/path` (virtual) and `path` (relative to collection root), missing the `collection/path` form. Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-05 16:45:33 -04:00
John R. Enders	09a4d19b31	fix(store): error on embedding dimension mismatch instead of silent rebuild (#501 ) When switching to an embedding model with different dimensions, ensureVecTableInternal() silently drops the vector table and all embeddings are lost. Users only discover this when semantic search returns empty results. Throw an error instead, telling users to run 'qmd embed -f' to explicitly re-embed. This is safe because embed -f calls clearAllEmbeddings() which drops the table before ensureVecTable is reached. Related to #497 Co-authored-by: JohnRichardEnders <john@telli.com>	2026-04-05 16:45:24 -04:00
John R. Enders	54550a3366	fix(llm): set explicit embed context size, default 2048, configurable via env var (#500 ) Without an explicit contextSize, node-llama-cpp defaults to "auto" which allocates the model's full training context (often 32k). For embedding chunks that are typically ~900 tokens this wastes ~3.5 GB of KV cache per context on Apple Silicon unified memory. Default to 2048 (matching the rerank context pattern) and allow override via QMD_EMBED_CONTEXT_SIZE for users with larger chunks. Addresses #329, related to #297 Co-authored-by: JohnRichardEnders <john@telli.com>	2026-04-05 16:45:12 -04:00
LJY	698b44fe87	Fix qmd embed model selection (#494 )	2026-04-05 16:45:04 -04:00
Oliver Ratzesberger	021236378b	fix(mcp): read version from package.json instead of hardcoding (#431 )	2026-04-05 16:44:35 -04:00
Matt Van Horn	1ad3388132	fix(store): preserve underscores in BM25 search terms (#404 ) sanitizeFTS5Term stripped all non-letter/non-number characters including underscores, causing snake_case identifiers like `my_variable` to become `myvariable` and silently fail BM25 matches. Add underscore to the preserved character set in the Unicode regex. Export the function and add unit tests covering snake_case, contractions, punctuation stripping, and unicode. Fixes #305 Co-authored-by: Matt Van Horn <455140+mvanhorn@users.noreply.github.com> Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-05 16:44:14 -04:00
Oliver Ratzesberger	959b719038	fix(mcp): include collection name in status output (#416 ) The MCP status tool shows collection paths but not names, making it impossible for agents to discover valid collection filter values. The CLI 'qmd status' already shows names. Add col.name prefix to each collection line in the status tool response.	2026-04-05 16:43:25 -04:00
Ahmed El Gabri	209b797c6a	fix(nix): correct CLI entry point path in wrapper (#413 ) The wrapper points at src/qmd.ts which no longer exists after the CLI was moved to src/cli/qmd.ts.	2026-04-05 16:43:18 -04:00
dan mackinlay	c22d00829b	Add line to JSON search output	2026-04-05 10:08:57 +00:00
Tobias Lütke	1fb2e2819e	Merge origin/main into feat/ast-aware-chunking Resolve conflicts: combine AST chunking args (filepath, chunkStrategy) with abort signal parameter from #458. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-28 20:00:49 -04:00
Tobias Lütke	dd27f499c7	Merge pull request #463 from goldsr09/fix/hyphenated-lex-queries Fix hyphenated tokens in FTS5 lex queries	2026-03-28 19:58:22 -04:00
Tobias Lütke	ea653304af	Merge pull request #458 from ccc-fff/fix/embed-infinite-loop fix: prevent qmd embed from running indefinitely	2026-03-28 19:58:17 -04:00
Tobias Lütke	616776ebdd	Merge pull request #453 from builderjarvis/fix/rerank-context-size fix: increase RERANK_CONTEXT_SIZE default 2048→4096, configurable via env var, fix template overhead underestimate	2026-03-28 19:58:11 -04:00
Tobias Lütke	6a45150f5a	Merge pull request #456 from antonio-mello-ai/fix/insert-embedding-vec0-replace fix(embed): handle vec0 OR REPLACE limitation in insertEmbedding	2026-03-28 19:58:06 -04:00
Tobias Lütke	827ad839f4	Merge origin/main into fix/fts5-collection-filter-performance Resolve conflict: use CTE approach from #455 with updated BM25 weights (1.5, 4.0, 1.0) from #462. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-28 19:57:33 -04:00
Tobias Lütke	08566ec316	Merge pull request #462 from goldsr09/fix/bm25-field-weights Fix BM25 field weights to include all 3 FTS columns	2026-03-28 19:56:04 -04:00
Tobias Lütke	5bef789ad3	Merge pull request #478 from zestyboy/feat/no-rerank-option Add rerank parameter to MCP query tool	2026-03-28 19:55:00 -04:00
Tobias Lütke	f73386ae02	Merge pull request #457 from antonio-mello-ai/fix/model-cache-xdg-cache-home fix: respect XDG_CACHE_HOME for model cache directory	2026-03-28 19:54:54 -04:00
Tobias Lütke	8d343b9da1	Update handelize tests for case/dot preservation (#475 ) PR #475 changed handelize() to preserve original case and dots, but the tests still expected lowercase output. Update assertions to match the new behavior. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-28 19:54:18 -04:00
Tobias Lütke	2ec3360a8b	Merge pull request #475 from alexei-led/fix/handelize-dots-and-case fix: preserve dots and original case in handelize()	2026-03-28 19:44:40 -04:00
Tobias Lütke	49ebf6e8ff	Merge pull request #479 from surma-dump/surma/fix-flake Fix flake	2026-03-28 19:43:15 -04:00
Surma	cf9991cfa7	Fix flake	2026-03-27 23:12:20 +00:00
Niven	792992ef65	Add rerank parameter to MCP query tool The MCP query tool always ran LLM reranking, even for lex-only queries. On CPU-only infrastructure (e.g. Railway), the reranker adds 60-120s per query. The SDK and CLI already support skipping reranking, but the MCP server did not expose this option. Add a `rerank` boolean parameter (default: true) to the MCP query tool's input schema, forwarded to store.search() as the existing `rerank` option. Fixes #477 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-27 13:10:31 -07:00
Alexei Ledenev	72f2dd1fe5	fix: preserve original filename case in handelize (MEMORY.md not memory.md)	2026-03-27 16:38:04 +03:00
Alexei Ledenev	ddecde78da	fix: preserve dots in filenames during handelize The handelize() regex replaced all non-letter/non-number chars with dashes, including dots in the filename stem. This mangled session filenames like "topic-1773595309.753009.md" to "topic-1773595309-753009.md", breaking memory_get path resolution (file not found on disk). Fix: add dot to the preserved character class in the filename regex. After deploying, run qmd-reindex.sh to rebuild indexes with correct paths.	2026-03-27 16:37:59 +03:00
Ryan	7b9bd01226	fix: handle hyphenated tokens in FTS5 lex queries Hyphenated terms like multi-agent, DEC-0054, gpt-4 were being stripped of hyphens and concatenated (e.g., "multiagent") which missed matches. Now they're split into FTS5 phrase queries ("multi agent") so the porter tokenizer matches them correctly.	2026-03-24 20:13:52 -04:00
Ryan	fa214db367	fix: correct BM25 field weights to include all 3 FTS columns The bm25() call only had 2 weights for 3 columns (filepath, title, body), giving body an implicit weight of 0. Add proper weights: filepath=1.5, title=4.0, body=1.0 so title matches are boosted and body content is scored.	2026-03-24 20:12:45 -04:00
Fred	70db2f5226	fix: prevent qmd embed from running indefinitely After the session's max duration timer fires (30 min), the embedding loop continued iterating over all remaining chunks. Each embed call threw SessionReleasedError, was caught, incremented errors, and the loop moved to the next chunk — burning 100% CPU for days with zero useful output. Three targeted fixes: 1. Check session.isValid before each batch iteration in the embedding loop, breaking early when the session has been aborted. 2. Pass the session's AbortSignal to chunkDocumentByTokens so tokenization also respects session expiry instead of running unbounded. 3. Add an error-rate circuit breaker: if >80% of processed chunks fail, abort early rather than grinding through the remaining work. Fixes #440 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-24 22:38:57 +01:00
Antonio	902e14650e	fix(embed): handle vec0 OR REPLACE limitation in insertEmbedding sqlite-vec's vec0 virtual tables silently ignore the OR REPLACE conflict clause. When a crash interrupts embedding mid-way, chunks that were inserted into vectors_vec but not content_vectors get re-selected by getHashesForEmbedding, causing a UNIQUE constraint error on re-embed. Two changes: 1. Insert content_vectors first so getHashesForEmbedding won't re-select the hash if a crash occurs between the two inserts. 2. Use DELETE + INSERT for vectors_vec instead of INSERT OR REPLACE. Fixes #445 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-24 11:11:31 -03:00
Antonio	840a614223	fix: respect XDG_CACHE_HOME for model cache directory MODEL_CACHE_DIR was hardcoded to ~/.cache/qmd/models/, ignoring the XDG_CACHE_HOME environment variable. This was inconsistent with the rest of the codebase (store.ts, cli/qmd.ts) which already respects XDG paths. Fixes #425 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-24 11:08:21 -03:00
Mike Bannister	bc80e72a06	chore: update bun.lock after dependency install	2026-03-23 11:49:25 -04:00

1 2 3 4 5 ...

417 Commits