ai-workspace-services/qmd

Author	SHA1	Message	Date
Tobi Lutke	9bafd3d0e9	docs: complete changelog for all PRs merged since v2.0.1 Cover ~25 community PRs including embedding stability fixes, BM25 field weight and hyphenation fixes, reranker context sizing, launcher reliability, XDG compliance, and the --no-rerank flag.	2026-04-05 18:15:09 -04:00
Tobi Lutke	cc32c9958d	fix: approve native build scripts and update vitest Add pnpm.onlyBuiltDependencies to whitelist packages that need install/postinstall scripts (better-sqlite3, esbuild, node-llama-cpp, tree-sitter-*). Without this, pnpm silently skips native compilation causing all tests that touch SQLite to fail. Also bumps vitest from ^3.0.0 to ^3.2.4.	2026-04-05 18:13:31 -04:00
Tobias Lütke	54fc7b01a9	Merge pull request #502 from JohnRichardEnders/feat/yaml-model-config feat: support model configuration in index.yml	2026-04-05 18:02:29 -04:00
Tobias Lütke	4f11517fb4	docs: add changelog entry for YAML model config	2026-04-05 18:02:26 -04:00
JohnRichardEnders	8644fa99d1	fix(store): thread embed model URI to format functions for correct prompt detection When the embed model is configured via YAML (not env var), formatDocForEmbedding and formatQueryForEmbedding callers in store.ts would fall back to the default model, producing the wrong prompt format. This adds a public embedModelName getter on LlamaCpp and threads it through all five call sites. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-05 18:00:34 -04:00
JohnRichardEnders	ce0cd64409	feat(mcp): pass YAML config path to createStore for model resolution Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-05 18:00:09 -04:00
JohnRichardEnders	c8d49d26da	feat(cli): configure LlamaCpp singleton from YAML models config Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-05 18:00:09 -04:00
JohnRichardEnders	33d42a2a04	feat(sdk): pass YAML models config to LlamaCpp in createStore Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-05 18:00:08 -04:00
JohnRichardEnders	50ce17bbfa	feat(llm): resolve models as config > env > default Separate hardcoded default from env var in DEFAULT_EMBED_MODEL so the constructor can resolve: config param > env var > hardcoded default. Also add env var support for QMD_GENERATE_MODEL and QMD_RERANK_MODEL. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-05 18:00:08 -04:00
JohnRichardEnders	14b384d9d8	feat(config): add ModelsConfig type to CollectionConfig Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-05 18:00:08 -04:00
Tobias Lütke	c940ce19d0	Merge pull request #508 from danmackinlay/dm/issue-507-osc8-editor-links feat: Add clickable OSC 8 editor links in CLI search output	2026-04-05 17:59:47 -04:00
dan mackinlay	1bada2eba6	Add explicit TTY link output tests	2026-04-05 17:58:09 -04:00
dan mackinlay	06f5642252	Fix stale ls test expectation	2026-04-05 17:56:26 -04:00
dan mackinlay	08b410696f	Document clickable TTY link output in README	2026-04-05 17:56:26 -04:00
dan mackinlay	636631225e	Add clickable OSC8 editor links for CLI search results	2026-04-05 17:56:26 -04:00
Tobias Lütke	4909aea28d	Merge pull request #485 from jamesrisberg/chore/migrate-ast-tests-to-vitest chore: migrate AST chunking tests to vitest	2026-04-05 17:21:42 -04:00
James Risberg	33fae1c4f5	chore: migrate AST chunking tests to vitest Replace standalone test-ast-chunking.mjs (823 lines, custom check() harness, invisible to CI) with proper vitest integration tests. All unique assertions preserved; duplicates already in ast.test.ts dropped. Performance benchmarks and real-collection scanner removed (dev tools, not regression tests). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-05 17:19:59 -04:00
Tobias Lütke	60c2c64c3a	Merge pull request #470 from jmilinovich/feat/bench-command feat(cli): add `qmd bench` for search quality benchmarks	2026-04-05 17:19:52 -04:00
John R Milinovich	b7a5a86a9b	feat(cli): add `qmd bench` command for search quality benchmarks Adds a benchmark harness that measures search quality across backends. Given a fixture file with queries and expected results, it runs each query through BM25, vector, hybrid (no rerank), and full pipeline, then reports precision@k, recall, MRR, F1, and latency. This is primarily a regression testing tool — users create fixtures for their own vaults to catch quality regressions after config or index changes. Ships with an example fixture against the eval-docs test collection to demonstrate the format. New files: src/bench/bench.ts — main runner src/bench/score.ts — precision, recall, MRR, F1, path matching src/bench/types.ts — fixture and result types src/bench/fixtures/ — example fixture test/bench-score.test.ts — unit tests for scoring (16 tests) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-05 17:17:59 -04:00
Tobias Lütke	76a2f0fb31	Merge pull request #506 from danmackinlay/fix-505-json-line-output feat: Include line in --json search output # Conflicts: # CHANGELOG.md	2026-04-05 17:16:05 -04:00
Tobias Lütke	6db34d7278	fix(llm): catch GPU init failures and fall back to CPU Adds QMD_LLAMA_GPU env var (set to false/off/none to force CPU) and wraps getLlama() in try/catch so Vulkan/CUDA init failures on headless or driverless machines fall back gracefully instead of crashing the node process with an uncatchable C++ terminate().	2026-04-05 17:12:53 -04:00
Tobias Lütke	9c9de94bd8	fix(handelize): restore lowercase + convert dots to dashes - Restore .toLowerCase() in handelize (was dropped, both test files expected it inconsistently) - Convert dots to dashes in filename body (e.g. v2.0 -> v2-0), keeping only the extension dot. Tobi confirmed this is the intended behavior. - Align both test/store.test.ts and test/store.helpers.unit.test.ts to match (they had diverged, one expected case-preserved, one lowercase) - Adjust 'ensureVecTable recreates' test to expect throw behavior (matches #501 dimension-mismatch fix)	2026-04-05 17:12:53 -04:00
Surma	2de225c9e7	Test nix flake builds in CI (#487 ) * Test nix flake builds in CI * Update outdated bun.lock file * fix: restore toLowerCase() in handelize and update tests * Fix flake to use proper FODs --------- Co-authored-by: Tobias Lütke <tobi@shopify.com>	2026-04-05 16:59:27 -04:00
Tobias Lütke	828823d20a	fix: restore toLowerCase() in handelize + align tests with post-#501 behavior - Restore .toLowerCase() in handelize (was dropped somewhere, tests expect it) - Update dimension-mismatch test to expect throw instead of silent rebuild (matches new behavior from #501) - Fix one stale test expectation for preserved dots in filenames	2026-04-05 16:56:06 -04:00
Antonio Mello	ef062e1b54	fix(multi-get): support brace expansion patterns in glob matching (#424 ) Brace expansion patterns like `{doc1,doc2}.md` or `collection/{a,b}.md` were incorrectly parsed as comma-separated file lists instead of being passed to the glob matcher (picomatch). This happened because the comma-detection heuristic only checked for `*` and `?` but not `{`. Also adds `collection/path` matching in `matchFilesByGlob` so patterns like `my-collection/{file1,file2}.md` work — previously the glob only matched against `qmd://collection/path` (virtual) and `path` (relative to collection root), missing the `collection/path` form. Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-05 16:45:33 -04:00
John R. Enders	09a4d19b31	fix(store): error on embedding dimension mismatch instead of silent rebuild (#501 ) When switching to an embedding model with different dimensions, ensureVecTableInternal() silently drops the vector table and all embeddings are lost. Users only discover this when semantic search returns empty results. Throw an error instead, telling users to run 'qmd embed -f' to explicitly re-embed. This is safe because embed -f calls clearAllEmbeddings() which drops the table before ensureVecTable is reached. Related to #497 Co-authored-by: JohnRichardEnders <john@telli.com>	2026-04-05 16:45:24 -04:00
John R. Enders	54550a3366	fix(llm): set explicit embed context size, default 2048, configurable via env var (#500 ) Without an explicit contextSize, node-llama-cpp defaults to "auto" which allocates the model's full training context (often 32k). For embedding chunks that are typically ~900 tokens this wastes ~3.5 GB of KV cache per context on Apple Silicon unified memory. Default to 2048 (matching the rerank context pattern) and allow override via QMD_EMBED_CONTEXT_SIZE for users with larger chunks. Addresses #329, related to #297 Co-authored-by: JohnRichardEnders <john@telli.com>	2026-04-05 16:45:12 -04:00
LJY	698b44fe87	Fix qmd embed model selection (#494 )	2026-04-05 16:45:04 -04:00
Oliver Ratzesberger	021236378b	fix(mcp): read version from package.json instead of hardcoding (#431 )	2026-04-05 16:44:35 -04:00
Matt Van Horn	1ad3388132	fix(store): preserve underscores in BM25 search terms (#404 ) sanitizeFTS5Term stripped all non-letter/non-number characters including underscores, causing snake_case identifiers like `my_variable` to become `myvariable` and silently fail BM25 matches. Add underscore to the preserved character set in the Unicode regex. Export the function and add unit tests covering snake_case, contractions, punctuation stripping, and unicode. Fixes #305 Co-authored-by: Matt Van Horn <455140+mvanhorn@users.noreply.github.com> Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-05 16:44:14 -04:00
Oliver Ratzesberger	959b719038	fix(mcp): include collection name in status output (#416 ) The MCP status tool shows collection paths but not names, making it impossible for agents to discover valid collection filter values. The CLI 'qmd status' already shows names. Add col.name prefix to each collection line in the status tool response.	2026-04-05 16:43:25 -04:00
Ahmed El Gabri	209b797c6a	fix(nix): correct CLI entry point path in wrapper (#413 ) The wrapper points at src/qmd.ts which no longer exists after the CLI was moved to src/cli/qmd.ts.	2026-04-05 16:43:18 -04:00
dan mackinlay	c22d00829b	Add line to JSON search output	2026-04-05 10:08:57 +00:00
Tobias Lütke	1fb2e2819e	Merge origin/main into feat/ast-aware-chunking Resolve conflicts: combine AST chunking args (filepath, chunkStrategy) with abort signal parameter from #458. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-28 20:00:49 -04:00
Tobias Lütke	dd27f499c7	Merge pull request #463 from goldsr09/fix/hyphenated-lex-queries Fix hyphenated tokens in FTS5 lex queries	2026-03-28 19:58:22 -04:00
Tobias Lütke	ea653304af	Merge pull request #458 from ccc-fff/fix/embed-infinite-loop fix: prevent qmd embed from running indefinitely	2026-03-28 19:58:17 -04:00
Tobias Lütke	616776ebdd	Merge pull request #453 from builderjarvis/fix/rerank-context-size fix: increase RERANK_CONTEXT_SIZE default 2048→4096, configurable via env var, fix template overhead underestimate	2026-03-28 19:58:11 -04:00
Tobias Lütke	6a45150f5a	Merge pull request #456 from antonio-mello-ai/fix/insert-embedding-vec0-replace fix(embed): handle vec0 OR REPLACE limitation in insertEmbedding	2026-03-28 19:58:06 -04:00
Tobias Lütke	827ad839f4	Merge origin/main into fix/fts5-collection-filter-performance Resolve conflict: use CTE approach from #455 with updated BM25 weights (1.5, 4.0, 1.0) from #462. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-28 19:57:33 -04:00
Tobias Lütke	08566ec316	Merge pull request #462 from goldsr09/fix/bm25-field-weights Fix BM25 field weights to include all 3 FTS columns	2026-03-28 19:56:04 -04:00
Tobias Lütke	5bef789ad3	Merge pull request #478 from zestyboy/feat/no-rerank-option Add rerank parameter to MCP query tool	2026-03-28 19:55:00 -04:00
Tobias Lütke	f73386ae02	Merge pull request #457 from antonio-mello-ai/fix/model-cache-xdg-cache-home fix: respect XDG_CACHE_HOME for model cache directory	2026-03-28 19:54:54 -04:00
Tobias Lütke	8d343b9da1	Update handelize tests for case/dot preservation (#475 ) PR #475 changed handelize() to preserve original case and dots, but the tests still expected lowercase output. Update assertions to match the new behavior. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-28 19:54:18 -04:00
Tobias Lütke	2ec3360a8b	Merge pull request #475 from alexei-led/fix/handelize-dots-and-case fix: preserve dots and original case in handelize()	2026-03-28 19:44:40 -04:00
Tobias Lütke	49ebf6e8ff	Merge pull request #479 from surma-dump/surma/fix-flake Fix flake	2026-03-28 19:43:15 -04:00
Surma	cf9991cfa7	Fix flake	2026-03-27 23:12:20 +00:00
Niven	792992ef65	Add rerank parameter to MCP query tool The MCP query tool always ran LLM reranking, even for lex-only queries. On CPU-only infrastructure (e.g. Railway), the reranker adds 60-120s per query. The SDK and CLI already support skipping reranking, but the MCP server did not expose this option. Add a `rerank` boolean parameter (default: true) to the MCP query tool's input schema, forwarded to store.search() as the existing `rerank` option. Fixes #477 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-27 13:10:31 -07:00
Alexei Ledenev	72f2dd1fe5	fix: preserve original filename case in handelize (MEMORY.md not memory.md)	2026-03-27 16:38:04 +03:00
Alexei Ledenev	ddecde78da	fix: preserve dots in filenames during handelize The handelize() regex replaced all non-letter/non-number chars with dashes, including dots in the filename stem. This mangled session filenames like "topic-1773595309.753009.md" to "topic-1773595309-753009.md", breaking memory_get path resolution (file not found on disk). Fix: add dot to the preserved character class in the filename regex. After deploying, run qmd-reindex.sh to rebuild indexes with correct paths.	2026-03-27 16:37:59 +03:00
Ryan	7b9bd01226	fix: handle hyphenated tokens in FTS5 lex queries Hyphenated terms like multi-agent, DEC-0054, gpt-4 were being stripped of hyphens and concatenated (e.g., "multiagent") which missed matches. Now they're split into FTS5 phrase queries ("multi agent") so the porter tokenizer matches them correctly.	2026-03-24 20:13:52 -04:00

1 2 3 4 5 ...

422 Commits