ai-workspace-services/qmd

Author	SHA1	Message	Date
Haitao Pan	47bd3ded44	feat(pg): add switchable PostgreSQL backend + OpenClaw/Hermes memory bridge Add an optional PostgreSQL backend (QMD_BACKEND=pg) alongside the unchanged default SQLite path. PG store uses pgvector (HNSW) for vectors and pg_jieba + pg_trgm for full-text/Chinese tokenization, with a namespace column isolating multi-agent memory (openclaw/hermes). - src/pg/: config, db-pg, schema bootstrap, memory store - MCP memory_add/memory_search/memory_get tools; qmd pg status + memory CLI - connection via QMD_PG_URL/DATABASE_URL/qmd config, stunnel TLS 5443 - tests: pg-config (unit) + pg-memory integration (gated on QMD_PG_URL) + pg-compose - docs/plan: plan, usage, test report, changelog; track docs/*/.md SQLite path: zero regression (typecheck clean, 249 passed / 6 skipped). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-23 19:13:04 +08:00
Haitao Pan	77024f7904	feat: add NVIDIA embedding API support and QMD remote sync	2026-06-12 07:32:43 +08:00
Haitao Pan	e3711767c6	fix: disable local qmd models by default	2026-05-23 11:04:48 +08:00
Haitao Pan	7c17c8bcce	feat: default to NVIDIA embeddings	2026-05-09 16:50:04 +08:00
Haitao Pan	fbad5791e3	feat: support NVIDIA embedding API	2026-05-09 16:44:47 +08:00
Haitao Pan	49fc83ebe2	Default embeddings to external API	2026-05-07 16:19:18 +08:00
Tobias Lütke	e8de7cab02	fix(cli): make status device probe opt-in	2026-04-21 21:45:52 -04:00
Tobi Lütke	cfd640ed34	fix(test): resolve LLM test timeouts by disabling file parallelism Parallel test files each cold-load their own LLM model, competing for CPU and causing timeouts even at 120s. Sequential execution eliminates contention — tests that timed out at 30s now complete in 1-15s. Made-with: Cursor	2026-04-11 01:21:22 +00:00
Tobias Lütke	525b9970cd	Merge pull request #546 from junmo-kim/fix/handelize-preserve-case fix: preserve original case in handelize()	2026-04-10 20:48:24 -04:00
Tobias Lütke	3295294be3	Merge pull request #532 from kuishou68/fix-qmd-uri-index-query fix: include custom index in qmd:// links	2026-04-10 20:47:55 -04:00
Tobias Lütke	46c4dfdaac	Merge pull request #545 from kuishou68/fix-sqlite-vec-actionable-guidance fix(store): surface actionable sqlite-vec guidance	2026-04-10 20:47:16 -04:00
Bek	e4990e470e	Harden embedding overflow handling	2026-04-10 16:02:46 -04:00
Kim Junmo	bb5becaf81	Merge remote-tracking branch 'origin/main' into fix/handelize-preserve-case # Conflicts: # CHANGELOG.md	2026-04-09 18:27:16 +09:00
kuishou68	0adbdeb337	fix(store): surface actionable sqlite-vec guidance	2026-04-09 10:13:40 +08:00
Tobias Lütke	171e9e3e65	Merge pull request #530 from kuishou68/fix-status-no-build-probe	2026-04-08 21:19:56 -04:00
Kim Junmo	fee576bf98	fix: migrate legacy lowercase paths on reindex When qmd update runs against an index created before case-preservation, documents may exist under lowercase paths (e.g. "skill.md" for a file actually named "SKILL.md"). Add findOrMigrateLegacyDocument() that: - Falls back to a lowercase lookup when the canonical path is not found - Renames the document path in-place via UPDATE OR IGNORE - Manually rebuilds the FTS entry (FTS5 INSERT OR REPLACE does not reliably update existing rows via triggers) - Handles UNIQUE conflicts gracefully (returns null on conflict) Embeddings are keyed by content hash, so the rename preserves all existing vectors — no re-embedding required. Both the CLI indexer and the library reindexer share the same helper, eliminating the duplication that a previous review flagged. Includes integration tests for: successful migration, already-lowercase no-op, and UNIQUE conflict handling.	2026-04-09 08:25:00 +09:00
Kim Junmo	9fb9de4fd2	fix: preserve original case in handelize() The blanket .toLowerCase() in handelize() drops filename casing, which breaks path resolution on case-sensitive filesystems (Linux). Files like README.md, CHANGELOG.md, and SKILL.md become unreachable when the index stores them as readme.md, changelog.md, skill.md. Since FTS5 already performs case-insensitive matching via the unicode61 tokenizer, lowercasing the stored path provides no search benefit — it only corrupts the metadata used to locate files on disk. Remove .toLowerCase() and update all affected test expectations.	2026-04-09 07:59:22 +09:00
Jeff Gardner	1ecb5c9f96	Fix QMD_LLAMA_GPU backend override handling	2026-04-07 18:49:22 +02:00
cocoon	8404cc3bb1	fix(uri): include index in custom qmd links	2026-04-07 23:26:19 +08:00
cocoon	26e3d0c077	fix(status): avoid build attempts during device probe	2026-04-07 23:18:58 +08:00
Tobi Lutke	66e70c028e	fix(test): reset _productionMode in getDefaultDbPath test Bun runs all test files in a single process, so module-level state leaks between files. The getDefaultDbPath test now resets the _productionMode flag before asserting it throws, fixing the flaky failure on Bun (ubuntu-latest) in CI.	2026-04-05 18:39:51 -04:00
Tobi Lutke	32e504c883	fix(test): remove duplicate path/handelize tests from store.test.ts These tests are already in store.helpers.unit.test.ts. The duplicates in store.test.ts failed in CI because _productionMode module state leaked from earlier tests in the same bun process, causing getDefaultDbPath to return a path instead of throwing.	2026-04-05 18:31:17 -04:00
JohnRichardEnders	50ce17bbfa	feat(llm): resolve models as config > env > default Separate hardcoded default from env var in DEFAULT_EMBED_MODEL so the constructor can resolve: config param > env var > hardcoded default. Also add env var support for QMD_GENERATE_MODEL and QMD_RERANK_MODEL. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-05 18:00:08 -04:00
dan mackinlay	1bada2eba6	Add explicit TTY link output tests	2026-04-05 17:58:09 -04:00
dan mackinlay	06f5642252	Fix stale ls test expectation	2026-04-05 17:56:26 -04:00
dan mackinlay	636631225e	Add clickable OSC8 editor links for CLI search results	2026-04-05 17:56:26 -04:00
James Risberg	33fae1c4f5	chore: migrate AST chunking tests to vitest Replace standalone test-ast-chunking.mjs (823 lines, custom check() harness, invisible to CI) with proper vitest integration tests. All unique assertions preserved; duplicates already in ast.test.ts dropped. Performance benchmarks and real-collection scanner removed (dev tools, not regression tests). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-05 17:19:59 -04:00
John R Milinovich	b7a5a86a9b	feat(cli): add `qmd bench` command for search quality benchmarks Adds a benchmark harness that measures search quality across backends. Given a fixture file with queries and expected results, it runs each query through BM25, vector, hybrid (no rerank), and full pipeline, then reports precision@k, recall, MRR, F1, and latency. This is primarily a regression testing tool — users create fixtures for their own vaults to catch quality regressions after config or index changes. Ships with an example fixture against the eval-docs test collection to demonstrate the format. New files: src/bench/bench.ts — main runner src/bench/score.ts — precision, recall, MRR, F1, path matching src/bench/types.ts — fixture and result types src/bench/fixtures/ — example fixture test/bench-score.test.ts — unit tests for scoring (16 tests) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-05 17:17:59 -04:00
Tobias Lütke	76a2f0fb31	Merge pull request #506 from danmackinlay/fix-505-json-line-output feat: Include line in --json search output # Conflicts: # CHANGELOG.md	2026-04-05 17:16:05 -04:00
Tobias Lütke	9c9de94bd8	fix(handelize): restore lowercase + convert dots to dashes - Restore .toLowerCase() in handelize (was dropped, both test files expected it inconsistently) - Convert dots to dashes in filename body (e.g. v2.0 -> v2-0), keeping only the extension dot. Tobi confirmed this is the intended behavior. - Align both test/store.test.ts and test/store.helpers.unit.test.ts to match (they had diverged, one expected case-preserved, one lowercase) - Adjust 'ensureVecTable recreates' test to expect throw behavior (matches #501 dimension-mismatch fix)	2026-04-05 17:12:53 -04:00
Surma	2de225c9e7	Test nix flake builds in CI (#487 ) * Test nix flake builds in CI * Update outdated bun.lock file * fix: restore toLowerCase() in handelize and update tests * Fix flake to use proper FODs --------- Co-authored-by: Tobias Lütke <tobi@shopify.com>	2026-04-05 16:59:27 -04:00
Tobias Lütke	828823d20a	fix: restore toLowerCase() in handelize + align tests with post-#501 behavior - Restore .toLowerCase() in handelize (was dropped somewhere, tests expect it) - Update dimension-mismatch test to expect throw instead of silent rebuild (matches new behavior from #501) - Fix one stale test expectation for preserved dots in filenames	2026-04-05 16:56:06 -04:00
Antonio Mello	ef062e1b54	fix(multi-get): support brace expansion patterns in glob matching (#424 ) Brace expansion patterns like `{doc1,doc2}.md` or `collection/{a,b}.md` were incorrectly parsed as comma-separated file lists instead of being passed to the glob matcher (picomatch). This happened because the comma-detection heuristic only checked for `*` and `?` but not `{`. Also adds `collection/path` matching in `matchFilesByGlob` so patterns like `my-collection/{file1,file2}.md` work — previously the glob only matched against `qmd://collection/path` (virtual) and `path` (relative to collection root), missing the `collection/path` form. Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-05 16:45:33 -04:00
LJY	698b44fe87	Fix qmd embed model selection (#494 )	2026-04-05 16:45:04 -04:00
Matt Van Horn	1ad3388132	fix(store): preserve underscores in BM25 search terms (#404 ) sanitizeFTS5Term stripped all non-letter/non-number characters including underscores, causing snake_case identifiers like `my_variable` to become `myvariable` and silently fail BM25 matches. Add underscore to the preserved character set in the Unicode regex. Export the function and add unit tests covering snake_case, contractions, punctuation stripping, and unicode. Fixes #305 Co-authored-by: Matt Van Horn <455140+mvanhorn@users.noreply.github.com> Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-05 16:44:14 -04:00
dan mackinlay	c22d00829b	Add line to JSON search output	2026-04-05 10:08:57 +00:00
Tobias Lütke	1fb2e2819e	Merge origin/main into feat/ast-aware-chunking Resolve conflicts: combine AST chunking args (filepath, chunkStrategy) with abort signal parameter from #458. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-28 20:00:49 -04:00
Tobias Lütke	dd27f499c7	Merge pull request #463 from goldsr09/fix/hyphenated-lex-queries Fix hyphenated tokens in FTS5 lex queries	2026-03-28 19:58:22 -04:00
Tobias Lütke	08566ec316	Merge pull request #462 from goldsr09/fix/bm25-field-weights Fix BM25 field weights to include all 3 FTS columns	2026-03-28 19:56:04 -04:00
Tobias Lütke	8d343b9da1	Update handelize tests for case/dot preservation (#475 ) PR #475 changed handelize() to preserve original case and dots, but the tests still expected lowercase output. Update assertions to match the new behavior. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-28 19:54:18 -04:00
Ryan	7b9bd01226	fix: handle hyphenated tokens in FTS5 lex queries Hyphenated terms like multi-agent, DEC-0054, gpt-4 were being stripped of hyphens and concatenated (e.g., "multiagent") which missed matches. Now they're split into FTS5 phrase queries ("multi agent") so the porter tokenizer matches them correctly.	2026-03-24 20:13:52 -04:00
Ryan	fa214db367	fix: correct BM25 field weights to include all 3 FTS columns The bm25() call only had 2 weights for 3 columns (filepath, title, body), giving body an implicit weight of 0. Add proper weights: filepath=1.5, title=4.0, body=1.0 so title matches are boosted and body content is scored.	2026-03-24 20:12:45 -04:00
James Risberg	244ddf5ecb	feat: AST-aware chunking for code files via tree-sitter Add opt-in AST-aware chunk boundary detection for code files using web-tree-sitter. When enabled with `--chunk-strategy auto`, code files (.ts, .tsx, .js, .jsx, .py, .go, .rs) are chunked at function, class, and import boundaries instead of arbitrary text positions. Default behavior (`regex`) is unchanged — no surprises on upgrade. In testing on QMD's own codebase, AST mode split 42% fewer function bodies across chunk boundaries compared to regex-only chunking. Usage: qmd embed --chunk-strategy auto qmd query "search terms" --chunk-strategy auto What's included: - Language detection from file extension with support for TypeScript, JavaScript (including arrow functions and function expressions), Python, Go, and Rust - Per-language tree-sitter queries with scored break points aligned to the existing markdown scale (class=100, function=90, type=80, import=60) - AST break points merged with regex break points — highest score wins at each position, so embedded markdown (comments, docstrings) still benefits from regex patterns - Refactored chunking core: chunkDocumentWithBreakPoints() extracted, mergeBreakPoints() added, async chunkDocumentAsync() wrapper for AST - ChunkStrategy type ("auto" \| "regex") threaded through generateEmbeddings(), hybridQuery(), structuredSearch(), CLI, and SDK - getASTStatus() health check wired into `qmd status` - Parse failures log a warning and fall back to regex — never crash Hardening: - Grammar packages are optionalDependencies with pinned versions to prevent ABI breaks from semver drift - web-tree-sitter is a direct dependency (pinned) - Errors are logged (not silently swallowed) for debuggability - Tested on both Node.js and Bun (Bun is actually faster) Testing: - 26 unit tests (test/ast.test.ts) — all 4 languages, error handling - 7 integration tests (test/store.test.ts) — merge, equivalence, bypass - Standalone test-ast-chunking.mjs with 63 synthetic tests and a real-collection performance scanner (npx tsx test-ast-chunking.mjs ~/code) - Validated end-to-end with qmd embed + qmd query on QMD's own codebase - Zero markdown regressions across all test paths Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-22 01:22:39 -04:00
Tobias Lütke	5f6821629b	Merge pull request #385 from rymalia/fix/launcher-lockfile-priority fix: prioritize package-lock.json in launcher to prevent Bun false positive	2026-03-14 08:08:03 -04:00
Tobias Lütke	5b48bcb6c1	Merge pull request #389 from sonwr/fix-issue-380-cleanup-no-sqlite-vec fix: skip cleanup when sqlite-vec is unavailable	2026-03-14 08:07:11 -04:00
programcaicai	809aa36172	fix: bound memory usage during embed	2026-03-13 17:39:17 +08:00
sonwr	7df09e8235	fix: skip vector cleanup when sqlite-vec is unavailable	2026-03-12 13:51:20 +00:00
Ryan Malia	28903d8eba	fix: prioritize package-lock.json in launcher to prevent Bun false positive The bin/qmd wrapper checks for bun.lock to select the runtime, but since bun.lock is committed to the repo, source builds using npm install are incorrectly routed to Bun — causing native module ABI mismatches (#381) and sqlite-vec crashes (#380). Add package-lock.json as a higher-priority signal: if it exists, npm installed the dependencies and Node should be used. Also fix cleanupOrphanedVectors() to use the existing isSqliteVecAvailable() guard instead of checking sqlite_master, which can report the virtual table even when the vec0 module isn't loaded. Fixes #381, fixes #380 Continuation of #362 (runtime detection false positives)	2026-03-12 01:46:38 -07:00
nkkko	b16d77146a	feat(skill): install packaged qmd skill	2026-03-10 23:18:15 +01:00
Tobi Lutke	55f16460d0	fix(ci): guard LLM calls in CI and increase test timeouts Add _ciMode flag to LlamaCpp that throws immediately on embedBatch, generate, expandQuery, and rerank when CI=true — prevents silent 30s timeouts. Skip MCP HTTP Transport tests in CI (they instantiate a real LlamaCpp). Bump vitest/bun test timeouts to 60s for slower CI runners.	2026-03-10 13:28:37 -04:00

1 2

86 Commits