ai-workspace-services/qmd

Author	SHA1	Message	Date
Antonio	840a614223	fix: respect XDG_CACHE_HOME for model cache directory MODEL_CACHE_DIR was hardcoded to ~/.cache/qmd/models/, ignoring the XDG_CACHE_HOME environment variable. This was inconsistent with the rest of the codebase (store.ts, cli/qmd.ts) which already respects XDG paths. Fixes #425 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-24 11:08:21 -03:00
Mike Bannister	bc80e72a06	chore: update bun.lock after dependency install	2026-03-23 11:49:25 -04:00
Mike Bannister	939d15652c	fix: use CTE in searchFTS to prevent query planner regression with collection filter When searchFTS combines FTS5 MATCH with a collection filter (d.collection = ?) in the same WHERE clause, SQLite's query planner abandons the FTS5 index and falls back to a full scan. This turns an 8ms query into a 17+ second query on large collections (16K+ documents). The fix wraps the FTS5 query in a CTE so it runs first with proper index usage, then filters by collection on the materialized results. Benchmarks on a 16,258-document collection: Before: qmd search "knowctl" -c <collection> → 19.8s After: qmd search "knowctl" -c <collection> → 0.4s The CTE fetches limit*10 candidates from the FTS index to ensure enough results survive collection filtering. Without a collection filter, the query plan was already optimal, so no CTE overhead is added in that case.	2026-03-23 11:35:22 -04:00
James Risberg	244ddf5ecb	feat: AST-aware chunking for code files via tree-sitter Add opt-in AST-aware chunk boundary detection for code files using web-tree-sitter. When enabled with `--chunk-strategy auto`, code files (.ts, .tsx, .js, .jsx, .py, .go, .rs) are chunked at function, class, and import boundaries instead of arbitrary text positions. Default behavior (`regex`) is unchanged — no surprises on upgrade. In testing on QMD's own codebase, AST mode split 42% fewer function bodies across chunk boundaries compared to regex-only chunking. Usage: qmd embed --chunk-strategy auto qmd query "search terms" --chunk-strategy auto What's included: - Language detection from file extension with support for TypeScript, JavaScript (including arrow functions and function expressions), Python, Go, and Rust - Per-language tree-sitter queries with scored break points aligned to the existing markdown scale (class=100, function=90, type=80, import=60) - AST break points merged with regex break points — highest score wins at each position, so embedded markdown (comments, docstrings) still benefits from regex patterns - Refactored chunking core: chunkDocumentWithBreakPoints() extracted, mergeBreakPoints() added, async chunkDocumentAsync() wrapper for AST - ChunkStrategy type ("auto" \| "regex") threaded through generateEmbeddings(), hybridQuery(), structuredSearch(), CLI, and SDK - getASTStatus() health check wired into `qmd status` - Parse failures log a warning and fall back to regex — never crash Hardening: - Grammar packages are optionalDependencies with pinned versions to prevent ABI breaks from semver drift - web-tree-sitter is a direct dependency (pinned) - Errors are logged (not silently swallowed) for debuggability - Tested on both Node.js and Bun (Bun is actually faster) Testing: - 26 unit tests (test/ast.test.ts) — all 4 languages, error handling - 7 integration tests (test/store.test.ts) — merge, equivalence, bypass - Standalone test-ast-chunking.mjs with 63 synthetic tests and a real-collection performance scanner (npx tsx test-ast-chunking.mjs ~/code) - Validated end-to-end with qmd embed + qmd query on QMD's own codebase - Zero markdown regressions across all test paths Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-22 01:22:39 -04:00
Jarvis	783359f55c	fix: increase RERANK_CONTEXT_SIZE default 2048→4096, make configurable via QMD_RERANK_CONTEXT_SIZE env var, fix RERANK_TEMPLATE_OVERHEAD underestimate 200→512 Default 2048 was too small for longer documents (session transcripts, CJK text, large markdown files). After truncation the Qwen3 reranker template adds more overhead than the original 200-token estimate, causing node-llama-cpp to throw 'input lengths exceed context size'. Fixes: tobi/qmd#91 tobi/qmd#290 tobi/qmd#291 tobi/qmd#314	2026-03-21 20:59:11 -07:00
Tobias Lütke	2b8f329d7e	Merge pull request #370 from mvanhorn/osc/231-no-rerank-cli-flag feat(cli): add --no-rerank flag to skip reranking in qmd query	2026-03-14 08:10:38 -04:00
Tobias Lütke	4721e07975	Merge pull request #371 from oysteinkrog/fix/wsl-drvfs-path-detection Fix QMD when running under WSL (Windows Subsystem for Linux)	2026-03-14 08:10:25 -04:00
Tobias Lütke	95dc295433	Merge pull request #377 from serhii12/fix/sqlite-vec-macos-bun-error-handling fix(db): add macOS Homebrew SQLite support for Bun and restore actionable errors	2026-03-14 08:08:50 -04:00
Tobias Lütke	43660c468e	Merge pull request #382 from rymalia/fix/zod-version-pin fix: pin zod to exact 4.2.1 to prevent tsc build failure	2026-03-14 08:08:20 -04:00
Tobias Lütke	5f6821629b	Merge pull request #385 from rymalia/fix/launcher-lockfile-priority fix: prioritize package-lock.json in launcher to prevent Bun false positive	2026-03-14 08:08:03 -04:00
Tobias Lütke	f35b4e19e0	Merge pull request #393 from lskun/fix/embed-context-overflow fix: truncate oversized text before embedding to prevent GGML crash	2026-03-14 08:07:47 -04:00
Tobias Lütke	a13a84fb28	Merge pull request #396 from Mic92/qmd-fix sync stale bun.lock, guard against future lockfile drift	2026-03-14 08:07:25 -04:00
Tobias Lütke	5b48bcb6c1	Merge pull request #389 from sonwr/fix-issue-380-cleanup-no-sqlite-vec fix: skip cleanup when sqlite-vec is unavailable	2026-03-14 08:07:11 -04:00
Tobias Lütke	7ab1497ebb	Merge pull request #395 from ProgramCaiCai/fix/embed-batching-memory Bound qmd embed memory usage with default batched processing	2026-03-14 08:06:51 -04:00
Tobias Lütke	398eadf15b	Merge pull request #399 from shreyaskarnik/feat/onnx-conversion Add ONNX conversion script for Transformers.js deployment	2026-03-14 08:05:10 -04:00
Shreyas Karnik	df8d625c00	fix: map quantize_type to valid Transformers.js dtype values --quantize none now emits dtype: "fp32" in the README instead of dtype: "none", matching Transformers.js documented values (fp32, fp16, q8, q4).	2026-03-13 12:57:19 -07:00
Shreyas Karnik	b05d8863ca	fix: quantization paths, missing imports, and hardcoded metadata - Add missing subprocess import (NameError on any quantize path) - Replace broken optimum-cli quantize calls with direct onnxruntime: Q4 uses MatMulNBitsQuantizer, Q8 uses quantize_dynamic - Add onnxconverter-common to deps for FP16 (was silently swallowed) - Make FP16 fail loudly on missing dep instead of silently uploading FP32 - README and transformers_js_config now reflect actual quantize_type instead of always hardcoding Q4 - Remove dead _convert_fp16_external function	2026-03-13 12:45:48 -07:00
Shreyas Karnik	e1ce37c989	fix: handle 2GB protobuf limit, add validation, fix input feeds - Use no_post_process=True for ONNX export to avoid protobuf serialize error - Add --validate and --validate-only flags for inference verification - Fix position_ids in validation feed (required by Qwen3 ONNX export) - Use optimum-cli for quantization to handle external data format - Fix optimum dependency to optimum[onnxruntime] Tested: export + validation passes on CPU, KV cache present (56 tensors). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-13 12:30:26 -07:00
Shreyas Karnik	2df95ac9ba	feat: add ONNX conversion script for Transformers.js deployment Add convert_onnx.py that mirrors convert_gguf.py's structure: - Loads base Qwen3 model, merges SFT + GRPO adapters - Exports to ONNX via Optimum (text-generation-with-past task) - Supports Q4 (MatMulNBits), Q8, FP16, and FP32 output - Uploads to separate HF repo (e.g. tobil/qmd-query-expansion-1.7B-ONNX) - Writes Transformers.js compatibility config - Includes model card with usage example Usage: uv run convert_onnx.py --size 1.7B uv run convert_onnx.py --size 1.7B --quantize q4 --no-upload Also adds `just convert-onnx` and `just convert-gguf` tasks. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-13 11:50:03 -07:00
Jörg Thalheim	8c4b4b335d	sync stale bun.lock, guard against future lockfile drift bun.lock still resolved better-sqlite3 to 11.x after package.json was bumped to ^12.4.5 in v2.0.0. This breaks sandboxed builds (e.g. Nix with bun2nix) where network access is unavailable to resolve the mismatch. CI and the publish workflow now use --frozen-lockfile so drift is caught immediately. The release script also validates lockfile consistency before tagging. Closes #386	2026-03-13 13:34:17 +01:00
programcaicai	809aa36172	fix: bound memory usage during embed	2026-03-13 17:39:17 +08:00
edy	9718d3767c	fix: truncate oversized text before embedding to prevent GGML crash When a chunk exceeds the embedding model's context window (trainContextSize), node-llama-cpp's getEmbeddingFor() triggers a native SIGABRT in GGML/Metal, crashing the entire process. Fix: Add truncateToContextSize() guard in embed() and embedBatch() that uses the model's own tokenizer to check token count before calling getEmbeddingFor(). Oversized text is truncated to (trainContextSize - 4) tokens with a warning, preserving partial embedding coverage instead of crashing. Fixes #303	2026-03-13 09:35:17 +08:00
sonwr	7df09e8235	fix: skip vector cleanup when sqlite-vec is unavailable	2026-03-12 13:51:20 +00:00
Ryan Malia	28903d8eba	fix: prioritize package-lock.json in launcher to prevent Bun false positive The bin/qmd wrapper checks for bun.lock to select the runtime, but since bun.lock is committed to the repo, source builds using npm install are incorrectly routed to Bun — causing native module ABI mismatches (#381) and sqlite-vec crashes (#380). Add package-lock.json as a higher-priority signal: if it exists, npm installed the dependencies and Node should be used. Also fix cleanupOrphanedVectors() to use the existing isSqliteVecAvailable() guard instead of checking sqlite_master, which can report the virtual table even when the vec0 module isn't loaded. Fixes #381, fixes #380 Continuation of #362 (runtime detection false positives)	2026-03-12 01:46:38 -07:00
Ryan Malia	e988c5d286	fix: pin zod to exact 4.2.1 to fix tsc build failure (#379 ) The caret range ^4.2.1 allows npm to resolve zod 4.3.x, which has breaking type changes against @modelcontextprotocol/sdk. Source builds fail with TypeScript errors. Pinning to exact 4.2.1 resolves this. See: https://github.com/tobi/qmd/issues/379	2026-03-11 21:43:09 -07:00
serhii12	52ebafce11	fix(db): add macOS Homebrew SQLite support for Bun and restore actionable errors On macOS, bun:sqlite uses Apple's system SQLite which is compiled with SQLITE_OMIT_LOAD_EXTENSION, preventing sqlite-vec from loading. The v2.0 refactor also silently swallowed extension loading failures, losing the actionable error messages that existed pre-2.0. - Call Database.setCustomSQLite() on macOS to use Homebrew's SQLite - Eagerly validate extension loading at init, not at first query - Throw with platform-specific fix instructions in loadSqliteVec() - Log warning in store.ts instead of silently catching Fixes #363	2026-03-11 12:55:20 -03:00
Øystein Krog	26d4ebfa56	fix: skip Git Bash path detection on WSL On WSL, paths like /c/work/... are valid drvfs mount points, not Git Bash drive-letter shortcuts. The existing code in isAbsolutePath() and resolve() detected /c/ as a Windows C: path, converting drvfs paths to C:/work/... which broke indexing entirely. Fix: detect WSL via WSL_DISTRO_NAME or WSL_INTEROP environment variables and skip the Git Bash /c/ -> C: branch on WSL. Native Linux path handling continues as before.	2026-03-11 12:25:28 +01:00
Matt Van Horn	11b3f17fba	feat(cli): add --no-rerank flag to skip reranking in qmd query Exposes the existing skipRerank option as a --no-rerank CLI flag for qmd query. On CPU-only machines, reranking takes 120s+ for 20 chunks - this flag lets users get RRF-fused results without the reranking penalty. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-10 23:41:43 -07:00
Tobias Lütke	ae3604cb88	Merge pull request #362 from syedair/fix/launcher-bun-install-false-positive fix: remove $BUN_INSTALL check from launcher to prevent false Bun detection	2026-03-10 21:38:41 -04:00
Syed Humair	b0a14b18ad	fix: remove $BUN_INSTALL check from launcher to prevent false Bun detection When Bun is installed on the system but QMD was installed via npm, $BUN_INSTALL is always set (typically to ~/.bun), causing the launcher to incorrectly run QMD under Bun. This leads to ABI mismatches with native modules (better-sqlite3, sqlite-vec) that were compiled for Node, breaking vector operations with "no such module: vec0". Only check for bun.lock/bun.lockb files, which reliably indicate that QMD was actually installed with Bun. Fixes #361 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-11 05:24:44 +04:00
Tobi Lutke	21a5dcc853	release: v2.0.1	2026-03-10 20:59:27 -04:00
Tobi Lutke	1207fe7776	docs: write changelog for 2.0.1	2026-03-10 20:58:45 -04:00
Tobias Lütke	55c951b15e	Merge pull request #349 from byheaven/fix/qwen3-embedding-model-filename-case docs: fix Qwen3-Embedding GGUF filename case (404 on download)	2026-03-10 20:08:53 -04:00
Tobias Lütke	22ba426dc0	Merge pull request #352 from nibzard/fix-global-launcher-path Fix launcher path for global installs	2026-03-10 20:08:04 -04:00
Tobias Lütke	e710e9d2b9	Merge pull request #355 from nibzard/feat-skill-install-clean Add skill install command	2026-03-10 20:07:47 -04:00
nkkko	b16d77146a	feat(skill): install packaged qmd skill	2026-03-10 23:18:15 +01:00
nkkko	9f4c71c783	fix(cli): resolve symlinked global launcher path	2026-03-10 22:38:20 +01:00
Tobi Lutke	55f16460d0	fix(ci): guard LLM calls in CI and increase test timeouts Add _ciMode flag to LlamaCpp that throws immediately on embedBatch, generate, expandQuery, and rerank when CI=true — prevents silent 30s timeouts. Skip MCP HTTP Transport tests in CI (they instantiate a real LlamaCpp). Bump vitest/bun test timeouts to 60s for slower CI runners.	2026-03-10 13:28:37 -04:00
Tobi Lutke	ed0249fd6b	fix(test): increase timeout for SDK search tests that trigger LLM expansion These tests load the query expansion model on first call, which consistently exceeds the 30s timeout on CI runners.	2026-03-10 12:59:46 -04:00
Tobi Lutke	8478ddb666	release: v2.0.0	2026-03-10 11:53:25 -04:00
Tobi Lutke	a444c86382	docs: rewrite SDK section for 2.0, fix MCP tool names, add changelog - Expand SDK documentation from ~70 lines to comprehensive coverage: store creation modes, unified search(), retrieval, collections, context, indexing, types, and lifecycle - Fix MCP tools section: old names (qmd_search, qmd_deep_search) replaced with actual registered names (query, get, multi_get, status) - Write 2.0.0 changelog under [Unreleased]	2026-03-10 11:53:12 -04:00
Tobi Lutke	b252219add	fix(deps): bump better-sqlite3 to ^12.4.5, add runtime-aware bin wrapper - Bump better-sqlite3 from ^11 to ^12.4.5 for Node 25 support (prebuilds + V8 API compat). Closes #257. - Add bin/qmd shell wrapper that detects bun vs node install and execs with the matching runtime, preventing native module ABI mismatches when installed via bun. Closes #319.	2026-03-10 11:43:00 -04:00
Tobi Lutke	c68904fe08	refactor: move CLI and MCP to subdirectories, MCP consumes SDK Move frontends into src/cli/ and src/mcp/ to separate them from the core library. The MCP server is fully rewritten to import only from the SDK (src/index.ts) — zero direct store.ts/collections.ts/llm.ts access. - src/qmd.ts → src/cli/qmd.ts - src/formatter.ts → src/cli/formatter.ts - src/mcp.ts → src/mcp/server.ts (rewritten to use QMDStore SDK) - New src/maintenance.ts: Maintenance class for CLI housekeeping - SDK gains: getDocumentBody(), getDefaultCollectionNames(), extractSnippet/addLineNumbers/DEFAULT_MULTI_GET_MAX_BYTES exports, getDefaultDbPath re-export, InternalStore type export - package.json bin/scripts updated for new paths - All 692 tests pass	2026-03-10 11:39:55 -04:00
Tobi Lutke	839d774a06	feat: redesign SDK search API with unified search() and ExpandedQuery type Replace three separate search methods (query, search, structuredSearch) with a single search(options) that accepts either a query string (auto-expanded) or pre-expanded queries. Add searchLex/searchVector convenience methods and expandQuery for manual control. Unify StructuredSubSearch and ExpandedQuery into a single ExpandedQuery type with { type, query } used throughout the pipeline. Add skipRerank option to hybridQuery and structuredSearch for fast no-LLM searches. New SDK surface: - search({ query, intent, rerank, limit, ... }) - search({ queries: expanded }) - searchLex(query, opts) - searchVector(query, opts) - expandQuery(query, { intent })	2026-03-10 11:04:45 -04:00
YuBai	740b17b485	docs: fix Qwen3-Embedding GGUF filename case in README and llm.ts HuggingFace filenames are case-sensitive. The documented filename 'qwen3-embedding-0.6b-q8_0.gguf' (lowercase) returns 404. The correct filename is 'Qwen3-Embedding-0.6B-Q8_0.gguf' (original case from the HuggingFace repo). Co-Authored-By: Oz <oz-agent@warp.dev>	2026-03-10 18:54:36 +08:00
Tobi Lutke	032f26edca	release: v1.1.6	2026-03-09 17:23:14 -04:00
Tobi Lutke	0c83dc1593	docs: write changelog for 1.1.6	2026-03-09 17:22:54 -04:00
Tobi Lutke	040c6fa904	feat: add SDK/library mode for programmatic access Allow QMD to be used as a library (`import { createStore } from '@tobilu/qmd'`) in addition to CLI and MCP modes. The constructor requires explicit dbPath and either a configPath (YAML file) or inline config object — no defaults assumed, making it safe to embed in any application. - Add src/index.ts entry point with QMDStore interface exposing search, retrieval, collection/context management, and index health - Add setConfigSource() to collections.ts for inline config support (in-memory config with no file I/O) - Add main/types/exports fields to package.json - Add SDK documentation section to README - Add 56 unit tests covering constructor, collections, contexts, search, document retrieval, config isolation, YAML persistence, and lifecycle	2026-03-08 15:59:22 -04:00
Tobi Lutke	4fa11682db	fix: update Store type to match intent parameter signatures	2026-03-07 21:30:31 -04:00
Tobi Lutke	ba97c03b02	docs: credit Ilya Grigorik in 1.1.5 changelog	2026-03-07 21:26:32 -04:00

1 2 3 4 5 ...

469 Commits