ai-workspace-services/qmd

Author	SHA1	Message	Date
Tobi Lutke	da9cf691fd	release: v1.1.5	2026-03-07 20:16:32 -04:00
Tobi Lutke	66fb5b1d98	docs: write changelog for 1.1.5	2026-03-07 20:15:54 -04:00
Tobi Lutke	ad38c1f698	feat: add intent parameter for query disambiguation Add optional `intent` parameter that steers query expansion, reranking, chunk selection, and snippet extraction without searching on its own. When a query like "performance" is ambiguous (web-perf vs team health vs fitness), intent provides background context that disambiguates results across all pipeline stages: - expandQuery: includes intent in LLM prompt ("Query intent: {intent}") - rerank: prepends intent to rerank query for Qwen3-Reranker - chunk selection: intent terms scored at 0.5x weight vs query terms - snippet extraction: intent terms scored at 0.3x weight - strong-signal bypass: disabled when intent provided Available via CLI (--intent flag or intent: line in query documents), MCP (intent field on query tool), and programmatic API. Adapted from PR #180 (thanks @vyalamar).	2026-03-07 19:27:29 -04:00
Tobi Lutke	b838f74c8c	release: v1.1.2	2026-03-07 15:58:28 -04:00
Tobi Lutke	0ff9bec129	docs: write changelog for 1.1.2	2026-03-07 15:58:14 -04:00
Tobi Lutke	e3549dab1a	perf(rerank): cap parallelism, deduplicate chunks, cache by content - Cap rerank contexts at 4 to avoid VRAM exhaustion on high-core machines - Deduplicate identical chunk texts before sending to reranker - Cache rerank scores by chunk content instead of file path — same text from different files now shares a single reranker call - Add truncation cache to avoid re-tokenizing duplicate documents	2026-03-07 15:57:36 -04:00
Tobi Lutke	44d7145bfe	Merge pull request #242 from vyalamar/feat/query-explain-score-traces feat(query): add --explain score traces for hybrid retrieval	2026-03-07 14:35:26 -04:00
vyalamar	b068ad0dd6	feat(query): add --explain score traces for hybrid search	2026-03-07 14:35:10 -04:00
Tobias Lütke	7904ab9a9d	Merge pull request #273 from daocoding/feature/configurable-embed-model feat: add QMD_EMBED_MODEL env var for multilingual embedding support	2026-03-07 14:28:59 -04:00
Tobias Lütke	cb5d84ff07	Merge pull request #225 from ilepn/fix/sqlite-vec-windows-package-name fix(package.json): correct Windows sqlite-vec package name + add linux-arm64	2026-03-07 14:28:55 -04:00
Tobias Lütke	a4b641d8e3	Merge pull request #255 from pandysp/feat/expose-candidate-limit feat: expose candidateLimit as MCP tool parameter and CLI flag	2026-03-07 14:28:52 -04:00
Tobias Lütke	e3bc5ccdc3	Merge pull request #286 from joelev/fix/multi-session-http fix: support multiple concurrent HTTP clients	2026-03-07 14:28:50 -04:00
Tobias Lütke	8bd93366ad	Merge pull request #228 from amsminn/fix-empty-results-format fix(cli): prevent parser breakage on empty results across output formats	2026-03-07 14:25:16 -04:00
Tobias Lütke	0b3fb07a8f	Merge pull request #230 from Balneario-de-Cofrentes/fix/tty-progress-guard fix(cli): suppress progress bars when not TTY	2026-03-07 14:25:13 -04:00
Tobias Lütke	271feb7791	Merge pull request #253 from jimmynail/fix/skip-unreadable-files fix: skip unreadable files during indexing instead of crashing	2026-03-07 14:25:11 -04:00
Tobias Lütke	ee08997f23	Merge pull request #313 from 0xble/fix/expand-context-size-config fix(llm): make query expansion context size configurable	2026-03-07 14:25:04 -04:00
Tobias Lütke	a28163fb2c	Merge pull request #304 from sebkouba/feature/collection-ignore feat: add ignore patterns for collections	2026-03-07 14:25:02 -04:00
Tobias Lütke	e6b50cfca9	Merge pull request #308 from debugerman/fix/handelize-emoji-crash fix(store): handle emoji-only filenames in handelize (#302)	2026-03-07 14:24:59 -04:00
Tobias Lütke	72e96d16c0	Merge pull request #312 from 0xble/fix/empty-collection-deactivate fix(index): deactivate stale docs on empty collection updates	2026-03-07 14:24:57 -04:00
Tobias Lütke	f75c668e46	Merge pull request #310 from giladgd/nodeLlamaCppUseBuildAutoAttempt feat: use `build: "autoAttempt"` on `getLlama`	2026-03-07 14:24:52 -04:00
Tobias Lütke	6934c464db	Merge pull request #311 from gi11es/patch-1 Fix claude plugin setup syntax	2026-03-07 14:24:49 -04:00
Gilad S.	607ab7a402	fix: remove unused config	2026-03-07 07:48:13 +02:00
Brian Le	0dec1df047	fix(llm): make expansion context size configurable	2026-03-06 16:35:33 -05:00
Brian Le	49d5b4f450	fix(index): deactivate stale docs on empty collection updates	2026-03-06 16:29:52 -05:00
Tobi Lutke	2ae1baba2f	release: v1.1.1	2026-03-06 14:48:47 -04:00
Tobi Lutke	60721658c0	docs: write changelog for 1.1.1	2026-03-06 14:48:35 -04:00
Gilles Dubuc	7f8e33e0a9	Fix plugin install syntax	2026-03-06 12:14:16 +01:00
Gilles Dubuc	75589d77f3	Fix claude marketplace syntax	2026-03-06 12:12:14 +01:00
Gilad S.	3095041e0f	feat: use `build: "autoAttempt"` on `getLlama`	2026-03-06 07:02:50 +00:00
Ning	dc777e3be0	fix(store): handle emoji-only filenames in handelize (#302 ) Convert emoji codepoints to hex representation (e.g. 🐘 → 1f418) instead of crashing, so files like 🐘.md can be indexed without halting the entire update process. Fixes #302	2026-03-06 14:24:24 +08:00
Sebastian Kouba	fde542cd0d	feat: add ignore patterns for collections Add an optional 'ignore' field to collection config that accepts an array of glob patterns to exclude from indexing. This allows collections to skip specific subdirectories without needing separate collections. Example YAML config: personal: path: ~/personal_synced pattern: '*/.md' ignore: - 'Sessions/' - 'archive/' The ignore patterns are passed to fast-glob's ignore option alongside the existing hardcoded excludes (node_modules, .git, etc). Already-indexed files matching new ignore patterns are deactivated on the next update. Changes: - Add ignore?: string[] to Collection interface - Pass ignore patterns through to fast-glob in indexFiles() - Show ignore patterns in collection list/status output - 5 new CLI integration tests covering the feature	2026-03-05 19:17:44 +01:00
Joel Johnson	383a2e5cf1	fix: support multiple concurrent HTTP clients The HTTP MCP server creates a single Transport + McpServer pair at startup. Once the first client initializes, all subsequent clients are rejected with "Server already initialized" — making the HTTP mode unusable for reconnect, crash recovery, or multi-client scenarios. Replace the singleton with a per-session architecture: each initialize request creates its own McpServer + Transport pair, stored in a sessions Map keyed by session ID. The shared Store (SQLite) is stateless and safe for concurrent access. Key changes: - createSession() factory creates fresh McpServer + Transport per client - POST /mcp routes by mcp-session-id header to existing sessions - New initialize requests (no session header) create new sessions - Unknown session IDs return 404 per MCP spec - Missing session IDs return 400 - onsessioninitialized callback stores sessions at the right time - transport.onclose cleans up the sessions Map - Shutdown iterates all active sessions Tested with 3+ concurrent clients, session cleanup via DELETE, cross-session isolation, and rapid session creation. Fixes #195 Closes #163 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-03 22:22:49 -05:00
Big (daocoding)	b71649b12d	feat: add QMD_EMBED_MODEL env var for multilingual embedding support The default embeddinggemma-300M model is English-centric and produces poor embeddings for CJK (Chinese, Japanese, Korean) text. This change allows overriding the embedding model via the QMD_EMBED_MODEL environment variable. Changes: - DEFAULT_EMBED_MODEL now reads from QMD_EMBED_MODEL env var (fallback to embeddinggemma-300M for backward compatibility) - getDefaultLlamaCpp() passes QMD_EMBED_MODEL to LlamaCpp config when set - formatQueryForEmbedding() and formatDocForEmbedding() detect Qwen3-Embedding models and apply the correct prompt format (Qwen3 uses task-instruction format; embeddinggemma uses nomic-style prefix format) - store.ts: pass model URI to format functions so format selection is consistent between indexing and query time - README: document QMD_EMBED_MODEL with Qwen3-Embedding example Recommended multilingual model: QMD_EMBED_MODEL=hf:Qwen/Qwen3-Embedding-0.6B-GGUF/qwen3-embedding-0.6b-q8_0.gguf After changing the model, run: qmd embed -f	2026-03-01 12:41:09 -05:00
Tobias Lütke	40610c3aa6	Merge pull request #256 from rkbadhan/reward-design fix(reward): tighten entity detection, filler penalty, stricter diversity	2026-02-26 06:15:49 -05:00
rkbadhan	4511b9bd4d	fix(reward): tighten entity detection, add filler penalty, stricter diversity - Compound entity chaining now stops one level deep. Previously "TDS motorsports team history" would inflate the expected entity set with "team" and "history", causing false-positive entity-preservation penalties during GRPO. Now only {tds, motorsports} are detected. - Add INTERIOR_FILLER_WORDS penalty (-3/line): lex lines containing "overview" or "basics" absent from the original query are penalised. Targets template-generator noise, e.g. "ancient overview rome timeline". - Raise is_diverse threshold 2→3: requires 3 unique words between lex lines before they count as diverse. Reduces reward for near-duplicate pairs like "auth setup" / "auth configuration". - Broaden quoted-phrase bonus: was gated on named entities existing; now any multi-word query earns +3 for using quotes in lex lines. Better incentivises BM25-aware syntax like "memory leak" python. Fixes scoring noise identified while working on issue #247.	2026-02-24 19:46:23 +05:30
Andreas Spannagel	87bd968d7b	feat: expose candidateLimit as MCP tool parameter and CLI flag Reranking 40 chunks takes ~2 min on CPU (the default candidateLimit). The option already exists in hybridQuery()/structuredSearch() but was never surfaced to users. This adds: - `candidateLimit` param to the MCP `query` tool inputSchema - `candidateLimit` field to the REST /query endpoint - `--candidate-limit` / `-C` CLI flag for `qmd query` Default stays 40 (no behavior change). Users on CPU-only machines can lower it for a speed/recall tradeoff. Complements #231. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-24 14:13:49 +01:00
CHAEWAN KIM	b024693f5d	Merge branch 'main' into fix-empty-results-format	2026-02-23 22:36:21 -08:00
Kit	32cd83b470	fix: skip unreadable files during indexing instead of crashing On macOS with iCloud Drive (especially shared folders), some files may appear in the filesystem but return EAGAIN (error -11) when read via Node's readFileSync. This happens when iCloud has evicted the file content but the file metadata remains visible. Previously this crashed the entire update process. Now we catch the error and skip the file, allowing the remaining files to index successfully. Affects: iCloud Drive shared folders on macOS Error: 'Unknown system error -11: Unknown system error -11, read' Reproduces with: Node.js v25.x, readFileSync on evicted iCloud files	2026-02-23 14:40:09 -08:00
Tobi Lütke	d6f3688d91	Remove grpo command from default train entrypoint	2026-02-22 15:29:09 -05:00
Tobi Lütke	189916d6fb	Move GRPO training out of default finetune pipeline	2026-02-22 15:26:23 -05:00
Tobi Lütke	cbeeb1f89b	Add wall-clock checkpoints and full eval defaults	2026-02-22 15:02:02 -05:00
Tobi Lütke	5233e676d9	fix(rerank): truncate documents exceeding 2048-token context size node-llama-cpp throws a hard error when any document + query + template overhead exceeds the ranking context size. Truncate oversized documents using the rerank model's tokenizer before passing them to rankAll(). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-22 12:41:59 -05:00
Tobi Lutke	1d7d167b29	finetune: strict Pydantic schema, one canonical data format Replace ad-hoc JSON parsing with a strict Pydantic model (TrainingExample with typed OutputPair). All data loading goes through load_examples() which fails loudly on invalid data. - Convert v3_structured.jsonl from "searches" to "output" format - Rewrite all consumer scripts (prepare, validate, score, analyze) to load through the Pydantic schema - Prepared train/val files are ephemeral build artifacts - Restore LFM2 and GEPA experiments under experiments/ - Add pydantic>=2.0 to dependencies	2026-02-22 13:39:00 -04:00
Tobi Lutke	3950055708	finetune: quoted phrases, negation, and entity preservation (#247 ) Training data: - Expand lex phrases/negation examples from 12 to 74 with intent field - Add 50 personal entity examples (meetings, emails, projects with names) Reward function: - Detect entities at position 0 (fixes "Bob asked about deploy") - Per-entity coverage penalty: -20 per entity absent from all lex+vec - Phrase quoting bonus: +3 when lex uses quotes for multi-word terms - Expanded stopwords to reduce false positive entity detection Eval queries: add 21 test queries for personal entities, quoted phrases, and negation/disambiguation scenarios.	2026-02-22 13:38:59 -04:00
Tobi Lutke	599935754b	finetune: remove orphaned files and abandoned experiments Remove one-off data generator/fix scripts, superseded data files (v2, v3 replaced by v3_structured), LFM2 experiment, GEPA directory, duplicate job scripts, and historical docs. Clean up Justfile. These are restored under experiments/ in a later commit.	2026-02-22 13:38:59 -04:00
Tobi Lutke	64ef25e1f6	Document query grammar and add skill helpers	2026-02-22 13:36:08 -04:00
Tobi Lutke	0e0feb6f2b	release: v1.1.0	2026-02-22 11:09:36 -04:00
Tobi Lutke	60564b34f8	chore: gitignore package-lock.json	2026-02-22 11:09:36 -04:00
Tobi Lutke	1765b6870d	docs: write changelog for 1.1.0 Query document format, lex phrase/negation syntax, standard node shebang, collection management commands, and formal SYNTAX.md spec.	2026-02-22 11:09:36 -04:00
Tobi Lutke	c7e8ea02a5	test: restructure container smoke tests for interactive use Replaces the inner test script with an outer driver that runs individual podman/docker commands against a pre-built image. Tests sqlite-vec loading and store unit tests under both node and bun runtimes. Supports --build (image only), --shell (interactive), and -- CMD (arbitrary command) for debugging install issues in isolation.	2026-02-22 11:09:36 -04:00

... 2 3 4 5 6 ...

469 Commits