ai-workspace-services/qmd

Author	SHA1	Message	Date
Tobi Lutke	79a53f856e	docs(release): add dependency pinning policy and update check step Release process now checks for sqlite-vec, node-llama-cpp, and better-sqlite3 updates before cutting a release. All deps must be pinned to exact versions.	2026-04-05 18:17:13 -04:00
Tobi Lutke	ad38c1f698	feat: add intent parameter for query disambiguation Add optional `intent` parameter that steers query expansion, reranking, chunk selection, and snippet extraction without searching on its own. When a query like "performance" is ambiguous (web-perf vs team health vs fitness), intent provides background context that disambiguates results across all pipeline stages: - expandQuery: includes intent in LLM prompt ("Query intent: {intent}") - rerank: prepends intent to rerank query for Qwen3-Reranker - chunk selection: intent terms scored at 0.5x weight vs query terms - snippet extraction: intent terms scored at 0.3x weight - strong-signal bypass: disabled when intent provided Available via CLI (--intent flag or intent: line in query documents), MCP (intent field on query tool), and programmatic API. Adapted from PR #180 (thanks @vyalamar).	2026-03-07 19:27:29 -04:00
Tobi Lutke	64ef25e1f6	Document query grammar and add skill helpers	2026-02-22 13:36:08 -04:00
Tobi Lütke	4649069e62	feat: add expand: type, rename to query, document syntax BREAKING CHANGES: - MCP tool renamed: structured_search → query - HTTP endpoint renamed: /search → /query New features: - expand: type auto-expands via local LLM (max 1 per query) - docs/SYNTAX.md formal grammar for query documents - lex syntax: "phrase", -negation documented Query types: lex, vec, hyde, expand Default (no prefix) = expand (backwards compatible)	2026-02-18 22:22:50 -05:00
Tobi Lütke	de3a83a553	refactor: remove OR operator from lex queries Simplify to just: terms, "phrases", and -negation	2026-02-18 22:17:52 -05:00
Tobi Lütke	77e4d8f378	refactor: remove single collection param, use collections array only BREAKING: collection param removed from structured_search. Use collections: ['name'] for single collection filter.	2026-02-18 22:16:15 -05:00
Tobi Lütke	efb39616e6	feat(lex): add query syntax for exact phrases, negation, and OR Lex queries now support: - "exact phrase" - quoted exact matching (no prefix) - -term or -"phrase" - exclude from results - term1 OR term2 - match either term Semantic queries (vec/hyde) validate and reject these operators with helpful error messages. Examples: performance -sports → matches "performance" excluding "sports" "machine learning" → exact phrase match auth OR authentication → matches either term	2026-02-18 22:14:09 -05:00
Tobi Lütke	d1ec31eab8	feat: add collections array filter + improve query writing docs - structured_search now accepts collections[] for OR filtering - Updated skill docs with detailed query writing guidance - lex: 2-5 keywords, include synonyms, exact names - vec: full natural language questions with context - hyde: 50-100 word hypothetical answer passages	2026-02-18 22:09:24 -05:00
Tobi Lütke	6d6bdff09c	docs: simplify skill documentation	2026-02-18 22:00:24 -05:00
Tobi Lütke	19284ddb80	refactor(mcp): remove deprecated search tools, keep only structured_search BREAKING CHANGE: MCP tools search, vector_search, deep_search removed. Use structured_search with lex/vec/hyde queries instead. - Remove search, vector_search, deep_search MCP tool registrations - Update MCP instructions to focus on structured_search - Update skill docs to reflect simplified API - Rename test describes to reflect they test store functions - CLI commands (qmd search, vsearch, query) unchanged for backwards compat	2026-02-18 21:50:25 -05:00
Tobi Lütke	bdec84a3e9	docs: use npm package name, clarify vec/hyde both use vector similarity	2026-02-18 19:34:08 -05:00
Tobi Lütke	0201710c2b	feat: add structured_search for LLM-provided query expansions - New MCP tool: structured_search - lets capable LLMs provide their own lex/vec/hyde query variations instead of using local expansion model - New REST endpoint: POST /search - same functionality without MCP protocol - Updated skill docs to prioritize structured_search for LLM callers - Added installation instructions for Claude Code, Desktop, and OpenClaw Pipeline: lex→FTS, vec/hyde→batch embed, RRF fusion (first query 2x weight), chunk + rerank, position-aware blending, dedup. This is the recommended endpoint for capable LLMs - they generate better query variations than the small local model, especially for domain-specific or nuanced queries.	2026-02-18 19:05:49 -05:00
Tobi Lutke	63f3b68559	feat: show models in status, improve pre-push hook - Move model info from --help to `qmd status` with live HuggingFace links derived from actual configured URIs - Pre-push hook: handle non-interactive shells gracefully, resolve annotated tags correctly for CI checks	2026-02-16 09:08:28 -04:00
Tobi Lutke	7fb69a5ca2	feat: release skill with changelog-driven workflow and git hooks - Add /release skill with full process: hook install, changelog validation, git history review, preview, and release execution - Skill auto-populates [Unreleased] from git history when empty - Install hook script symlinks pre-push for tag validation - Register skills/ dir in .pi/settings.json for pi discovery	2026-02-16 08:46:10 -04:00
Tobi Lutke	09803a75b7	feat: compile to JS for npm, release system, full changelog - Add tsc build step (tsconfig.build.json) so npm package ships compiled JS instead of raw TypeScript requiring tsx at runtime - Update qmd wrapper and daemon spawn to use dist/qmd.js in production while keeping tsx for development - Add self-installing pre-push hook validating v* tag pushes: package.json version match, changelog entry, CI status - Add release.sh script that renames [Unreleased] to versioned entry, bumps package.json, commits, and tags - Add extract-changelog.sh for cumulative GitHub release notes - Update publish workflow with build step and GitHub release creation - Flesh out CHANGELOG.md with full history from 0.1.0 through 1.0.0 in Keep-a-Changelog format with PR/contributor attributions - Add release standards and changelog guidelines to CLAUDE.md	2026-02-16 08:42:32 -04:00
Ilya Grigorik	785bbcf319	MCP: Streamable HTTP, scoring fixes, tool improvements (#149 ) * feat: MCP HTTP transport with daemon lifecycle Add streaming HTTP transport as an alternative to stdio for the MCP server. A long-lived HTTP server avoids reloading 3 GGUF models (~2GB) on every client connection, reducing warm query latency from ~16s (CLI) to ~10s. New CLI surface: qmd mcp --http [--port N] # foreground, default port 3000 qmd mcp --http --daemon # background, PID in ~/.cache/qmd/mcp.pid qmd mcp stop # stop daemon via PID file qmd status # now shows MCP daemon liveness Server implementation (mcp.ts): - Extract createMcpServer(store) shared by stdio and HTTP transports - HTTP transport uses WebStandardStreamableHTTPServerTransport with JSON responses (stateless, no SSE) - /health endpoint with uptime, /mcp for MCP protocol, 404 otherwise - Request logging to stderr with timestamps, tool names, query args Daemon lifecycle (qmd.ts): - PID file + log file management with stale PID detection - Absolute paths in Bun.spawn (process.execPath + import.meta.path) so daemon works regardless of cwd - mkdirSync for cache dir on fresh installs - Removes top-level SIGTERM/SIGINT handlers before starting HTTP server so async cleanup in mcp.ts actually runs Move hybridQuery() and vectorSearchQuery() into store.ts as standalone functions that take a Store as first argument. Both CLI and MCP now call the identical pipeline, eliminating the class of bugs where one copy drifts from the other. Shared pipeline (store.ts): - hybridQuery(): BM25 probe → expand → FTS+vec search → RRF → chunk → rerank (chunks only) → position-aware blending → dedup - vectorSearchQuery(): expand → vec search → dedup → sort - SearchHooks interface for optional progress callbacks - Constants: STRONG_SIGNAL_MIN_SCORE, STRONG_SIGNAL_MIN_GAP, RERANK_CANDIDATE_LIMIT (40), addLineNumbers() Bugs fixed by unification: - MCP now gets strong-signal short-circuit (was CLI-only) - Reranker candidate limit unified at 40 (MCP had 30) - File dedup added to hybrid query (MCP was missing it) - Collection filter pushed into searchVec DB query - Filter-then-slice ordering fixed (MCP was slice-then-filter) * feat: type-routed query expansion — lex→FTS, vec/hyde→vector expandQuery() now returns typed ExpandedQuery[] instead of string[], preserving the lex/vec/hyde type info from the LLM's GBNF-structured output. hybridQuery() and vectorSearchQuery() route searches by type: lex queries go to FTS only, vec/hyde go to vector only. Previously, every expanded query ran through BOTH backends — keyword variants wasted embedding forward passes, semantic paraphrases wasted BM25 lookups. Type routing eliminates ~4 calls/query with zero quality loss (cross-backend noise actually hurt RRF fusion). Cache format changed from newline-separated text to JSON (preserves types). Old cache entries gracefully re-expand on first access. CLI expansion tree now shows query types: ├─ original query ├─ lex: keyword variant ├─ vec: semantic meaning └─ hyde: hypothetical document... Benchmark (5 queries, 1756-doc index, warm LLM, Apple Silicon): Metric Old (untyped) New (typed) Delta Avg backend calls 10.0 6.0 -40% Total wall time 1278ms 549ms -57% Avg saved/query — — 146ms "authentication setup" 12 → 7 calls 511 → 112ms "database migration strategy" 10 → 6 calls 182 → 106ms "how to handle errors in API" 10 → 6 calls 216 → 121ms "meeting notes from last week" 10 → 6 calls 228 → 110ms "performance optimization" 8 → 5 calls 141 → 100ms Savings come from skipped embed() calls (~30-80ms each). FTS is synchronous SQLite (~0ms), so lex→FTS routing is free while vec/hyde→vector-only avoids wasted embedding passes. * fix: MCP query snippets now use reranker's best chunk, not full body extractSnippet() was scanning the entire document body for keyword matches to build the snippet. But hybridQuery() already identified the most relevant chunk via cross-attention reranking — rescanning the full body is redundant and can land on a less relevant section if the query terms appear elsewhere in the document. CLI was already using bestChunk (set during the refactor). MCP was still using body — a pre-existing inconsistency, not a regression. * feat: dynamic MCP instructions + tool annotations The MCP server now generates instructions at startup from actual index state and injects them into the initialize response. LLMs see collection names, document counts, content descriptions, and search strategy guidance in their system prompt — zero tool calls needed for orientation. Previously, the only guidance was generic static tool descriptions and a user-invocable "query" prompt that no LLM would discover on its own. An LLM connecting to QMD had no idea what collections existed, what they contained, or how to scope searches effectively. * change default port to 8181 * fix: BM25 score normalization was inverted The normalization formula `1 / (1 + \|bm25\|)` is a decreasing function of match strength. FTS5 BM25 scores are negative where more negative = better match (e.g., -10 is strong, -0.5 is weak). The formula mapped: strong match (raw -10) → 1/(1+10) = 9% ← should be highest weak match (raw -0.5) → 1/(1+0.5) = 67% ← should be lowest Three downstream effects: 1. `--min-score 0.5` (or MCP minScore: 0.5) filtered OUT strong matches and kept only weak ones. The MCP instructions recommend this threshold. 2. CLI `formatScore()` color bands never showed green for BM25 results (best matches scored ~9%, green threshold is 70%). 3. The strong signal optimization in hybridQuery (skip ~2s LLM expansion when BM25 already has a clear winner) was dead code — strong matches scored ~0.09, never reaching the 0.85 threshold. Fix: `\|x\| / (1 + \|x\|)` — same (0,1) range, monotonic, no per-query normalization needed, but now correctly maps strong → high, weak → low. The normalization was born broken (Math.max(0, x) clamped all negative BM25 to 0 → every score = 1.0), then PR #76 changed to Math.abs which made scores vary but inverted the direction. Neither state was ever correct. * fix: rerank cache key ignores chunk content The rerank cache key was (query, file, model) but the actual text sent to the reranker is a keyword-selected chunk that varies by query terms. Two different queries hitting the same file can select different chunks, but the second query gets a stale cached score from the first chunk. Example: Query "auth flow" → selects chunk about authentication → score 0.92 Query "auth tokens" → same file, selects chunk about tokens → cache HIT on (query, file, model) → returns 0.92 from wrong chunk Fix: include full chunk text in cache key. getCacheKey() already SHA-256 hashes its inputs, so this adds no key bloat — just disambiguation. Old cache entries become natural misses (different key shape) and re-warm on next query. * rename MCP tools for clarity, rewrite descriptions for LLM tool selection Rename MCP tools: vsearch → vector_search, query → deep_search. LLMs see these names — self-documenting names reduce reliance on descriptions for tool selection. CLI commands stay unchanged (qmd vsearch, qmd query) — different namespace, users type those. Rewrite all search tool descriptions to be action-oriented: - search: "Search by keyword. Finds documents containing exact words and phrases in the query." - vector_search: "Search by meaning. Finds relevant documents even when they use different words than the query — handles synonyms, paraphrases, and related concepts." - deep_search: "Deep search. Auto-expands the query into variations, searches each by keyword and meaning, and reranks for top hits across all results." Rewrite instructions ladder — each tool says what it does, no "start here" / "escalate as needed" strategy language. Delete the "query" prompt (registerPrompt) — it restated what descriptions + instructions already cover. No LLM proactively calls prompts/get to learn how to use tools. * supress HTTP server logs during tests	2026-02-10 16:37:33 -05:00
Matt Galligan	63028fd5e9	feat: add Claude Code plugin support with inline status check (#99 ) - Add marketplace.json for Claude Code plugin installation - Simplify skill status check to inline `qmd status` (portable across agents) - Update SKILL.md MCP section, reference mcp-setup.md for manual config - Clean up mcp-setup.md (remove redundant prerequisites) - Rename MCP-SETUP.md to mcp-setup.md Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>	2026-02-02 14:14:24 -05:00
Algimantas Krasauskas	f6a987a642	Add skills.sh integration for AI agent discovery (#64 )	2026-01-29 18:27:50 -08:00

18 Commits