Add _ciMode flag to LlamaCpp that throws immediately on embedBatch,
generate, expandQuery, and rerank when CI=true — prevents silent 30s
timeouts. Skip MCP HTTP Transport tests in CI (they instantiate a real
LlamaCpp). Bump vitest/bun test timeouts to 60s for slower CI runners.
- Bump better-sqlite3 from ^11 to ^12.4.5 for Node 25 support (prebuilds
+ V8 API compat). Closes#257.
- Add bin/qmd shell wrapper that detects bun vs node install and execs
with the matching runtime, preventing native module ABI mismatches
when installed via bun. Closes#319.
Move frontends into src/cli/ and src/mcp/ to separate them from the
core library. The MCP server is fully rewritten to import only from
the SDK (src/index.ts) — zero direct store.ts/collections.ts/llm.ts
access.
- src/qmd.ts → src/cli/qmd.ts
- src/formatter.ts → src/cli/formatter.ts
- src/mcp.ts → src/mcp/server.ts (rewritten to use QMDStore SDK)
- New src/maintenance.ts: Maintenance class for CLI housekeeping
- SDK gains: getDocumentBody(), getDefaultCollectionNames(),
extractSnippet/addLineNumbers/DEFAULT_MULTI_GET_MAX_BYTES exports,
getDefaultDbPath re-export, InternalStore type export
- package.json bin/scripts updated for new paths
- All 692 tests pass
Replace three separate search methods (query, search, structuredSearch)
with a single search(options) that accepts either a query string
(auto-expanded) or pre-expanded queries. Add searchLex/searchVector
convenience methods and expandQuery for manual control.
Unify StructuredSubSearch and ExpandedQuery into a single ExpandedQuery
type with { type, query } used throughout the pipeline. Add skipRerank
option to hybridQuery and structuredSearch for fast no-LLM searches.
New SDK surface:
- search({ query, intent, rerank, limit, ... })
- search({ queries: expanded })
- searchLex(query, opts)
- searchVector(query, opts)
- expandQuery(query, { intent })
HuggingFace filenames are case-sensitive. The documented filename
'qwen3-embedding-0.6b-q8_0.gguf' (lowercase) returns 404. The correct
filename is 'Qwen3-Embedding-0.6B-Q8_0.gguf' (original case from the
HuggingFace repo).
Co-Authored-By: Oz <oz-agent@warp.dev>
Allow QMD to be used as a library (`import { createStore } from '@tobilu/qmd'`)
in addition to CLI and MCP modes. The constructor requires explicit dbPath and
either a configPath (YAML file) or inline config object — no defaults assumed,
making it safe to embed in any application.
- Add src/index.ts entry point with QMDStore interface exposing search,
retrieval, collection/context management, and index health
- Add setConfigSource() to collections.ts for inline config support
(in-memory config with no file I/O)
- Add main/types/exports fields to package.json
- Add SDK documentation section to README
- Add 56 unit tests covering constructor, collections, contexts, search,
document retrieval, config isolation, YAML persistence, and lifecycle
Add optional `intent` parameter that steers query expansion, reranking,
chunk selection, and snippet extraction without searching on its own.
When a query like "performance" is ambiguous (web-perf vs team health vs
fitness), intent provides background context that disambiguates results
across all pipeline stages:
- expandQuery: includes intent in LLM prompt ("Query intent: {intent}")
- rerank: prepends intent to rerank query for Qwen3-Reranker
- chunk selection: intent terms scored at 0.5x weight vs query terms
- snippet extraction: intent terms scored at 0.3x weight
- strong-signal bypass: disabled when intent provided
Available via CLI (--intent flag or intent: line in query documents),
MCP (intent field on query tool), and programmatic API.
Adapted from PR #180 (thanks @vyalamar).
- Cap rerank contexts at 4 to avoid VRAM exhaustion on high-core machines
- Deduplicate identical chunk texts before sending to reranker
- Cache rerank scores by chunk content instead of file path — same text
from different files now shares a single reranker call
- Add truncation cache to avoid re-tokenizing duplicate documents
Convert emoji codepoints to hex representation (e.g. 🐘 → 1f418) instead
of crashing, so files like 🐘.md can be indexed without halting the
entire update process.
Fixes#302