Commit Graph

41 Commits

Author SHA1 Message Date
Haitao Pan
47bd3ded44 feat(pg): add switchable PostgreSQL backend + OpenClaw/Hermes memory bridge
Add an optional PostgreSQL backend (QMD_BACKEND=pg) alongside the
unchanged default SQLite path. PG store uses pgvector (HNSW) for vectors
and pg_jieba + pg_trgm for full-text/Chinese tokenization, with a
namespace column isolating multi-agent memory (openclaw/hermes).

- src/pg/: config, db-pg, schema bootstrap, memory store
- MCP memory_add/memory_search/memory_get tools; qmd pg status + memory CLI
- connection via QMD_PG_URL/DATABASE_URL/qmd config, stunnel TLS 5443
- tests: pg-config (unit) + pg-memory integration (gated on QMD_PG_URL) + pg-compose
- docs/plan: plan, usage, test report, changelog; track docs/**/*.md

SQLite path: zero regression (typecheck clean, 249 passed / 6 skipped).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-23 19:13:04 +08:00
Tobi Lutke
4383accf0f
release: v2.1.0 2026-04-05 18:18:41 -04:00
Tobi Lutke
9c0d100a09
chore: pin all dependencies to exact versions
Remove ^ ranges from all dependencies, optionalDependencies, and
devDependencies. Lockfile ensures reproducible installs.
2026-04-05 18:17:02 -04:00
Tobi Lutke
cc32c9958d
fix: approve native build scripts and update vitest
Add pnpm.onlyBuiltDependencies to whitelist packages that need
install/postinstall scripts (better-sqlite3, esbuild, node-llama-cpp,
tree-sitter-*). Without this, pnpm silently skips native compilation
causing all tests that touch SQLite to fail.

Also bumps vitest from ^3.0.0 to ^3.2.4.
2026-04-05 18:13:31 -04:00
James Risberg
244ddf5ecb feat: AST-aware chunking for code files via tree-sitter
Add opt-in AST-aware chunk boundary detection for code files using
web-tree-sitter. When enabled with `--chunk-strategy auto`, code files
(.ts, .tsx, .js, .jsx, .py, .go, .rs) are chunked at function, class,
and import boundaries instead of arbitrary text positions. Default
behavior (`regex`) is unchanged — no surprises on upgrade.

In testing on QMD's own codebase, AST mode split 42% fewer function
bodies across chunk boundaries compared to regex-only chunking.

Usage:
  qmd embed --chunk-strategy auto
  qmd query "search terms" --chunk-strategy auto

What's included:
- Language detection from file extension with support for TypeScript,
  JavaScript (including arrow functions and function expressions),
  Python, Go, and Rust
- Per-language tree-sitter queries with scored break points aligned to
  the existing markdown scale (class=100, function=90, type=80, import=60)
- AST break points merged with regex break points — highest score wins
  at each position, so embedded markdown (comments, docstrings) still
  benefits from regex patterns
- Refactored chunking core: chunkDocumentWithBreakPoints() extracted,
  mergeBreakPoints() added, async chunkDocumentAsync() wrapper for AST
- ChunkStrategy type ("auto" | "regex") threaded through
  generateEmbeddings(), hybridQuery(), structuredSearch(), CLI, and SDK
- getASTStatus() health check wired into `qmd status`
- Parse failures log a warning and fall back to regex — never crash

Hardening:
- Grammar packages are optionalDependencies with pinned versions to
  prevent ABI breaks from semver drift
- web-tree-sitter is a direct dependency (pinned)
- Errors are logged (not silently swallowed) for debuggability
- Tested on both Node.js and Bun (Bun is actually faster)

Testing:
- 26 unit tests (test/ast.test.ts) — all 4 languages, error handling
- 7 integration tests (test/store.test.ts) — merge, equivalence, bypass
- Standalone test-ast-chunking.mjs with 63 synthetic tests and a
  real-collection performance scanner (npx tsx test-ast-chunking.mjs ~/code)
- Validated end-to-end with qmd embed + qmd query on QMD's own codebase
- Zero markdown regressions across all test paths

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-22 01:22:39 -04:00
Ryan Malia
e988c5d286 fix: pin zod to exact 4.2.1 to fix tsc build failure (#379)
The caret range ^4.2.1 allows npm to resolve zod 4.3.x, which has
breaking type changes against @modelcontextprotocol/sdk. Source builds
fail with TypeScript errors. Pinning to exact 4.2.1 resolves this.

See: https://github.com/tobi/qmd/issues/379
2026-03-11 21:43:09 -07:00
Tobi Lutke
21a5dcc853
release: v2.0.1 2026-03-10 20:59:27 -04:00
Tobi Lutke
8478ddb666
release: v2.0.0 2026-03-10 11:53:25 -04:00
Tobi Lutke
b252219add
fix(deps): bump better-sqlite3 to ^12.4.5, add runtime-aware bin wrapper
- Bump better-sqlite3 from ^11 to ^12.4.5 for Node 25 support (prebuilds
  + V8 API compat). Closes #257.
- Add bin/qmd shell wrapper that detects bun vs node install and execs
  with the matching runtime, preventing native module ABI mismatches
  when installed via bun. Closes #319.
2026-03-10 11:43:00 -04:00
Tobi Lutke
c68904fe08
refactor: move CLI and MCP to subdirectories, MCP consumes SDK
Move frontends into src/cli/ and src/mcp/ to separate them from the
core library. The MCP server is fully rewritten to import only from
the SDK (src/index.ts) — zero direct store.ts/collections.ts/llm.ts
access.

- src/qmd.ts → src/cli/qmd.ts
- src/formatter.ts → src/cli/formatter.ts
- src/mcp.ts → src/mcp/server.ts (rewritten to use QMDStore SDK)
- New src/maintenance.ts: Maintenance class for CLI housekeeping
- SDK gains: getDocumentBody(), getDefaultCollectionNames(),
  extractSnippet/addLineNumbers/DEFAULT_MULTI_GET_MAX_BYTES exports,
  getDefaultDbPath re-export, InternalStore type export
- package.json bin/scripts updated for new paths
- All 692 tests pass
2026-03-10 11:39:55 -04:00
Tobi Lutke
032f26edca
release: v1.1.6 2026-03-09 17:23:14 -04:00
Tobi Lutke
040c6fa904
feat: add SDK/library mode for programmatic access
Allow QMD to be used as a library (`import { createStore } from '@tobilu/qmd'`)
in addition to CLI and MCP modes. The constructor requires explicit dbPath and
either a configPath (YAML file) or inline config object — no defaults assumed,
making it safe to embed in any application.

- Add src/index.ts entry point with QMDStore interface exposing search,
  retrieval, collection/context management, and index health
- Add setConfigSource() to collections.ts for inline config support
  (in-memory config with no file I/O)
- Add main/types/exports fields to package.json
- Add SDK documentation section to README
- Add 56 unit tests covering constructor, collections, contexts, search,
  document retrieval, config isolation, YAML persistence, and lifecycle
2026-03-08 15:59:22 -04:00
Tobi Lutke
da9cf691fd
release: v1.1.5 2026-03-07 20:16:32 -04:00
Tobi Lutke
b838f74c8c
release: v1.1.2 2026-03-07 15:58:28 -04:00
Tobias Lütke
cb5d84ff07
Merge pull request #225 from ilepn/fix/sqlite-vec-windows-package-name
fix(package.json): correct Windows sqlite-vec package name + add linux-arm64
2026-03-07 14:28:55 -04:00
Tobias Lütke
f75c668e46
Merge pull request #310 from giladgd/nodeLlamaCppUseBuildAutoAttempt
feat: use `build: "autoAttempt"` on `getLlama`
2026-03-07 14:24:52 -04:00
Tobi Lutke
2ae1baba2f
release: v1.1.1 2026-03-06 14:48:47 -04:00
Gilad S.
3095041e0f feat: use build: "autoAttempt" on getLlama 2026-03-06 07:02:50 +00:00
Tobi Lutke
0e0feb6f2b
release: v1.1.0 2026-02-22 11:09:36 -04:00
Tobi Lutke
0b57711d32
refactor: replace bash wrapper with standard #!/usr/bin/env node shebang
The qmd bin was a custom bash script that discovered node via hardcoded
fallback paths (mise, asdf, nvm, homebrew). This was nonstandard and
caused ABI mismatches when installed via bun (native modules compiled
for bun but executed with node).

Now uses the standard npm bin convention: dist/qmd.js with a node
shebang, added by the build script. The isMain guard resolves symlinks
so it works when npm/bun create symlinked bin entries.

Also converts all dynamic require() calls in tests to ESM imports, and
adds container-based smoke tests (test/smoke-install.sh) that verify
install + run under both node and bun via mise in a Debian container.
2026-02-22 11:09:36 -04:00
Tobi Lütke
6ac7c6837e
chore: release 1.0.8 2026-02-19 06:56:17 -05:00
ilepn
3f946f7cb3 fix(package.json): correct Windows sqlite-vec package name + add linux-arm64
The optionalDependencies entry `sqlite-vec-win32-x64` does not exist on npm
(404). The correct package name is `sqlite-vec-windows-x64` — the sqlite-vec
project maps Node's `process.platform` value `win32` to `windows` in its
`platformPackageName()` function.

Also adds the missing `sqlite-vec-linux-arm64` entry to match all 5 platforms
declared in sqlite-vec's own optionalDependencies.

Fixes #224

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-19 09:18:00 +02:00
Tobi Lutke
0697bc59d6
release: v1.0.7 2026-02-18 15:36:54 -04:00
Tobi Lutke
51c03d9445
release: v1.0.6 2026-02-16 09:08:34 -04:00
Tobi Lutke
6d399bc50a
release: v1.0.5 2026-02-16 08:47:23 -04:00
Tobi Lutke
09803a75b7
feat: compile to JS for npm, release system, full changelog
- Add tsc build step (tsconfig.build.json) so npm package ships
  compiled JS instead of raw TypeScript requiring tsx at runtime
- Update qmd wrapper and daemon spawn to use dist/qmd.js in
  production while keeping tsx for development
- Add self-installing pre-push hook validating v* tag pushes:
  package.json version match, changelog entry, CI status
- Add release.sh script that renames [Unreleased] to versioned
  entry, bumps package.json, commits, and tags
- Add extract-changelog.sh for cumulative GitHub release notes
- Update publish workflow with build step and GitHub release creation
- Flesh out CHANGELOG.md with full history from 0.1.0 through 1.0.0
  in Keep-a-Changelog format with PR/contributor attributions
- Add release standards and changelog guidelines to CLAUDE.md
2026-02-16 08:42:32 -04:00
Tobi Lutke
870d3aed3b
test: move all tests to flat test/ directory
No more src/models/ and src/integration/ subfolders to forget about.
All 9 test files live in test/, one command runs everything:

  npx vitest run test/
  bun test test/
2026-02-15 21:37:47 -04:00
Tobi Lutke
dc64166a2a
release: v1.0.0
Node.js compatibility, parallel embedding/reranking, flash attention,
GPU auto-detection, and restructured test suite.
2026-02-15 17:02:00 -04:00
Tobi Lutke
4df5505bd6
Merge origin/nodejs: Node.js compat, perf improvements, vitest
Brings in Node.js compatibility (tsx, vitest), GPU auto-detection,
parallel embedding/reranking contexts, and flash attention support.
Preserves @tobilu/qmd package scope and publish config from main.
2026-02-15 16:52:30 -04:00
Tobi Lutke
13e8473455
docs: update node usage and bump version
Update README installation and quick-start commands to Node examples.
- replace bun install/link commands with npm-based Node workflow
- bump package version to 0.9.9 for CLI and MCP metadata
- keep Bun guidance as optional development/runtime note
2026-02-15 16:44:47 -04:00
Tobi Lutke
00ff084fd9
chore: fix bin path, add author, use token-based npm publish 2026-02-15 15:14:45 -04:00
Tobi Lutke
5d73752b47
chore: rename package scope to @tobilu/qmd 2026-02-15 15:07:26 -04:00
Tobi Lutke
2279389415
chore: set up npm publishing as @tobi/qmd v0.9.0
- Scope package to @tobi/qmd, version 0.9.0
- Add files whitelist, publishConfig, repo metadata
- Add CI workflow (bun tests on ubuntu + macos, bun latest + 1.1.0)
- Add publish workflow (triggers on v* tags, publishes to npm)
- Add release script for version bumping + changelog generation
- Add LICENSE (MIT) and initial CHANGELOG.md
- Update install instructions to use @tobi/qmd
2026-02-15 14:31:23 -04:00
Tobi Lutke
537d15a9e6
fix: proper cleanup of Metal GPU resources in tests
Add test-preload.ts with global afterAll hook that ensures llama.cpp
Metal resources are properly disposed before process exit, avoiding
GGML_ASSERT failures.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-15 14:20:09 -04:00
Tobi Lutke
c85889df12
fixes 2025-12-21 14:50:17 -04:00
Tobi Lutke
d383b5c226
Migrate to node-llama-cpp and add structured query expansion
- Replace Ollama HTTP API with node-llama-cpp for local GGUF models
- Add structured query expansion using JSON schema grammar:
  - Generates lexical query (for BM25), vector query, and HyDE
  - Tree-style CLI output showing query types
- Fix vector search: use cosine distance instead of L2
- Format queries with embeddinggemma nomic-style prompts
- Rename ollama_cache table to llm_cache
- Add disposeDefaultLlamaCpp() for clean process exit

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-20 18:03:41 -04:00
Tobi Lutke
691c56d051
Add YAML-based collections configuration system
- Create src/collections.ts module for managing collections in YAML
- Collections defined in ~/.config/qmd/index.yml instead of SQLite
- Support for nested contexts at any path level
- Global context applies to all collections
- Functions: load/save config, add/remove/rename collections
- Context management: add, remove, find best match for path
- Add yaml package dependency
- Include example-index.yml showing the clean YAML format

This is the foundation for removing collections and path_contexts
tables from SQLite, moving all configuration to YAML.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2025-12-13 09:56:56 -05:00
Tobi Lutke
529e989d83
Refactor: Move TypeScript source files to src/ directory
Move all .ts files to src/ to clean up the project root:
- Created src/ directory and moved all TypeScript source and test files
- Updated qmd shell wrapper to point to src/qmd.ts
- Updated package.json scripts to use src/ paths
- Updated documentation (CLAUDE.md, README.md) to reflect new structure
- All imports remain relative within src/, no changes needed
- Tests pass with same results (192 pass, 75 fail - existing issues)

This improves project organization and makes the root directory cleaner.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2025-12-12 17:12:09 -05:00
Tobi Lutke
bab46dacb2
Refactor: extract store, LLM, and formatter modules with comprehensive tests
- Extract store.ts: database operations, search, document retrieval
  - createStore() factory pattern for clean DB lifecycle management
  - Unified DocumentResult type with optional body loading
  - Snippet extraction with diff-style headers (@@ -line,count @@)

- Extract llm.ts: LLM abstraction layer with Ollama implementation
  - Clean interface for embed, generate, rerank operations
  - High-level rerankerLogprobsCheck with logprob-based scoring
  - Query expansion support

- Extract formatter.ts: output formatting utilities
  - Support for CLI, JSON, CSV, MD, XML formats
  - MCP-specific CSV formatting

- Extract mcp.ts: MCP server using createStore() pattern
  - Single DB connection for server lifetime (fixes closed DB errors)
  - URL-decode resource paths for proper space/special char handling

- Add comprehensive test suites (215 tests total)
  - store.test.ts: 96 tests covering all store operations
  - llm.test.ts: 60 tests for LLM abstraction
  - mcp.test.ts: 59 tests for MCP endpoints and resources
  - All tests use mocked Ollama (errors on unmocked calls)

- Add bun run inspector script for MCP debugging

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-09 16:33:32 -05:00
Tobi Lutke
25ac53848f
Add MCP server for AI agent integration
- Add `qmd mcp` command to start stdio-based MCP server
- Expose tools: qmd_search, qmd_vsearch, qmd_query, qmd_get, qmd_status
- Add index health warnings for unembedded docs and stale indexes
- Return CSV format with text/csv mime type for search results
- Add MCP documentation and configuration examples to README
- Add @modelcontextprotocol/sdk and zod dependencies

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-08 14:59:56 -05:00
Tobi Lutke
39193ea252
Initial commit: QMD - Quick Markdown Search
A CLI tool for searching markdown knowledge bases using hybrid retrieval:
- BM25 full-text search via SQLite FTS5
- Vector semantic search via sqlite-vec + Ollama embeddings
- LLM re-ranking with qwen3-reranker (logprobs-based scoring)
- Reciprocal Rank Fusion with weighted queries and position-aware blending

Features:
- `qmd add .` - Index markdown files in current directory
- `qmd embed` - Generate vector embeddings
- `qmd search` - BM25 full-text search
- `qmd vsearch` - Vector similarity search
- `qmd query` - Hybrid search with query expansion + reranking

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-07 19:16:16 -05:00