The `qmd get` command was documented to support docid lookups
(e.g., `qmd get "#abc123"` or `qmd get abc123`), but the
implementation in getDocument() never actually handled docids.
The findDocumentByDocid() function existed in store.ts and worked
correctly, but getDocument() in qmd.ts reimplemented document
lookup without calling it.
This adds docid detection at the start of getDocument() to resolve
docids to virtual paths before other path handling.
Co-authored-by: Joshua Mitchell <jlelonmitchell@gmail.com >
Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
The code uses Qwen3-1.7B (~2.2GB) for query expansion, but the README
documented Qwen3-0.6B (~640MB) in three places:
- Model requirements table
- Architecture diagram
- Code configuration sample
This caused confusion when users saw a 2GB+ download instead of 640MB.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Previously, the "Run 'qmd embed' to update embeddings" message was
printed after each collection was indexed, repeating the same global
count multiple times. Now it's shown once at the end of the update.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Add detailed comments explaining why two-step query is necessary
- Add regression test for sqlite-vec JOIN hang bug
- Link to PR in comments for future reference
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
sqlite-vec virtual tables don't work correctly with JOINs in the same
query - they cause the query to hang indefinitely.
Changes:
- searchVec: Rewrite to use two-step approach
1. Query vectors_vec table alone (no JOINs)
2. Look up document info separately using result hash_seqs
- vsearch: Change from Promise.all to sequential for loop
(node-llama-cpp embedding context doesn't handle concurrent calls)
This fixes vsearch and hybrid query commands that were hanging at
"Searching N vector queries..."
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Empty files have nothing useful to index or embed. Previously they would
be indexed with an empty body, causing confusing "1 need embedding" status
messages that could never be resolved.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Issue #11: Collection filter (-c) SQL error
- Fixed searchVec to properly parameterize collection filter
- Changed collectionId check from !== undefined to truthy
- Added test for searchVec with collection filter
Issue #10: Non-ASCII filename support
- Updated handelize() to use Unicode property escapes (\p{L}\p{N})
- Now supports Cyrillic, Japanese, and other Unicode filenames
- Updated tests to verify Unicode filename handling
Also:
- Fixed expandQuery to filter out lex entries when includeLexical=false
- Updated expandQuery tests to match actual behavior
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Fix SQL syntax error when collectionId is empty string (searchFTS, searchVec)
- Add 1-second timeout to llama.dispose() to prevent indefinite hang
- Add process.exit(0) after cleanup for clean CLI exit
- Include hash/docid in search results mapping
- Update query expansion to use structured Queryable types
- Switch to Qwen3-1.7B model for better query expansion
- Improve bun discovery in qmd wrapper script
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Instead of calling countTokens() multiple times during binary search
for chunk boundaries, tokenize the document once upfront and slice
token arrays. This reduces tokenizer calls from O(chunks × iterations)
to O(1) per document.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
The previous implementation spawned a subprocess for every file during
indexing (e.g., 4500 subprocess spawns for a large collection). This
caused resource exhaustion and random hangs. Using Node's native
realpathSync is orders of magnitude faster.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
The installPhase was trying to copy qmd.ts from root, but it's at
src/qmd.ts. Updated to copy the src directory and point the wrapper
to the correct path.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Fix ReferenceError in vectorIndex(): firstResult was used but never
defined. Added code to embed first chunk to get embedding dimensions.
- Fix 87 TypeScript errors across codebase:
- formatter.ts: Define MultiGetFile type locally (was missing from store.ts)
- collections.ts: Add non-null assertion for array access
- mcp.ts: Fix StatusResult type to match store.ts CollectionInfo,
add list parameter to ResourceTemplate, fix undefined checks
- qmd.ts: Fix boolean/string type coercions, undefined array access
- llm.test.ts: Update expandQuery tests for Queryable[] return type,
fix array access assertions
- store.test.ts: Add non-null assertions for array access in tests
- eval-harness.ts: Fix array access assertion
Let node-llama-cpp handle context size and sequences automatically.
The mutex still serializes generation calls for safety.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Reduce context sequences from 4 to 1 to minimize VRAM usage
when multiple models (embed, generate, rerank) are loaded
- Add mutex to serialize generation calls to prevent "No sequences left"
error when concurrent requests occur with single sequence
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- dispose() now just calls llama.dispose() which cascades to models/contexts
per node-llama-cpp lifecycle docs
- Remove disposeDefaultLlamaCpp calls from tests - they don't help with
the Metal cleanup crash
- Use singleton getDefaultLlamaCpp() in llm tests for consistency
The Metal backend crash at process exit is a known llama.cpp issue:
https://github.com/ggml-org/llama.cpp/pull/17869
All tests pass - the abort happens after test completion.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
The Metal backend crash happens regardless of whether we dispose or not.
It's a known llama.cpp issue during process exit static destructor cleanup:
https://github.com/ggml-org/llama.cpp/pull/17869
All 297 tests pass - the abort happens after tests complete.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
The Metal crash was caused by not disposing resources in the right order
at the right time. The fix:
1. Restore proper dispose() that disposes contexts → models → llama in order
2. Move disposeDefaultLlamaCpp() to global afterAll (after all tests complete)
3. Keep disposed flag to prevent double-dispose
The issue was that disposing per-suite broke tests that share llama,
and not disposing at all left orphaned Metal resources at process exit.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Add 6 "fusion" queries designed to test cases where neither BM25 nor
vector search alone succeeds, but combining them with RRF does:
- "how much runway before running out of money" → fundraising
- "datacenter replication sync strategy" → distributed-systems
- "splitting data for training and testing" → machine-learning
- "JSON response codes error messages" → api-design
- "video calls camera async messaging" → remote-work
- "CI/CD pipeline testing coverage" → product-launch
The fusion test verifies:
1. Hybrid achieves ≥50% Hit@3 on these multi-signal queries
2. Hybrid outperforms or matches the best individual method
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Add disposed flag to prevent double-dispose
- Don't explicitly dispose llama resources in dispose() - just clear refs
- Let process exit handle Metal cleanup naturally
- Remove disposeDefaultLlamaCpp call from eval tests
Note: llama.cpp Metal backend still crashes at process exit due to
ggml-metal cleanup issues. This is a known upstream issue:
https://github.com/ggml-org/llama.cpp/pull/17869
All tests pass (12/12), the abort happens after test completion.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Pass model and embeddedAt parameters to insertEmbedding
- Convert embedding to Float32Array for sqlite-vec compatibility
- Make hybrid search thresholds conditional on vector availability
(falls back to BM25-only thresholds when no embeddings exist)
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Three test suites with different thresholds:
- BM25: easy≥80%, medium≥15%, hard≥15%, overall≥40%
- Vector: easy≥60%, medium≥40%, hard≥30%, overall≥50%
- Hybrid (RRF): easy≥80%, medium≥50%, hard≥35%, overall≥60%
Hybrid should outperform individual methods on semantic queries.
Vector/hybrid tests have 60-120s timeouts for embedding generation.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Add enableProductionMode() that qmd.ts calls at startup
- getDefaultDbPath() throws in test mode unless INDEX_PATH is set
- Update store.test.ts to expect throws for default path tests
- Add eval.test.ts with 18 BM25 quality tests (easy/medium/hard)
- Tests now cannot accidentally write to ~/.cache/qmd/
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Skip expensive LLM query expansion when initial BM25 search has
strong signals (top result score > 0.7). This saves LLM calls and
latency for queries that already match well.
- Run initial BM25 search first
- Check if top result has strong score
- Skip expansion if signal is strong
- Reuse initial search results for retrieval
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Instead of reranking just 1 keyword-matched chunk per doc, now:
- Select top 3 chunks per document (by keyword score)
- Rerank all selected chunks
- Aggregate scores using top-2 average (rewards consistency)
- Use best-scoring chunk for snippet display
This improves ranking for long documents where the keyword-matched
chunk isn't always the most relevant to the query.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Added safety net deduplication after reranking to prevent the same
file appearing multiple times in results. Uses Set to keep only
first (highest-scored) occurrence of each file.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Models are now automatically unloaded after 2 minutes of inactivity
to free memory when running as MCP server. Key changes:
- Add inactivityTimeoutMs config option (default: 2 minutes)
- Add touchActivity() called after each model operation
- Add unloadModels() to free memory while keeping instance alive
- Timer uses unref() so it doesn't keep process alive
- Models reload lazily on next operation
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Changed "A hypothetical document excerpt that would answer" to
"Write a brief example passage that answers the query" to make the
model generate actual hypothetical answer content.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Replace Ollama HTTP API with node-llama-cpp for local GGUF models
- Add structured query expansion using JSON schema grammar:
- Generates lexical query (for BM25), vector query, and HyDE
- Tree-style CLI output showing query types
- Fix vector search: use cosine distance instead of L2
- Format queries with embeddinggemma nomic-style prompts
- Rename ollama_cache table to llm_cache
- Add disposeDefaultLlamaCpp() for clean process exit
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Add support for collection/path.md format in get command (checks if
first component is a known collection before treating as filesystem path)
- Add comprehensive output format tests verifying qmd:// URIs, docid,
and context in JSON, CSV, MD, XML, files, and CLI formats
- Add path normalization tests for various input formats:
qmd://, //, qmd:////, collection/path, and path:line suffix
- Add isolated test environments (createIsolatedTestEnv) to prevent
YAML config conflicts between test suites
- Add test fixture files test1.md and test2.md for path tests
- Update runQmd helper to accept custom configDir parameter
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
The get command now supports --line-numbers flag to prefix each line
with its line number. When combined with --from, line numbers start
from the specified line.
Example:
qmd get file.md --line-numbers
qmd get file.md --from 10 -l 5 --line-numbers
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- searchFTS filters by collection name (collection param works with YAML-based collections)
- getStatus returns correct structure
- getStatus counts documents correctly
- getStatus reports collection info
All 278 tests now pass.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Features:
- Add short document IDs (docid) - first 6 chars of hash - to all search outputs
- Add --line-numbers CLI option and lineNumbers param for MCP tools
- Add handelize() function for token-friendly filenames (lowercase, special chars to dash, preserves extension)
- Convert triple underscore `___` to folder separator in filenames
- Change displayPath format to include collection name (collection/path)
- Make line-numbers default for MCP search snippets
Changes:
- store.ts: Add getDocid(), findDocumentByDocid(), handelize() functions
- formatter.ts: Add docid to all formatters, addLineNumbers() helper
- qmd.ts: Add --line-numbers option, use handelize during indexing
- mcp.ts: Remove resource listing, lineNumbers default for snippets
- Update all tests to expect new displayPath format and handelize behavior
- Update CLAUDE.md with docid documentation
All 274 tests pass.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
**Test Fixes:**
- Fixed tilde expansion test to create collection with home directory path
- Fixed test expectations for displayPath vs filepath separation
- Fixed MCP test config to use isolated YAML config directory
- Fixed MCP mock to return correct logprobs format
- Fixed qmd_query test to use r.filepath instead of r.file
- Fixed CLI multi-get test to use fresh database for isolation
- Fixed multiGet function to parse filepath (virtual) instead of displayPath
**Bug Fixes:**
- Fixed multiGet to use virtual paths for parsing collection/path info
- Fixed findDocuments selectCols to separate virtual_path and display_path
- Fixed context loading in findDocuments to use virtual paths
**New Test:**
- Added hierarchical context test verifying global + collection + path contexts
are all included and joined with double newlines
**Results:** 261 passing / 0 failing (100% pass rate)
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Update all direct SQL queries in tests to use new schema
- Replace display_path column with path
- Add joins with content table to get document bodies
- Use computed virtual_path for filepath field
Test results: 249 passing / 11 failing (94.3% pass rate)
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Update fuzzy matching functions to return relative paths
- Fix findDocument to properly separate displayPath and filepath
- Update MCP test schema to use content-addressable storage
- Remove deprecated getCollectionIdByName function references
- Fix MCP collection filtering to work post-search
- Update test expectations for YAML-based collections
- Fix integration test expectations for path formats
Test results: 244 passing / 16 failing (93.8% pass rate)
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Update add-context test to use virtual path format (qmd://collection/)
- Fix collection filter test to provide explicit collection names with --name
- Add debug output for failing tests
- Update MCP queries to use new schema (path, collection instead of display_path, filepath)
- Fix resource list to construct virtual paths properly: qmd://collection/path
- Fix resource read handler to parse virtual paths and join with content table
- Prevents double-encoding issues like qmd://qmd%3A//archive/...