feat: compile to JS for npm, release system, full changelog

- Add tsc build step (tsconfig.build.json) so npm package ships
  compiled JS instead of raw TypeScript requiring tsx at runtime
- Update qmd wrapper and daemon spawn to use dist/qmd.js in
  production while keeping tsx for development
- Add self-installing pre-push hook validating v* tag pushes:
  package.json version match, changelog entry, CI status
- Add release.sh script that renames [Unreleased] to versioned
  entry, bumps package.json, commits, and tags
- Add extract-changelog.sh for cumulative GitHub release notes
- Update publish workflow with build step and GitHub release creation
- Flesh out CHANGELOG.md with full history from 0.1.0 through 1.0.0
  in Keep-a-Changelog format with PR/contributor attributions
- Add release standards and changelog guidelines to CLAUDE.md
This commit is contained in:
Tobi Lutke 2026-02-16 08:42:05 -04:00
parent 77c6eba159
commit 09803a75b7
No known key found for this signature in database
18 changed files with 725 additions and 121 deletions

View File

@ -33,6 +33,23 @@ jobs:
node-version: 22
registry-url: https://registry.npmjs.org
- run: npm run build
- run: npm publish --provenance --access public
env:
NODE_AUTH_TOKEN: ${{ secrets.NPM_TOKEN }}
- name: Extract release notes
id: notes
run: |
VERSION="${GITHUB_REF_NAME#v}"
NOTES=$(./scripts/extract-changelog.sh "$VERSION")
# Write to file for gh release (avoids quoting issues)
echo "$NOTES" > /tmp/release-notes.md
- name: Create GitHub release
run: |
gh release create "$GITHUB_REF_NAME" \
--title "$GITHUB_REF_NAME" \
--notes-file /tmp/release-notes.md
env:
GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}

1
.gitignore vendored
View File

@ -1,4 +1,5 @@
node_modules/
dist/
.npmrc
*.sqlite
.DS_Store

View File

@ -1,68 +1,275 @@
# Changelog
All notable changes to QMD will be documented in this file.
## [Unreleased]
## [1.0.0] - 2026-02-15
### Node.js Compatibility
QMD now runs on both Node.js and Bun, with up to 2.7x faster reranking
through parallel GPU contexts. GPU auto-detection replaces the unreliable
`gpu: "auto"` with explicit CUDA/Metal/Vulkan probing.
QMD now runs on both **Node.js (>=22)** and **Bun**. Install with `npm install -g @tobilu/qmd` or `bun install -g @tobilu/qmd` — your choice. The `qmd` wrapper auto-detects Node.js via `tsx` and works out of the box with mise, asdf, nvm, and Homebrew installs.
### Changes
### Performance
- **Parallel embedding & reranking** — multiple contexts split work across CPU cores (or VRAM on GPU), delivering up to **2.7x faster reranking** and significantly faster embedding on multi-core machines
- **Flash attention** — ~20% less VRAM per reranking context, enabling more parallel contexts on GPU
- **Right-sized contexts** — reranker context dropped from 40960 to 2048 tokens (17x less memory), since chunks are capped at ~900 tokens
- **Adaptive parallelism** — automatically scales context count based on available VRAM (GPU) or CPU math cores
- **CPU thread splitting** — each context runs on its own cores for true parallelism instead of contending on a single context
### GPU Auto-Detection
- Probes for CUDA, Metal, and Vulkan at startup — uses the best available backend
- Falls back gracefully to CPU with a warning if GPU init fails
- `qmd status` now shows device info (GPU type, VRAM usage)
### Test Suite
- Tests split into `src/*.test.ts` (unit), `src/models/*.test.ts` (model), and `src/integration/*.test.ts` (CLI/integration)
- Vitest config for Node.js; bun test still works for Bun
- New `eval-bm25` and `store.helpers.unit` test suites
- Runtime: support Node.js (>=22) alongside Bun via a cross-runtime SQLite
abstraction layer (`src/db.ts`). `bun:sqlite` on Bun, `better-sqlite3` on
Node. The `qmd` wrapper auto-detects a suitable Node.js install via PATH,
then falls back to mise, asdf, nvm, and Homebrew locations.
- Performance: parallel embedding & reranking via multiple LlamaContext
instances — up to 2.7x faster on multi-core machines.
- Performance: flash attention for ~20% less VRAM per reranking context,
enabling more parallel contexts on GPU.
- Performance: right-sized reranker context (40960 → 2048 tokens, 17x less
memory) since chunks are capped at ~900 tokens.
- Performance: adaptive parallelism — context count computed from available
VRAM (GPU) or CPU math cores rather than hardcoded.
- GPU: probe for CUDA, Metal, Vulkan explicitly at startup instead of
relying on node-llama-cpp's `gpu: "auto"`. `qmd status` shows device info.
- Tests: reorganized into flat `test/` directory with vitest for Node.js and
bun test for Bun. New `eval-bm25` and `store.helpers.unit` suites.
### Fixes
- Prevent VRAM waste from duplicate context creation during concurrent loads
- Collection-aware FTS filtering for scoped keyword search
---
- Prevent VRAM waste from duplicate context creation during concurrent
`embedBatch` calls — initialization lock now covers the full path.
- Collection-aware FTS filtering so scoped keyword search actually restricts
results to the requested collection.
## [0.9.0] - 2026-02-15
Initial public release.
First published release on npm as `@tobilu/qmd`. MCP HTTP transport with
daemon mode cuts warm query latency from ~16s to ~10s by keeping models
loaded between requests.
### Features
### Changes
- **Hybrid search pipeline** — BM25 full-text + vector similarity + LLM reranking with Reciprocal Rank Fusion
- **Smart chunking** — scored markdown break points keep sections, paragraphs, and code blocks intact (~900 tokens/chunk, 15% overlap)
- **Query expansion** — fine-tuned Qwen3 1.7B model generates search variations for better recall
- **Cross-encoder reranking** — Qwen3-Reranker scores candidates with position-aware blending
- **Vector embeddings** — EmbeddingGemma 300M via node-llama-cpp, all on-device
- **MCP server** — stdio and HTTP transports for Claude Desktop, Claude Code, and any MCP client
- **Collection management** — index multiple directories with glob patterns
- **Context annotations** — add descriptions to collections and paths for richer search
- **Document IDs** — 6-char content hash for stable references across re-indexes
- **Multi-get** — retrieve multiple documents by glob pattern, comma list, or docids
- **Multiple output formats** — JSON, CSV, Markdown, XML, files list
- **Claude Code plugin** — inline status checks and MCP integration
- MCP: HTTP transport with daemon lifecycle — `qmd mcp --http --daemon`
starts a background server, `qmd mcp stop` shuts it down. Models stay warm
in VRAM between queries. #149 (thanks @igrigorik)
- Search: type-routed query expansion preserves lex/vec/hyde type info and
routes to the appropriate backend. Eliminates ~4 wasted backend calls per
query (10.0 → 6.0 calls, 1278ms → 549ms). #149 (thanks @igrigorik)
- Search: unified pipeline — extracted `hybridQuery()` and
`vectorSearchQuery()` to `store.ts` so CLI and MCP share identical logic.
Fixes a class of bugs where results differed between the two. #149 (thanks
@igrigorik)
- MCP: dynamic instructions generated at startup from actual index state —
LLMs see collection names, doc counts, and content descriptions. #149
(thanks @igrigorik)
- MCP: tool renames (vsearch → vector_search, query → deep_search) with
rewritten descriptions for better tool selection. #149 (thanks @igrigorik)
- Integration: Claude Code plugin with inline status checks and MCP
integration. #99 (thanks @galligan)
### Fixes
- Handle dense content (code) that tokenizes beyond expected chunk size
- Proper cleanup of Metal GPU resources
- SQLite-vec readiness verification after extension load
- Reactivate deactivated documents on re-index
- BM25 score normalization with Math.abs
- Bun UTF-8 path corruption workaround
- BM25 score normalization — formula was inverted (`1/(1+|x|)` instead of
`|x|/(1+|x|)`), so strong matches scored *lowest*. Broke `--min-score`
filtering and made the "strong signal" short-circuit dead code. #76 (thanks
@dgilperez)
- Normalize Unicode paths to NFC for macOS compatibility. #82 (thanks
@c-stoeckl)
- Handle dense content (code) that tokenizes beyond expected chunk size.
- Proper cleanup of Metal GPU resources on process exit.
- SQLite-vec readiness verification after extension load.
- Reactivate deactivated documents on re-index instead of creating duplicates.
- Bun UTF-8 path corruption workaround for non-ASCII filenames.
- Disable following symlinks in glob.scan to avoid infinite loops.
## [0.8.0] - 2026-01-28
Fine-tuned query expansion model trained with GRPO replaces the stock Qwen3
0.6B. The training pipeline scores expansions on named entity preservation,
format compliance, and diversity — producing noticeably better lexical
variations and HyDE documents.
### Changes
- LLM: deploy GRPO-trained (Group Relative Policy Optimization) query
expansion model, hosted on HuggingFace and auto-downloaded on first use.
Better preservation of proper nouns and technical terms in expansions.
- LLM: `/only:lex` mode for single-type expansions — useful when you know
which search backend will help.
- LLM: HyDE output moved to first position so vector search can start
embedding while other expansions generate.
- LLM: session lifecycle management via `withLLMSession()` pattern — ensures
cleanup even on failure, similar to database transactions.
- Integration: org-mode title extraction support. #50 (thanks @sh54)
- Integration: SQLite extension loading in Nix devshell. #48 (thanks @sh54)
- Integration: AI agent discovery via skills.sh. #64 (thanks @Algiras)
### Fixes
- Use sequential embedding on CPU-only systems — parallel contexts caused a
race condition where contexts competed for CPU cores, making things slower.
#54 (thanks @freeman-jiang)
- Fix `collectionName` column in vector search SQL (was still using old
`collectionId` from before YAML migration). #61 (thanks @jdvmi00)
- Fix Qwen3 sampling params to prevent repetition loops — stock
temperature/top-p caused occasional infinite repeat patterns.
- Add `--index` option to CLI argument parser (was documented but not wired
up). #84 (thanks @Tritlo)
- Fix DisposedError during slow batch embedding. #41 (thanks @wuhup)
## [0.7.0] - 2026-01-09
First community contributions. The project gained external contributors,
surfacing bugs that only appear in diverse environments — Homebrew sqlite-vec
paths, case-sensitive model filenames, and sqlite-vec JOIN incompatibilities.
### Changes
- Indexing: native `realpathSync()` replaces `readlink -f` subprocess spawn
per file. On a 5000-file collection this eliminates 5000 shell spawns,
~15% faster. #8 (thanks @burke)
- Indexing: single-pass tokenization — chunking algorithm tokenized each
document twice (count then split); now tokenizes once and reuses. #9
(thanks @burke)
### Fixes
- Fix `vsearch` and `query` hanging — sqlite-vec's virtual table doesn't
support the JOIN pattern used; rewrote to subquery. #23 (thanks @mbrendan)
- Fix MCP server exiting immediately after startup — process had no active
handles keeping the event loop alive. #29 (thanks @mostlydev)
- Fix collection filter SQL to properly restrict vector search results.
- Support non-ASCII filenames in collection filter.
- Skip empty files during indexing instead of crashing on zero-length content.
- Fix case sensitivity in Qwen3 model filename resolution. #15 (thanks
@gavrix)
- Fix sqlite-vec loading on macOS with Homebrew (`BREW_PREFIX` detection).
#42 (thanks @komsit37)
- Fix Nix flake to use correct `src/qmd.ts` path. #7 (thanks @burke)
- Fix docid lookup with quotes support in get command. #36 (thanks
@JoshuaLelon)
- Fix query expansion model size in documentation. #38 (thanks @odysseus0)
## [0.6.0] - 2025-12-28
Replaced Ollama HTTP API with node-llama-cpp for all LLM operations. Ollama
adds convenience but also a running server dependency. node-llama-cpp loads
GGUF models directly in-process — zero external dependencies. Models
auto-download from HuggingFace on first use.
### Changes
- LLM: structured query expansion via JSON schema grammar constraints.
Model produces typed expansions — **lexical** (BM25 keywords), **vector**
(semantic rephrasings), **HyDE** (hypothetical document excerpts) — so each
routes to the right backend instead of sending everything everywhere.
- LLM: lazy model loading with 2-minute inactivity auto-unload. Keeps memory
low when idle while avoiding ~3s model load on every query.
- Search: conditional query expansion — when BM25 returns strong results, the
expensive LLM expansion is skipped entirely.
- Search: multi-chunk reranking — documents with multiple relevant chunks
scored by aggregating across all chunks rather than best single chunk.
- Search: cosine distance for vector search (was L2).
- Search: embeddinggemma nomic-style prompt formatting.
- Testing: evaluation harness with synthetic test documents and Hit@K metrics
for BM25, vector, and hybrid RRF.
## [0.5.0] - 2025-12-13
Collections and contexts moved from SQLite tables to YAML at
`~/.config/qmd/index.yml`. SQLite was overkill for config — you can't share
it, and it's opaque. YAML is human-readable and version-controllable. The
migration was extensive (35+ commits) because every part of the system that
touched collections or contexts had to be updated.
### Changes
- Config: YAML-based collections and contexts replace SQLite tables.
`collections` and `path_contexts` tables dropped from schema. Collections
support an optional `update:` command (e.g., `git pull`) before re-index.
- CLI: `qmd collection add/list/remove/rename` commands with `--name` and
`--mask` glob pattern support.
- CLI: `qmd ls` virtual file tree — list collections, files in a collection,
or files under a path prefix.
- CLI: `qmd context add/list/check/rm` with hierarchical context inheritance.
A query to `qmd://notes/2024/jan/` inherits context from `notes/`,
`notes/2024/`, and `notes/2024/jan/`.
- CLI: `qmd context add / "text"` for global context across all collections.
- CLI: `qmd context check` audit command to find paths without context.
- Paths: `qmd://` virtual URI scheme for portable document references.
`qmd://notes/ideas.md` works regardless of where the collection lives on
disk. Works in `get`, `multi-get`, `ls`, and context commands.
- CLI: document IDs (docid) — first 6 chars of content hash for stable
references. Shown as `#abc123` in search results, usable with `get` and
`multi-get`.
- CLI: `--line-numbers` flag for get command output.
## [0.4.0] - 2025-12-10
MCP server for AI agent integration. Without it, agents had to shell out to
`qmd search` and parse CLI output. The monolithic `qmd.ts` (1840 lines) was
split into focused modules with the project's first test suite (215 tests).
### Changes
- MCP: stdio server with tools for search, vector search, hybrid query,
document retrieval, and status. Runs over stdio transport for Claude
Desktop and MCP clients.
- MCP: spec-compliant with June 2025 MCP specification — removed non-spec
`mimeType`, added `isError: true` to errors, `structuredContent` for
machine-readable results, proper URI encoding.
- MCP: simplified tool naming (`qmd_search` → `search`) since MCP already
namespaces by server.
- Architecture: extract `store.ts` (1221 LOC), `llm.ts` (539 LOC),
`formatter.ts` (359 LOC), `mcp.ts` (503 LOC) from monolithic `qmd.ts`.
- Testing: 215 tests (store: 96, llm: 60, mcp: 59) with mocked Ollama for
fast, deterministic runs. Before this: zero tests.
## [0.3.0] - 2025-12-08
Document chunking for vector search. A 5000-word document about many topics
gets a single embedding that averages everything together, matching poorly for
specific queries. Chunking produces one embedding per ~900-token section with
focused semantic signal.
### Changes
- Search: markdown-aware chunking — prefers heading boundaries, then paragraph
breaks, then sentence boundaries. 15% overlap between chunks ensures
cross-boundary queries still match.
- Search: multi-chunk scoring bonus (+0.02 per additional chunk, capped at
+0.1 for 5+ chunks). Documents relevant in multiple sections rank higher.
- CLI: display paths show collection-relative paths and extracted titles
(from H1 headings or YAML frontmatter) instead of raw filesystem paths.
- CLI: `--all` flag returns all matches (use with `--min-score` to filter).
- CLI: byte-based progress bar with ETA for `embed` command.
- CLI: human-readable time formatting ("15m 4s" instead of "904.2s").
- CLI: documents >64KB truncated with warning during embedding.
## [0.2.0] - 2025-12-08
### Changes
- CLI: `--json`, `--csv`, `--files`, `--md`, `--xml` output format flags.
`--json` for programmatic access, `--files` for piping, `--md`/`--xml` for
LLM consumption, `--csv` for spreadsheets.
- CLI: `qmd status` shows index health — document count, size, embedding
coverage, time since last update.
- Search: weighted RRF — original query gets 2x weight relative to expanded
queries since the user's actual words are a more reliable signal.
## [0.1.0] - 2025-12-07
Initial implementation. Built in a single day for searching personal markdown
notes, journals, and meeting transcripts.
### Changes
- Search: SQLite FTS5 with BM25 ranking. Chose SQLite over Elasticsearch
because QMD is a personal tool — single binary, no server dependencies.
- Search: sqlite-vec for vector similarity. Same rationale: in-process, no
external vector database.
- Search: Reciprocal Rank Fusion to combine BM25 and vector results. RRF is
parameter-free and handles missing signals gracefully.
- LLM: Ollama for embeddings, reranking, and query expansion. Later replaced
with node-llama-cpp in 0.6.0.
- CLI: `qmd add`, `qmd embed`, `qmd search`, `qmd vsearch`, `qmd query`,
`qmd get`. ~1800 lines of TypeScript in a single `qmd.ts` file.
[Unreleased]: https://github.com/tobi/qmd/compare/v1.0.0...HEAD
[1.0.0]: https://github.com/tobi/qmd/releases/tag/v1.0.0
[0.9.0]: https://github.com/tobi/qmd/releases/tag/v0.9.0
[0.9.0]: https://github.com/tobi/qmd/compare/v0.8.0...v0.9.0

View File

@ -149,4 +149,17 @@ bun test --preload ./src/test-preload.ts test/
## Do NOT compile
- Never run `bun build --compile` - it overwrites the shell wrapper and breaks sqlite-vec
- The `qmd` file is a shell script that runs `bun src/qmd.ts` - do not replace it
- The `qmd` file is a shell script that runs compiled JS from `dist/` - do not replace it
- `npm run build` compiles TypeScript to `dist/` via `tsc -p tsconfig.build.json`
## Releasing
Use `/release <version>` to cut a release. Full changelog standards,
release workflow, and git hook setup are documented in the
[release skill](skills/release/SKILL.md).
Key points:
- Add changelog entries under `## [Unreleased]` **as you make changes**
- The release script renames `[Unreleased]``[X.Y.Z] - date` at release time
- Credit external PRs with `#NNN (thanks @username)`
- GitHub releases roll up the full minor series (e.g. 1.2.0 through 1.2.3)

View File

@ -6,6 +6,8 @@ QMD combines BM25 full-text search, vector semantic search, and LLM re-ranking
![QMD Architecture](assets/qmd-architecture.png)
You can read more about QMD's progress in the [CHANGELOG](CHANGELOG.md).
## Quick Start
```sh
@ -23,7 +25,7 @@ qmd collection add ~/notes --name notes
qmd collection add ~/Documents/meetings --name meetings
qmd collection add ~/work/docs --name docs
# Add context to help with search results
# Add context to help with search results, each piece of context will be returned when matching sub documents are returned. This works as a tree. This is the key feature of QMD as it allows LLMs to make much better contextual choices when selecting documents. Don't sleep on it!
qmd context add qmd://notes "Personal notes and ideas"
qmd context add qmd://meetings "Meeting transcripts and notes"
qmd context add qmd://docs "Work documentation"

View File

@ -7,14 +7,14 @@
"qmd": "qmd"
},
"files": [
"src/**/*.ts",
"!src/**/*.test.ts",
"!src/test-preload.ts",
"dist/",
"qmd",
"LICENSE",
"CHANGELOG.md"
],
"scripts": {
"prepare": "[ -d .git ] && ./scripts/install-hooks.sh || true",
"build": "tsc -p tsconfig.build.json",
"test": "vitest run --reporter=verbose test/",
"qmd": "tsx src/qmd.ts",
"index": "tsx src/qmd.ts index",

2
qmd
View File

@ -43,4 +43,4 @@ while [[ -L "$SOURCE" ]]; do
done
SCRIPT_DIR="$(cd -P "$(dirname "$SOURCE")" && pwd)"
exec "$NODE" --import tsx "$SCRIPT_DIR/src/qmd.ts" "$@"
exec "$NODE" "$SCRIPT_DIR/dist/qmd.js" "$@"

78
scripts/extract-changelog.sh Executable file
View File

@ -0,0 +1,78 @@
#!/usr/bin/env bash
set -euo pipefail
# Extract cumulative release notes from CHANGELOG.md.
#
# For a given version (e.g. 1.0.5), extracts all entries from the current
# minor series back to x.x.0 (e.g. 1.0.0 through 1.0.5). This means each
# GitHub release restates the full arc of changes for the minor series.
#
# The [Unreleased] section is included — it contains the content that will
# become [X.Y.Z] when the release script runs. If the version is already
# released, [Unreleased] may be empty and is omitted.
#
# Fails if neither [Unreleased] nor [X.Y.Z] has content in the changelog.
#
# Usage: scripts/extract-changelog.sh <version>
# Example: scripts/extract-changelog.sh 1.0.5
# -> extracts [Unreleased] + [1.0.5], [1.0.4], ..., [1.0.0]
VERSION="${1:?Usage: extract-changelog.sh <version>}"
# Parse major.minor.patch from version
IFS='.' read -r MAJOR MINOR PATCH <<< "$VERSION"
if [[ ! -f CHANGELOG.md ]]; then
echo "CHANGELOG.md not found" >&2
exit 1
fi
# Extract [Unreleased] section and all [X.Y.Z] sections matching our minor series.
OUTPUT=""
CAPTURING=false
UNRELEASED_CONTENT=""
IN_UNRELEASED=false
while IFS= read -r line; do
if [[ "$line" =~ ^##\ \[Unreleased\] ]]; then
CAPTURING=true
IN_UNRELEASED=true
elif [[ "$line" =~ ^##\ \[([0-9]+\.[0-9]+\.[0-9]+)\] ]]; then
IN_UNRELEASED=false
ENTRY_VERSION="${BASH_REMATCH[1]}"
IFS='.' read -r E_MAJOR E_MINOR E_PATCH <<< "$ENTRY_VERSION"
if [[ "$E_MAJOR" == "$MAJOR" && "$E_MINOR" == "$MINOR" ]]; then
CAPTURING=true
OUTPUT+="$line"$'\n'
else
CAPTURING=false
fi
elif [[ "$line" =~ ^##\ ]]; then
IN_UNRELEASED=false
CAPTURING=false
elif $CAPTURING; then
if $IN_UNRELEASED; then
UNRELEASED_CONTENT+="$line"$'\n'
else
OUTPUT+="$line"$'\n'
fi
fi
done < CHANGELOG.md
# Only include [Unreleased] if it has non-blank content
TRIMMED=$(echo "$UNRELEASED_CONTENT" | sed '/^[[:space:]]*$/d')
if [[ -n "$TRIMMED" ]]; then
OUTPUT="## [Unreleased]"$'\n'"$UNRELEASED_CONTENT$OUTPUT"
fi
# Fail if we got nothing
TRIMMED_OUTPUT=$(echo "$OUTPUT" | sed '/^[[:space:]]*$/d')
if [[ -z "$TRIMMED_OUTPUT" ]]; then
echo "error: no changelog content found for $VERSION" >&2
echo "Expected either:" >&2
echo " ## [Unreleased] (with content)" >&2
echo " ## [$VERSION] - YYYY-MM-DD" >&2
exit 1
fi
printf '%s' "$OUTPUT"

19
scripts/install-hooks.sh Executable file
View File

@ -0,0 +1,19 @@
#!/usr/bin/env bash
set -euo pipefail
# Self-installing git hooks for qmd
# Called from package.json "prepare" script after bun install
REPO_ROOT="$(cd "$(dirname "$0")/.." && pwd)"
HOOKS_DIR="$REPO_ROOT/.git/hooks"
if [[ ! -d "$HOOKS_DIR" ]]; then
echo "Not a git repository, skipping hook install"
exit 0
fi
# Install pre-push hook
cp "$REPO_ROOT/scripts/pre-push" "$HOOKS_DIR/pre-push"
chmod +x "$HOOKS_DIR/pre-push"
echo "Installed git hooks: pre-push"

112
scripts/pre-push Executable file
View File

@ -0,0 +1,112 @@
#!/usr/bin/env bash
set -euo pipefail
# Pre-push hook: validates v* tag pushes before they reach the remote.
#
# Checks:
# 1. package.json version matches the tag
# 2. CHANGELOG.md has a "## [{version}] - {date}" entry
# 3. CI passed upstream on GitHub for the tagged commit
#
# Installed automatically by: bun install (via prepare script)
while read -r local_ref local_sha remote_ref remote_sha; do
# Only validate v* tag pushes
if [[ "$local_ref" != refs/tags/v* ]]; then
continue
fi
# Skip tag deletions
if [[ "$local_sha" == "0000000000000000000000000000000000000000" ]]; then
continue
fi
TAG="${local_ref#refs/tags/}"
VERSION="${TAG#v}"
echo "Validating release $TAG..."
# --- 1. package.json version must match the tag ---
PKG_VERSION=$(jq -r .version package.json)
if [[ "$PKG_VERSION" != "$VERSION" ]]; then
echo ""
echo "ABORT: package.json version is $PKG_VERSION but tag is $TAG"
echo "Run: jq --arg v '$VERSION' '.version = \$v' package.json > tmp && mv tmp package.json"
exit 1
fi
echo " package.json version: $PKG_VERSION"
# --- 2. CHANGELOG.md must have an entry for this version ---
if [[ ! -f CHANGELOG.md ]]; then
echo ""
echo "ABORT: CHANGELOG.md not found"
exit 1
fi
if ! grep -q "^## \[$VERSION\] - " CHANGELOG.md; then
echo ""
echo "ABORT: CHANGELOG.md has no entry for this release"
echo "Expected heading: ## [$VERSION] - $(date +%Y-%m-%d)"
echo ""
echo "Write the changelog entry first. See CLAUDE.md for guidelines."
exit 1
fi
echo " CHANGELOG.md: entry found"
# --- 3. CI must have passed on GitHub for this commit ---
COMMIT=$(git rev-parse "$TAG" 2>/dev/null || git rev-parse HEAD)
if ! command -v gh &>/dev/null; then
echo " CI check: skipped (gh CLI not installed)"
continue
fi
# Check GitHub Actions check runs
CHECK_JSON=$(gh api "repos/{owner}/{repo}/commits/$COMMIT/check-runs" 2>/dev/null || echo "")
if [[ -z "$CHECK_JSON" ]]; then
echo " CI check: skipped (could not reach GitHub API)"
continue
fi
TOTAL=$(echo "$CHECK_JSON" | jq -r '.total_count // 0' 2>/dev/null || echo "0")
if [[ "$TOTAL" -eq 0 ]] 2>/dev/null; then
# No checks found — commit may not be pushed yet
MAIN_SHA=$(git rev-parse origin/main 2>/dev/null || echo "")
if [[ "$COMMIT" != "$MAIN_SHA" ]]; then
echo ""
echo "WARNING: No CI runs found for $COMMIT and it's not the tip of origin/main."
echo "Push the commit to main first and wait for CI to pass."
read -p "Push tag anyway? [y/N] " -n 1 -r </dev/tty
echo ""
[[ $REPLY =~ ^[Yy]$ ]] || exit 1
else
echo " CI check: no runs found (commit is on main)"
fi
else
FAILED=$(echo "$CHECK_JSON" | jq '[.check_runs // [] | .[] | select(.conclusion == "failure")] | length' 2>/dev/null || echo "0")
PENDING=$(echo "$CHECK_JSON" | jq '[.check_runs // [] | .[] | select(.status != "completed")] | length' 2>/dev/null || echo "0")
if [[ "$FAILED" -gt 0 ]] 2>/dev/null; then
echo ""
echo "ABORT: CI failed for commit $COMMIT"
echo "Check: https://github.com/tobi/qmd/commit/$COMMIT"
exit 1
fi
if [[ "$PENDING" -gt 0 ]] 2>/dev/null; then
echo ""
echo "WARNING: CI still running for commit $COMMIT ($PENDING pending)"
read -p "Push tag anyway? [y/N] " -n 1 -r </dev/tty
echo ""
[[ $REPLY =~ ^[Yy]$ ]] || exit 1
else
echo " CI check: all passed"
fi
fi
echo "All checks passed for $TAG"
done
exit 0

View File

@ -2,6 +2,11 @@
set -euo pipefail
# QMD Release Script
#
# Renames the [Unreleased] section in CHANGELOG.md to the new version,
# bumps package.json, commits, and creates a tag. The actual publish
# happens via GitHub Actions when the tag is pushed.
#
# Usage: ./scripts/release.sh [patch|minor|major|<version>]
# Examples:
# ./scripts/release.sh patch # 0.9.0 -> 0.9.1
@ -41,74 +46,60 @@ bump_version() {
}
NEW=$(bump_version "$CURRENT" "$BUMP")
DATE=$(date +%Y-%m-%d)
echo "New version: $NEW"
echo ""
# Confirm
# --- Validate CHANGELOG.md ---
if [[ ! -f CHANGELOG.md ]]; then
echo "Error: CHANGELOG.md not found" >&2
exit 1
fi
# The [Unreleased] section must have content
if ! grep -q "^## \[Unreleased\]" CHANGELOG.md; then
echo "Error: no [Unreleased] section in CHANGELOG.md" >&2
echo "" >&2
echo "Add your changes under an [Unreleased] heading first:" >&2
echo "" >&2
echo " ## [Unreleased]" >&2
echo "" >&2
echo " ### Changes" >&2
echo " - Your change here" >&2
exit 1
fi
# --- Preview release notes ---
echo "--- Release notes (will appear on GitHub) ---"
./scripts/extract-changelog.sh "$NEW"
echo "--- End ---"
echo ""
# --- Confirm ---
read -p "Release v$NEW? [y/N] " -n 1 -r
echo ""
[[ $REPLY =~ ^[Yy]$ ]] || { echo "Aborted."; exit 1; }
# Gather commits since last tag (or all if no tags)
LAST_TAG=$(git describe --tags --abbrev=0 2>/dev/null || echo "")
if [[ -n "$LAST_TAG" ]]; then
RANGE="$LAST_TAG..HEAD"
else
RANGE="HEAD"
fi
# --- Rename [Unreleased] -> [X.Y.Z] - date, add fresh [Unreleased] ---
echo ""
echo "Commits since ${LAST_TAG:-beginning}:"
git log "$RANGE" --oneline --no-decorate
echo ""
sed -i '' "s/^## \[Unreleased\].*/## [$NEW] - $DATE/" CHANGELOG.md
# Generate changelog entry
DATE=$(date +%Y-%m-%d)
ENTRY="## [$NEW] - $DATE"$'\n'$'\n'
# Insert a new empty [Unreleased] section after the header
awk '
/^## \['"$NEW"'\]/ && !done {
print "## [Unreleased]\n"
done = 1
}
{ print }
' CHANGELOG.md > CHANGELOG.md.tmp && mv CHANGELOG.md.tmp CHANGELOG.md
# Collect conventional commits
FEATS=$(git log "$RANGE" --oneline --no-decorate --grep="^feat" | sed 's/^[a-f0-9]* feat[:(]/- /' | sed 's/)$//' || true)
FIXES=$(git log "$RANGE" --oneline --no-decorate --grep="^fix" | sed 's/^[a-f0-9]* fix[:(]/- /' | sed 's/)$//' || true)
OTHER=$(git log "$RANGE" --oneline --no-decorate --grep="^feat" --grep="^fix" --grep="^docs" --grep="^chore" --grep="^refactor" --invert-grep | sed 's/^[a-f0-9]* /- /' || true)
# --- Bump version and commit ---
if [[ -n "$FEATS" ]]; then
ENTRY+="### Features"$'\n'$'\n'"$FEATS"$'\n'$'\n'
fi
if [[ -n "$FIXES" ]]; then
ENTRY+="### Fixes"$'\n'$'\n'"$FIXES"$'\n'$'\n'
fi
if [[ -n "$OTHER" ]]; then
ENTRY+="### Other"$'\n'$'\n'"$OTHER"$'\n'$'\n'
fi
# Add link reference
LINK="[$NEW]: https://github.com/tobi/qmd/compare/v$CURRENT...v$NEW"
# Show what will be added
echo "--- Changelog entry ---"
echo "$ENTRY"
echo "$LINK"
echo "--- End ---"
echo ""
read -p "Looks good? [y/N] " -n 1 -r
echo ""
[[ $REPLY =~ ^[Yy]$ ]] || { echo "Aborted."; exit 1; }
# Update package.json version
jq --arg v "$NEW" '.version = $v' package.json > package.json.tmp && mv package.json.tmp package.json
# Prepend changelog entry (after the header line)
if [[ -f CHANGELOG.md ]]; then
# Insert after "# Changelog" header and any blank lines
awk -v entry="$ENTRY$LINK" '
/^# Changelog/ { print; getline; print; print ""; print entry; print ""; next }
{ print }
' CHANGELOG.md > CHANGELOG.md.tmp && mv CHANGELOG.md.tmp CHANGELOG.md
else
echo "# Changelog"$'\n'$'\n'"$ENTRY$LINK" > CHANGELOG.md
fi
# Commit and tag
git add package.json CHANGELOG.md
git commit -m "release: v$NEW"
git tag -a "v$NEW" -m "v$NEW"
@ -116,9 +107,6 @@ git tag -a "v$NEW" -m "v$NEW"
echo ""
echo "Created commit and tag v$NEW"
echo ""
echo "Next steps:"
echo " git push origin main --tags # push to GitHub"
echo " npm publish # publish to npm"
echo "Next: push to trigger the publish workflow"
echo ""
echo "Or both at once:"
echo " git push origin main --tags && npm publish"
echo " git push origin main --tags"

114
skills/release/SKILL.md Normal file
View File

@ -0,0 +1,114 @@
---
name: release
description: Manage releases for this project. Validates changelog, installs git hooks, and cuts releases. Use when user says "/release", "release 1.0.5", "cut a release", or asks about the release process. NOT auto-invoked by the model.
disable-model-invocation: true
---
# Release
Cut a release, validate the changelog, and ensure git hooks are installed.
## Usage
`/release 1.0.5` or `/release patch` (bumps patch from current version).
## Process
When the user triggers `/release <version>`:
1. **Install hooks** — run `scripts/install-hooks.sh` (idempotent)
2. **Validate changelog** — confirm `## [Unreleased]` has content
3. **Preview** — show the user what will be released (unreleased content + minor series rollup via `scripts/extract-changelog.sh`)
4. **Ask for confirmation** — do NOT proceed without explicit user approval
5. **Run `scripts/release.sh <version>`** — renames `[Unreleased]`, bumps version, commits, tags
6. **Remind** — tell the user to `git push origin main --tags`
If any step fails, stop and explain. Never force-push or skip validation.
## Changelog Standard
The changelog lives in `CHANGELOG.md` and follows [Keep a Changelog](https://keepachangelog.com/) conventions.
### Heading format
- `## [Unreleased]` — accumulates entries between releases
- `## [X.Y.Z] - YYYY-MM-DD` — released versions
The release script renames `[Unreleased]``[X.Y.Z] - date` and inserts a
fresh empty `[Unreleased]` section automatically.
### Structure of a release entry
Each version entry has two parts:
**1. Highlights (optional, 1-4 sentences of prose)**
Immediately after the version heading, before any `###` section. This is the
elevator pitch — what would you tell someone in 30 seconds? Only include for
releases with significant changes. Skip for small patches.
```markdown
## [1.1.0] - 2026-03-01
QMD now runs on both Node.js and Bun, with up to 2.7x faster reranking
through parallel contexts. GPU auto-detection replaces the unreliable
`gpu: "auto"` with explicit CUDA/Metal/Vulkan probing.
```
**2. Detailed changelog (`### Changes` and `### Fixes`)**
```markdown
### Changes
- Runtime: support Node.js (>=22) alongside Bun. The `qmd` wrapper
auto-detects a suitable install via PATH. #149 (thanks @igrigorik)
- Performance: parallel embedding & reranking — up to 2.7x faster on
multi-core machines.
### Fixes
- Prevent VRAM waste from duplicate context creation during concurrent
`embedBatch` calls. #152 (thanks @jkrems)
```
### Writing guidelines
- **Explain the why, not just the what.** The changelog is for users.
- **Include numbers.** "2.7x faster", "17x less memory".
- **Group by theme, not by file.** "Performance" not "Changes to llm.ts".
- **Don't list every commit.** Aggregate related changes.
- **Credit contributors:** end bullets with `#NNN (thanks @username)` for
external PRs. No need to credit the repo owner.
### What not to include
- Internal refactors with no user-visible effect
- Dependency bumps (unless fixing a user-facing bug)
- CI/tooling changes (unless affecting the release artifact)
- Test additions (unless validating a fix worth mentioning)
## GitHub Release Notes
Each GitHub release includes the full changelog for the **minor series** back
to x.x.0. Releasing v1.2.3 includes entries for 1.2.3, 1.2.2, 1.2.1, and
1.2.0. The `scripts/extract-changelog.sh` script handles this, and the
publish workflow (`publish.yml`) calls it to populate the GitHub release.
## Git Hooks
The pre-push hook (`scripts/pre-push`) blocks `v*` tag pushes unless:
1. `package.json` version matches the tag
2. `CHANGELOG.md` has a `## [X.Y.Z] - date` entry for the version
3. CI passed on GitHub for the tagged commit
Run `skills/release/scripts/install-hooks.sh` to install (also runs
automatically via `bun install` prepare script).
## Scripts
- [`scripts/install-hooks.sh`](scripts/install-hooks.sh) — install/update git hooks
- Project scripts used during release:
- `scripts/release.sh` — rename [Unreleased], bump version, commit, tag
- `scripts/extract-changelog.sh` — extract minor series notes for GitHub release
- `scripts/pre-push` — pre-push validation hook

View File

@ -0,0 +1,38 @@
#!/usr/bin/env bash
set -euo pipefail
# Install git hooks for release validation.
# Idempotent — safe to run multiple times.
REPO_ROOT=$(git rev-parse --show-toplevel 2>/dev/null)
if [[ -z "$REPO_ROOT" ]]; then
echo "Error: not in a git repository" >&2
exit 1
fi
HOOKS_DIR="$REPO_ROOT/.git/hooks"
SOURCE="$REPO_ROOT/scripts/pre-push"
if [[ ! -f "$SOURCE" ]]; then
echo "Error: scripts/pre-push not found at $SOURCE" >&2
exit 1
fi
# Install pre-push hook
if [[ -L "$HOOKS_DIR/pre-push" ]] && [[ "$(readlink "$HOOKS_DIR/pre-push")" == "$SOURCE" ]]; then
echo "pre-push hook: already installed (symlink)"
elif [[ -f "$HOOKS_DIR/pre-push" ]]; then
# Existing hook that isn't our symlink — back it up
BACKUP="$HOOKS_DIR/pre-push.backup.$(date +%s)"
echo "pre-push hook: backing up existing hook to $(basename "$BACKUP")"
mv "$HOOKS_DIR/pre-push" "$BACKUP"
ln -sf "$SOURCE" "$HOOKS_DIR/pre-push"
echo "pre-push hook: installed (symlink → scripts/pre-push)"
else
ln -sf "$SOURCE" "$HOOKS_DIR/pre-push"
echo "pre-push hook: installed (symlink → scripts/pre-push)"
fi
# Ensure the source is executable
chmod +x "$SOURCE"
echo "Done."

View File

@ -494,6 +494,7 @@ export class LlamaCpp implements LLM {
// Detect available GPU types and use the best one.
// We can't rely on gpu:"auto" — it returns false even when CUDA is available
// (likely a binary/build config issue in node-llama-cpp).
// @ts-expect-error node-llama-cpp API compat
const gpuTypes = await getLlamaGpuTypes();
// Prefer CUDA > Metal > Vulkan > CPU
const preferred = (["cuda", "metal", "vulkan"] as const).find(g => gpuTypes.includes(g));
@ -733,7 +734,7 @@ export class LlamaCpp implements LLM {
contextSize: LlamaCpp.RERANK_CONTEXT_SIZE,
flashAttention: true,
...(threads > 0 ? { threads } : {}),
}));
} as any));
} catch {
if (this.rerankContexts.length === 0) {
// Flash attention might not be supported — retry without it
@ -828,7 +829,7 @@ export class LlamaCpp implements LLM {
if (n === 1) {
// Single context: sequential (no point splitting)
const context = contexts[0]!;
const embeddings = [];
const embeddings: ({ embedding: number[]; model: string } | null)[] = [];
for (const text of texts) {
try {
const embedding = await context.getEmbeddingFor(text);

View File

@ -682,6 +682,6 @@ export async function startMcpHttpServer(port: number, options?: { quiet?: boole
}
// Run if this is the main module
if (fileURLToPath(import.meta.url) === process.argv[1] || process.argv[1]?.endsWith("/mcp.ts")) {
if (fileURLToPath(import.meta.url) === process.argv[1] || process.argv[1]?.endsWith("/mcp.ts") || process.argv[1]?.endsWith("/mcp.js")) {
startMcpServer().catch(console.error);
}

View File

@ -2207,7 +2207,7 @@ async function showVersion(): Promise<void> {
}
// Main CLI - only run if this is the main module
if (fileURLToPath(import.meta.url) === process.argv[1] || process.argv[1]?.endsWith("/qmd.ts")) {
if (fileURLToPath(import.meta.url) === process.argv[1] || process.argv[1]?.endsWith("/qmd.ts") || process.argv[1]?.endsWith("/qmd.js")) {
const cli = parseCLI();
if (cli.values.version) {
@ -2489,8 +2489,11 @@ if (fileURLToPath(import.meta.url) === process.argv[1] || process.argv[1]?.endsW
mkdirSync(cacheDir, { recursive: true });
const logPath = resolve(cacheDir, "mcp.log");
const logFd = openSync(logPath, "w"); // truncate — fresh log per daemon run
const tsxLoader = pathJoin(dirname(fileURLToPath(import.meta.url)), "..", "node_modules", "tsx", "dist", "esm", "index.mjs");
const child = nodeSpawn(process.execPath, ["--import", tsxLoader, fileURLToPath(import.meta.url), "mcp", "--http", "--port", String(port)], {
const selfPath = fileURLToPath(import.meta.url);
const spawnArgs = selfPath.endsWith(".ts")
? ["--import", pathJoin(dirname(selfPath), "..", "node_modules", "tsx", "dist", "esm", "index.mjs"), selfPath, "mcp", "--http", "--port", String(port)]
: [selfPath, "mcp", "--http", "--port", String(port)];
const child = nodeSpawn(process.execPath, spawnArgs, {
stdio: ["ignore", logFd, logFd],
detached: true,
});

View File

@ -23,7 +23,7 @@ import {
formatDocForEmbedding,
type RerankDocument,
type ILLMSession,
} from "./llm";
} from "./llm.js";
import {
findContextForPath as collectionsFindContextForPath,
addContext as collectionsAddContext,
@ -37,7 +37,7 @@ import {
setGlobalContext,
loadConfig as collectionsLoadConfig,
type NamedCollection,
} from "./collections";
} from "./collections.js";
// =============================================================================
// Configuration

11
tsconfig.build.json Normal file
View File

@ -0,0 +1,11 @@
{
"extends": "./tsconfig.json",
"compilerOptions": {
"noEmit": false,
"outDir": "dist",
"declaration": true,
"noImplicitAny": false
},
"include": ["src/**/*.ts"],
"exclude": ["src/**/*.test.ts", "src/test-preload.ts", "src/bench-*.ts"]
}