From dc64166a2a02e2625e2bfcf9bbfe17553b8d6445 Mon Sep 17 00:00:00 2001 From: Tobi Lutke Date: Sun, 15 Feb 2026 17:02:00 -0400 Subject: [PATCH] release: v1.0.0 Node.js compatibility, parallel embedding/reranking, flash attention, GPU auto-detection, and restructured test suite. --- CHANGELOG.md | 34 ++++++++++++++++++++++++++++++++++ package.json | 2 +- 2 files changed, 35 insertions(+), 1 deletion(-) diff --git a/CHANGELOG.md b/CHANGELOG.md index 148860d..43b0fd1 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -2,6 +2,39 @@ All notable changes to QMD will be documented in this file. +## [1.0.0] - 2026-02-15 + +### Node.js Compatibility + +QMD now runs on both **Node.js (>=22)** and **Bun**. Install with `npm install -g @tobilu/qmd` or `bun install -g @tobilu/qmd` — your choice. The `qmd` wrapper auto-detects Node.js via `tsx` and works out of the box with mise, asdf, nvm, and Homebrew installs. + +### Performance + +- **Parallel embedding & reranking** — multiple contexts split work across CPU cores (or VRAM on GPU), delivering up to **2.7x faster reranking** and significantly faster embedding on multi-core machines +- **Flash attention** — ~20% less VRAM per reranking context, enabling more parallel contexts on GPU +- **Right-sized contexts** — reranker context dropped from 40960 to 2048 tokens (17x less memory), since chunks are capped at ~900 tokens +- **Adaptive parallelism** — automatically scales context count based on available VRAM (GPU) or CPU math cores +- **CPU thread splitting** — each context runs on its own cores for true parallelism instead of contending on a single context + +### GPU Auto-Detection + +- Probes for CUDA, Metal, and Vulkan at startup — uses the best available backend +- Falls back gracefully to CPU with a warning if GPU init fails +- `qmd status` now shows device info (GPU type, VRAM usage) + +### Test Suite + +- Tests split into `src/*.test.ts` (unit), `src/models/*.test.ts` (model), and `src/integration/*.test.ts` (CLI/integration) +- Vitest config for Node.js; bun test still works for Bun +- New `eval-bm25` and `store.helpers.unit` test suites + +### Fixes + +- Prevent VRAM waste from duplicate context creation during concurrent loads +- Collection-aware FTS filtering for scoped keyword search + +--- + ## [0.9.0] - 2026-02-15 Initial public release. @@ -30,5 +63,6 @@ Initial public release. - BM25 score normalization with Math.abs - Bun UTF-8 path corruption workaround +[1.0.0]: https://github.com/tobi/qmd/releases/tag/v1.0.0 [0.9.0]: https://github.com/tobi/qmd/releases/tag/v0.9.0 diff --git a/package.json b/package.json index f82c0c6..1382b44 100644 --- a/package.json +++ b/package.json @@ -1,6 +1,6 @@ { "name": "@tobilu/qmd", - "version": "0.9.9", + "version": "1.0.0", "description": "Query Markup Documents - On-device hybrid search for markdown files with BM25, vector search, and LLM reranking", "type": "module", "bin": {