docs: improve qmd skill guidance

2026-05-19 15:22:14 -04:00 · 2026-05-19 15:22:14 -04:00 · 105c577b3b
commit 105c577b3b
parent 2b250f3dca
4 changed files with 142 additions and 89 deletions
--- a/skills/qmd/SKILL.md
+++ b/skills/qmd/SKILL.md
@ -11,86 +11,107 @@ allowed-tools: Bash(qmd:*), mcp__qmd__*

 # QMD - Query Markdown Documents

-QMD is a local search and retrieval engine for markdown collections: notes, docs,
-wikis, transcripts, and project knowledge bases. Use it before generic web search
-when the user is asking about something that may already live in their indexed
-local markdown.
+## How search works

-## Status Check
+QMD searches local markdown collections: notes, docs, wikis, transcripts, and
+project knowledge bases. Use it before web search when the answer may already be
+in indexed local files.

-Start by checking what QMD can see:
+The workflow is always:
+
+1. Search for candidate documents.
+2. Retrieve the full source with `qmd get` or `qmd multi-get`.
+3. Answer from retrieved text, citing paths or docids.
+
+Do not answer from snippets alone when the user needs facts, decisions, quotes,
+or nuance. Snippets are only leads.
+
+Typical loop:

 ```bash
-qmd collection list
-qmd ls
+qmd search "merchant reality support interviews" -n 5
+# leads: #abc123 concepts/customer-proximity.md; #def432 sources/merchant-call.md
+qmd multi-get "#abc123,#def432" --md
 ```

-For health details:
+For harder searches, use `qmd query` structured queries with `intent:`, `lex:`,
+`vec:`, and `hyde:` fields.

-```bash
-qmd status
+When reporting what you retrieved, a compact note is enough; do not paste whole
+files unless needed:
+
+```text
+Retrieved:
+- #abc123 concepts/customer-proximity.md
+- #def432 sources/merchant-call.md
 ```

-If QMD is missing:
+## Pick the right search mode

-```bash
-npm install -g @tobilu/qmd
-```
-
-## Retrieval Workflow
-
-1. **Discover collections** with `qmd collection list` or `qmd ls`.
-2. **Search first**, usually with a small result count.
-3. **Retrieve source documents** with `qmd get` or `qmd multi-get`.
-4. **Answer from the retrieved text**, citing file paths or docids.
-5. **If results are weak**, rewrite the query using a different search mode.
-
-Do not answer from search-result snippets alone when the user needs substance.
-Fetch the document.
-
-## Search Modes
-
-### Fast lexical search
-
-Use BM25 when you know names, exact terms, titles, identifiers, or code symbols:
+Use **BM25 lexical search** when you know exact words, titles, names, code
+symbols, or rare phrases:

 ```bash
 qmd search "cockpit OKR Goodhart" -n 10
 qmd search '"AI Before Headcount"' -c concepts -n 5
 ```

-Good `lex` queries are short: 2-6 discriminative terms, quoted phrases when exact,
-and no filler words.
-
-### Hybrid query search
-
-Use `qmd query` when semantic recall, query expansion, vector search, or reranking
-matters more than speed:
+Use **hybrid semantic search** when the user describes an idea indirectly, uses
+different wording than the source, or needs conceptual recall:

 ```bash
 qmd query "decision quality depends on surfacing assumptions and context" -n 10
 qmd query --json --explain "metrics as cockpit instruments but not OKRs"
 ```

-`qmd query` may initialize local models. If models/GPU are unavailable, slow, or
-crashing, fall back to `qmd search` and use better lexical terms.
-
-### Structured queries
-
-For subtle wiki/doc searches, structured queries are usually strongest:
+Use **structured queries** for hard searches. They combine exact anchors with
+semantic recall:

 ```bash
 qmd query $'intent: Find the concept note about metrics as instruments without letting OKRs replace judgment.\nlex: cockpit instruments OKR Goodhart metrics judgment\nvec: data informed not metric driven product judgment\nhyde: A concept note says metrics are useful like cockpit instruments, but leaders should remain data-informed rather than metric-driven because OKRs and dashboards can Goodhart product judgment.'
 ```

-Use this pattern when the user's wording is indirect:
+Structured query fields:

- `intent:` disambiguates the target.
- `lex:` anchors exact names, phrases, aliases, and rare terms.
- `vec:` adds the semantic paraphrase.
- `hyde:` describes the document that would answer the query.
+- `intent:` states what you are trying to find and what to avoid.
+- `lex:` uses exact terms, aliases, titles, and rare words.
+- `vec:` paraphrases the idea in natural language.
+- `hyde:` describes the document or answer that would satisfy the request.

-Put the best query first; early searches receive more weight in fusion.
+If `qmd query` is slow or model/GPU setup fails, fall back to `qmd search` with
+better lexical terms.
+
+## Retrieve sources
+
+Search results include docids like `#abc123` and `qmd://...` paths. Fetch them:
+
+```bash
+qmd get "#abc123"
+qmd get qmd://concepts/ai-before-headcount.md --full
+qmd multi-get "#abc123,#def432" --md
+qmd multi-get 'concepts/{ai-before-headcount.md,data-informed-not-metric-driven.md}' --md
+qmd multi-get 'sources/podcast-2025-*.md' -l 80
+```
+
+Use `multi-get` when comparing several hits or gathering context across pages.
+Use `--full` when the exact source matters.
+
+## Discover what is indexed
+
+```bash
+qmd collection list
+qmd ls
+qmd status
+```
+
+Add collection filters when broad searches drift into the wrong corpus:
+
+```bash
+qmd search "headcount autonomous agents" -c concepts -n 10
+qmd query "merchant support product reality" -c concepts -c sources -n 10
+```
+
+Omit `-c` to search everything.

 ## MCP Tool: `query`

@ -109,35 +130,13 @@ When using the MCP server, prefer structured searches:
 }
 ```

-### Query Types
+Query types:

 - `lex` — BM25 keyword search. Best for exact terms, names, titles, and code.
 - `vec` — vector semantic search. Best for natural-language concepts.
 - `hyde` — vector search using a hypothetical answer/document passage.

-## Retrieval Commands
-
-```bash
-qmd get "#abc123"                         # retrieve by docid
-qmd get qmd://concepts/ai-before-headcount.md --full
-qmd multi-get 'concepts/{ai-before-headcount.md,data-informed-not-metric-driven.md}' --md
-qmd multi-get 'sources/podcast-2025-*.md' -l 80
-```
-
-Use `multi-get` when comparing several hits or gathering context across pages.
-Use `--full` when the exact source matters.
-
-## Collection Filtering
-
-```bash
-qmd search "headcount autonomous agents" -c concepts -n 10
-qmd query "merchant support product reality" -c concepts -c sources -n 10
-```
-
-Omit `-c` / `collections` to search everything. Add collection filters when a
-broad query drifts into the wrong corpus.
-
-## Query Craft
+## Query craft

 Good QMD searches mix three things:

@ -158,19 +157,31 @@ qmd query $'intent: Find the customer proximity concept, not generic customer de
 qmd search "six-week cadence WhatsApp merchant relationships Shawn Ryan" -c sources -n 10
 ```

-## Setup
+## Setup and maintenance
+
+Only mutate indexes when the user asked for setup or maintenance. Searching and
+retrieving are safe; collection/index mutation is not a casual first step.

 ```bash
 npm install -g @tobilu/qmd
 qmd collection add ~/notes --name notes
+qmd update
 qmd embed
 ```

-Only add collections or generate embeddings when the user asked for setup or
-index maintenance. Searching and retrieving are safe; collection/index mutation is
-not a casual first step.
+Health and diagnostics:

-## MCP Setup
+```bash
+qmd doctor
+qmd status
+qmd pull
+```
+
+`qmd doctor` checks config, model cache, device/GPU setup, vector fingerprints,
+and common environment overrides. If a model-backed command fails, run it before
+changing configuration.
+
+## MCP setup

 See `references/mcp-setup.md` for Claude Code, Claude Desktop, OpenClaw, and HTTP
 server configuration.
@ -188,5 +199,5 @@ server configuration.
 - **Ambiguous user wording needs intent.** Add `intent:` rather than hoping query
  expansion guesses the right domain.
 - **Collection names matter.** Search `concepts` for synthesized wiki pages,
-  `sources` for transcripts/raw source pages, and docs collections for code/project
-  documentation.
+  `sources` for transcripts/raw source pages, and docs collections for code or
+  project documentation.
--- a/src/cli/qmd.ts
+++ b/src/cli/qmd.ts
@ -3004,6 +3004,35 @@ function copyDirectoryContents(sourceDir: string, targetDir: string): void {
  }
 }

+function installedSkillStubContent(): string {
+  return `---
+name: qmd
+description: Bootstrap QMD search instructions from the installed qmd CLI. Use when users ask to find notes, retrieve documents, inspect a wiki, or answer from indexed local markdown.
+license: MIT
+compatibility: Requires qmd CLI. Run \`qmd skill show\` for version-matched instructions.
+allowed-tools: Bash(qmd:*), mcp__qmd__*
+---
+
+# QMD - Query Markdown Documents
+
+This installed skill is intentionally a small bootstrap so it does not go stale
+when the qmd package updates.
+
+Load the full, version-matched QMD instructions from the CLI:
+
+!\`qmd skill show\`
+
+If your agent does not support bang-command expansion, run:
+
+\`\`\`bash
+qmd skill show
+\`\`\`
+
+Then follow those instructions. In short: search first, fetch full sources with
+\`qmd get\` or \`qmd multi-get\`, and answer from retrieved text rather than snippets.
+`;
+}
+
 function writeSkillInstall(targetDir: string, force: boolean): void {
  if (pathExists(targetDir)) {
    if (!force) {
@ -3018,6 +3047,7 @@ function writeSkillInstall(targetDir: string, force: boolean): void {
  }

  copyDirectoryContents(skill.dir, targetDir);
+  writeFileSync(resolve(targetDir, "SKILL.md"), installedSkillStubContent(), "utf-8");
 }

 function outputSkillsJson(payload: unknown): void {
--- a/test/cli.test.ts
+++ b/test/cli.test.ts
@ -293,7 +293,7 @@ describe("CLI Skills", () => {
    expect(stdout).not.toContain("This file is a discovery stub");
  });

-  test("legacy skill install writes the canonical skill", async () => {
+  test("legacy skill install writes a qmd skill show bootstrap", async () => {
    const installDir = join(testDir, "skill-install-target");
    await mkdir(installDir, { recursive: true });

@ -305,8 +305,9 @@ describe("CLI Skills", () => {
    const installedSkillDir = join(installDir, ".agents", "skills", "qmd");
    const installed = readFileSync(join(installedSkillDir, "SKILL.md"), "utf8");
    expect(installed).toContain("# QMD - Query Markdown Documents");
-    expect(installed).toContain("## MCP Tool: `query`");
-    expect(installed).not.toContain("This file is a discovery stub");
+    expect(installed).toContain("!`qmd skill show`");
+    expect(installed).toContain("qmd get");
+    expect(installed).not.toContain("## MCP Tool: `query`");
    expect(readFileSync(join(installedSkillDir, "references", "mcp-setup.md"), "utf8")).toContain("# QMD MCP Server Setup");
  });
 });
@ -378,7 +379,9 @@ describe("CLI Skill Commands", () => {
    expect(exitCode).toBe(0);

    const skillDir = join(projectDir, ".agents", "skills", "qmd");
-    expect(readFileSync(join(skillDir, "SKILL.md"), "utf-8")).toContain("# QMD - Query Markdown Documents");
+    const installed = readFileSync(join(skillDir, "SKILL.md"), "utf-8");
+    expect(installed).toContain("# QMD - Query Markdown Documents");
+    expect(installed).toContain("!`qmd skill show`");
    expect(existsSync(join(projectDir, ".claude", "skills", "qmd"))).toBe(false);
    expect(stdout).toContain(`✓ Installed QMD skill to ${skillDir}`);
    expect(stdout).toContain("Tip: create a Claude symlink manually");
@ -396,9 +399,9 @@ describe("CLI Skill Commands", () => {
    const skillDir = join(fakeHome, ".agents", "skills", "qmd");
    const claudeLink = join(fakeHome, ".claude", "skills", "qmd");

-    expect(readFileSync(join(skillDir, "SKILL.md"), "utf-8")).toContain("# QMD - Query Markdown Documents");
+    expect(readFileSync(join(skillDir, "SKILL.md"), "utf-8")).toContain("!`qmd skill show`");
    expect(lstatSync(claudeLink).isSymbolicLink()).toBe(true);
-    expect(readFileSync(join(claudeLink, "SKILL.md"), "utf-8")).toContain("# QMD - Query Markdown Documents");
+    expect(readFileSync(join(claudeLink, "SKILL.md"), "utf-8")).toContain("!`qmd skill show`");
    expect(stdout).toContain(`✓ Installed QMD skill to ${skillDir}`);
    expect(stdout).toContain(`✓ Linked Claude skill at ${claudeLink}`);
  });
@ -416,7 +419,7 @@ describe("CLI Skill Commands", () => {

    const skillDir = join(fakeHome, ".agents", "skills", "qmd");
    expect(lstatSync(skillDir).isSymbolicLink()).toBe(false);
-    expect(readFileSync(join(skillDir, "SKILL.md"), "utf-8")).toContain("# QMD - Query Markdown Documents");
+    expect(readFileSync(join(skillDir, "SKILL.md"), "utf-8")).toContain("!`qmd skill show`");
    expect(stdout).toContain(`✓ Claude already sees the skill via ${join(fakeHome, ".claude", "skills")}`);
  });

--- a/test/package.test.ts
+++ b/test/package.test.ts
@ -50,9 +50,18 @@ describe("package grammar distribution", () => {
    expect(pkg.files, "published package files").toContain("skills/");
    const qmdSkill = readFileSync(new URL("skills/qmd/SKILL.md", root), "utf8");
    expect(qmdSkill).toContain("# QMD - Query Markdown Documents");
+    expect(qmdSkill).toContain("## How search works");
    expect(qmdSkill).toContain("## MCP Tool: `query`");
    expect(qmdSkill).not.toContain("This file is a discovery stub");

+    const firstSixtyLines = qmdSkill.split(/\r?\n/).slice(0, 60).join("\n");
+    expect(firstSixtyLines).toContain("Search for candidate documents");
+    expect(firstSixtyLines).toContain("qmd search");
+    expect(firstSixtyLines).toContain('qmd multi-get "#abc123,#def432"');
+    expect(firstSixtyLines).toContain("Retrieved:");
+    expect(firstSixtyLines).toContain("qmd query");
+    expect(firstSixtyLines).toContain("structured queries");
+
    const scriptPath = join(root.pathname, "scripts", "check-package-grammars.mjs");
    const script = readFileSync(scriptPath, "utf8");
    expect(script).toContain("tree-sitter-typescript/tree-sitter-typescript.wasm");