docs(qmd-skill): structured-query-first; cite docid + lines; no sed

- Make structured `qmd query` with intent:/lex:/vec:/hyde: the default search mode, and emphasize that the caller authors the expansion rather than leaning on the built-in query-expansion model. - Tell the caller to cite the #docid and exact line numbers now printed by get/multi-get, and to slice files with the :from:count suffix or --from/-l instead of piping through sed/head/tail. - Document --full-path for handing the on-disk path to editor tools. - Bump skill version to 2.2.0 and record the behavior changes under ## [Unreleased] in CHANGELOG.md. - Update the package smoke test that pinned the old 'structured queries' wording to match the new, more specific intro phrasing.
2026-05-28 10:56:13 -07:00 · 2026-05-28 10:56:13 -07:00 · fa8f904a9d
commit fa8f904a9d
parent 41bc3a27d8
3 changed files with 130 additions and 20 deletions
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@ -2,6 +2,32 @@

 ## [Unreleased]

+### Features
+
+- `qmd get` now accepts a `:from:count` suffix on a path or docid (e.g.
+  `qmd get "#abc123:120:40"` reads 40 lines starting at line 120). Explicit
+  `--from`/`-l` flags still override the suffix. The MCP `get` tool accepts the
+  same suffix.
+- `qmd get` and `qmd multi-get` are now **line-numbered by default** and print
+  the document's `#docid` and `qmd://` path in the output header. Disable line
+  numbers with `--no-line-numbers`. The MCP `get`/`multi_get` tools default
+  `lineNumbers` to `true` to match.
+- `qmd multi-get` now includes the `#docid` in every output format
+  (`--md`, `--json`, `--csv`, `--xml`, `--files`, and the default CLI view),
+  consistent with `qmd search`.
+- `qmd get` and `qmd multi-get` accept `--full-path`, which replaces the
+  `qmd://` path + `#docid` with the document's on-disk filesystem path (handy for
+  piping into `Read`/`Edit`/an editor). Falls back to the canonical `qmd://` +
+  docid header when the file no longer exists on disk.
+
+### Docs
+
+- qmd skill: emphasize reading line ranges with `get`'s built-in
+  `:from:count` suffix / `--from`/`-l` flags instead of piping through
+  `sed`/`head`/`tail`; cite the docid and line numbers now present in retrieval
+  output; and author structured `intent:`/`lex:`/`vec:`/`hyde:` queries yourself
+  rather than relying on built-in query expansion.
+
 ## [2.5.2] - 2026-05-22

 ### Fixes
--- a/skills/qmd/SKILL.md
+++ b/skills/qmd/SKILL.md
@ -5,7 +5,7 @@ license: MIT
 compatibility: Requires qmd CLI or MCP server. Install via `npm install -g @tobilu/qmd`.
 metadata:
  author: tobi
-  version: "2.1.0"
+  version: "2.2.0"
 allowed-tools: Bash(qmd:*), mcp__qmd__*
 ---

@ -34,8 +34,13 @@ qmd search "merchant reality support interviews" -n 5
 qmd multi-get "#abc123,#def432" --md
 ```

-For harder searches, use `qmd query` structured queries with `intent:`, `lex:`,
-`vec:`, and `hyde:` fields.
+**Default to structured `qmd query` with `intent:`, `lex:`, `vec:`, and `hyde:`
+fields that you write yourself.** You are a better query expander than the
+built-in model: you know the user's actual goal, the domain vocabulary, and the
+nearby-but-wrong concepts to avoid. Do not just paste the user's words into
+`qmd query "..."` and hope the expansion model guesses right — supply the
+`intent:` and craft the lexical and semantic terms deliberately (see
+[Pick the right search mode](#pick-the-right-search-mode)).

 When reporting what you retrieved, a compact note is enough; do not paste whole
 files unless needed:
@ -56,28 +61,37 @@ qmd search "cockpit OKR Goodhart" -n 10
 qmd search '"AI Before Headcount"' -c concepts -n 5
 ```

-Use **hybrid semantic search** when the user describes an idea indirectly, uses
-different wording than the source, or needs conceptual recall:
-
-```bash
-qmd query "decision quality depends on surfacing assumptions and context" -n 10
-qmd query --json --explain "metrics as cockpit instruments but not OKRs"
-```
-
-Use **structured queries** for hard searches. They combine exact anchors with
-semantic recall:
+Use **`qmd query` with structured fields** when the user describes an idea
+indirectly, uses different wording than the source, or needs conceptual recall.
+**This is the default mode — write the fields yourself rather than leaning on
+query expansion.** Combine exact anchors with semantic recall:

 ```bash
 qmd query $'intent: Find the concept note about metrics as instruments without letting OKRs replace judgment.\nlex: cockpit instruments OKR Goodhart metrics judgment\nvec: data informed not metric driven product judgment\nhyde: A concept note says metrics are useful like cockpit instruments, but leaders should remain data-informed rather than metric-driven because OKRs and dashboards can Goodhart product judgment.'
 ```

-Structured query fields:
+Structured query fields (you author each one — do not delegate this to the
+expansion model):

- `intent:` states what you are trying to find and what to avoid.
- `lex:` uses exact terms, aliases, titles, and rare words.
- `vec:` paraphrases the idea in natural language.
+- `intent:` states what you are trying to find **and what to avoid**. Always
+  supply this. It steers ranking away from nearby-but-wrong concepts.
+- `lex:` exact terms, aliases, titles, code symbols, and rare words you expect
+  in the source. This is your own keyword expansion.
+- `vec:` paraphrases the idea in natural language, in source-like wording.
 - `hyde:` describes the document or answer that would satisfy the request.

+You do not need all four every time, but you should almost always write at least
+`intent:` plus one of `lex:`/`vec:`. A bare `qmd query "the user's sentence"`
+throws away the context only you have and relies on the built-in expander to
+reconstruct it — prefer the structured form.
+
+If you genuinely have nothing to expand (a single rare token, a verbatim phrase),
+that is a job for `qmd search`, not bare `qmd query`:
+
+```bash
+qmd query --json --explain $'intent: ...\nlex: ...\nvec: ...'  # inspect ranking
+```
+
 If `qmd query` is slow or model/GPU setup fails, fall back to `qmd search` with
 better lexical terms.

@ -87,14 +101,77 @@ Search results include docids like `#abc123` and `qmd://...` paths. Fetch them:

 ```bash
 qmd get "#abc123"
-qmd get qmd://concepts/ai-before-headcount.md --full
+qmd get qmd://concepts/ai-before-headcount.md
 qmd multi-get "#abc123,#def432" --md
 qmd multi-get 'concepts/{ai-before-headcount.md,data-informed-not-metric-driven.md}' --md
 qmd multi-get 'sources/podcast-2025-*.md' -l 80
 ```

 Use `multi-get` when comparing several hits or gathering context across pages.
-Use `--full` when the exact source matters.
+
+### Output is line-numbered and carries the docid — cite both
+
+`get` and `multi-get` are **line-numbered by default** and always print the
+document's `#docid` and `qmd://` path. So `get` output looks like:
+
+```text
+qmd://concepts/note.md  #abc123
+---
+
+1: # Metrics as instruments
+2:
+3: Treat dashboards like cockpit instruments...
+```
+
+Cite the docid and exact line numbers in your answer, and use the numbers to ask
+for the next slice. Pass `--no-line-numbers` only when you need raw content to
+copy verbatim (e.g. reproducing a code block).
+
+When you need to open or edit the underlying file (e.g. hand a path to `Read`,
+`Edit`, or an editor), add `--full-path`. It replaces the `qmd://` URL + docid
+header with the document's on-disk path, falling back to the canonical header if
+the file no longer exists on disk:
+
+```text
+$ qmd get "#abc123" --full-path
+/Users/you/notes/concepts/note.md
+---
+
+1: # Metrics as instruments
+```
+
+### Read line ranges with the `:from:count` suffix — never pipe through `sed`/`head`/`tail`
+
+`qmd get` slices files itself. Use the suffix or flags; do **not** shell out to
+`sed -n`, `head`, `tail`, or `awk` to pull a line range. Piping defeats docid
+resolution, virtual-path lookups, line numbering, and the header, and it is
+slower and more error-prone.
+
+The most compact form is a `:from:count` suffix right on the path or docid —
+prefer it:
+
+```bash
+qmd get "#abc123:120:40"                  # 40 lines starting at line 120
+qmd get qmd://concepts/note.md:200:60     # lines 200–259
+qmd get "#abc123:120"                      # from line 120 to end of file
+qmd get "#abc123" --from 120 -l 40         # equivalent, using flags
+```
+
+Suffix and flags:
+
+- `<path>:<from>:<count>` — start at line `<from>`, read `<count>` lines. **Best
+  for reading around a search hit.**
+- `<path>:<from>` — start at `<from>`, read to end of file.
+- `--from <line>` / `-l <lines>` — flag equivalents. Explicit flags override the
+  suffix, so `... :5:2 -l 1` reads 1 line.
+- `--no-line-numbers` — drop the `N:` prefixes (line numbers are on by default).
+
+Wrong: `qmd get "#abc123" | sed -n '120,160p'`
+Right: `qmd get "#abc123:120:40"`
+
+Search results include a `:line` anchor on each hit — feed it straight into
+`qmd get path:line:<n>` to read a window around the match (line numbers in the
+output will start at `line`).

 ## Discover what is indexed

@ -189,6 +266,12 @@ server configuration.
 ## Pitfalls

 - **Do not stop at snippets.** Fetch documents before making claims.
+- **Do not slice files with `sed`/`head`/`tail`.** Use the `path:from:count`
+  suffix (e.g. `qmd get "#abc123:120:40"`) or `--from`/`-l`. Output is already
+  line-numbered; piping breaks docid resolution, the header, and virtual paths.
+- **Do not lean on query expansion.** Write `intent:`/`lex:`/`vec:`/`hyde:`
+  yourself. A bare `qmd query "user sentence"` discards the context only you
+  have. You expand the query; the model just ranks.
 - **Do not overuse semantic search.** If you know exact titles or terms, BM25 is
  faster and often better.
 - **Do not mutate indexes casually.** `qmd collection add`, `qmd update`, and
--- a/test/package.test.ts
+++ b/test/package.test.ts
@ -60,7 +60,8 @@ describe("package grammar distribution", () => {
    expect(firstSixtyLines).toContain('qmd multi-get "#abc123,#def432"');
    expect(firstSixtyLines).toContain("Retrieved:");
    expect(firstSixtyLines).toContain("qmd query");
-    expect(firstSixtyLines).toContain("structured queries");
+    // The skill must teach structured, self-authored queries near the top.
+    expect(firstSixtyLines).toContain("Default to structured");

    const scriptPath = join(root.pathname, "scripts", "check-package-grammars.mjs");
    const script = readFileSync(scriptPath, "utf8");