docs(qmd-skill): structured-query-first; cite docid + lines; no sed
- Make structured `qmd query` with intent:/lex:/vec:/hyde: the default search mode, and emphasize that the caller authors the expansion rather than leaning on the built-in query-expansion model. - Tell the caller to cite the #docid and exact line numbers now printed by get/multi-get, and to slice files with the :from:count suffix or --from/-l instead of piping through sed/head/tail. - Document --full-path for handing the on-disk path to editor tools. - Bump skill version to 2.2.0 and record the behavior changes under ## [Unreleased] in CHANGELOG.md. - Update the package smoke test that pinned the old 'structured queries' wording to match the new, more specific intro phrasing.
This commit is contained in:
parent
41bc3a27d8
commit
fa8f904a9d
26
CHANGELOG.md
26
CHANGELOG.md
@ -2,6 +2,32 @@
|
||||
|
||||
## [Unreleased]
|
||||
|
||||
### Features
|
||||
|
||||
- `qmd get` now accepts a `:from:count` suffix on a path or docid (e.g.
|
||||
`qmd get "#abc123:120:40"` reads 40 lines starting at line 120). Explicit
|
||||
`--from`/`-l` flags still override the suffix. The MCP `get` tool accepts the
|
||||
same suffix.
|
||||
- `qmd get` and `qmd multi-get` are now **line-numbered by default** and print
|
||||
the document's `#docid` and `qmd://` path in the output header. Disable line
|
||||
numbers with `--no-line-numbers`. The MCP `get`/`multi_get` tools default
|
||||
`lineNumbers` to `true` to match.
|
||||
- `qmd multi-get` now includes the `#docid` in every output format
|
||||
(`--md`, `--json`, `--csv`, `--xml`, `--files`, and the default CLI view),
|
||||
consistent with `qmd search`.
|
||||
- `qmd get` and `qmd multi-get` accept `--full-path`, which replaces the
|
||||
`qmd://` path + `#docid` with the document's on-disk filesystem path (handy for
|
||||
piping into `Read`/`Edit`/an editor). Falls back to the canonical `qmd://` +
|
||||
docid header when the file no longer exists on disk.
|
||||
|
||||
### Docs
|
||||
|
||||
- qmd skill: emphasize reading line ranges with `get`'s built-in
|
||||
`:from:count` suffix / `--from`/`-l` flags instead of piping through
|
||||
`sed`/`head`/`tail`; cite the docid and line numbers now present in retrieval
|
||||
output; and author structured `intent:`/`lex:`/`vec:`/`hyde:` queries yourself
|
||||
rather than relying on built-in query expansion.
|
||||
|
||||
## [2.5.2] - 2026-05-22
|
||||
|
||||
### Fixes
|
||||
|
||||
@ -5,7 +5,7 @@ license: MIT
|
||||
compatibility: Requires qmd CLI or MCP server. Install via `npm install -g @tobilu/qmd`.
|
||||
metadata:
|
||||
author: tobi
|
||||
version: "2.1.0"
|
||||
version: "2.2.0"
|
||||
allowed-tools: Bash(qmd:*), mcp__qmd__*
|
||||
---
|
||||
|
||||
@ -34,8 +34,13 @@ qmd search "merchant reality support interviews" -n 5
|
||||
qmd multi-get "#abc123,#def432" --md
|
||||
```
|
||||
|
||||
For harder searches, use `qmd query` structured queries with `intent:`, `lex:`,
|
||||
`vec:`, and `hyde:` fields.
|
||||
**Default to structured `qmd query` with `intent:`, `lex:`, `vec:`, and `hyde:`
|
||||
fields that you write yourself.** You are a better query expander than the
|
||||
built-in model: you know the user's actual goal, the domain vocabulary, and the
|
||||
nearby-but-wrong concepts to avoid. Do not just paste the user's words into
|
||||
`qmd query "..."` and hope the expansion model guesses right — supply the
|
||||
`intent:` and craft the lexical and semantic terms deliberately (see
|
||||
[Pick the right search mode](#pick-the-right-search-mode)).
|
||||
|
||||
When reporting what you retrieved, a compact note is enough; do not paste whole
|
||||
files unless needed:
|
||||
@ -56,28 +61,37 @@ qmd search "cockpit OKR Goodhart" -n 10
|
||||
qmd search '"AI Before Headcount"' -c concepts -n 5
|
||||
```
|
||||
|
||||
Use **hybrid semantic search** when the user describes an idea indirectly, uses
|
||||
different wording than the source, or needs conceptual recall:
|
||||
|
||||
```bash
|
||||
qmd query "decision quality depends on surfacing assumptions and context" -n 10
|
||||
qmd query --json --explain "metrics as cockpit instruments but not OKRs"
|
||||
```
|
||||
|
||||
Use **structured queries** for hard searches. They combine exact anchors with
|
||||
semantic recall:
|
||||
Use **`qmd query` with structured fields** when the user describes an idea
|
||||
indirectly, uses different wording than the source, or needs conceptual recall.
|
||||
**This is the default mode — write the fields yourself rather than leaning on
|
||||
query expansion.** Combine exact anchors with semantic recall:
|
||||
|
||||
```bash
|
||||
qmd query $'intent: Find the concept note about metrics as instruments without letting OKRs replace judgment.\nlex: cockpit instruments OKR Goodhart metrics judgment\nvec: data informed not metric driven product judgment\nhyde: A concept note says metrics are useful like cockpit instruments, but leaders should remain data-informed rather than metric-driven because OKRs and dashboards can Goodhart product judgment.'
|
||||
```
|
||||
|
||||
Structured query fields:
|
||||
Structured query fields (you author each one — do not delegate this to the
|
||||
expansion model):
|
||||
|
||||
- `intent:` states what you are trying to find and what to avoid.
|
||||
- `lex:` uses exact terms, aliases, titles, and rare words.
|
||||
- `vec:` paraphrases the idea in natural language.
|
||||
- `intent:` states what you are trying to find **and what to avoid**. Always
|
||||
supply this. It steers ranking away from nearby-but-wrong concepts.
|
||||
- `lex:` exact terms, aliases, titles, code symbols, and rare words you expect
|
||||
in the source. This is your own keyword expansion.
|
||||
- `vec:` paraphrases the idea in natural language, in source-like wording.
|
||||
- `hyde:` describes the document or answer that would satisfy the request.
|
||||
|
||||
You do not need all four every time, but you should almost always write at least
|
||||
`intent:` plus one of `lex:`/`vec:`. A bare `qmd query "the user's sentence"`
|
||||
throws away the context only you have and relies on the built-in expander to
|
||||
reconstruct it — prefer the structured form.
|
||||
|
||||
If you genuinely have nothing to expand (a single rare token, a verbatim phrase),
|
||||
that is a job for `qmd search`, not bare `qmd query`:
|
||||
|
||||
```bash
|
||||
qmd query --json --explain $'intent: ...\nlex: ...\nvec: ...' # inspect ranking
|
||||
```
|
||||
|
||||
If `qmd query` is slow or model/GPU setup fails, fall back to `qmd search` with
|
||||
better lexical terms.
|
||||
|
||||
@ -87,14 +101,77 @@ Search results include docids like `#abc123` and `qmd://...` paths. Fetch them:
|
||||
|
||||
```bash
|
||||
qmd get "#abc123"
|
||||
qmd get qmd://concepts/ai-before-headcount.md --full
|
||||
qmd get qmd://concepts/ai-before-headcount.md
|
||||
qmd multi-get "#abc123,#def432" --md
|
||||
qmd multi-get 'concepts/{ai-before-headcount.md,data-informed-not-metric-driven.md}' --md
|
||||
qmd multi-get 'sources/podcast-2025-*.md' -l 80
|
||||
```
|
||||
|
||||
Use `multi-get` when comparing several hits or gathering context across pages.
|
||||
Use `--full` when the exact source matters.
|
||||
|
||||
### Output is line-numbered and carries the docid — cite both
|
||||
|
||||
`get` and `multi-get` are **line-numbered by default** and always print the
|
||||
document's `#docid` and `qmd://` path. So `get` output looks like:
|
||||
|
||||
```text
|
||||
qmd://concepts/note.md #abc123
|
||||
---
|
||||
|
||||
1: # Metrics as instruments
|
||||
2:
|
||||
3: Treat dashboards like cockpit instruments...
|
||||
```
|
||||
|
||||
Cite the docid and exact line numbers in your answer, and use the numbers to ask
|
||||
for the next slice. Pass `--no-line-numbers` only when you need raw content to
|
||||
copy verbatim (e.g. reproducing a code block).
|
||||
|
||||
When you need to open or edit the underlying file (e.g. hand a path to `Read`,
|
||||
`Edit`, or an editor), add `--full-path`. It replaces the `qmd://` URL + docid
|
||||
header with the document's on-disk path, falling back to the canonical header if
|
||||
the file no longer exists on disk:
|
||||
|
||||
```text
|
||||
$ qmd get "#abc123" --full-path
|
||||
/Users/you/notes/concepts/note.md
|
||||
---
|
||||
|
||||
1: # Metrics as instruments
|
||||
```
|
||||
|
||||
### Read line ranges with the `:from:count` suffix — never pipe through `sed`/`head`/`tail`
|
||||
|
||||
`qmd get` slices files itself. Use the suffix or flags; do **not** shell out to
|
||||
`sed -n`, `head`, `tail`, or `awk` to pull a line range. Piping defeats docid
|
||||
resolution, virtual-path lookups, line numbering, and the header, and it is
|
||||
slower and more error-prone.
|
||||
|
||||
The most compact form is a `:from:count` suffix right on the path or docid —
|
||||
prefer it:
|
||||
|
||||
```bash
|
||||
qmd get "#abc123:120:40" # 40 lines starting at line 120
|
||||
qmd get qmd://concepts/note.md:200:60 # lines 200–259
|
||||
qmd get "#abc123:120" # from line 120 to end of file
|
||||
qmd get "#abc123" --from 120 -l 40 # equivalent, using flags
|
||||
```
|
||||
|
||||
Suffix and flags:
|
||||
|
||||
- `<path>:<from>:<count>` — start at line `<from>`, read `<count>` lines. **Best
|
||||
for reading around a search hit.**
|
||||
- `<path>:<from>` — start at `<from>`, read to end of file.
|
||||
- `--from <line>` / `-l <lines>` — flag equivalents. Explicit flags override the
|
||||
suffix, so `... :5:2 -l 1` reads 1 line.
|
||||
- `--no-line-numbers` — drop the `N:` prefixes (line numbers are on by default).
|
||||
|
||||
Wrong: `qmd get "#abc123" | sed -n '120,160p'`
|
||||
Right: `qmd get "#abc123:120:40"`
|
||||
|
||||
Search results include a `:line` anchor on each hit — feed it straight into
|
||||
`qmd get path:line:<n>` to read a window around the match (line numbers in the
|
||||
output will start at `line`).
|
||||
|
||||
## Discover what is indexed
|
||||
|
||||
@ -189,6 +266,12 @@ server configuration.
|
||||
## Pitfalls
|
||||
|
||||
- **Do not stop at snippets.** Fetch documents before making claims.
|
||||
- **Do not slice files with `sed`/`head`/`tail`.** Use the `path:from:count`
|
||||
suffix (e.g. `qmd get "#abc123:120:40"`) or `--from`/`-l`. Output is already
|
||||
line-numbered; piping breaks docid resolution, the header, and virtual paths.
|
||||
- **Do not lean on query expansion.** Write `intent:`/`lex:`/`vec:`/`hyde:`
|
||||
yourself. A bare `qmd query "user sentence"` discards the context only you
|
||||
have. You expand the query; the model just ranks.
|
||||
- **Do not overuse semantic search.** If you know exact titles or terms, BM25 is
|
||||
faster and often better.
|
||||
- **Do not mutate indexes casually.** `qmd collection add`, `qmd update`, and
|
||||
|
||||
@ -60,7 +60,8 @@ describe("package grammar distribution", () => {
|
||||
expect(firstSixtyLines).toContain('qmd multi-get "#abc123,#def432"');
|
||||
expect(firstSixtyLines).toContain("Retrieved:");
|
||||
expect(firstSixtyLines).toContain("qmd query");
|
||||
expect(firstSixtyLines).toContain("structured queries");
|
||||
// The skill must teach structured, self-authored queries near the top.
|
||||
expect(firstSixtyLines).toContain("Default to structured");
|
||||
|
||||
const scriptPath = join(root.pathname, "scripts", "check-package-grammars.mjs");
|
||||
const script = readFileSync(scriptPath, "utf8");
|
||||
|
||||
Loading…
Reference in New Issue
Block a user