Ryan Malia 7488fe8094 docs: CLI reference, collection-flag semantics, MCP params, bench, new commands

Documents previously-undocumented surface area surfaced by onboarding feedback
and the bench discoverability report:

- README: collection filtering (-c semantics), collection show/include/exclude/
  update-cmd, --intent/--no-rerank/-C/--full-path, --format <kind> (legacy
  output booleans noted as aliases), vector-search/deep-search aliases, embed
  memory flags, a sample --explain trace, MCP tool parameter reference, qmd
  doctor/init, get :from:count + --no-line-numbers, and a Benchmarking section
  for qmd bench.
- README: removed the misleading `qmd update --pull` example; --pull is parsed
  but never consumed, so it points to `qmd collection update-cmd` (the real
  per-collection pre-reindex mechanism) instead.
- docs/SYNTAX.md: drop the non-existent `q` MCP parameter (the query tool/REST
  endpoint accept only `searches`); add a Scoping section.
- server.ts: buildInstructions now advertises the plural `collections` parameter
  to match the schema (singular was silently stripped, yielding unscoped
  results), and the `get` instruction documents the full file.md:from:count
  range suffix instead of only file.md:100.

Refs #25, #181, #217, #372, #520, #576

2026-06-07 13:37:41 -07:00

5.8 KiB

Raw Permalink Blame History

QMD Query Syntax

QMD queries are structured documents with typed sub-queries. Each line specifies a search type and query text.

Grammar

query          = expand_query | query_document ;
expand_query   = text | explicit_expand ;
explicit_expand= "expand:" text ;
query_document = [ intent_line ] { typed_line } ;
intent_line    = "intent:" text newline ;
typed_line     = type ":" text newline ;
type           = "lex" | "vec" | "hyde" ;
text           = quoted_phrase | plain_text ;
quoted_phrase  = '"' { character } '"' ;
plain_text     = { character } ;
newline        = "\n" ;

Query Types

Type	Method	Description
`lex`	BM25	Keyword search with exact matching
`vec`	Vector	Semantic similarity search
`hyde`	Vector	Hypothetical document embedding

Default Behavior

A QMD query is either a single expand query or a multi-line query document. Any single-line query with no prefix is treated as an expand query and passed to the expansion model, which emits lex, vec, and hyde variants automatically.

# These are equivalent and cannot be combined with typed lines:
how does authentication work
expand: how does authentication work

Lex Query Syntax

Lex queries support special syntax for precise keyword matching:

lex_query   = { lex_term } ;
lex_term    = negation | phrase | word ;
negation    = "-" ( phrase | word ) ;
phrase      = '"' { character } '"' ;
word        = { letter | digit | "'" } ;

Syntax	Meaning	Example
`word`	Prefix match	`perf` matches "performance"
`"phrase"`	Exact phrase	`"rate limiter"`
`-word`	Exclude term	`-sports`
`-"phrase"`	Exclude phrase	`-"test data"`

Examples

lex: CAP theorem consistency
lex: "machine learning" -"deep learning"
lex: auth -oauth -saml

Vec Query Syntax

Vec queries are natural language questions. No special syntax — just write what you're looking for.

vec: how does the rate limiter handle burst traffic
vec: what is the tradeoff between consistency and availability

Hyde Query Syntax

Hyde queries are hypothetical answer passages (50-100 words). Write what you expect the answer to look like.

hyde: The rate limiter uses a sliding window algorithm with a 60-second window. When a client exceeds 100 requests per minute, subsequent requests return 429 Too Many Requests.

Multi-Line Queries

Combine multiple query types for best results. First query gets 2x weight in fusion.

lex: rate limiter algorithm
vec: how does rate limiting work in the API
hyde: The API implements rate limiting using a token bucket algorithm...

Expand Queries

An expand query stands alone; it's not mixed with typed lines. You can either rely on the default untyped form or add the explicit expand: prefix:

expand: error handling best practices
# equivalent
error handling best practices

Both forms call the local query expansion model, which generates lex, vec, and hyde variations automatically.

Intent

An optional intent: line provides background context to disambiguate ambiguous queries. It steers query expansion, reranking, and snippet extraction but does not search on its own.

At most one intent: line per query document
intent: cannot appear alone — at least one lex:, vec:, or hyde: line is required
Intent is also available via the --intent CLI flag or MCP intent parameter

intent: web page load times and Core Web Vitals
lex: performance
vec: how to improve performance

Without intent, "performance" is ambiguous (web-perf? team health? fitness?). With intent, the search pipeline preferentially selects and ranks web-performance content.

Constraints

Top-level query must be either a standalone expand query or a multi-line document
Query documents allow only lex, vec, hyde, and intent typed lines (no expand: inside)
lex syntax (-term, "phrase") only works in lex queries
At most one intent: line per query document; cannot appear alone
Empty lines are ignored
Leading/trailing whitespace is trimmed

Scoping

Restrict queries to specific collections with -c (CLI) or collections (MCP/SDK):

# CLI — by collection name (see `qmd collection list`)
qmd query -c docs "how does auth work"
qmd query -c docs -c notes $'lex: auth\nvec: authentication flow'

For MCP / HTTP, pass a plural collections array (OR match):

{ "searches": [ { "type": "lex", "query": "auth" } ], "collections": ["docs", "notes"] }

-c/collections matches by collection name and works from any directory. Multiple values are OR-combined. Without scoping, all default-included collections are searched; collections marked excluded (qmd collection exclude <name>) are skipped unless explicitly named. In MCP the parameter is the plural collections array — a singular collection is silently ignored.

MCP/HTTP API

The query tool (and the REST /query endpoint) accept a structured query with a searches array. There is no q string parameter — searches is required:

{
  "searches": [
    { "type": "lex", "query": "CAP theorem" },
    { "type": "vec", "query": "consistency vs availability" }
  ],
  "collections": ["docs"],
  "limit": 10
}

With intent:

{
  "searches": [
    { "type": "lex", "query": "performance" }
  ],
  "intent": "web page load times and Core Web Vitals"
}

CLI

# Single line (implicit expand)
qmd query "how does auth work"

# Multi-line with types
qmd query $'lex: auth token\nvec: how does authentication work'

# Structured
qmd query $'lex: keywords\nvec: question\nhyde: hypothetical answer...'

# With intent (inline)
qmd query $'intent: web performance and latency\nlex: performance\nvec: how to improve performance'

# With intent (flag)
qmd query --intent "web performance and latency" "performance"

5.8 KiB Raw Permalink Blame History