OpenClaw gateway turns are async: chat.send returns a runId fast, the app then
polls tasks.get. Previously every tasks.get re-asked the gateway, so a WS blip /
reconnect that lost the gateway's in-memory run state turned into not_found /
socket_closed — the already-finished result was lost and the client either
hard-failed or polled forever.
Make tasks.get resilient by leaning on the per-session store (s.sessions),
whose lifetime is independent of the bridge<->gateway WebSocket:
- T8: cache a gateway-confirmed terminal result (final client-facing shape, after
download-URL decoration + inline-content stripping) into sess.lastResult and
serve it on subsequent polls, so a later gateway not_found cannot lose it.
- T7: when the gateway can't confirm (unavailable / socket closed / not_found) but
the run is still within budget, synthesize a running handle so the client keeps
polling across a transient blip — run tracking decoupled from WS lifetime.
- T9: when the run is past its DeadlineAt and the gateway still can't confirm,
return a deterministic `interrupted` terminal (OPENCLAW_RUN_DEADLINE_EXCEEDED).
Correctness guards:
- startOpenClawGatewayTask resets State/ProgressTerminal when a session is reused
for a new turn, so a prior turn's terminal can't be mis-served for a new runId.
- cache lookups verify the cached runId matches the requested runId (defense in depth).
Design note: T7 is handled at the tasks.get layer (re-correlate by runId via the
durable session store) rather than rewiring gatewayruntime's pending map — lower
risk, equivalent effect. A killed in-flight request surfaces as a gateway error
that the new fallback absorbs. T9 only force-terminates when the gateway is
unconfirmed, never when it explicitly reports running (avoids killing legit long
runs; the client-side deadline T3 covers that case).
Tests: internal/acp/openclaw_run_registry_test.go (terminal detection, within-budget
keep-polling, past-deadline interrupt, cache hit/replay, cross-runId isolation,
no-session not_found). go vet + full acp package green.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
A gateway that doesn't implement xworkmate.session.prepare returns an
"unknown method: xworkmate.session.prepare" error. isOpenClawUnknownMethodError
gated on a string code allowlist {"", INVALID_REQUEST, METHOD_NOT_FOUND}, but real
gateways send a numeric JSON-RPC code (e.g. -32002 / -32601) which shared.StringArg
stringifies to "-32002". The matcher then returned false, so the graceful fallback
(openClawFallbackSessionPreparePayload) never fired and every turn hard-failed with
"-32002: unknown method: xworkmate.session.prepare".
Match on the unambiguous message ("unknown method" + the method name) instead of the
stringified numeric code. Add a regression test covering numeric codes and guarding
against swallowing unrelated errors / other method names.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The runtime-release matrix builds linux+darwin × amd64+arm64, but each job
wrote its checksum to SHA256SUMS-<arch> (arch only). The linux/<arch> and
darwin/<arch> jobs therefore emitted the same filename, which clobbered each
other under the publish job's `merge-multiple: true` download. The merged
SHA256SUMS ended up with only 2 of the 4 platforms, so consumers of the
missing tarballs (notably xworkmate-bridge-linux-arm64.tar.gz) failed with
"missing checksum" — breaking the console offline arm64 package build.
Name the per-job file SHA256SUMS-<os>-<arch> so all four are unique and the
merged SHA256SUMS lists every published tarball.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
- Add ExpectedFileCounts field to openClawArtifactContract to support per-extension file count validation
- Add normalizeOpenClawArtifactExtCountMap and openClawPositiveInt helpers
- Propagate expectedFileCountByExtension from contract/metadata/xworkmateArtifactConstraints
- Replace hard-coded 2min chat timeout with openClawAgentWaitTimeout for dynamic timeouts
- Add test coverage for normalize result and web contract