- Stop checking out the old private mono-repo `ai-workspace-infra`.
- Checkout the split public repositories `ai-workspace-infra/playbooks` and `ai-workspace-infra/iac_modules` separately.
- Remove `CODEX_GITHUB_PERSONAL_ACCESS_TOKEN` (`INFRA_REPO_TOKEN`) dependency from vault as it's no longer needed for public repos.
Documents the YAML->generate.py->terraform->cmdb.json->ansible flow, the FQDN
inventory_hostname contract, the two execution models, the Vault-OIDC pipeline,
the non-empty/fail-fast checks, and the key fixes that make it work end to end.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
DEEPSEEK/NVIDIA/OLLAMA_API_KEY live in kv/data/openclaw (not CICD); vault-action
reads them from that path in the same step. Policy grants read on both
kv/data/CICD and kv/data/openclaw.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
- VAULT_KV -> kv/data/CICD (shared CICD secrets), map existing keys to outputs:
CODEX_GITHUB_PERSONAL_ACCESS_TOKEN->INFRA_REPO_TOKEN,
SSH_PRIVATE_DEPLOY_KEY[_B64]->ANSIBLE_SSH_KEY[_B64],
CLOUDFLARE_DNS_API_TOKEN direct; VULTR_API_KEY/LLM keys same name.
- docs: policy reads kv/data/CICD; field table maps existing keys; note the
three LLM keys still need to be added to kv/CICD, and SSH_PUBLIC_DEPLOY_KEY
must match hosts.yaml.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
- deploy job: read ANSIBLE_SSH_KEY_B64 (preferred) + ANSIBLE_SSH_KEY (fallback)
from Vault, decode/write ~/.ssh/id_deploy and ssh-keygen -y self-check —
matches the org SSH-deploy runbook (avoids multiline-key libcrypto errors).
- docs/operations/vault-github-actions.md: full Vault role/policy/jwt/KV setup
for github-actions-xworkspace-console, mirroring the existing org records.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Expand the all-in-one setup guide (zh+en) into a full reference of the
bootstrap script's supported options, grouped by purpose: subcommands
(uninstall/--purge), public-exposure & security, unified auth-token chain,
runtime modes, offline package, performance/locks, source/version overrides.
Fix the inaccurate TOKEN var -> AI_WORKSPACE_AUTH_TOKEN (the real precedence
chain). Sourced from scripts/setup-ai-workspace-all-in-one.sh.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
- deploy-ai-workspace-iac.yaml: deploy job now ssh-es to each host and runs
the official curl|bash bootstrap locally (host-side ansible -c local,
offline-accelerated), instead of running all-in-one from the runner (which
breaks on roles/agent_skills delegate_to: localhost). provision job kept as
the batch-provision mode.
- docs/operations: record final console fix (local python static backend),
caddy/public-access architecture, and debian13/ubuntu26.04/macOS verification.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Records the IaC->inventory->deploy linkage, offline-package linkage
verification, the local-on-host execution finding, the 5 fixes applied to
playbooks, and the remaining console static-serve + pipeline TODOs.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Document the six macOS issues found and fixed during end-to-end
verification of the all-in-one install: litellm dependency version-probe
SyntaxError (TC-028), prisma generator PATH (TC-029), QMD plist undefined
nodejs_version (TC-030), QMD better-sqlite3 Node ABI mismatch (TC-031),
XFCE/XRDP apt-on-macOS (TC-032), and litellm DATABASE_URL password
percent-encoding / P1013 (TC-033), each with its playbooks commit. Update
the fix-dimension summary and the runtime delivery plan status.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
`uninstall` / `uninstall --purge` previously removed services and (on
purge) `rm -rf`'d a hand-maintained list of paths with no output, so users
could not see what would be — or had been — deleted (TC-MAC-026).
Add a pre-flight `print_uninstall_summary` that lists the apps/services to
be removed (launchd agents on macOS; systemd units + docker containers on
Linux) and, when --purge is set, every target path with its current
[present]/[absent] status. Centralize the purge paths into a single
source-of-truth inventory and route deletions through a `purge_path`
helper that prints `removed:` / `absent (skipped):` per path. Document the
subcommands in the usage header. Behavior is otherwise unchanged.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Add patch_playbook_postgres_macos() to rewrite the postgres macos.yml install
from the community.general.homebrew module (which can select a crashing stale
Intel Homebrew) to a brew command using the PATH brew, matching the playbooks
repo fix. Documents TC-MAC-018.
postgresql_deploy_mode defaults to compose (Docker) and the admin password is
generated via a /root password-file lookup, both of which fail on a native
macOS deploy (no Docker, /root not writable). The role already ships a native
path (macos.yml, Homebrew postgresql@16). In the script's Darwin block, set
postgresql_deploy_mode=native and pass postgresql_admin_password directly
(highest-precedence extra-var, bypassing the /root lookup). Linux unchanged.
Documents TC-MAC-017.
Root cause of the repeated 'Bootstrap Vault admin userpass auth' failure was
not macOS-specific: init_vault_admin.sh derived entity_id by logging in as the
user, but the login MFA enforcement it creates makes that login MFA-gated on
re-runs (dev Vault persists across deploys), yielding 'missing entityID'.
patch_playbook_vault_macos() now rewrites init_vault_admin.sh to resolve
entity_id via the userpass entity-alias (creating entity+alias on first run),
matching the same fix landed in the playbooks repo. Removes the temporary
no_log/file-dump diagnostic. Documents TC-MAC-016.
vault : Bootstrap Vault admin userpass auth runs init_vault_admin.sh, which
require_cmd's vault/jq/curl/base64. macOS has no jq by default (the apt deps
task is Darwin-skipped) and ansible.builtin.script uses a minimal PATH without
/opt/homebrew/bin. Extend patch_playbook_vault_macos() to brew install jq and
add environment PATH to the bootstrap task. Idempotent; verified. TC-MAC-015.
The common role's 'Base | *' tasks (timedatectl timezone, /etc/hostname,
hostname, /etc/hosts, ssh hardening, fail2ban, file limits, firewall) all run
with become: true against Linux-only tooling/paths and fail on macOS — the
reported timedatectl failure is just the first. Add patch_playbook_common_macos()
(post-clone, Darwin-only) that appends an ansible_os_family != 'Darwin' guard to
the whole Base block. Idempotent; verified against the real role; Linux
unchanged. Documents TC-MAC-014.
The vault role's 'Ensure standalone Vault directories exist' task creates
/etc/vault.d and /opt/vault/data with owner: root and lacks the Darwin guard
its sibling tasks have, so it fails under macOS become=false. Unlike the
bridge dir (owned by the service user, fixable via -e), this owner: root is
hardcoded and not overridable, so the role logic must change.
Since the role lives in a separate playbooks repo, reuse the existing
post-clone patch mechanism (cf. patch_playbook_user_systemd): add
patch_playbook_vault_macos() that, on Darwin only, guards the directory task,
makes vault dirs/binary OS-conditional (macOS -> ~/Library/Application
Support/vault[/data], /opt/homebrew/bin/vault; Linux unchanged), and creates
the user-owned data dir in macos.yml. Idempotent; verified against the real
role. Documents TC-MAC-013.
Switch the macOS bridge base dir to the Apple-standard per-user location
$HOME/Library/Application Support/cloud-neutral/xworkmate-bridge, while Linux
keeps /opt/cloud-neutral/xworkmate-bridge. Applied both as the Darwin -e
override in setup-ai-workspace-all-in-one.sh (the lever that reaches the
curl|bash path) and as an OS-conditional role default. Updates TC-MAC-012 and
the progress report with the not-pushed root cause of the 19:09 re-failure.
macOS deploys run with ansible_become=false, so the bridge role default
xworkmate_bridge_base_dir=/opt/cloud-neutral failed with EACCES creating
/opt/cloud-neutral. Inject a Darwin -e override pointing the base dir at
$HOME/.local/state/cloud-neutral/xworkmate-bridge, matching existing macOS
overrides for gateway_openclaw/agent_skills/xworkspace_console. Documents the
failure and fix as TC-MAC-012.