xworkspace-console

Author	SHA1	Message	Date
Haitao Pan	6257cd41ea	backport: support customizable AI_WORKSPACE_AUTH_TOKEN in deployment workflow	2026-06-28 16:32:30 +08:00
Haitao Pan	b9c649af68	ci: backport release/* source validation workflow to release/v1.1.5 (#3 ) 让现有 release/v1.1.5 分支自身包含门禁 workflow（pull_request_target 用 base 分支版本）。详见 iac_modules/docs/tldr-github-branch-model.md Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-28 12:41:18 +08:00
Haitao Pan	3ce3c6fb66	fix(iac): require Cloudflare DNS token	2026-06-27 13:48:20 +08:00
Haitao Pan	974904be13	ci: update workflow actions for node 24	2026-06-26 19:05:39 +08:00
Haitao Pan	338d057375	feat(ci): add provider key wiring toggles	2026-06-26 18:30:29 +08:00
Haitao Pan	50070c0708	fix(ci): pass tfstate credentials to inventory render	2026-06-26 18:15:35 +08:00
Haitao Pan	12b5805fb5	fix(ci): pass tfstate credentials to terraform apply	2026-06-26 18:12:21 +08:00
Haitao Pan	002257ce5b	fix(ci): source tf state region from vault	2026-06-26 18:10:28 +08:00
Haitao Pan	3b270f4959	fix(ci): pin aws tfstate region for s3 backend	2026-06-26 18:07:52 +08:00
Haitao Pan	8f8e925706	fix(ci): require tf state region from vault	2026-06-26 17:50:04 +08:00
Haitao Pan	a72e580ae6	fix(ci): default tf state region to us-east-1	2026-06-26 17:47:49 +08:00
Haitao Pan	5a76c5ed06	fix(deploy): on-host bootstrap defaults to online mode (pull fixed main playbooks) The deploy job ran curl\|bash with no AI_WORKSPACE_OFFLINE_MODE -> auto -> stale offline package, which still ships the pinned-Chrome / root-PGDATA playbooks that were already fixed in playbooks main. Pipeline kept failing at the Chrome task. - run-on-host-bootstrap.sh: thread AI_WORKSPACE_OFFLINE_MODE (default off) into the remote env so the bootstrap git-clones latest main instead of the stale package. - workflow: add offline_mode input (off\|auto\|force, default off); flip back to auto once the offline package is republished with the fixes. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-06-25 22:47:14 +08:00
Haitao Pan	09a8bae35d	fix(iac-workflow): make S3-compatible remote state mandatory (no local fallback) Previously 'Configure remote backend' had `if: TF_STATE_BUCKET != ''`, so when the gate evaluated empty the step was skipped and terraform silently fell back to local state — risking state loss on destroy. TF_STATE_* exist in Vault, so make the remote backend the default required path: - Validate step now requires TF_STATE_{ENDPOINT,BUCKET,ACCESS_KEY,SECRET_KEY} - 'Configure remote backend' always runs (renders backend.tf) - terraform init fails fast if TF_STATE_BUCKET empty (removed local-state else) - header comment updated: backend keys are required, not optional Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-06-25 22:09:44 +08:00
Haitao Pan	5ce6dad9bc	fix(iac-workflow): change TF_STATE_REGION fallback from us-east-1 to auto Cloudflare R2 S3-compatible backend requires region=auto; the previous fallback us-east-1 would cause terraform init to fail if Vault key is absent. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-06-25 21:01:13 +08:00
Haitao Pan	e39b16e92f	fix(ci): checkout bootstrap helper in deploy job	2026-06-25 20:53:33 +08:00
Haitao Pan	fbfa32ca2a	fix(ci): poll on-host bootstrap logs across ssh reconnects	2026-06-25 20:48:20 +08:00
Haitao Pan	12d9bb327f	refactor(ci): 将 render_backend_tf.py 移至 ai-workspace-infra vultr-vps/scripts/ 脚本从 xworkspace-console/scripts/ 移入 ai-workspace-infra 的 vultr-vps/scripts/，通过已有的 Checkout iac_modules 步骤引用，无需额外 self-checkout xw-console；workflow 和 CLAUDE.md 同步更新路径。 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-06-25 12:02:48 +08:00
Haitao Pan	9b3687e189	fix(ci): 消除 workflow 所有 heredoc，改为外置脚本调用 - 删除 Configure remote backend 步骤的 shell heredoc（导致 YAML L191 语法错误） - 新增 scripts/render_backend_tf.py 外置脚本，接受 TF_STATE_ENDPOINT env 渲染 backend.tf - provision job 新增 Checkout xworkspace-console 步骤，确保 scripts/ 在 runner 可用 - 新增 CLAUDE.md，明确禁止 workflow 内嵌 heredoc（shell/python），要求外置脚本 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-06-25 11:48:47 +08:00
Haitao Pan	f636366699	fix(ci): 还原 backend.tf 为 shell heredoc，修复 Jinja2 内联 Python 导致的 YAML 语法错误 Python 内联脚本（python3 - <<'PYEOF'...PYEOF）的代码行从列 1 开始，超出 YAML literal block 的缩进范围，导致整个 workflow 文件 YAML 解析失败， GitHub 丢失 workflow_dispatch 触发器。还原为 shell heredoc（<<TFEOF，非引号，允许变量展开），保留 force_path_style → use_path_style 升级。 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-06-25 11:45:44 +08:00
Haitao Pan	4a6057d58b	fix(ci): 改用 Jinja2 渲染 backend.tf + 更新 force_path_style → use_path_style - 将 Configure remote backend 步骤从 shell heredoc 改为 Python Jinja2 渲染，避免 shell 引号/转义问题，与 generate.py 保持一致的渲染风格 - force_path_style 已在 Terraform 1.9 废弃，改为 use_path_style Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-06-25 11:37:25 +08:00
Haitao Pan	b4c051e6c0	fix(ci): 将 R2 endpoint 写入 backend.tf HCL 而非 -backend-config flag Terraform S3 backend 的 endpoints 块只能在 HCL 配置里指定，无法通过 -backend-config 命令行参数传递（endpoints={s3=...} 和 endpoints.s3=... 两种写法均被 Terraform 拒绝）。改为：Configure remote backend 步骤用非引号 heredoc 将 TF_STATE_ENDPOINT 展开写入 backend.tf，terraform init 只通过 -backend-config 传 bucket/key/region。 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-06-25 11:29:35 +08:00
Haitao Pan	b9ec7a2e45	fix(ci): 修复 R2 TF state backend endpoint 语法 + 补全前置条件文档 - 将 terraform init -backend-config 中的 endpoints={s3="..."} HCL map 语法改为 endpoints.s3=... 点号语法（前者在 -backend-config flag 中无效，导致 R2 endpoint 未被传递，Terraform 回退 AWS 默认 endpoint 签名失败） - 补全 workflow 顶部 TLDR 前置条件注释（6 项） - 新增 docs/operations/iac-prerequisites.md（前置条件完整指南含 R2 搭建） - vault-github-actions.md 补充 §7 交叉引用 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-06-25 11:13:16 +08:00
Haitao Pan	d225ff74e2	fix(ci): fix terraform s3 backend SignatureDoesNotMatch error in dynamically generated backend config - Add skip_s3_checksum = true and skip_metadata_api_check = true to s3 backend config - Use endpoints = { s3 = ... } instead of deprecated endpoint parameter in terraform init	2026-06-25 10:46:01 +08:00
Haitao Pan	4b1f809937	ci: checkout playbooks and iac_modules from public repos - Stop checking out the old private mono-repo `ai-workspace-infra`. - Checkout the split public repositories `ai-workspace-infra/playbooks` and `ai-workspace-infra/iac_modules` separately. - Remove `CODEX_GITHUB_PERSONAL_ACCESS_TOKEN` (`INFRA_REPO_TOKEN`) dependency from vault as it's no longer needed for public repos.	2026-06-25 10:14:15 +08:00
Haitao Pan	c2cd3035a4	ci(deploy-iac): fail fast on missing required Vault secrets Add a 'Validate required secrets' run-step after each job's Vault OIDC load step. It checks REQUIRED steps.vault.outputs.* are non-empty via env: mapping (never echoes secret values), and on any empty key prints a ::error:: naming the key + its Vault path then exit 1. The deploy job requires at least one of ANSIBLE_SSH_KEY_B64 / ANSIBLE_SSH_KEY. Optional keys (INFRA_REPO_TOKEN, TF_STATE_*) are not validated. Vault path strings in error messages reference the env.VAULT_KV[_OPENCLAW] vars rather than hardcoded literals. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-24 20:46:30 +08:00
Haitao Pan	fe479bc4b4	ci(deploy-iac): pass XWORKMATE_BRIDGE_DOMAIN (override or CMDB service_domains) to on-host bootstrap New optional 'bridge_domain' input overrides; otherwise derive from each host's cmdb.json host_vars.service_domains (first entry) and inject as XWORKMATE_BRIDGE_DOMAIN so the host sets /etc/hostname + xworkmate-bridge.caddy from it (on-host model has no inventory hostvars). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-24 15:56:45 +08:00
Haitao Pan	607c995a9a	ci+docs(vault): read LLM keys from kv/openclaw, SSH/infra/cloudflare from kv/CICD DEEPSEEK/NVIDIA/OLLAMA_API_KEY live in kv/data/openclaw (not CICD); vault-action reads them from that path in the same step. Policy grants read on both kv/data/CICD and kv/data/openclaw. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-24 15:35:25 +08:00
Haitao Pan	dba85dad04	docs(ci): fix header comment to kv/CICD + actual key names Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-24 15:31:30 +08:00
Haitao Pan	5d852e0285	ci+docs(vault): read shared kv/CICD with existing key names - VAULT_KV -> kv/data/CICD (shared CICD secrets), map existing keys to outputs: CODEX_GITHUB_PERSONAL_ACCESS_TOKEN->INFRA_REPO_TOKEN, SSH_PRIVATE_DEPLOY_KEY[_B64]->ANSIBLE_SSH_KEY[_B64], CLOUDFLARE_DNS_API_TOKEN direct; VULTR_API_KEY/LLM keys same name. - docs: policy reads kv/data/CICD; field table maps existing keys; note the three LLM keys still need to be added to kv/CICD, and SSH_PUBLIC_DEPLOY_KEY must match hosts.yaml. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-24 15:31:00 +08:00
Haitao Pan	04d349073e	ci+docs(vault): SSH key B64-preferred pattern + xworkspace-console Vault setup - deploy job: read ANSIBLE_SSH_KEY_B64 (preferred) + ANSIBLE_SSH_KEY (fallback) from Vault, decode/write ~/.ssh/id_deploy and ssh-keygen -y self-check — matches the org SSH-deploy runbook (avoids multiline-key libcrypto errors). - docs/operations/vault-github-actions.md: full Vault role/policy/jwt/KV setup for github-actions-xworkspace-console, mirroring the existing org records. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-24 15:21:01 +08:00
Haitao Pan	75d3098d1c	ci(deploy-iac): fetch secrets from Vault KV via GitHub OIDC Replace GitHub Actions Secrets with HashiCorp Vault (https://vault.svc.plus): - permissions: id-token: write; auth via hashicorp/vault-action@v2 (method=jwt, role=github-actions-xworkspace-console, audience=vault) — no static token. - Each job loads only the keys it needs from kv/data/github-actions/xworkspace-console (VULTR_API_KEY, INFRA_REPO_TOKEN, ANSIBLE_SSH_KEY, CLOUDFLARE_API_TOKEN, DEEPSEEK/NVIDIA/OLLAMA_API_KEY, optional TF_STATE_*). - Backend gating now keys off the Vault output (steps.vault.outputs.TF_STATE_BUCKET). - Drop unused 'playbook' input (deploy is on-host bootstrap). Pattern mirrors xworkmate-app/.github/workflows/build-and-release.yml. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-24 15:17:46 +08:00
Haitao Pan	b2c8c5d875	ci+docs: on-host bootstrap deploy job + console serving/verification updates - deploy-ai-workspace-iac.yaml: deploy job now ssh-es to each host and runs the official curl\|bash bootstrap locally (host-side ansible -c local, offline-accelerated), instead of running all-in-one from the runner (which breaks on roles/agent_skills delegate_to: localhost). provision job kept as the batch-provision mode. - docs/operations: record final console fix (local python static backend), caddy/public-access architecture, and debian13/ubuntu26.04/macOS verification. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-24 09:44:22 +08:00
Haitao Pan	b039a36a69	ci: align deploy pipeline with shared scripts/templates layout generate.py moved to vultr-vps/scripts/ and provider/variables/cloud-init to templates/; run render/inventory from VPS_ROOT via scripts/generate.py, keep terraform -chdir in the env workdir. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-23 21:23:48 +08:00
Haitao Pan	7c46dffde2	ci: add IaC + Ansible + Cloudflare matrix deploy pipeline Matrix pipeline that provisions Vultr hosts via iac_modules vultr-vps ai-workspace env (Terraform), derives the deploy matrix from the rendered CMDB, deploys per-host with Ansible all-in-one, then syncs Cloudflare DNS. Pipelining off + PYTHONWARNINGS=ignore for Python 3.13 targets. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-23 21:02:32 +08:00
Haitao Pan	fd1fb5710c	ci(console-runtime): publish moving latest-runtime release The ai-workspace role's final-deployment step downloads the console runtime from a stable latest-runtime release (matching the bridge/qmd/litellm convention). Have the publish job refresh a moving `latest-runtime` release alongside the immutable `runtime-<sha>` one, carrying the same cross-compiled assets (darwin-arm64, linux-amd64, linux-arm64) + SHA256SUMS, so consumers get a predictable URL: releases/download/latest-runtime/xworkspace-console-runtime-<os>-<arch>.tar.gz Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-22 17:05:20 +08:00
Haitao Pan	198ca0c88a	chore: rename workflow to offline-package prefix and implement openclaw plugin macOS patch	2026-06-21 16:17:30 +08:00
Haitao Pan	04f653b0b3	ci: slim console runtime - universal dist + cross-compiled API, no macOS runner Build the dashboard dist once (platform-independent) and cross-compile the Go API (CGO disabled) for darwin-arm64, linux-amd64, linux-arm64 in a single ubuntu job. Drops the macOS runners and per-arch docker. node_modules is excluded from the runtime (nodejs role provides Node on target), so the tarballs only carry the API binary + built dist + scripts + manifest. macOS is arm64-only; Linux covers amd64/arm64 (Debian and Ubuntu share the binary).	2026-06-21 08:02:58 +00:00
Haitao Pan	da64de72bb	ci: unify runtime + offline into one pipeline (single build matrix) Merge offline-package workflow jobs into runtime-release.yaml: build (one linux+darwin matrix) -> publish (outputs runtime_tag) -> build-offline-package (matrix) -> test-offline-package (matrix) -> publish-release. One-directional deps: publish-release needs test-offline-package needs build-offline-package. Offline build uses the in-pipeline runtime_tag, and publish-release folds in the console-runtime download (from `aaf6c47`) plus the >2GiB split-upload. The standalone offline-package-ai-workspace-installer.yaml is now redundant (dispatch-only; safe to delete).	2026-06-21 07:50:34 +00:00
Haitao Pan	aaf6c47b69	ci: include console runtimes in offline release	2026-06-21 15:32:39 +08:00
Haitao Pan	77230a5fd4	ci: publish darwin runtime + split >2GiB offline packages A) runtime-release.yaml: add a native build-darwin job (macos-14 arm64 / macos-13 amd64) that builds the dashboard + cross-correct Go API and publishes xworkspace-console-runtime-darwin-{arm64,amd64}.tar.gz, fixing the macOS deploy 404. publish now needs both build jobs and globs all runtimes. B) offline-package workflow: GitHub caps release assets at 2 GiB. Split any package >= 2 GiB into 1900 MiB parts plus a <name>.parts manifest and upload the parts. The offline bootstrap (download_offline_split) falls back to the manifest and reassembles the parts when the single asset is absent. Verified the split/reassemble round-trips byte-for-byte.	2026-06-19 22:17:11 +00:00
Haitao Pan	2cb26128fb	fix: delete existing offline release assets before upload	2026-06-18 18:05:24 +08:00
Haitao Pan	c6335c2dcf	fix: preserve exact runtime asset names	2026-06-15 22:02:52 +08:00
Haitao Pan	6f85f4d183	feat: aggregate prebuilt workspace releases	2026-06-15 21:59:35 +08:00
Haitao Pan	73762d498f	feat: bundle Python 3.13 for Ubuntu 26.04	2026-06-15 15:44:59 +08:00
Haitao Pan	572f736ce5	fix: bundle portable offline runtime assets	2026-06-15 15:13:21 +08:00
Haitao Pan	a457a9f438	fix: consume packaged runtime sources offline	2026-06-15 14:43:48 +08:00
Haitao Pan	3b6b03da95	feat: prefer idempotent offline runtime installs	2026-06-15 14:32:36 +08:00
Haitao Pan	fe8f3e38e0	fix: upload offline packages with retries	2026-06-14 14:22:59 +08:00
Haitao Pan	65bb07ab06	feat: build offline AI Workspace installer packages	2026-06-14 13:50:36 +08:00

49 Commits