History

Haitao Pan 01f1499a60 feat(ai-workspace): consume prebuilt console runtime for final deployment The macOS console API previously ran via `go run .`, which fails under launchd's minimal PATH (no `go`) and recompiles on every launch. Switch to the same prebuilt-runtime consumption model the bridge/qmd/litellm runtimes already use. The ai-workspace role now does final deployment only (never builds): - download xworkspace-console-runtime-<os>-<arch>.tar.gz (incl. darwin-arm64) from the latest-runtime release, or use an offline-staged archive via XWORKSPACE_CONSOLE_RUNTIME_ARCHIVE; - unpack to a per-user system dir (~/.local/share/xworkspace-console), idempotent via a sha256 marker; - read manifest.json to resolve the prebuilt API binary and assert it is a present, executable native binary; - on macOS, deploy a LaunchAgent that sources portal.env and execs the prebuilt binary directly — no go, no Homebrew, no PATH games. The Go API is pure-Go (no cgo), so CI cross-compiles darwin-arm64 cleanly; this role only consumes that artifact. Validated end-to-end on darwin-arm64: packaged binary serves :8788 (200 with token, 401 without) under launchd. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>		2026-06-22 17:04:55 +08:00
..
agent_skills	fix: make agent_skills defaults cross-platform (HOME, user, group)	2026-06-18 16:46:16 +08:00
ai_agent_runtime	fix(playbooks): use include_tasks for windows and force node24 path for openclaw	2026-06-21 19:52:14 +08:00
azure_dev_desktop_lifecycle/tasks	feat(playbooks): add cloud desktop bootstrap flow	2026-04-10 17:09:59 +08:00
charts	feat(gpu_inference): add comprehensive GPU inference infrastructure with Sealos, Ray, and vLLM	2026-04-23 19:17:23 +08:00
cloud_cli_prereqs	fix(permissions): add missing become:true to all cross-platform /usr/local/bin writes	2026-06-19 18:56:47 +08:00
cloud_vm_inventory_emit/tasks	feat(playbooks): add cloud desktop bootstrap flow	2026-04-10 17:09:59 +08:00
cloud_vm_request_validate/tasks	feat(playbooks): add cloud desktop bootstrap flow	2026-04-10 17:09:59 +08:00
cloudflare_dns	Migrate XRDP and Cloudflare playbooks	2026-04-05 16:54:48 +08:00
cloudflare_svc_plus_dns	deploy: align console ingress and dns contract	2026-04-12 18:14:28 +08:00
dev_desktop_common	feat(playbooks): add cloud desktop bootstrap flow	2026-04-10 17:09:59 +08:00
dev_desktop_debian_kde/tasks	feat(playbooks): add cloud desktop bootstrap flow	2026-04-10 17:09:59 +08:00
dev_desktop_fedora_gnome/tasks	feat(playbooks): add cloud desktop bootstrap flow	2026-04-10 17:09:59 +08:00
dev_desktop_windows	feat(playbooks): add cloud desktop bootstrap flow	2026-04-10 17:09:59 +08:00
docker	fix: quote strings in caddy_base_dir Jinja (invalid unquoted expr)	2026-06-21 09:55:12 +00:00
gcp_dev_desktop_lifecycle/tasks	feat(playbooks): add cloud desktop bootstrap flow	2026-04-10 17:09:59 +08:00
github	chore: commit pending infra playbook changes including ssh initialization script	2026-06-19 18:09:16 +08:00
grafana-dashboard	feat(ansible): extract playbooks and roles into standalone repository	2025-12-21 19:09:46 +08:00
harden_ssh_root_key_only	Migrate XRDP and Cloudflare playbooks	2026-04-05 16:54:48 +08:00
readonly_ssh_user	Add readonly SSH audit user role and playbooks	2026-04-10 11:08:47 +08:00
vhosts	feat(ai-workspace): consume prebuilt console runtime for final deployment	2026-06-22 17:04:55 +08:00
README.md	feat(playbooks): add comprehensive vhosts roles and ops scripts	2025-12-21 19:23:19 +08:00

README.md

Playbook roles planning

This document clarifies what should live under /playbooks/roles/ for host-level automation (Ansible) versus what should be delivered through Helm charts, and ensures we cover the five tiers across data platforms: data warehouse → big data → ML → DL → large models.

Scope rules

Ansible roles: host-coupled configuration that is not itself a cloud resource (GPU driver/runtime, OS tuning, user/SSH prep, rendering on-host config files, database bootstrapping, etc.).
Helm charts: anything that runs as a Kubernetes workload (operators, clusters, services running in pods).

Base roles shared across tiers (Ansible)

GPU driver and CUDA stack installation.
Docker/Containerd runtime setup.
System parameter tuning (kernel limits, hugepages, network stack), plus user home/SSH layout.
Database initialization tasks (e.g., bootstrap PostgreSQL/ClickHouse on hosts) and rendering templated configs such as ClickHouse/users.xml.

Coverage by capability tier

Tier	Host-focused roles (Ansible)	Kubernetes services (Helm)
Data warehouse	ClickHouse host bootstrap & config render; PostgreSQL init where needed.	—
Big data	JVM/runtime, local disks, and OS tuning for data nodes.	Spark Operator; Flink Operator; Kafka/Redpanda; MinIO.
ML	GPU runtime base (drivers, container runtime), Python ML base image prep; user workspace/SSH.	Ray Cluster; MLflow; JupyterHub.
DL	Same GPU/system tuning plus inference node bootstrap (tensorRT/cuDNN as needed).	Triton Inference Server; LMDeploy (for deployment runtimes).
Large models	Secure SSH/user profiles and config templating for model storage/IO.	vLLM serving; model-specific Helm releases atop Ray/K8s.

Suggested role layout under `/playbooks/roles/`

common/ (new): shared tasks for system tuning, users/SSH, and package repos for GPU/runtime support.
gpu/: install GPU drivers + CUDA toolkit.
container_runtime/: install and configure Docker/Containerd with GPU runtime integration.
database_init/: bootstrap on-host databases (e.g., PostgreSQL, ClickHouse), render config files (users.xml, etc.).
bigdata_node_prep/: OS/disk tuning for Spark/Flink/Kafka/Redpanda/MinIO hosts.
ml_node_prep/: Python/conda base, SSH workspace prep for ML workloads.
dl_inference_node/: tensorRT/cuDNN dependencies and runtime checks for Triton/LMDeploy nodes.

Helm-delivered components should live under playbooks/roles/charts/ or the repo’s Helm release structure and include Spark/Flink Operators, Kafka/Redpanda/MinIO, Ray Cluster, Triton, vLLM/LMDeploy, MLflow, and JupyterHub.