playbooks/roles
Haitao Pan 477b52c516
fix(acp_server_opencode): detect opencode CLI at deploy time (portable across Debian/Ubuntu/macOS) (#22)
Stop assuming a fixed opencode path. Probe the real binary with 'command -v'
using the role PATH, then feed the resolved path to both the systemd unit and
the launchd plist (plist now also passes -opencode-bin). Falls back to the
OS-aware default when opencode is not yet installed.

Also remove the dead acp-bridge.service.j2 template: it was not deployed by any
task and referenced two undefined vars (acp_opencode_bridge_disabled_binary_path,
acp_opencode_bridge_opencode_binary_path) — a hardcoding landmine.

Co-authored-by: Haitao Pan <manbuzhe2009@qq.com>
Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-28 15:31:54 +08:00
..
agent_skills refactor(agent_skills): run on target host, git-clone sources, drop delegate_to localhost 2026-06-24 14:57:49 +08:00
ai_agent_runtime fix(ai_agent_runtime): resolver must verify browser actually runs, skip disabled stub 2026-06-26 10:42:06 +08:00
azure_dev_desktop_lifecycle/tasks feat(playbooks): add cloud desktop bootstrap flow 2026-04-10 17:09:59 +08:00
charts feat(gpu_inference): add comprehensive GPU inference infrastructure with Sealos, Ray, and vLLM 2026-04-23 19:17:23 +08:00
cloud_cli_prereqs fix(permissions): add missing become:true to all cross-platform /usr/local/bin writes 2026-06-19 18:56:47 +08:00
cloud_vm_inventory_emit/tasks feat(playbooks): add cloud desktop bootstrap flow 2026-04-10 17:09:59 +08:00
cloud_vm_request_validate/tasks feat(playbooks): add cloud desktop bootstrap flow 2026-04-10 17:09:59 +08:00
cloudflare_dns fix(cloudflare): prefer DNS scoped token 2026-06-27 13:48:19 +08:00
cloudflare_svc_plus_dns deploy: align console ingress and dns contract 2026-04-12 18:14:28 +08:00
dev_desktop_common feat(playbooks): add cloud desktop bootstrap flow 2026-04-10 17:09:59 +08:00
dev_desktop_debian_kde/tasks feat(playbooks): add cloud desktop bootstrap flow 2026-04-10 17:09:59 +08:00
dev_desktop_fedora_gnome/tasks feat(playbooks): add cloud desktop bootstrap flow 2026-04-10 17:09:59 +08:00
dev_desktop_windows feat(playbooks): add cloud desktop bootstrap flow 2026-04-10 17:09:59 +08:00
docker feat: implement postgresql.svc.plus docker deployment role 2026-06-26 10:00:00 +08:00
gcp_dev_desktop_lifecycle/tasks feat(playbooks): add cloud desktop bootstrap flow 2026-04-10 17:09:59 +08:00
github chore: commit pending infra playbook changes including ssh initialization script 2026-06-19 18:09:16 +08:00
grafana-dashboard feat(ansible): extract playbooks and roles into standalone repository 2025-12-21 19:09:46 +08:00
harden_ssh_root_key_only Migrate XRDP and Cloudflare playbooks 2026-04-05 16:54:48 +08:00
readonly_ssh_user Add readonly SSH audit user role and playbooks 2026-04-10 11:08:47 +08:00
vhosts fix(acp_server_opencode): detect opencode CLI at deploy time (portable across Debian/Ubuntu/macOS) (#22) 2026-06-28 15:31:54 +08:00
README.md feat(playbooks): add comprehensive vhosts roles and ops scripts 2025-12-21 19:23:19 +08:00

Playbook roles planning

This document clarifies what should live under /playbooks/roles/ for host-level automation (Ansible) versus what should be delivered through Helm charts, and ensures we cover the five tiers across data platforms: data warehouse → big data → ML → DL → large models.

Scope rules

  • Ansible roles: host-coupled configuration that is not itself a cloud resource (GPU driver/runtime, OS tuning, user/SSH prep, rendering on-host config files, database bootstrapping, etc.).
  • Helm charts: anything that runs as a Kubernetes workload (operators, clusters, services running in pods).

Base roles shared across tiers (Ansible)

  • GPU driver and CUDA stack installation.
  • Docker/Containerd runtime setup.
  • System parameter tuning (kernel limits, hugepages, network stack), plus user home/SSH layout.
  • Database initialization tasks (e.g., bootstrap PostgreSQL/ClickHouse on hosts) and rendering templated configs such as ClickHouse/users.xml.

Coverage by capability tier

Tier Host-focused roles (Ansible) Kubernetes services (Helm)
Data warehouse ClickHouse host bootstrap & config render; PostgreSQL init where needed.
Big data JVM/runtime, local disks, and OS tuning for data nodes. Spark Operator; Flink Operator; Kafka/Redpanda; MinIO.
ML GPU runtime base (drivers, container runtime), Python ML base image prep; user workspace/SSH. Ray Cluster; MLflow; JupyterHub.
DL Same GPU/system tuning plus inference node bootstrap (tensorRT/cuDNN as needed). Triton Inference Server; LMDeploy (for deployment runtimes).
Large models Secure SSH/user profiles and config templating for model storage/IO. vLLM serving; model-specific Helm releases atop Ray/K8s.

Suggested role layout under /playbooks/roles/

  • common/ (new): shared tasks for system tuning, users/SSH, and package repos for GPU/runtime support.
  • gpu/: install GPU drivers + CUDA toolkit.
  • container_runtime/: install and configure Docker/Containerd with GPU runtime integration.
  • database_init/: bootstrap on-host databases (e.g., PostgreSQL, ClickHouse), render config files (users.xml, etc.).
  • bigdata_node_prep/: OS/disk tuning for Spark/Flink/Kafka/Redpanda/MinIO hosts.
  • ml_node_prep/: Python/conda base, SSH workspace prep for ML workloads.
  • dl_inference_node/: tensorRT/cuDNN dependencies and runtime checks for Triton/LMDeploy nodes.

Helm-delivered components should live under playbooks/roles/charts/ or the repos Helm release structure and include Spark/Flink Operators, Kafka/Redpanda/MinIO, Ray Cluster, Triton, vLLM/LMDeploy, MLflow, and JupyterHub.