Compare commits

..

No commits in common. "main" and "release/v0.1" have entirely different histories.

42 changed files with 143 additions and 2069 deletions

View File

@ -1,44 +0,0 @@
name: Validate Release PR
# release/* 分支的发布策略门禁:仅接受 hotfix/* 或带 cherry-pick/backport 标签的 PR。
# 详见 iac_modules/docs/tldr-github-branch-model.md
on:
pull_request_target:
types: [opened, synchronize, reopened, labeled, unlabeled]
permissions:
contents: read
pull-requests: read
jobs:
validate-release-source:
runs-on: ubuntu-latest
if: startsWith(github.base_ref, 'release/')
steps:
- name: Check PR source branch
run: |
SRC="${{ github.head_ref }}"
TGT="${{ github.base_ref }}"
LABELS="${{ join(github.event.pull_request.labels.*.name, ',') }}"
echo "🔍 Validating PR into release branch"
echo " source: $SRC"
echo " target: $TGT"
echo " labels: $LABELS"
if [[ "$SRC" =~ ^hotfix/ ]]; then
echo "✅ Allowed: hotfix/* branch"
exit 0
fi
if [[ "$LABELS" =~ (^|,)(cherry-pick|backport)(,|$) ]]; then
echo "✅ Allowed: cherry-pick/backport labeled PR"
exit 0
fi
echo "❌ Rejected."
echo "release/* 仅接受:"
echo " - 来自 hotfix/* 的 PR"
echo " - 带 cherry-pick 或 backport 标签的 PR已验证 feature 的 backport/cherry-pick"
echo "禁止从 main / develop / feature/* 直接合并到 release/*。"
exit 1

184
README.md
View File

@ -1,59 +1,157 @@
[🇺🇸 English](README.md) | [🇨🇳 中文](README.zh.md)
# 🌐 CloudNativeSuite
Cloud-Neutral · Cloud-Agnostic · AI-Enhanced Infrastructure Platform
# AI Workspace Infrastructure Modules (iac_modules)
---
`iac_modules` is the core Infrastructure-as-Code (IaC) repository for the AI Workspace ecosystem. It provides cloud-neutral, multi-cloud GitOps orchestrations, standard Terraform/Terragrunt modules, and automated deployment pipelines for establishing a resilient, scalable AI infrastructure platform.
<details>
<summary><strong>🇨🇳 中文版 README展开查看</strong></summary>
## About
<br>
- **Cloud-Neutral by Design**: Consistent resource abstraction across AWS, GCP, Azure, Alibaba Cloud, and Vultr.
- **GitOps Orchestration**: Automated multi-environment pipelines for bootstrap, landing zone, and account matrices.
- **VPN Overlay & Networking**: Includes configurations for establishing secure `vpn-overlay` (WireGuard, Xray) connections.
- **Standardized HCL**: A robust library of pre-configured `terraform-hcl-standard` modules for compute, networking, and security.
# 🌐 CloudNativeSuite
云中立 · 多云敏捷 · AI 驱动的基础设施平台
## Start TLDR
CloudNativeSuite 是一个以 **Cloud-Neutral云中立** 为核心理念构建的现代开源平台,
旨在为工程团队提供 **多云 GitOps 编排、可观测性、开发者加速与智能运维分析** 的统一解决方案。
> **Note:** These modules are designed to be consumed by CI/CD pipelines (e.g., GitHub Actions) and orchestration tools rather than being run entirely manually.
平台由五大核心项目构成:
### Prerequisites
---
Ensure you have the following installed if you plan to run modules locally:
- Terraform >= 1.5.0
- Terragrunt >= 0.50.0
- Cloud Provider CLIs (AWS CLI, gcloud, etc.)
## 🧩 项目仓库Repositories
### Usage
| 项目名 | 描述 |
|--------|-------------|
| **Modern-Container-Application-Reference-Architecture** | 云中立架构示例,构建多云/混合环境的现代应用参考体系 |
| **XControl** | X 系列工具的统一 Demo 控制台 |
| **XStream** | 面向开发者的智能网络加速器 |
| **XScopeHub** | 基于 OpenTelemetry 的可观测性与 AI-Ops 分析平台 |
| **XCloudFlow** | 多云 GitOps 资源编排与自动部署引擎 |
1. Multi-cloud Pipelines:
Refer to the `.github/workflows` directory for automated matrices:
```bash
.github/workflows/iac-pipeline-mutli-cloud-bootstrap.yaml
.github/workflows/iac-pipeline-mutli-cloud-landingzone-baseline.yaml
```
---
2. Standard Terraform Modules:
Navigate to the module directory and initialize:
```bash
cd terraform-hcl-standard/
terraform init
terraform plan
```
## 🚀 核心能力
3. Setup VPN Overlay:
```bash
cd vpn-overlay/
# Follow specific instructions for wireguard or xray setup
```
### ☁️ Cloud-Neutral 多云中立
- 任意云环境一致使用
- 跨区域/跨云迁移无缝切换
- 统一资源模型Cloud-Agnostic
## Repository Structure
### 🔁 GitOps 统一工作流
- 多环境自动化部署
- 自动生成、同步与调度 CloudFlow Pipelines
- 持续交付与状态漂移检测
- `terraform-hcl-standard/`: Core, reusable Terraform modules.
- `vpn-overlay/`: Secure networking and overlay configurations.
- `.github/workflows/`: GitOps pipelines and CI/CD matrices.
- `scripts/`: Helper scripts for environment setup and deployment.
- `example/`: Example implementations and reference architectures.
### 🔭 可观测性 + AI-Ops
- Metrics / Logs / Traces 全栈观察
- 自动基线分析、智能告警
- AI 式根因推理与修复建议
## Docs / Links
### ⚡ 开发者效率提升
- XStream 加速 GitHub / DockerHub / AI API
- 智能路由 & Zero-Trust 通道
- 低开销、高可靠网络体验
- [Official Website](https://www.svc.plus/)
- [CloudNativeSuite Documentation](https://github.com/ai-workspace-infra)
### 🧩 统一控制平面
- 通过 XControl 一站式查看、管理与演示
- 展示 CloudFlow、ScopeHub、XStream 的协同工作流
---
## 🌟 使用场景Use Cases
- 多云 GitOps 编排与自动部署
- Cloud-Neutral 应用落地架构
- AI 增强可观测性(智能诊断与自动分析)
- 大规模节点自动化/初始化
- 开发者跨境网络加速
---
## 🌐 官方网站
https://www.svc.plus/
---
欢迎参与共建,一起打造下一代 Cloud-Neutral 工程体系。
</details>
---
<details>
<summary><strong>🇺🇸 English Version README (Click to expand)</strong></summary>
<br>
# 🌐 CloudNativeSuite
Cloud-Neutral · Multi-Cloud Ready · AI-Enhanced Infrastructure Platform
CloudNativeSuite is an open platform designed around the principle of **Cloud-Neutral** infrastructure —
helping teams deploy, observe, accelerate, and operate workloads consistently across any cloud environment.
The platform consists of five core projects:
---
## 🧩 Repositories
| Project | Description |
|--------|-------------|
| **Modern-Container-Application-Reference-Architecture** | Reference architecture for cloud-neutral multi-cloud application stacks |
| **XControl** | Unified demo & control plane for all X-Series tools |
| **XStream** | Developer network accelerator |
| **XScopeHub** | Observability suite powered by OpenTelemetry and AI-Ops analytics |
| **XCloudFlow** | Multi-cloud GitOps orchestrator for automated deployment pipelines |
---
## 🚀 Core Capabilities
### ☁️ Cloud-Neutral by Design
- Consistent experience across AWS, GCP, Azure, Alibaba Cloud, and on-prem
- Cross-cloud/cross-region migration without vendor lock-in
- Unified resource abstraction
### 🔁 GitOps Orchestration
- Automated multi-environment deployments
- Cross-cloud CloudFlow pipelines
- Continuous delivery with drift detection
### 🔭 Observability & AI-Ops
- Full-stack telemetry: metrics, logs, traces
- Automated anomaly detection
- AI-powered diagnostics and root-cause reasoning
### ⚡ Developer Productivity
- XStream accelerates GitHub / DockerHub / AI APIs
- Reality routing & zero-trust tunnels
- Designed for high-latency or restricted networks
### 🧩 Unified Control Plane
- XControl provides a consistent UI for demos & orchestration
- Shows how CloudFlow, ScopeHub, and XStream operate together end-to-end
---
## 🌟 Use Cases
- Multi-cloud GitOps orchestration
- Cloud-neutral architecture patterns
- AI-driven observability & diagnostics
- Distributed node automation
- Developer network acceleration
---
## 🌐 Official Website
https://www.svc.plus/
---
Contributions are welcome —
help us build the future of Cloud-Neutral Infrastructure.
</details>

View File

@ -1,59 +0,0 @@
[🇺🇸 English](README.md) | [🇨🇳 中文](README.zh.md)
# AI Workspace Infrastructure Modules (iac_modules)
`iac_modules` 是 AI Workspace 生态系统的核心基础设施即代码 (IaC) 仓库。它提供云中立的多云 GitOps 编排、标准的 Terraform/Terragrunt 模块,以及自动化的部署流水线,旨在构建弹性、可扩展的 AI 基础设施平台。
## 关于 (About)
- **云中立设计 (Cloud-Neutral)**:跨 AWS、GCP、Azure、阿里云和 Vultr 等环境提供一致的资源抽象。
- **GitOps 编排**:自动化的多环境流水线,支持引导层 (Bootstrap)、登陆区 (Landing Zone) 和多云账户矩阵。
- **VPN Overlay 与网络**:包含建立安全 `vpn-overlay`WireGuard、Xray连接的配置。
- **标准化 HCL 代码**:为计算、网络和安全提供强大且预配置的 `terraform-hcl-standard` 模块库。
## 快速开始 (Start TLDR)
> **注意:** 这些模块主要设计供 CI/CD 流水线(例如 GitHub Actions和编排工具调用而不是完全通过手动运行。
### 前置依赖 (Prerequisites)
如果计划在本地运行模块,请确保安装了以下工具:
- Terraform >= 1.5.0
- Terragrunt >= 0.50.0
- 云服务商 CLI 工具 (AWS CLI, gcloud 等)
### 使用 (Usage)
1. 多云流水线 (Multi-cloud Pipelines)
请参考 `.github/workflows` 目录中自动化的矩阵配置:
```bash
.github/workflows/iac-pipeline-mutli-cloud-bootstrap.yaml
.github/workflows/iac-pipeline-mutli-cloud-landingzone-baseline.yaml
```
2. 标准 Terraform 模块:
进入模块目录并初始化:
```bash
cd terraform-hcl-standard/
terraform init
terraform plan
```
3. 配置 VPN Overlay
```bash
cd vpn-overlay/
# 根据具体说明进行 wireguard 或 xray 配置
```
## 仓库结构 (Repository Structure)
- `terraform-hcl-standard/`: 核心且可重用的 Terraform 模块。
- `vpn-overlay/`: 安全网络与覆盖网络配置。
- `.github/workflows/`: GitOps 流水线和 CI/CD 矩阵。
- `scripts/`: 环境搭建与部署的辅助脚本。
- `example/`: 示例实现与参考架构。
## 文档 / 链接 (Docs / Links)
- [官方网站](https://www.svc.plus/)
- [CloudNativeSuite 文档](https://github.com/ai-workspace-infra)

View File

@ -1,14 +0,0 @@
# Documentation Coverage Matrix
This matrix tracks the bilingual canonical documentation set for `iac_modules` and maps it back to the current codebase and older docs.
该矩阵用于跟踪 `iac_modules` 的双语规范文档,并将其与当前代码状态和历史文档对应起来。
| Category | EN | ZH | Current status | Existing references | Next check |
| --- | --- | --- | --- | --- | --- |
| Architecture | Yes | Yes | Seeded from current codebase and existing docs. | `CloudNeutral-Architecture-Blueprint-2025.md` | Keep diagrams and ownership notes synchronized with actual directories, services, and integration dependencies. |
| Design | Yes | Yes | Seeded from current codebase and existing docs. | `Multi-Cloud-Landing-Zone-Planning.md` | Promote one-off implementation notes into reusable design records when behavior, APIs, or deployment contracts change. |
| Deployment | Yes | Yes | Seeded from current codebase; deeper legacy consolidation is still needed. | None yet; use the new canonical page as the starting point. | Verify deployment steps against current scripts, manifests, CI/CD flow, and environment contracts before each release. |
| User Guide | Yes | Yes | Seeded from current codebase; deeper legacy consolidation is still needed. | None yet; use the new canonical page as the starting point. | Prefer workflow-oriented examples and keep screenshots or terminal snippets aligned with the latest UI or CLI behavior. |
| Developer Guide | Yes | Yes | Seeded from current codebase; deeper legacy consolidation is still needed. | None yet; use the new canonical page as the starting point. | Keep setup and test commands tied to actual package scripts, Make targets, or language toolchains in this repository. |
| Vibe Coding Reference | Yes | Yes | Seeded from current codebase; deeper legacy consolidation is still needed. | None yet; use the new canonical page as the starting point. | Review prompt templates and repo rules whenever the project adds new subsystems, protected areas, or mandatory verification steps. |

View File

@ -1,38 +0,0 @@
# Infrastructure as Code Modules / 基础设施即代码模块仓库
This `docs/` directory now has a bilingual canonical layer for the current repository state.
`docs/` 目录现已补齐双语规范层,用于承接当前仓库状态下的核心文档。
## Quick Entry / 快速入口
- Coverage checklist / 覆盖检查矩阵: `docs/DOC_COVERAGE.md`
- English index / 英文入口: `docs/en/README.md`
- 中文入口 / Chinese index: `docs/zh/README.md`
## Canonical Bilingual Pages / 双语规范页
- `docs/en/architecture.md` / `docs/zh/architecture.md`
- `docs/en/design.md` / `docs/zh/design.md`
- `docs/en/deployment.md` / `docs/zh/deployment.md`
- `docs/en/user-guide.md` / `docs/zh/user-guide.md`
- `docs/en/developer-guide.md` / `docs/zh/developer-guide.md`
- `docs/en/vibe-coding-reference.md` / `docs/zh/vibe-coding-reference.md`
## Current Repo Context / 当前仓库背景
- Root README: `🌐 CloudNativeSuite`
- Previous docs index: `Documentation`
- Manifest evidence / 构建清单: repository structure and scripts only
- Active code and ops directories / 当前主要目录: `scripts/`, `example/`
## Existing Docs To Reconcile / 需要继续归并的现有文档
- `AWS-Landing-Zone-Baseline.md`
- `Alicloud-Landing-Zone-Baseline.md`
- `CloudNeutral-Architecture-Blueprint-2025.md`
- `Multi-Cloud-Landing-Zone-Planning.md`
- `Vultr-Landing-Zone-Baseline.md`
- `cilium-egress-vxlan-crosscluster.md`
- `landingzone/alicloud-landingzone-mvp-single-account.md`
- `virtual-cloud-README.md`

View File

@ -1,30 +0,0 @@
# Infrastructure as Code Modules Documentation
This repository contains infrastructure modules, baseline blueprints, and platform landing zone reference material.
## Current state snapshot
- Root README title: `🌐 CloudNativeSuite`
- Build/runtime evidence: repository structure and scripts only
- Primary directories detected: `scripts/`, `example/`
- Existing docs count: 8
## Canonical pages
- [Architecture](architecture.md)
- [Design](design.md)
- [Deployment](deployment.md)
- [User Guide](user-guide.md)
- [Developer Guide](developer-guide.md)
- [Vibe Coding Reference](vibe-coding-reference.md)
## Legacy docs to fold in
- `AWS-Landing-Zone-Baseline.md`
- `Alicloud-Landing-Zone-Baseline.md`
- `CloudNeutral-Architecture-Blueprint-2025.md`
- `Multi-Cloud-Landing-Zone-Planning.md`
- `Vultr-Landing-Zone-Baseline.md`
- `cilium-egress-vxlan-crosscluster.md`
- `landingzone/alicloud-landingzone-mvp-single-account.md`
- `virtual-cloud-README.md`

View File

@ -1,24 +0,0 @@
# Architecture
This repository contains infrastructure modules, baseline blueprints, and platform landing zone reference material.
Use this page as the canonical bilingual overview of system boundaries, major components, and repo ownership.
## Current code-aligned notes
- Documentation target: `iac_modules`
- Repo kind: `infra-modules`
- Manifest and build evidence: repository structure and scripts only
- Primary implementation and ops directories: `scripts/`, `example/`
- Package scripts snapshot: No package.json scripts were detected.
## Existing docs to reconcile
- `CloudNeutral-Architecture-Blueprint-2025.md`
## What this page should cover next
- Describe the current implementation rather than an aspirational future-only design.
- Keep terminology aligned with the repository root README, manifests, and actual directories.
- Link deeper runbooks, specs, or subsystem notes from the legacy docs listed above.
- Keep diagrams and ownership notes synchronized with actual directories, services, and integration dependencies.

View File

@ -1,24 +0,0 @@
# Deployment
This repository contains infrastructure modules, baseline blueprints, and platform landing zone reference material.
Use this page to standardize deployment prerequisites, supported topologies, operational checks, and rollback notes.
## Current code-aligned notes
- Documentation target: `iac_modules`
- Repo kind: `infra-modules`
- Manifest and build evidence: repository structure and scripts only
- Primary implementation and ops directories: `scripts/`, `example/`
- Package scripts snapshot: No package.json scripts were detected.
## Existing docs to reconcile
- No directly matching legacy docs were detected; this page is currently the canonical seed.
## What this page should cover next
- Describe the current implementation rather than an aspirational future-only design.
- Keep terminology aligned with the repository root README, manifests, and actual directories.
- Link deeper runbooks, specs, or subsystem notes from the legacy docs listed above.
- Verify deployment steps against current scripts, manifests, CI/CD flow, and environment contracts before each release.

View File

@ -1,24 +0,0 @@
# Design
This repository contains infrastructure modules, baseline blueprints, and platform landing zone reference material.
Use this page to consolidate design decisions, ADR-style tradeoffs, and roadmap-sensitive implementation notes.
## Current code-aligned notes
- Documentation target: `iac_modules`
- Repo kind: `infra-modules`
- Manifest and build evidence: repository structure and scripts only
- Primary implementation and ops directories: `scripts/`, `example/`
- Package scripts snapshot: No package.json scripts were detected.
## Existing docs to reconcile
- `Multi-Cloud-Landing-Zone-Planning.md`
## What this page should cover next
- Describe the current implementation rather than an aspirational future-only design.
- Keep terminology aligned with the repository root README, manifests, and actual directories.
- Link deeper runbooks, specs, or subsystem notes from the legacy docs listed above.
- Promote one-off implementation notes into reusable design records when behavior, APIs, or deployment contracts change.

View File

@ -1,24 +0,0 @@
# Developer Guide
This repository contains infrastructure modules, baseline blueprints, and platform landing zone reference material.
Use this page to document local setup, project structure, test surfaces, and contribution conventions tied to the current codebase.
## Current code-aligned notes
- Documentation target: `iac_modules`
- Repo kind: `infra-modules`
- Manifest and build evidence: repository structure and scripts only
- Primary implementation and ops directories: `scripts/`, `example/`
- Package scripts snapshot: No package.json scripts were detected.
## Existing docs to reconcile
- No directly matching legacy docs were detected; this page is currently the canonical seed.
## What this page should cover next
- Describe the current implementation rather than an aspirational future-only design.
- Keep terminology aligned with the repository root README, manifests, and actual directories.
- Link deeper runbooks, specs, or subsystem notes from the legacy docs listed above.
- Keep setup and test commands tied to actual package scripts, Make targets, or language toolchains in this repository.

View File

@ -1,24 +0,0 @@
# User Guide
This repository contains infrastructure modules, baseline blueprints, and platform landing zone reference material.
Use this page to document primary user/operator tasks, everyday workflows, and navigation to existing how-to material.
## Current code-aligned notes
- Documentation target: `iac_modules`
- Repo kind: `infra-modules`
- Manifest and build evidence: repository structure and scripts only
- Primary implementation and ops directories: `scripts/`, `example/`
- Package scripts snapshot: No package.json scripts were detected.
## Existing docs to reconcile
- No directly matching legacy docs were detected; this page is currently the canonical seed.
## What this page should cover next
- Describe the current implementation rather than an aspirational future-only design.
- Keep terminology aligned with the repository root README, manifests, and actual directories.
- Link deeper runbooks, specs, or subsystem notes from the legacy docs listed above.
- Prefer workflow-oriented examples and keep screenshots or terminal snippets aligned with the latest UI or CLI behavior.

View File

@ -1,24 +0,0 @@
# Vibe Coding Reference
This repository contains infrastructure modules, baseline blueprints, and platform landing zone reference material.
Use this page to align AI-assisted coding prompts, repo boundaries, safe edit rules, and documentation update expectations.
## Current code-aligned notes
- Documentation target: `iac_modules`
- Repo kind: `infra-modules`
- Manifest and build evidence: repository structure and scripts only
- Primary implementation and ops directories: `scripts/`, `example/`
- Package scripts snapshot: No package.json scripts were detected.
## Existing docs to reconcile
- No directly matching legacy docs were detected; this page is currently the canonical seed.
## What this page should cover next
- Describe the current implementation rather than an aspirational future-only design.
- Keep terminology aligned with the repository root README, manifests, and actual directories.
- Link deeper runbooks, specs, or subsystem notes from the legacy docs listed above.
- Review prompt templates and repo rules whenever the project adds new subsystems, protected areas, or mandatory verification steps.

View File

@ -1,127 +0,0 @@
# Release v1.1.5 — 发布前准备记录
> 记录 2026-06-28 一轮 7 仓发布前准备:稳定性改进收口、发布分支/Tag 创建、两级分支保护、分支模型门禁、首个 hotfix PR 审核。
> 配套文档:分支模型策略见 [tldr-github-branch-model.md](tldr-github-branch-model.md);稳定性根因与改进见 `xworkmate-app/docs/cases/06-gateway-turn-stability-and-robustness.md`
---
## 0. TL;DR
- **稳定性改进全部落地**T1T13 + S0 + S5四层 gateway turn 链路止血 + 持久 run 仓 + 可观测 + 插件稳定安装 + 提示精简main 已合并并 live 验证通过。
- **7 仓建发布分支 + Tag**`release/v1.1.5`(分支与 Tag 同名,推送用显式 refspec
- **两级分支保护**`main` 轻保护(走 PR、禁 force-push`release/v*` 严保护1 review + status checks + conversation + linear history + admin 不豁免)。
- **发布门禁**`release/*` 仅接受 `hotfix/*` 或带 `cherry-pick`/`backport` 标签的 PRworkflow 待部署)。
- **首个 hotfix PR #16** 审核通过PDF 成品排序),待合并。
---
## 1. 仓库清单与发布基线
| 仓库 | owner/repo | remote | `release/v1.1.5` 基线 commit |
|---|---|---|---|
| xworkmate-app | `ai-workspace-lab/xworkmate-app` | SSH | `b95be41` |
| xworkmate-bridge | `ai-workspace-lab/xworkmate-bridge` | SSH | `188ca4b` |
| xworkspace-console | `ai-workspace-lab/xworkspace-console` | SSH | `3ce3c6f` |
| openclaw-multi-session-plugins | `ai-workspace-lab/openclaw-multi-session-plugins` | SSH | `849972a` |
| playbooks | `ai-workspace-infra/playbooks` | SSH | `d806ba9` |
| iac_modules | `ai-workspace-infra/iac_modules` | SSH**已从 Cloud-Neutral-Workshop 改正**| `e489fa7` |
| xworkspace-core-skills | `ai-workspace-lab/xworkspace-core-skills` | SSH**已从 https 改正**| `c6b1a03` |
> 两处 remote 修正:`iac_modules` → `ai-workspace-infra``xworkspace-core-skills` → SSHhttps + PAT 缺 `workflow` scope无法写 `.github/workflows/`)。
---
## 2. 稳定性改进收口(进入 v1.1.5 的内容)
完整根因与设计见 cases/06。本轮确认 **已合并 main 且 live 验证**
| 项 | 仓库 | 内容 |
|---|---|---|
| T1 | playbooks | Caddy `/acp*` 超时对齐 bridge 60min30m→70m|
| T2 | playbooks | 补齐非 `/acp` 路由流式配置(`flush_interval -1` + 长超时)|
| T3 | app | running 轮询加硬截止DeadlineAt|
| T4 | app | `停止` 本地权威化(清 pending|
| T5 | app | 传输中断降级为「后台续跑·重连中」有界续轮询6 次耗尽落终态)|
| T6 | app | 失败路径与 pending 清理一致性 |
| T7/T8/T9 | bridge | 持久 per-session run 仓,脱离 WS 连接生命周期DeadlineAt 兜底 interrupted |
| T10/T11/T12 | bridge | 错误语义(`retryable`/`poll`+ runId 贯穿日志 + 三项指标经 `/api/ping` 暴露 |
| S0 | runtime | 插件稳定安装(真实目录 + `openclaw plugins install`),重启后 6 plugins 不丢 |
| S5 | app | gateway 提示工作区上下文精简,去除冲突的 App 本地路径 |
**live 健康2026-06-27/28**bridge 运行 commit == main HEAD无漂移`/api/ping.metrics` 三项均 0网关稳定 6 plugins四层 E2E 打通summary.md 438B 落 task scope
### 已知 backlog未进 v1.1.5,记录在案)
- **S1**:缺省 `expectedArtifactDirs` 兜底扫描 —— 已合并后回退(`0280893`→`81f65e3`),需解耦「扫描提示/阻塞导出」后重做。
- **T8b**bridge run 仓跨进程重启持久化 —— 纯增量加固,~250350 行评估在案,未做。
---
## 3. 发布分支与 Tag
每仓从 `main` 创建(分支与 Tag **同名** `release/v1.1.5`
```bash
git branch release/v1.1.5 main
git tag release/v1.1.5 main
# 同名 → 显式 refspec 推送,避免歧义
git push origin refs/heads/release/v1.1.5:refs/heads/release/v1.1.5 \
refs/tags/release/v1.1.5:refs/tags/release/v1.1.5
```
> 本地操作同名分支/Tag 用 `git checkout refs/heads/release/v1.1.5` 消歧义。
---
## 4. 两级分支保护(已应用)
### 4.1 `main`轻保护7 仓)
| 规则 | 值 |
|---|---|
| 要求 PR | ✅(`required_approving_review_count: 0`,可合自己的 PR|
| 禁 force-push | ✅ `allow_force_pushes: false` |
| 禁删除 | ✅ `allow_deletions: false` |
| Admin 豁免 | ✅ 可绕过(`enforce_admins: false`|
| Status checks / linear history | ❌ 不强制 |
### 4.2 `release/v1.1.5`严保护7 仓)
| 规则 | 值 |
|---|---|
| 禁直接 push | ✅(走 PR|
| 禁 force-push / 禁删除 | ✅ / ✅ |
| Admin 不豁免 | ✅ `enforce_admins: true` |
| 强制 review | `required_approving_review_count: 1` + `dismiss_stale_reviews: true` |
| Status checks | `strict: true`contexts 待来源校验 workflow 跑出后补)|
| Conversation 解决 | ✅ `required_conversation_resolution: true` |
| Linear history | ✅ `required_linear_history: true`(禁 merge commit|
> API 注意:必传 `required_pull_request_reviews`/`required_status_checks`/`restrictions`(可为 `null`)否则 422分支名 `/` 在 path 段需编码 `release%2Fv1.1.5`
> 写 `.github/workflows/` 需 token 带 `workflow` scope 或改走 SSH git push当前 token scopes`repo, read:org, project, gist, admin:public_key`**无 workflow**)。
---
## 5. 发布门禁 — PR 来源校验
`release/*` 仅接受 `hotfix/*` 或带 `cherry-pick`/`backport` 标签的 PR禁止 main/develop/feature 直入。校验靠 `.github/workflows/validate-release-pr.yml`(详见 tldr-github-branch-model.md §4
- ⏳ **状态:待部署**。因 main 现已要求 PR需为 7 仓各开 PR 合入 workflow现有 `release/v1.1.5` 需 backport 该 workflow 才能对其 PR 实际生效(`pull_request_target` 用 base 分支的 workflow 版本)。
---
## 6. 首个 hotfix PR 审核
**[PR #16](https://github.com/ai-workspace-lab/xworkmate-app/pull/16)** — `fix(artifacts): prioritize PDF deliverables in sidebar`
- 路径:`hotfix/pdf-sync-anomaly-release` → `release/v1.1.5`(✅ 合规)
- 改动2 文件 +67/-0排序 comparator 加 PDF 优先 + 根目录优先
- 实证:`flutter test desktop_thread_artifact_service_test.dart` **4/4 通过**`flutter analyze` 改动文件 **No issues**
- 结论:**通过,可合并**(单 commit满足 linear history
- nit非阻塞PDF 深度 tie-break 的 `if(a==pdf)` 隐含 b 也是 pdf可加注释
---
## 7. 剩余动作(发布前 checklist
- [ ] 部署 `validate-release-pr.yml` 到 7 仓 main走 PR
- [ ] backport workflow 到 `release/v1.1.5`hotfix PR并把其 check 名加入 `required_status_checks.contexts`
- [ ] Approve + squash 合并 PR #16
- [ ] (可选)把现有 CI checkapp: `build-and-release`/`pr-tests`/`release-e2e`)按需加入 release/v* required checks
- [ ] backlog 排期S1 重做、T8b 跨重启持久化

View File

@ -1,219 +0,0 @@
# TL;DR — GitHub 分支模型与发布管理策略
> 适用于全部交付仓库7 仓)。**核心思想:`main` 是「最新但不保证最稳」的集成主干,`release/v*` 是「冻结的稳定发布分支」,二者治理规则不同。**
> 首次落地:`release/v1.1.5`2026-06-287 仓全部应用保护 + 来源校验)。
---
## 1. 一句话总结
| 分支 | 角色 | 谁能进 | 稳定性 |
|---|---|---|---|
| `main` | 集成主干,收各类已验证 PR | 任意已 review 的 PR | 最新,**不保证最稳** |
| `release/v*` | 冻结的发布快照(如 `release/v1.1.5` | **仅** `hotfix/*` + 已验证的 cherry-pick/backport | **最稳定**,受最严保护 |
| `hotfix/*` | 针对发布分支的紧急修复 | 从 `release/v*` 切出,修完 PR 回 `release/v*`(再 backport 回 `main` | — |
| `feature/*` | 常规功能开发 | PR → `main`**禁止**直接进 `release/*` | — |
> 不设 `release/latest`——「最新发布版」由 tag/版本号语义表达,别名分支多余。
---
## 2. 受保护分支规则(`release/v*`
7 仓的 `release/v1.1.5` 已应用以下 GitHub Branch Protection通过 REST API `PUT /repos/{owner}/{repo}/branches/{branch}/protection`
| 规则 | 设置 | 含义 |
|---|---|---|
| 禁止直接 Push | ✅PR 强制) | 所有变更走 PR |
| 禁止 Force Push | `allow_force_pushes: false` | 历史不可改写 |
| 禁止删除分支 | `allow_deletions: false` | 发布快照不可删 |
| Admin 不豁免 | `enforce_admins: true` | 管理员也受同等约束 |
| 强制 PR Review | `required_approving_review_count: 1` | 至少 1 人批准 |
| 过期 Review 失效 | `dismiss_stale_reviews: true` | 新 push 后需重新批准 |
| 必过 Status Checks | `required_status_checks.strict: true` | 含来源校验 workflow见 §4|
| 必须解决 Conversation | `required_conversation_resolution: true` | 所有 review 讨论需 resolve |
| 线性历史 | `required_linear_history: true` | 禁 merge commit仅 Squash/Rebase |
### 一键应用(任意新 `release/vX.Y.Z`
```bash
TOKEN=$(gh auth token)
SLUG=ai-workspace-lab/xworkmate-app # owner/repo
BRANCH=release/vX.Y.Z
curl -s -X PUT \
-H "Authorization: Bearer $TOKEN" \
-H "Accept: application/vnd.github.v3+json" \
-d '{
"enforce_admins": true,
"required_pull_request_reviews": {"required_approving_review_count": 1, "dismiss_stale_reviews": true},
"required_status_checks": {"strict": true, "contexts": []},
"required_conversation_resolution": true,
"required_linear_history": true,
"restrictions": null,
"allow_force_pushes": false,
"allow_deletions": false
}' \
"https://api.github.com/repos/$SLUG/branches/$(python3 -c "import urllib.parse,sys;print(urllib.parse.quote(sys.argv[1],safe=''))" "$BRANCH")/protection"
```
> ⚠️ API 必传 `required_pull_request_reviews`/`required_status_checks`/`restrictions` 三项(可为 `null`),否则 422 `weren't supplied`
> ⚠️ 分支名含 `/`URL path 段需编码为 `release%2FvX.Y.Z`
---
## 3. 发布策略 — 什么能进 `release/v*`
**仅接受两类合并:**
1. ✅ `hotfix/*` 分支的修复(从对应 `release/v*` 切出)
2. ✅ 已验证 Feature 的 **Cherry-pick / Backport**PR 打 `cherry-pick``backport` 标签)
**明确禁止:**
- ❌ 从 `develop` / `main` / `master` 直接合并
- ❌ 从普通 `feature/*` 直接合并
> Branch Protection 本身无法限制「来源分支」,因此用 §4 的 GitHub Actions 在 PR 上做来源校验,作为 required status check 卡 merge。
---
## 4. PR 来源校验 Workflow`release/*` 专用)
GitHub Branch Protection 不能限制 PR 的源分支,故每仓部署 `.github/workflows/validate-release-pr.yml`,对 `base = release/*` 的 PR 校验来源,纳入 required status checks。
```yaml
name: Validate Release PR
on:
pull_request_target:
types: [opened, synchronize, reopened]
jobs:
validate-release-source:
runs-on: ubuntu-latest
if: startsWith(github.base_ref, 'release/')
steps:
- name: Check PR source branch
run: |
SRC="${{ github.head_ref }}"
LABELS="${{ join(github.event.pull_request.labels.*.name, ',') }}"
if [[ "$SRC" =~ ^hotfix/ ]]; then echo "✅ hotfix/*"; exit 0; fi
if [[ "$LABELS" =~ cherry-pick|backport ]]; then echo "✅ cherry-pick/backport label"; exit 0; fi
echo "❌ release/* 仅接受 hotfix/* 或带 cherry-pick/backport 标签的 PR"; exit 1
```
> 部署后,需把该 workflow 的 check 名加入各仓 `required_status_checks.contexts`,使其成为合并硬门槛。
> **状态workflow 文件尚未部署**(待显式授权推送到 7 仓 main
---
## 5. 分支模型流程图
```mermaid
flowchart TD
subgraph DEV["日常开发"]
F["feature/*"] -->|PR + 1 review| M["main<br/>(集成主干 · 最新非最稳)"]
BUG["bugfix/*"] -->|PR + 1 review| M
end
subgraph RELEASE["发布冻结"]
M -->|切发布分支 + 打 tag| R["release/vX.Y.Z<br/>(稳定发布快照 · 最严保护)"]
end
subgraph MAINT["发布后维护"]
R -.->|切出| HF["hotfix/*"]
HF -->|PR ✅ 允许| R
FEAT["已验证 feature<br/>(cherry-pick/backport 标签)"] -->|PR ✅ 允许| R
end
M -.->|❌ 禁止直接合并| R
F -.->|❌ 禁止直接合并| R
HF -.->|backport 修复回主干| M
classDef stable fill:#1b5e20,stroke:#fff,color:#fff;
classDef main fill:#0d47a1,stroke:#fff,color:#fff;
classDef work fill:#37474f,stroke:#fff,color:#fff;
class R stable;
class M main;
class F,BUG,HF,FEAT work;
```
**关键路径:**
- 正常流:`feature/*` → (PR) → `main` → (冻结+tag) → `release/vX.Y.Z`
- 紧急修复:`release/vX.Y.Z` → `hotfix/*` → (PR) → `release/vX.Y.Z` → (backport) → `main`
- 回填功能:`main` 上已验证 commit → cherry-pick/backport PR带标签`release/vX.Y.Z`
- 🚫 被拒:`main` / `develop` / `feature/*` 直接 → `release/*`
---
## 6. 当前落地状态2026-06-28
| 仓库 | owner/repo | `release/v1.1.5` 分支+tag | 保护规则 | 来源校验 workflow |
|---|---|---|---|---|
| xworkmate-app | `ai-workspace-lab/xworkmate-app` | ✅ | ✅ 全量 | ⏳ 待部署 |
| xworkmate-bridge | `ai-workspace-lab/xworkmate-bridge` | ✅ | ✅ 全量 | ⏳ 待部署 |
| xworkspace-console | `ai-workspace-lab/xworkspace-console` | ✅ | ✅ 全量 | ⏳ 待部署 |
| openclaw-multi-session-plugins | `ai-workspace-lab/openclaw-multi-session-plugins` | ✅ | ✅ 全量 | ⏳ 待部署 |
| playbooks | `ai-workspace-infra/playbooks` | ✅ | ✅ 全量 | ⏳ 待部署 |
| iac_modules | `ai-workspace-infra/iac_modules` | ✅ | ✅ 全量 | ⏳ 待部署 |
| xworkspace-core-skills | `ai-workspace-lab/xworkspace-core-skills` | ✅ | ✅ 全量 | ⏳ 待部署 |
> 分支与 tag 同名 `release/v1.1.5`:推送用显式 refspec`refs/heads/...` + `refs/tags/...`);本地 checkout 用 `git checkout refs/heads/release/v1.1.5` 避免歧义。
> `iac_modules` remote 已统一为 `git@github.com:ai-workspace-infra/iac_modules.git`
---
## 7. 剩余动作
1. ~~**部署来源校验 workflow** 到 7 仓 main~~ ✅ 2026-06-28走 PR + squash 合并)。
2. ~~backport workflow 到现有 `release/v1.1.5`~~ ✅ 2026-06-28见 §8 应急流程)。
3. workflow 首次跑出 check 名后,加入各仓 `required_status_checks.contexts`,使其成为合并硬门槛。
4. 后续每发新版(`release/vX.Y.Z`):切分支+tag → 跑 §2 一键脚本应用保护。
---
## 8. 应急流程 — 向严格保护的 `release/v*` 紧急合入
**适用场景**:需直接改 `release/v*`(如 backport 门禁 workflow、紧急配置修复但**当前无可用的第二审阅人**(仅 PR 作者本人账号、code-agent-bot 未就绪等)。
> ⚠️ 该流程**临时下调**分支保护,属高风险操作。务必:① 单仓串行、窗口最小化;② 完成立即恢复;③ 记录操作前后保护状态;④ 仅用于 hotfix/backport 类小改,**禁止用于功能合并**。
### 标准步骤(每仓)
```bash
TOKEN=$(gh auth token); SLUG=<owner/repo>; PR=<pr_number>; BR=release/v1.1.5
# 严保护 payload仅 required_approving_review_count 变量化
protect() { # $1 = review_count
curl -s -o /dev/null -w "%{http_code}" -X PUT \
-H "Authorization: Bearer $TOKEN" -H "Accept: application/vnd.github.v3+json" \
-d "{\"enforce_admins\":true,\"required_pull_request_reviews\":{\"required_approving_review_count\":$1,\"dismiss_stale_reviews\":true},\"required_status_checks\":{\"strict\":true,\"contexts\":[]},\"required_conversation_resolution\":true,\"required_linear_history\":true,\"restrictions\":null,\"allow_force_pushes\":false,\"allow_deletions\":false}" \
"https://api.github.com/repos/$SLUG/branches/$(printf %s "$BR" | sed 's#/#%2F#g')/protection"
}
# 1) 先建 hotfix 分支 + PR不需改保护
git switch -c hotfix/<fix> origin/$BR # 改动 & push & gh pr create --base $BR
# 2) 临时降审阅到 0窗口开始
protect 0
# 3) squash 合并linear history 要求:必须 squash/rebase不能 merge commit
gh pr merge "$PR" --repo "$SLUG" --squash --delete-branch
# 4) 立即恢复审阅到 1窗口结束
protect 1
```
### 关键约束
- **不要动 `enforce_admins`**:保持 `true`,仅改 review count。
- **必须 squash/rebase 合并**`required_linear_history: true` 禁 merge commit。
- **status checks 仍 strict**`contexts: []` 时无强制 check不阻塞一旦把 check 名加入 contexts应急合并前需确保该 PR 的 check 已通过。
- **验证三件套**PR `MERGED` + 目标文件存在于分支 + 保护 `review=1` 已恢复。
```bash
gh pr view "$PR" --repo "$SLUG" --json state --jq '.state' # → MERGED
curl -s -H "Authorization: Bearer $TOKEN" "https://api.github.com/repos/$SLUG/contents/<path>?ref=$BR" | jq -r '.path // .message'
curl -s -H "Authorization: Bearer $TOKEN" "https://api.github.com/repos/$SLUG/branches/${BR/\//%2F}/protection" | jq '.required_pull_request_reviews.required_approving_review_count' # → 1
```
> **首选正规路径**:有第二审阅人时走 `hotfix/* → PR → 他人 approve → squash 合并`,无需动保护。应急流程仅在审阅人缺位且变更紧迫时使用。
>
> **首次执行记录2026-06-28**backport `validate-release-pr.yml` 到 7 仓 `release/v1.1.5`PR `app#21 / bridge#12 / console#3 / plugins#3 / playbooks#20 / iac#213 / core-skills#4`,全部 MERGED保护已恢复 `review=1`

View File

@ -1,30 +0,0 @@
# 基础设施即代码模块仓库 文档
该仓库包含基础设施模块、基线蓝图以及平台 Landing Zone 参考资料。
## 当前状态快照
- 根 README 标题: `🌐 CloudNativeSuite`
- 构建与运行时证据: repository structure and scripts only
- 自动识别的主要目录: `scripts/`, `example/`
- 现有文档数量: 8
## 核心双语文档
- [架构](architecture.md)
- [设计](design.md)
- [部署](deployment.md)
- [使用手册](user-guide.md)
- [开发手册](developer-guide.md)
- [Vibe Coding 参考](vibe-coding-reference.md)
## 待归并的历史文档
- `AWS-Landing-Zone-Baseline.md`
- `Alicloud-Landing-Zone-Baseline.md`
- `CloudNeutral-Architecture-Blueprint-2025.md`
- `Multi-Cloud-Landing-Zone-Planning.md`
- `Vultr-Landing-Zone-Baseline.md`
- `cilium-egress-vxlan-crosscluster.md`
- `landingzone/alicloud-landingzone-mvp-single-account.md`
- `virtual-cloud-README.md`

View File

@ -1,24 +0,0 @@
# 架构
该仓库包含基础设施模块、基线蓝图以及平台 Landing Zone 参考资料。
本页作为系统边界、核心组件与仓库职责的双语总览入口。
## 与当前代码对齐的说明
- 文档目标仓库: `iac_modules`
- 仓库类型: `infra-modules`
- 构建与运行依据: repository structure and scripts only
- 主要实现与运维目录: `scripts/`, `example/`
- `package.json` 脚本快照: No package.json scripts were detected.
## 需要继续归并的现有文档
- `CloudNeutral-Architecture-Blueprint-2025.md`
## 本页下一步应补充的内容
- 先描述当前已落地实现,再补充未来规划,避免只写愿景不写现状。
- 术语需要与仓库根 README、构建清单和实际目录保持一致。
- 将上方列出的历史 runbook、spec、子系统说明逐步链接并归并到本页。
- 随着目录结构、服务关系和集成依赖变化,持续同步图示与职责说明。

View File

@ -1,24 +0,0 @@
# 部署
该仓库包含基础设施模块、基线蓝图以及平台 Landing Zone 参考资料。
本页用于统一部署前提、支持的拓扑、运维检查项与回滚注意事项。
## 与当前代码对齐的说明
- 文档目标仓库: `iac_modules`
- 仓库类型: `infra-modules`
- 构建与运行依据: repository structure and scripts only
- 主要实现与运维目录: `scripts/`, `example/`
- `package.json` 脚本快照: No package.json scripts were detected.
## 需要继续归并的现有文档
- 尚未发现直接对应的历史文档,本页目前就是该类别的规范起点。
## 本页下一步应补充的内容
- 先描述当前已落地实现,再补充未来规划,避免只写愿景不写现状。
- 术语需要与仓库根 README、构建清单和实际目录保持一致。
- 将上方列出的历史 runbook、spec、子系统说明逐步链接并归并到本页。
- 每次发布前依据当前脚本、清单、CI/CD 流程和环境契约重新核对部署步骤。

View File

@ -1,24 +0,0 @@
# 设计
该仓库包含基础设施模块、基线蓝图以及平台 Landing Zone 参考资料。
本页用于汇总设计决策、类似 ADR 的权衡记录,以及与路线图相关的实现说明。
## 与当前代码对齐的说明
- 文档目标仓库: `iac_modules`
- 仓库类型: `infra-modules`
- 构建与运行依据: repository structure and scripts only
- 主要实现与运维目录: `scripts/`, `example/`
- `package.json` 脚本快照: No package.json scripts were detected.
## 需要继续归并的现有文档
- `Multi-Cloud-Landing-Zone-Planning.md`
## 本页下一步应补充的内容
- 先描述当前已落地实现,再补充未来规划,避免只写愿景不写现状。
- 术语需要与仓库根 README、构建清单和实际目录保持一致。
- 将上方列出的历史 runbook、spec、子系统说明逐步链接并归并到本页。
- 当行为、API 或部署契约发生变化时,把一次性实现笔记提升为可复用设计记录。

View File

@ -1,24 +0,0 @@
# 开发手册
该仓库包含基础设施模块、基线蓝图以及平台 Landing Zone 参考资料。
本页用于记录本地开发环境、项目结构、测试面与贴合当前代码库的贡献约定。
## 与当前代码对齐的说明
- 文档目标仓库: `iac_modules`
- 仓库类型: `infra-modules`
- 构建与运行依据: repository structure and scripts only
- 主要实现与运维目录: `scripts/`, `example/`
- `package.json` 脚本快照: No package.json scripts were detected.
## 需要继续归并的现有文档
- 尚未发现直接对应的历史文档,本页目前就是该类别的规范起点。
## 本页下一步应补充的内容
- 先描述当前已落地实现,再补充未来规划,避免只写愿景不写现状。
- 术语需要与仓库根 README、构建清单和实际目录保持一致。
- 将上方列出的历史 runbook、spec、子系统说明逐步链接并归并到本页。
- 持续让环境搭建与测试命令对应真实存在的脚本、Make 目标或语言工具链。

View File

@ -1,24 +0,0 @@
# 使用手册
该仓库包含基础设施模块、基线蓝图以及平台 Landing Zone 参考资料。
本页用于记录主要用户或运维角色的日常任务、常见流程,以及现有操作文档入口。
## 与当前代码对齐的说明
- 文档目标仓库: `iac_modules`
- 仓库类型: `infra-modules`
- 构建与运行依据: repository structure and scripts only
- 主要实现与运维目录: `scripts/`, `example/`
- `package.json` 脚本快照: No package.json scripts were detected.
## 需要继续归并的现有文档
- 尚未发现直接对应的历史文档,本页目前就是该类别的规范起点。
## 本页下一步应补充的内容
- 先描述当前已落地实现,再补充未来规划,避免只写愿景不写现状。
- 术语需要与仓库根 README、构建清单和实际目录保持一致。
- 将上方列出的历史 runbook、spec、子系统说明逐步链接并归并到本页。
- 优先提供面向流程的示例,并确保截图或终端片段与最新 UI/CLI 行为一致。

View File

@ -1,24 +0,0 @@
# Vibe Coding 参考
该仓库包含基础设施模块、基线蓝图以及平台 Landing Zone 参考资料。
本页用于统一 AI 辅助开发提示词、仓库边界、安全编辑规则与文档同步要求。
## 与当前代码对齐的说明
- 文档目标仓库: `iac_modules`
- 仓库类型: `infra-modules`
- 构建与运行依据: repository structure and scripts only
- 主要实现与运维目录: `scripts/`, `example/`
- `package.json` 脚本快照: No package.json scripts were detected.
## 需要继续归并的现有文档
- 尚未发现直接对应的历史文档,本页目前就是该类别的规范起点。
## 本页下一步应补充的内容
- 先描述当前已落地实现,再补充未来规划,避免只写愿景不写现状。
- 术语需要与仓库根 README、构建清单和实际目录保持一致。
- 将上方列出的历史 runbook、spec、子系统说明逐步链接并归并到本页。
- 当项目新增子系统、受保护目录或强制验证步骤时,同步更新提示模板与仓库规则。

View File

@ -1,116 +0,0 @@
# Skill: release-branch-policy
## Purpose
Standardize release branch policy across Cloud-Neutral Toolkit repos:
- `main` is the **preview** branch (fast iteration, integrates frequently).
- `release/*` branches are **production release lines** and must be protected.
- Updates to `release/*` happen via **local cherry-pick** by release managers (process gate).
This skill includes:
- A policy doc (this file)
- A ruleset JSON template (GitHub Rulesets API)
- A `gh` script to apply the ruleset to one or many repos
- A sync script to copy this skill into all local sub-repos
- A script to generate a cross-repo release manifest (for tag association)
Non-goals:
- This skill does NOT create/push `release/v0.1` or tags automatically.
## Policy
### Branch Roles
- `main`: preview
- Accepts PRs and merges normally.
- May be ahead of production at any time.
- `release/*`: production
- No force-push.
- Require linear history.
- Prefer "cherry-pick into release branch" as the only change mechanism (process).
- Restrict who can update release branches to release managers (enforced via GitHub Rulesets/Branch protection UI).
### “Cherry-Pick Only” Clarification
GitHub branch rules cannot reliably guarantee "only cherry-pick" as a technical constraint.
We treat it as a **process rule**:
1. A change lands in `main`.
2. Release manager cherry-picks specific commits onto `release/<version>`.
3. Release manager pushes the updated release branch.
### “No PR / No Push” Clarification
If you literally forbid both:
- PR merges to `release/*`, and
- any push to `release/*`
then the branch becomes non-updatable.
What we implement is:
- No force-push, no deletion, linear history (enforceable).
- Only release managers can update `release/*` (enforceable via "restrict updates" / bypass actors).
- "Cherry-pick only" (process rule).
### Tags
For milestone releases like `v0.1`:
- Use an annotated tag named `v0.1` (per-repo).
- Prefer tags on `release/<version>` tip.
If you need SemVer tags, follow governance: `<repo>-vX.Y.Z`.
### Cross-Repo Tag Association
Git tags are per-repo; GitHub does not provide a first-class "one tag links all repos" concept.
We represent "release v0.1 across repos" by committing a **release manifest** file in the control repo, generated from local git state:
- repo name
- release branch tip SHA
- tag tip SHA
Use: `skills/release-branch-policy/scripts/generate_release_manifest.sh v0.1`
## Ruleset Requirements (release/*)
Enforce at minimum:
- block deletion
- block force-push (non-fast-forward)
- require linear history
Optional (recommended if you have stable CI):
- require status checks
- require signed commits
## Tools
### 1) Apply Ruleset (GitHub Rulesets)
Script: `skills/release-branch-policy/scripts/apply_ruleset.sh`
- Applies (create/update) a repo ruleset targeting `refs/heads/release/*`
- Uses `gh api` and a JSON payload
- Does not modify branches/tags
### 2) Sync Skill Into All Local Sub-Repos
Script: `skills/release-branch-policy/scripts/sync_skill_to_subrepos.sh`
- Copies this skill folder into each local repo under `/Users/shenlan/workspaces/cloud-neutral-toolkit/*`
- Skips repos without `.git`
- Keeps existing files unless overwritten explicitly
### 3) Generate Release Manifest (Cross-Repo Association)
Script: `skills/release-branch-policy/scripts/generate_release_manifest.sh`
- Generates `releases/<version>.yaml` in the current working directory (default)
- Does not push or create refs
## Operator Checklist
- Confirm `main` is treated as preview across repos (docs + CI naming).
- Apply ruleset to every repo that has production releases.
- Document "cherry-pick only" in release runbooks.
- Verify bypass actors (release managers) in GitHub UI if needed.

View File

@ -1,17 +0,0 @@
{
"name": "Release Branch Protection (release/*)",
"target": "branch",
"enforcement": "active",
"conditions": {
"ref_name": {
"include": ["refs/heads/release/*"],
"exclude": []
}
},
"rules": [
{ "type": "deletion" },
{ "type": "non_fast_forward" },
{ "type": "required_linear_history" }
]
}

View File

@ -1,57 +0,0 @@
#!/usr/bin/env bash
set -euo pipefail
usage() {
cat <<'EOF'
Apply GitHub Ruleset to protect release/* branches.
Usage:
apply_ruleset.sh <owner/repo> [<owner/repo> ...]
Notes:
- Requires: gh (authenticated), jq
- Does NOT create/push branches or tags.
- Ruleset payload is in: skills/release-branch-policy/references/ruleset.release-branches.json
EOF
}
if [[ $# -lt 1 || "${1:-}" == "-h" || "${1:-}" == "--help" ]]; then
usage
exit 0
fi
if ! command -v gh >/dev/null 2>&1; then
echo "missing: gh" >&2
exit 1
fi
if ! command -v jq >/dev/null 2>&1; then
echo "missing: jq" >&2
exit 1
fi
SKILL_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")/.." && pwd)"
PAYLOAD_FILE="${SKILL_DIR}/references/ruleset.release-branches.json"
if [[ ! -f "${PAYLOAD_FILE}" ]]; then
echo "payload not found: ${PAYLOAD_FILE}" >&2
exit 1
fi
NAME="$(jq -r '.name' < "${PAYLOAD_FILE}")"
for OWNER_REPO in "$@"; do
echo ">>> ${OWNER_REPO}"
# Find existing ruleset by name.
existing_id="$(
gh api "repos/${OWNER_REPO}/rulesets" --jq ".[] | select(.name == \"${NAME}\") | .id" 2>/dev/null || true
)"
if [[ -n "${existing_id}" ]]; then
echo "Updating ruleset id=${existing_id}"
gh api -X PUT "repos/${OWNER_REPO}/rulesets/${existing_id}" --input "${PAYLOAD_FILE}" >/dev/null
else
echo "Creating ruleset"
gh api -H "Accept: application/vnd.github+json" -X POST "repos/${OWNER_REPO}/rulesets" --input "${PAYLOAD_FILE}" >/dev/null
fi
done

View File

@ -1,116 +0,0 @@
#!/usr/bin/env bash
set -euo pipefail
usage() {
cat <<'EOF'
Generate a cross-repo release manifest (read-only) from local git state.
Usage:
generate_release_manifest.sh <version> [--base <dir>] [--out <file>]
Examples:
generate_release_manifest.sh v0.1
generate_release_manifest.sh v0.1 --out releases/v0.1.yaml
generate_release_manifest.sh v0.1 --base /Users/shenlan/workspaces/cloud-neutral-toolkit
Notes:
- This script does NOT create/push branches or tags.
- It inspects local refs only; if your local remotes are stale, run 'git fetch --all --tags' per repo first.
- "Cross-repo association" is represented by this manifest file (repo -> release branch tip + tag tip).
EOF
}
if [[ $# -lt 1 || "${1:-}" == "-h" || "${1:-}" == "--help" ]]; then
usage
exit 0
fi
VERSION="$1"
shift || true
BASE="/Users/shenlan/workspaces/cloud-neutral-toolkit"
OUT=""
while [[ $# -gt 0 ]]; do
case "$1" in
--base)
BASE="${2:-}"
shift 2
;;
--out)
OUT="${2:-}"
shift 2
;;
*)
echo "unknown arg: $1" >&2
usage >&2
exit 2
;;
esac
done
if [[ -z "${OUT}" ]]; then
mkdir -p "releases"
OUT="releases/${VERSION}.yaml"
fi
if [[ ! -d "${BASE}" ]]; then
echo "missing base dir: ${BASE}" >&2
exit 1
fi
REL_BRANCH="release/${VERSION}"
tmp="$(mktemp)"
trap 'rm -f "$tmp"' EXIT
{
echo "version: ${VERSION}"
echo "generated_at_utc: \"$(date -u +%Y-%m-%dT%H:%M:%SZ)\""
echo "base_dir: \"${BASE}\""
echo "release_branch: \"${REL_BRANCH}\""
echo "repos:"
} >"$tmp"
for d in "${BASE}"/*; do
[[ -d "$d" ]] || continue
[[ -d "$d/.git" ]] || continue
name="$(basename "$d")"
remote_url="$(cd "$d" && git config --get remote.origin.url 2>/dev/null || true)"
rel_ref=""
rel_sha=""
if (cd "$d" && git show-ref --verify --quiet "refs/remotes/origin/${REL_BRANCH}"); then
rel_ref="refs/remotes/origin/${REL_BRANCH}"
rel_sha="$(cd "$d" && git rev-parse "refs/remotes/origin/${REL_BRANCH}")"
elif (cd "$d" && git show-ref --verify --quiet "refs/heads/${REL_BRANCH}"); then
rel_ref="refs/heads/${REL_BRANCH}"
rel_sha="$(cd "$d" && git rev-parse "refs/heads/${REL_BRANCH}")"
fi
tag_sha=""
if (cd "$d" && git show-ref --tags --quiet --verify "refs/tags/${VERSION}"); then
tag_sha="$(cd "$d" && git rev-parse "${VERSION}^{}" 2>/dev/null || git rev-parse "${VERSION}" 2>/dev/null || true)"
fi
{
echo " - name: \"${name}\""
echo " path: \"${d}\""
if [[ -n "${remote_url}" ]]; then
echo " remote: \"${remote_url}\""
else
echo " remote: \"\""
fi
echo " release:"
echo " branch: \"${REL_BRANCH}\""
echo " ref: \"${rel_ref}\""
echo " sha: \"${rel_sha}\""
echo " tag:"
echo " name: \"${VERSION}\""
echo " sha: \"${tag_sha}\""
} >>"$tmp"
done
mv "$tmp" "$OUT"
echo "wrote: ${OUT}"

View File

@ -1,55 +0,0 @@
#!/usr/bin/env bash
set -euo pipefail
usage() {
cat <<'EOF'
Copy this skill into all local Cloud-Neutral Toolkit sub-repos.
Usage:
sync_skill_to_subrepos.sh
Copies:
skills/release-branch-policy -> <repo>/skills/release-branch-policy
Notes:
- Local path root: /Users/shenlan/workspaces/cloud-neutral-toolkit
- Skips directories without .git
EOF
}
if [[ "${1:-}" == "-h" || "${1:-}" == "--help" ]]; then
usage
exit 0
fi
SRC_SKILL="$(cd "$(dirname "${BASH_SOURCE[0]}")/.." && pwd)"
SRC_REPO_ROOT="$(cd "${SRC_SKILL}/../.." && pwd)"
realpath_py() {
python3 - "$1" <<'PY'
import os, sys
print(os.path.realpath(sys.argv[1]))
PY
}
BASE="/Users/shenlan/workspaces/cloud-neutral-toolkit"
if [[ ! -d "${BASE}" ]]; then
echo "missing base dir: ${BASE}" >&2
exit 1
fi
for d in "${BASE}"/*; do
[[ -d "$d" ]] || continue
[[ -d "$d/.git" ]] || continue
# Don't delete our own source while syncing.
if [[ "$(realpath_py "$d")" == "$(realpath_py "$SRC_REPO_ROOT")" ]]; then
echo ">>> skipping source repo $d"
continue
fi
mkdir -p "$d/skills"
echo ">>> syncing to $d"
rm -rf "$d/skills/release-branch-policy"
cp -R "${SRC_SKILL}" "$d/skills/release-branch-policy"
done

View File

@ -1,86 +0,0 @@
# Skill: terraform-yaml-render-pattern
## Purpose
约束性规范:在 `iac_modules/terraform-hcl-standard/**` 下编排可批量创建的资源
(典型如多主机 VPS**必须**采用「YAML 描述 → Python+Jinja2 渲染显式 HCL 块
→ Terraform apply → Python 合并生成 CMDB/Inventory」的范式**不得**在 HCL 内做循环。
这是 binding skill与本文件冲突的写法一律以本文件为准。
配套约束见 `iac_modules/terraform-hcl-standard/AGENTS.md`
参考实现(基准,新 env 照此结构落地):`terraform-hcl-standard/vultr-vps/envs/ai-workspace/`
## Pattern强制数据流
```
hosts.yaml (唯一人工入口:资源描述 / CMDB 源)
│ generate.py render —— 循环在 Python+Jinja2 侧完成
├─▶ generated_hosts.tf 每实例/每 key 一个独立显式 module/resource/data 块
└─▶ terraform.auto.tfvars.json ──▶ variables.tf (global 段 -> 变量)
▼ terraform apply
output "cmdb_runtime" (仅运行时事实ip / instance_id / 解析后的 os_id)
│ generate.py inventory —— 合并 YAML 静态字段 + 运行时输出
├─▶ cmdb.json (IaC ↔ Ansible 契约)
└─▶ inventory.ini
▼ Ansible 动态 inventory: playbooks/inventory/terraform_cmdb.py (只读 cmdb.json)
```
## Rules
### MUST NOT
- 不在 env 的 `.tf` 中使用 `for_each` / `count` / `dynamic`
- 不用 `templatefile()` + `%{ for }` / `%{ if }` 等 HCL 模板控制结构做渲染。
### MUST
- 资源信息由 env 内 `hosts.yaml` 描述;多份资源由 Jinja2 展开为**多个命名唯一的显式块**。
- YAML 全局段经 `terraform.auto.tfvars.json` 传给 `variables.tf`;逐实例字段由 Jinja2 进 `.tf`
- 机密走环境变量(如 `TF_VAR_vultr_api_key`**禁止**写入 YAML/tfvars公钥可入 YAML。
- 共享 `scripts/generate.py``--resources`/`--workdir` 参数化)提供 `render`
`inventory` 两个子命令(职责见上图);不在每个 env 各放一份。
- Terraform 只输出运行时才确定的事实静态字段os_name/plan/groups/host_vars…由 Python 合并。
- 渲染产物(`generated_hosts.tf`、`terraform.auto.tfvars.json`、`cmdb.json`、`inventory.ini`
加入 `.gitignore`,不入库。
- `inventory.ini` 中含空格的 host_var 值加引号(`key="a b c"`)。
- Ansible 动态 inventory 只消费 `cmdb.json`,不直接耦合 tfstateIaC 变更后重跑 `generate.py inventory`
### SHOULD
- 复用 `modules/compute` 等既有模块,不在 env 内重写 provider 资源。
- 每个用到 provider 的子模块声明 `required_providers`(含正确 `source`)。
- OS 用 `data "vultr_os"``os_name` 解析 `os_id`,避免硬编码漂移 ID解析不到时允许直接给 `os_id`
## Reference Layout按职责分层
声明 / 可复用模板 / 组合三层分离env 目录只保留组合逻辑:
```
<provider>-vps/
config/resources/<name>-hosts.yaml # 声明:唯一人工入口 global / ssh_keys / hosts
templates/ # 共享:可复用 .tf 与 Jinja2 模板
provider.tf variables.tf cloud-init.yaml # 共享 .tf/配置render 时拷入 workdir
hosts.tf.j2 inventory.ini.j2 # 渲染模板
scripts/ # 共享:组合逻辑(不依赖具体 env
generate.py # render + inventory--resources / --workdir 参数化
provision.sh # 一键 render -> apply -> inventory -> (可选) ansible
modules/<resource>/ # 复用的资源模块
envs/<name>/ # 运行目录terraform workdir
README.md .gitignore # 唯二入库文件;其余为渲染产物 + tfstate
# 渲染产物(落 workdir、不入库provider.tf / variables.tf / cloud-init.yaml /
# generated_hosts.tf / terraform.auto.tfvars.json / cmdb.json / inventory.ini
```
> 三层共享:**声明**归 `config/resources/`、**可复用 .tf 与模板**归 `templates/`
> **组合逻辑**归 `scripts/`env 退化为运行目录。`scripts/generate.py render` 把
> `templates/` 下的 provider/variables/cloud-init 拷入 workdir、渲染出 `generated_hosts.tf`
> 使 workdir 成为可独立 terraform 的根模块。新增一套主机只加一个 `config/resources/*.yaml`
> + 一个 workdir复用同一 scripts/templates。
## Operator Checklist提交前自检
- `terraform fmt` 无 diff`terraform validate` 通过。
- `python3 generate.py render` 产出合法 `.tf``validate` 通过)。
- 生成的 `inventory.ini` 能被 `ansible-inventory -i <file> --graph` 正确解析。
- 渲染产物已被 `.gitignore` 忽略;机密未入库。
- HCL 内无 `for_each`/`count`/`dynamic`/`templatefile` 控制结构。

View File

@ -1,92 +0,0 @@
# AGENTS.md —— Terraform 模块约束规范(强制)
适用范围:`iac_modules/terraform-hcl-standard/**`。本文件为**约束性规范**
在本目录下创建/修改 Terraform env 时 **必须遵守**。冲突时以本文件为准。
## 0. 核心范式MUST
新建可批量编排的 env如多主机 VPS**必须**采用如下数据流,
**不得**在 HCL 内做循环:
```
YAML 描述资源
│ (Python + Jinja2 渲染,循环在此完成)
显式 Terraform module/resource 块 ── 每个实例/资源一个独立块,不用 for_each
terraform apply
Python 合并「YAML 静态信息」+「Terraform 运行时输出」
CMDB (cmdb.json) + Ansible inventory (inventory.ini / 动态 inventory)
```
**参考实现(基准,新 env 照此结构落地):**
`vultr-vps/envs/ai-workspace/`
**配套 binding skill完整规则与自检清单**
`../skills/terraform-yaml-render-pattern/SKILL.md`
## 1. HCL 控制结构禁令MUST NOT
- ❌ 禁止在 env 的 `.tf` 中使用 `for_each`、`count`、`dynamic` 块。
- ❌ 禁止用 `templatefile()` + `%{ for }`/`%{ if }` 等 HCL 模板控制结构做渲染。
- ✅ 允许纯函数:`concat`、`merge`、`coalesce`、`jsonencode` 等(非控制结构)。
- ✅ 资源的“多份”由 Python+Jinja2 在生成阶段展开为**多个显式块**
每台主机 / 每个 key 对应一个命名唯一的 `module`/`resource`/`data` 块。
> 理由:把循环与条件收敛到 Python/Jinja2 单一处HCL 保持“扁平、可审计、
> diff 友好”,资源命名稳定,避免 for_each key 漂移导致的重建。
## 2. 资源描述与变量传递MUST
- 资源信息**必须**由 `config/resources/<name>-hosts.yaml`(或等价 `*.yaml`)声明,作为唯一人工入口;
**不**放在 env 目录里。
- YAML 的全局段经渲染写入 `terraform.auto.tfvars.json`**传给 `variables.tf`**
逐实例字段由 Jinja2 展开进生成的 `.tf`
- 机密API Key、私钥等**禁止**写入 YAML / tfvars**必须**走环境变量
(如 `TF_VAR_vultr_api_key`)。公钥可入 YAML。
## 3. 渲染器约定MUST
组合逻辑**必须**收敛到共享 `scripts/generate.py``--resources`/`--workdir` 参数化,
不在每个 env 各放一份),至少含两个子命令:
- `render``config/resources/*.yaml` → workdir 下 `generated_hosts.tf` +
`provider.tf`/`variables.tf`/`cloud-init.yaml`(拷自 `templates/`+ `terraform.auto.tfvars.json`
- `inventory``terraform output`(运行时事实)+ YAML静态字段
`cmdb.json` + `inventory.ini`
约定:
- 共享脚本放 `scripts/`,共享 .tf 与 Jinja2 模板放 `templates/``provider.tf`/
`variables.tf`/`cloud-init.yaml`/`*.j2`env 仅作 terraform 运行目录;
标识符经 `tf_id` 过滤器净化为合法 HCL 名。
- Terraform 仅输出 **运行时才确定** 的事实(如 `cmdb_runtime`ip / instance_id /
解析后的 os_id。静态字段os_name/plan/groups/host_vars 等)由 Python 从 YAML 合并。
- `inventory.ini` 中含空格的 host_var 值**必须**加引号(`key="a b c"`)。
- 渲染产物(`generated_hosts.tf`、`terraform.auto.tfvars.json`、`cmdb.json`、
`inventory.ini`)为派生物,**必须**加入 `.gitignore`,不入库。
## 4. 模块复用与 OS 解析SHOULD
- 实例**应**复用 `modules/compute` 等既有模块,不在 env 内重写 provider 资源。
- 每个使用 provider 的子模块**必须**声明 `required_providers`(含正确 `source`
否则 Terraform 误判为 `hashicorp/<name>`
- OS **应**用 `data "vultr_os"``os_name` 解析 `os_id`,避免硬编码漂移 ID
解析不到时允许在 YAML 直接给 `os_id`
## 5. IaC ↔ Ansible 联动MUST
- CMDB`cmdb.json`)是 IaC 与 Ansible 的契约。Ansible 侧的动态 inventory
`playbooks/inventory/terraform_cmdb.py`**只**消费 `cmdb.json`,不直接耦合 tfstate。
- IaC 变更后**必须**重跑 `generate.py inventory` 刷新 CMDB/inventory保持二者一致。
## 6. 提交前自检MUST
- `terraform fmt` 无 diff`terraform validate` 通过。
- `python3 generate.py render` 能产出合法 `.tf``validate` 通过)。
- 生成的 `inventory.ini` 能被 `ansible-inventory -i <file> --graph` 正确解析。
- 确认渲染产物已被 `.gitignore` 忽略、机密未入库。

View File

@ -1,47 +0,0 @@
# =============================================================================
# ai-workspace 资源描述 (CMDB 源数据)
#
# 这是唯一的人工维护入口。generate.py 读取本文件:
# - global 段 -> terraform.auto.tfvars.json (传给 variables.tf)
# - ssh_keys -> generated_hosts.tf 里的显式 vultr_ssh_key 资源块
# - hosts -> generated_hosts.tf 里逐主机的显式 module/data 块 (无 for_each)
#
# 改完 YAML 后执行: python3 generate.py render
# =============================================================================
global:
region: nrt # 默认区域:东京。可选 ewr/sgp/fra ...
name_prefix: ai-workspace
user_data_file: cloud-init.yaml
# 注入实例的登录公钥(公钥非敏感,可入库;私钥/API Key 不要写这里)
ssh_keys:
- name: ai-workspace-admin
public: "ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABgQDEsuS135lzjVvlH2iNrKz23lDFr7b686xs4d2HINP2glFPmgkgx1D6Dqwisb1UbhWHZmUUzRxXeNlE8fiaO0TXN/C0dsdUxgopnQRyakcA+gfJqqb38Syx8eqdC7mQy9ygOf763dWm6d/SYZ8WgNWLldk4QF9DiZOW9K22DMtY4/1Cqe/YE/WGpOMVr9T9BwvmOjarjWp2OPbx6RVlSOd735Mze5X+cJ9QqdLaisCiSoJ3j9S6dulcxm+7ghPfATvxlJyZWSrRrVqnmV45lPbeuUHlIEyuK1PK2MS6NtUP03ZhdRYJQKZLECpR5xAO/BliOtDdRornvHV1gutYD8/n3IS8sRVzYPvN9DuOhzBnBQUgciu2++R8zMfdVoH7mSbsE8u++vMcBk3UJ1Op0Ct+trl2bsnue96cAnoiII08JKwAaczD5uZIGhdkGV8zKnChNCjzCxP0i4PV/MYW04eWmH+E8G81zq4ZsvrvPYmilBbRrkwHvvbPba3SSb2F2As= shenlan@shenlandeMacBook-Air-2.local"
# 主机清单。map key 风格用 name 字段;每台机器渲染成一个独立 module 块。
hosts:
- name: ai-debian13
os_name: "Debian 13 x64 (trixie)" # Vultr 实际镜像名(含 trixie也可写 os_id: 2625
plan: vc2-2c-4gb # 2 核 4G临时缩配验证后改回 vc2-4c-8gb
backups: false # 不开备份
enable_ipv6: true # 公网 IPv4 默认分配,这里附带开 IPv6
ansible_user: root
groups: [ai_workspace, debian]
tags: [debian]
host_vars:
role: primary
# 逗号分隔的服务域名cloudflare_dns 角色据此为本机 IP 建 A 记录
service_domains: xworkmate-bridge-debian-13.svc.plus
- name: ai-ubuntu2604
os_name: "Ubuntu 26.04 LTS x64"
plan: vc2-2c-4gb
backups: false
enable_ipv6: true
ansible_user: root
groups: [ai_workspace, ubuntu]
tags: [ubuntu]
host_vars:
role: secondary
service_domains: xworkmate-bridge-ubuntu-26.svc.plus

View File

@ -1,19 +0,0 @@
# generate.py 渲染产物(由 config/resources/*.yaml 与 templates/ 派生,拷入/渲染进
# 本运行目录,均不入库)
generated_hosts.tf
provider.tf
variables.tf
cloud-init.yaml
terraform.auto.tfvars.json
cmdb.json
inventory.ini
# render_backend_tf.py 渲染的 S3 兼容 backend含 endpoint按环境派生不入库
backend.tf
# terraform 运行时
.terraform/
.terraform.lock.hcl
*.tfstate
*.tfstate.*
terraform.tfvars
__pycache__/

View File

@ -1,94 +0,0 @@
# ai-workspace env —— YAML 描述资源 + Python/Jinja2 渲染 + Ansible 动态 inventory
`vultr-vps` 模块基础上,做到 **IaC 创建主机 ↔ Ansible inventory** 的联动,且满足约束:
- **不使用 HCL 控制结构**(无 `for_each`/`count`/`dynamic`/`templatefile` 循环)。
- **用 Python + Jinja2 渲染 YAML** 生成显式的 Terraform `module`/`resource`/`data` 块。
- 资源声明放共享 `config/resources/ai-workspace-hosts.yaml`,全局段经 `terraform.auto.tfvars.json` 传给 `variables.tf`
默认创建两台机器(均 **4 核 8G / 公网 IP / 不开备份**`ai-debian13`(Debian 13)、`ai-ubuntu2604`(Ubuntu 26.04)。
## 目录分层(声明 / 共享模板 / 共享脚本 / 运行目录)
```
vultr-vps/
config/resources/ai-workspace-hosts.yaml # 声明:唯一人工入口 / CMDB 源
templates/ # 共享provider/变量/云初始化/渲染模板
provider.tf variables.tf cloud-init.yaml hosts.tf.j2 inventory.ini.j2
scripts/ # 共享:组合逻辑(可被多套声明复用)
generate.py provision.sh
modules/{compute,iam,...} # 复用的资源模块
envs/ai-workspace/ # 运行目录terraform workdir仅 README/.gitignore
# 其余文件为渲染产物 + tfstate均 gitignore
```
> 本 env 已退化为"运行目录":组合逻辑提到共享 `scripts/`,可复用的 .tf 与模板提到
> 共享 `templates/`。新增一套主机只需加一个 `config/resources/<name>-hosts.yaml`
> 和一个 workdir复用同一套 scripts/templates。
## 数据流
```
config/resources/ai-workspace-hosts.yaml
│ scripts/generate.py render --resources <yaml> --workdir <env>
│ (读 templates/hosts.tf.j2拷 templates/{provider,variables,cloud-init} 入 workdir
├─▶ workdir/generated_hosts.tf (逐主机显式 module 块,无 for_each)
├─▶ workdir/{provider.tf, variables.tf, cloud-init.yaml} (拷自 templates/)
└─▶ workdir/terraform.auto.tfvars.json ──▶ variables.tf
└─▶ terraform -chdir=workdir apply ──▶ output cmdb_runtime (ip/instance_id/os_id)
YAML(静态字段) + cmdb_runtime ──scripts/generate.py inventory──▶ workdir/{cmdb.json, inventory.ini}
playbooks/inventory/terraform_cmdb.py (Ansible 动态 inventory 读 cmdb.json)
```
> 循环逻辑全部在 Python/Jinja2 侧:每台主机 / 每个 SSH key 都被展开成
> `generated_hosts.tf` 里独立的显式块HCL 内不出现任何控制结构。
## 文件
| 文件 | 角色 |
|------|------|
| `../../config/resources/ai-workspace-hosts.yaml` | **唯一人工入口**global / ssh_keys / hosts 资源声明 (CMDB 源) |
| `../../templates/{hosts.tf.j2, inventory.ini.j2}` | Jinja2 渲染模板(主机 module/data 块、静态 inventory |
| `../../templates/{provider.tf, variables.tf, cloud-init.yaml}` | 共享 .tf/配置render 时拷入运行目录 |
| `../../scripts/generate.py` | `render`YAML+模板→workdir 产物;`inventory`tf 输出+YAML→cmdb.json+inventory.ini |
| `../../scripts/provision.sh` | 一键render → apply → inventory →(可选)跑 Ansible |
| `../../../../../playbooks/inventory/terraform_cmdb.py` | Ansible 动态 inventory 脚本 |
| 本目录 `.gitignore` / `README.md` | 运行目录里唯二入库的文件 |
> 运行目录里的 `generated_hosts.tf` / `provider.tf` / `variables.tf` / `cloud-init.yaml` /
> `terraform.auto.tfvars.json` / `cmdb.json` / `inventory.ini` 均为渲染产物,已 `.gitignore` 忽略。
## 用法
```bash
export TF_VAR_vultr_api_key=xxxxxxxx
# 编辑 config/resources/ai-workspace-hosts.yaml填真实 ssh 公钥 / 调整主机),然后:
../../scripts/provision.sh # 渲染 + 创建 + 生成 inventory
# 用动态 inventory 驱动 Ansible
cd ../../../../../playbooks
ansible ai_workspace -i inventory/terraform_cmdb.py -m ping
ansible-playbook -i inventory/terraform_cmdb.py setup-ai-workspace-all-in-one.yml
```
分步执行(在 vultr-vps 根目录):
```bash
python3 scripts/generate.py render # -> envs/ai-workspace/ 渲染产物
terraform -chdir=envs/ai-workspace init && terraform -chdir=envs/ai-workspace apply
python3 scripts/generate.py inventory # -> cmdb.json + inventory.ini
```
> 多套主机:加 `config/resources/<name>-hosts.yaml` + 一个 workdir复用同一 scripts
> `scripts/generate.py render --resources config/resources/<name>-hosts.yaml --workdir envs/<name>`
## 调整主机
`../../config/resources/ai-workspace-hosts.yaml` 即可增删机器或改套餐/区域/分组/host_vars重跑 `scripts/generate.py render`
`groups` 决定主机进入哪些 Ansible 组;`host_vars` 会进入 inventory 行与动态 inventory 的 `hostvars`
> OS 通过 `os_name` 自动解析 `os_id`(避免硬编码漂移的镜像 ID。名称查不到时用
> `vultr-cli os list` 核对,或在该主机直接写 `os_id`

View File

@ -60,7 +60,7 @@ resource "vultr_instance" "this" {
plan = var.plan
os_id = var.os_id
enable_ipv6 = var.enable_ipv6
backups = var.backups ? "enabled" : "disabled"
backups = var.backups
tags = var.tags
vpc_ids = var.vpc_id == null ? [] : [var.vpc_id]
ssh_key_ids = var.ssh_key_ids

View File

@ -1,8 +0,0 @@
terraform {
required_providers {
vultr = {
source = "vultr/vultr"
version = "~> 2.19"
}
}
}

View File

@ -1,8 +0,0 @@
terraform {
required_providers {
vultr = {
source = "vultr/vultr"
version = "~> 2.19"
}
}
}

View File

@ -1,241 +0,0 @@
#!/usr/bin/env python3
"""共享渲染器:资源声明 (config/resources) -> Terraform 资源 / Ansible inventory。
分层本脚本不依赖某个具体 env可被多套资源声明复用
- 声明: ../config/resources/<name>-hosts.yaml --resources 覆盖
- 共享模板: ../templates/{provider.tf, variables.tf, cloud-init.yaml,
hosts.tf.j2, inventory.ini.j2}
- 运行目录: ../envs/<name>/ --workdir 覆盖渲染产物 + tfstate 落此 gitignore
设计要点满足约束
- 不在 HCL 里使用 for_each/count 等控制结构 Python + Jinja2 YAML
展开成 generated_hosts.tf 中逐个的显式 module/resource/data
- YAML global 段渲染成 terraform.auto.tfvars.json传给 variables.tf
- apply 后用 terraform 运行时输出 + YAML 静态字段合并出 cmdb.json
再渲染 inventory.ini二者供 Ansible含动态 inventory 脚本消费
子命令
render YAML + 模板 -> workdir/{generated_hosts.tf, provider.tf,
variables.tf, cloud-init.yaml, terraform.auto.tfvars.json}
inventory terraform output(cmdb_runtime) + YAML -> workdir/{cmdb.json, inventory.ini}
"""
import argparse
import json
import os
import re
import shutil
import subprocess
import sys
import yaml
from jinja2 import Environment, FileSystemLoader
SCRIPT_DIR = os.path.dirname(os.path.abspath(__file__))
# scripts/ -> vultr-vps 根
VULTR_VPS_ROOT = os.path.abspath(os.path.join(SCRIPT_DIR, ".."))
TEMPLATE_DIR = os.path.join(VULTR_VPS_ROOT, "templates")
DEFAULT_RESOURCES = os.path.join(
VULTR_VPS_ROOT, "config", "resources", "ai-workspace-hosts.yaml"
)
DEFAULT_WORKDIR = os.path.join(VULTR_VPS_ROOT, "envs", "ai-workspace")
# render 时从 templates/ 拷入运行目录的静态文件(使 workdir 成为独立根模块)。
COPY_INTO_WORKDIR = ["provider.tf", "variables.tf", "cloud-init.yaml"]
# 逐主机可选字段的缺省值(集中定义,避免散落的硬编码字面量)。
DEFAULT_PLAN = "vc2-4c-8gb"
DEFAULT_ANSIBLE_USER = "root"
def _tf_id(value):
"""把任意名字转成合法的 Terraform 标识符。"""
return re.sub(r"[^0-9a-zA-Z_]", "_", str(value))
def _jinja():
env = Environment(
loader=FileSystemLoader(TEMPLATE_DIR),
trim_blocks=True,
lstrip_blocks=False,
keep_trailing_newline=True,
)
env.filters["tf_id"] = _tf_id
return env
def load_yaml(path):
with open(path, encoding="utf-8") as fh:
return yaml.safe_load(fh) or {}
def _terraform_output(workdir, name):
out = subprocess.check_output(
["terraform", f"-chdir={workdir}", "output", "-json", name],
stderr=subprocess.PIPE,
)
return json.loads(out)
def cmd_render(args):
resources, workdir = args.resources, args.workdir
os.makedirs(workdir, exist_ok=True)
data = load_yaml(resources)
glob = data.get("global", {}) or {}
ssh_keys = data.get("ssh_keys", []) or []
hosts = data.get("hosts", []) or []
rendered = (
_jinja()
.get_template("hosts.tf.j2")
.render(ssh_keys=ssh_keys, hosts=hosts, true=True, false=False)
)
with open(os.path.join(workdir, "generated_hosts.tf"), "w", encoding="utf-8") as fh:
fh.write(rendered)
# 共享 provider/variables/cloud-init 拷入运行目录,使 workdir 成为可独立
# terraform 的根模块(这些是渲染产物,已在 env/.gitignore 忽略)。
for name in COPY_INTO_WORKDIR:
shutil.copyfile(os.path.join(TEMPLATE_DIR, name), os.path.join(workdir, name))
tfvars = {
"region": glob.get("region", "nrt"),
"name_prefix": glob.get("name_prefix", "ai-workspace"),
"user_data_file": glob.get("user_data_file", "cloud-init.yaml"),
}
with open(
os.path.join(workdir, "terraform.auto.tfvars.json"), "w", encoding="utf-8"
) as fh:
json.dump(tfvars, fh, indent=2, ensure_ascii=False)
fh.write("\n")
print(f" resources: {os.path.relpath(resources, VULTR_VPS_ROOT)}")
print(f" workdir: {os.path.relpath(workdir, VULTR_VPS_ROOT)}")
print(
f" wrote generated_hosts.tf + {', '.join(COPY_INTO_WORKDIR)}"
" + terraform.auto.tfvars.json"
)
print(f" next: terraform -chdir={workdir} init && terraform -chdir={workdir} apply")
def cmd_inventory(args):
resources, workdir = args.resources, args.workdir
data = load_yaml(resources)
glob = data.get("global", {}) or {}
hosts = data.get("hosts", []) or []
default_region = glob.get("region", "nrt")
try:
runtime = _terraform_output(workdir, "cmdb_runtime")
except (OSError, subprocess.CalledProcessError) as exc:
msg = getattr(exc, "stderr", b"") or b""
sys.exit(
f"无法读取 terraform 输出 cmdb_runtime请先在 {workdir} terraform apply\n"
+ msg.decode(errors="replace")
)
cmdb = {}
groups = {}
for host in hosts:
name = host["name"]
rt = runtime.get(name, {})
host_vars = dict(host.get("host_vars", {}) or {})
host_vars.setdefault("os_name", host.get("os_name", ""))
host_vars.setdefault("plan", host.get("plan", DEFAULT_PLAN))
host_vars.setdefault("region", host.get("region") or default_region)
# inventory_hostname = service_domains 的首个 FQDN动态取自资源声明 yaml
# 无 service_domains 时回退到 name。CMDB / inventory / 分组均以此为键。
sd = (host_vars.get("service_domains") or "").split(",")
fqdn = next((d.strip() for d in sd if d.strip()), "") or name
cmdb[fqdn] = {
"name": name,
"fqdn": fqdn,
"ip": rt.get("ip"),
"instance_id": rt.get("instance_id"),
"os_id": rt.get("os_id"),
"os_name": host.get("os_name", ""),
"plan": host.get("plan", DEFAULT_PLAN),
"region": host.get("region") or default_region,
"ansible_user": host.get("ansible_user", DEFAULT_ANSIBLE_USER),
"groups": host.get("groups", []) or [],
"tags": host.get("tags", []) or [],
"host_vars": host_vars,
}
for group in cmdb[fqdn]["groups"] or ["ungrouped"]:
groups.setdefault(group, []).append(fqdn)
# 非空传递检查:运行时事实(ip/instance_id)必须由 terraform 输出带回,否则下游
# inventory 会渲染出空 ansible_host、静默连错主机。缺失即抛错中止默认要求非空
problems = []
for fqdn, h in cmdb.items():
if not str(h.get("ip") or "").strip():
problems.append(f" - {fqdn} (name={h['name']}): 缺少运行时 ip")
if not str(h.get("instance_id") or "").strip():
problems.append(f" - {fqdn} (name={h['name']}): 缺少 instance_id")
if problems:
sys.exit(
"CMDB 非空校验失败:以下主机缺少 terraform 运行时事实;请确认已在 "
f"{workdir} 完成 terraform apply 且 output cmdb_runtime 覆盖这些主机:\n"
+ "\n".join(problems)
)
with open(os.path.join(workdir, "cmdb.json"), "w", encoding="utf-8") as fh:
json.dump(cmdb, fh, indent=2, ensure_ascii=False)
fh.write("\n")
# 每台主机整行在 Python 侧拼好(含带引号的 host_vars模板里只做表达式
# 输出,避免 Jinja2 trim_blocks 把行尾 block 标签后的换行吃掉。
lines = {}
for name, host in cmdb.items():
parts = [
name,
f"ansible_host={host['ip']}",
f"ansible_user={host['ansible_user']}",
]
for k, v in host["host_vars"].items():
parts.append(f'{k}="{v}"')
lines[name] = " ".join(parts)
rendered = (
_jinja()
.get_template("inventory.ini.j2")
.render(
cmdb=cmdb,
lines=lines,
groups={g: sorted(m) for g, m in sorted(groups.items())},
)
)
with open(os.path.join(workdir, "inventory.ini"), "w", encoding="utf-8") as fh:
fh.write(rendered)
rel = os.path.relpath(workdir, VULTR_VPS_ROOT)
print(f" wrote {os.path.join(rel, 'cmdb.json')}")
print(f" wrote {os.path.join(rel, 'inventory.ini')}")
def _add_common(p):
p.add_argument("--resources", default=DEFAULT_RESOURCES, help="资源声明 YAML 路径")
p.add_argument("--workdir", default=DEFAULT_WORKDIR, help="terraform 运行目录")
def main():
parser = argparse.ArgumentParser(description=__doc__)
sub = parser.add_subparsers(dest="cmd", required=True)
r = sub.add_parser("render", help="YAML+模板 -> workdir 渲染产物")
r.set_defaults(func=cmd_render)
_add_common(r)
i = sub.add_parser(
"inventory", help="terraform output + YAML -> cmdb.json + inventory.ini"
)
i.set_defaults(func=cmd_inventory)
_add_common(i)
args = parser.parse_args()
args.func(args)
if __name__ == "__main__":
main()

View File

@ -1,57 +0,0 @@
#!/usr/bin/env bash
# 共享一键联动脚本YAML 声明 -> 渲染 TF -> apply -> CMDB/inventory ->可选Ansible
#
# 位置: vultr-vps/scripts/provision.sh (与 generate.py 同目录,可被多套资源声明复用)
#
# 环境变量:
# TF_VAR_vultr_api_key 必填
# RESOURCES 资源声明 YAML默认 config/resources/ai-workspace-hosts.yaml
# WORKDIR terraform 运行目录(默认 envs/ai-workspace
#
# 用法:
# export TF_VAR_vultr_api_key=xxxx
# ./provision.sh # 渲染+创建+生成 inventory
# ./provision.sh ping # 之后对 ai_workspace 组跑 ping
# ./provision.sh playbook setup-ai-workspace-all-in-one.yml
set -euo pipefail
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
VULTR_VPS_ROOT="$(cd "${SCRIPT_DIR}/.." && pwd)"
# vultr-vps -> terraform-hcl-standard -> iac_modules -> ai-workspace-infra
REPO_ROOT="$(cd "${VULTR_VPS_ROOT}/../../.." && pwd)"
PLAYBOOKS_DIR="${REPO_ROOT}/playbooks"
DYN_INV="${PLAYBOOKS_DIR}/inventory/terraform_cmdb.py"
RESOURCES="${RESOURCES:-${VULTR_VPS_ROOT}/config/resources/ai-workspace-hosts.yaml}"
WORKDIR="${WORKDIR:-${VULTR_VPS_ROOT}/envs/ai-workspace}"
GEN=("python3" "${SCRIPT_DIR}/generate.py")
echo "==> [1/4] render (${RESOURCES##*/} -> ${WORKDIR})"
"${GEN[@]}" render --resources "${RESOURCES}" --workdir "${WORKDIR}"
echo "==> [2/4] terraform init & apply"
terraform -chdir="${WORKDIR}" init -input=false >/dev/null
terraform -chdir="${WORKDIR}" apply -auto-approve -input=false
echo "==> [3/4] inventory (terraform output + YAML -> cmdb.json + inventory.ini)"
"${GEN[@]}" inventory --resources "${RESOURCES}" --workdir "${WORKDIR}"
echo "==> [4/4] 完成。生成: ${WORKDIR}/{cmdb.json,inventory.ini}"
action="${1:-none}"
case "${action}" in
none)
echo "可手动执行: ansible ai_workspace -i ${DYN_INV} -m ping"
;;
ping)
( cd "${PLAYBOOKS_DIR}" && ansible ai_workspace -i "${DYN_INV}" -m ping )
;;
playbook)
pb="${2:?用法: provision.sh playbook <playbook.yml>}"
( cd "${PLAYBOOKS_DIR}" && ansible-playbook -i "${DYN_INV}" "${pb}" )
;;
*)
echo "未知动作: ${action} (支持: none|ping|playbook)" >&2
exit 1
;;
esac

View File

@ -1,45 +0,0 @@
#!/usr/bin/env python3
"""
渲染 Terraform S3 backend 配置文件backend.tf
用法
TF_STATE_ENDPOINT=https://... TF_STATE_REGION=us-east-1 python3 render_backend_tf.py [output_path]
默认输出到当前目录的 backend.tf terraform working-directory 里执行
"""
import os
import sys
endpoint = os.environ.get("TF_STATE_ENDPOINT", "")
if not endpoint:
print("ERROR: TF_STATE_ENDPOINT is not set", file=sys.stderr)
sys.exit(1)
region = os.environ.get("TF_STATE_REGION", "")
if not region:
print("ERROR: TF_STATE_REGION is not set", file=sys.stderr)
sys.exit(1)
output = sys.argv[1] if len(sys.argv) > 1 else "backend.tf"
content = f"""\
terraform {{
backend "s3" {{
endpoints = {{ s3 = "{endpoint}" }}
region = "{region}"
skip_credentials_validation = true
skip_region_validation = true
skip_requesting_account_id = true
skip_metadata_api_check = true
skip_s3_checksum = true
use_path_style = true
}}
}}
"""
with open(output, "w") as f:
f.write(content)
print(f"backend.tf written to {output}")
print(f" endpoint = {endpoint[:40]}...")
print(f" region = {region}")

View File

@ -1,8 +1,6 @@
terraform {
backend "s3" {
endpoints = {
s3 = var.object_storage_endpoint
}
endpoint = var.object_storage_endpoint
bucket = var.state_bucket
key = var.state_key
region = var.region
@ -11,8 +9,6 @@ terraform {
skip_credentials_validation = true
skip_region_validation = true
skip_requesting_account_id = true
skip_metadata_api_check = true
skip_s3_checksum = true
force_path_style = true
}
}

View File

@ -1,6 +0,0 @@
#cloud-config
package_update: true
package_upgrade: true
runcmd:
- echo "Provisioned by Terraform ai-workspace env" > /etc/motd
- systemctl enable --now ssh || systemctl enable --now sshd || true

View File

@ -1,69 +0,0 @@
# =============================================================================
# 由 generate.py 从 hosts.yaml 渲染生成 —— 请勿手工编辑
# 重新生成: python3 generate.py render
#
# 约束: 不使用 for_each/count/dynamic 等 HCL 控制结构。
# 每台主机、每个 SSH key 都渲染成独立的显式 module/resource/data 块。
# (循环在 Python+Jinja2 侧完成,而非 HCL 侧)
# =============================================================================
{% for key in ssh_keys -%}
resource "vultr_ssh_key" "{{ key.name | tf_id }}" {
name = "{{ key.name }}"
ssh_key = "{{ key.public }}"
}
{% endfor -%}
{% for host in hosts -%}
{%- set hid = host.name | tf_id -%}
{%- if host.os_id is not defined or host.os_id is none %}
data "vultr_os" "{{ hid }}" {
filter {
name = "name"
values = ["{{ host.os_name }}"]
}
}
{% endif %}
module "compute_{{ hid }}" {
source = "../../modules/compute"
label = "${var.name_prefix}-{{ host.name }}"
region = {% if host.region %}"{{ host.region }}"{% else %}var.region{% endif %}
plan = "{{ host.plan | default('vc2-4c-8gb') }}"
os_id = {% if host.os_id %}{{ host.os_id }}{% else %}data.vultr_os.{{ hid }}.id{% endif %}
enable_ipv6 = {{ 'true' if host.get('enable_ipv6', true) else 'false' }}
backups = {{ 'true' if host.get('backups', false) else 'false' }}
tags = concat([var.name_prefix], {{ host.get('tags', []) | tojson }})
ssh_key_ids = [{% for k in ssh_keys %}vultr_ssh_key.{{ k.name | tf_id }}.id{{ ", " if not loop.last }}{% endfor %}]
user_data = file(var.user_data_file)
}
{% endfor -%}
# 运行时事实:仅暴露 apply 后才确定的动态值。静态字段(os_name/plan/groups/
# host_vars 等)由 generate.py 从 hosts.yaml 合并,最终一起写入 cmdb.json。
output "cmdb_runtime" {
description = "name -> {ip, instance_id, os_id}"
value = {
{% for host in hosts -%}
{%- set hid = host.name | tf_id %}
"{{ host.name }}" = {
ip = module.compute_{{ hid }}.main_ip
instance_id = module.compute_{{ hid }}.instance_id
os_id = {% if host.os_id %}{{ host.os_id }}{% else %}data.vultr_os.{{ hid }}.id{% endif %}
}
{% endfor -%}
}
}
output "hosts_summary" {
description = "name -> public_ip"
value = {
{% for host in hosts -%}
{%- set hid = host.name | tf_id %}
"{{ host.name }}" = module.compute_{{ hid }}.main_ip
{% endfor -%}
}
}

View File

@ -1,13 +0,0 @@
# =============================================================================
# 由 generate.py 从 cmdb.json 渲染生成 —— 请勿手工编辑
# 重新生成: python3 generate.py inventory
# =============================================================================
{% for group, members in groups.items() %}
[{{ group }}]
{% for name in members %}
{{ lines[name] }}
{% endfor %}
{% endfor %}
[all:vars]
ansible_ssh_common_args='-o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null'

View File

@ -1,25 +0,0 @@
# generate.py hosts.yaml global
# terraform.auto.tfvars.json YAML -> variables.tf
# generate.py Jinja2 generated_hosts.tf
# module/resource 使 for_each/count HCL
#
# vultr_api_key provider ../../templates/provider.tf
# generate.py render
variable "region" {
description = "默认部署区域,主机未单独指定 region 时使用"
type = string
default = "nrt"
}
variable "name_prefix" {
description = "实例 label 前缀"
type = string
default = "ai-workspace"
}
variable "user_data_file" {
description = "cloud-init 脚本路径"
type = string
default = "cloud-init.yaml"
}