Commit Graph

202 Commits

Author SHA1 Message Date
shenlan
bf20dd62c3 Fix GPU cluster playbook SSH setup 2025-06-26 00:07:02 +08:00
shenlan
68830ab067 Merge pull request #22 from svc-design/codex/fix-passwordless-ssh-access-issue
Fix GPU k8s ssh precheck user
2025-06-25 23:50:18 +08:00
shenlan
556058036f fix(gpu-k8s): use inventory ssh user for precheck 2025-06-25 23:50:03 +08:00
shenlan
ceb07b5a4c Merge pull request #21 from svc-design/codex/配置本机-ssh-key
Add SSH precheck for gpu-k8s role
2025-06-25 23:39:29 +08:00
shenlan
4131181bc6 Authorize ops host key on all cluster nodes 2025-06-25 23:39:12 +08:00
shenlan
d1de70b020 gpu-k8s: precheck SSH connectivity 2025-06-25 23:29:49 +08:00
shenlan
a1c023f216 Merge pull request #20 from svc-design/codex/fix-templating-error-in-ip-resolution
Fix IP resolution templating
2025-06-25 23:14:18 +08:00
shenlan
310e4aef12 Fix IP resolution templating 2025-06-25 23:14:00 +08:00
shenlan
214018e607 Merge pull request #19 from svc-design/codex/修复---masters---nodes-未获取ip
Fix GPU Kubernetes IP resolution
2025-06-25 23:07:52 +08:00
shenlan
8bdf5fb17e fix gpu-k8s role ip resolution 2025-06-25 23:07:35 +08:00
shenlan
050d327fc9 Merge pull request #18 from svc-design/codex/fix-permission-issue-with-get_labring_registry.sh
Fix GPU role variable checks
2025-06-25 22:56:06 +08:00
shenlan
6fd798d4f4 support hostnames for gpu k8s role 2025-06-25 22:55:46 +08:00
shenlan
17d43001e0 Merge pull request #17 from svc-design/codex/fix--sudo--a-password-is-required--error
Fix LabRing registry prefix task sudo issue
2025-06-25 22:28:34 +08:00
shenlan
7d76bc170e Move LabRing registry script into role 2025-06-25 22:27:46 +08:00
shenlan
4989b26dd6 Merge pull request #16 from svc-design/codex/修复labring注册表脚本未找到错误
Fix gpu-k8s role script path
2025-06-25 22:13:28 +08:00
shenlan
00ab7a116c fix gpu-k8s role script path 2025-06-25 22:13:14 +08:00
shenlan
e9f1337e4d Merge pull request #15 from svc-design/codex/根据节点ip选择镜像地址
Implement automatic LabRing registry selection
2025-06-25 22:05:50 +08:00
shenlan
5019dc008c Merge branch 'main' into codex/根据节点ip选择镜像地址 2025-06-25 22:04:13 +08:00
shenlan
206f649406 feat: auto-select labring registry 2025-06-25 21:47:00 +08:00
shenlan
892ed22100 Merge pull request #14 from svc-design/codex/fix-invalid-ip-range-format-error
Fix GPU k8s role default version
2025-06-25 21:42:29 +08:00
shenlan
35fca24f2e gpu-k8s: separate kubernetes version 2025-06-25 21:42:15 +08:00
shenlan
20b6647639 Merge pull request #13 from svc-design/codex/fix-sealos-installation-404-error
Update gpu-k8s role to pull latest Sealos
2025-06-25 21:33:40 +08:00
shenlan
3933de1764 gpu role: fetch latest sealos and install tools 2025-06-25 21:33:22 +08:00
shenlan
9d19914dec Merge pull request #12 from svc-design/codex/修正roles/vhosts/gpu-k8s/配置与sealos初始化
Fix GPU K8S role and add ssh trust setup
2025-06-25 21:17:42 +08:00
shenlan
b6bbd93cea Add SSH trust role and enhance gpu-k8s setup 2025-06-25 21:17:30 +08:00
shenlan
524aaf6d0a Merge pull request #11 from svc-design/codex/更新-readme.md-并创建子文档
Update README with docs reference
2025-06-25 20:44:00 +08:00
shenlan
79576fb9b6 Merge pull request #10 from svc-design/codex/修复roles/vhosts/gpu-k8s/问题
Fix NVIDIA repo URLs for gpu role
2025-06-25 20:41:29 +08:00
shenlan
1664a8cddd docs: add repo structure overview 2025-06-25 20:40:54 +08:00
shenlan
b8f6cc7648 Fix NVIDIA repository URLs 2025-06-25 20:40:38 +08:00
shenlan
0e47349cc9 Merge pull request #9 from svc-design/codex/fix-missing-nvidia-container-runtime-package
Fix GPU role packages
2025-06-25 20:28:42 +08:00
shenlan
85caf2f3b0 Fix NVIDIA runtime install 2025-06-25 20:28:19 +08:00
shenlan
bb81e13de9 Merge pull request #8 from svc-design/feature/gpu_k8s_cluster
add inventory/gpu_k8s_cluster
2025-06-25 20:20:36 +08:00
Haitao Pan
a6f5a5e518 add inventory/gpu_k8s_cluster 2025-06-25 20:07:36 +08:00
shenlan
904503e316 Merge pull request #6 from svc-design/codex/安装sealos-cli到install_cluster.yml 2025-06-24 12:26:15 +08:00
shenlan
f90010956d Define ops_host in demo playbook 2025-06-24 12:25:57 +08:00
shenlan
ef3108b0d1 Merge pull request #5 from svc-design/te0hq5-codex/使用ansible安装k8s并配置gpu驱动
Refine gpu-k8s role variables
2025-06-24 11:31:57 +08:00
shenlan
d29198d1cd Merge branch 'main' into te0hq5-codex/使用ansible安装k8s并配置gpu驱动 2025-06-24 11:30:37 +08:00
shenlan
e15044b26d Improve gpu-k8s role variable handling 2025-06-24 11:26:14 +08:00
shenlan
a090604968 Merge pull request #2 from svc-design/svc-design-patch-1
Update setup-k3s-cluster-with-br0.sh
2025-06-24 10:57:40 +08:00
shenlan
d118531b94 Merge pull request #4 from svc-design/codex/使用ansible安装k8s并配置gpu驱动
Add gpu-k8s role and documentation
2025-06-24 10:57:14 +08:00
shenlan
8c9f7d26fc Add gpu-k8s ansible role and docs 2025-06-24 10:45:35 +08:00
shenlan
b4fc61c397 Merge pull request #3 from svc-design/feature/deepflow-agent-playbook-and-tools
feat: add deepflow agent playbook and deployment tools
2025-06-16 11:57:27 +08:00
Haitao Pan
9b15bed30a feat: add deepflow agent playbook and deployment tools
- add initial deepflow-agent-playbook (inventory, playbook, roles)
- add iptables whitelist enforce script
- add deepflow agent batch deploy script
- add initial .gitignore
2025-06-16 11:01:52 +08:00
Haitao Pan
29da696228 add chart: update-server and website-homepage 2025-06-11 16:20:35 +08:00
Haitao Pan
584496cfa8 add pulp-operator-repo-gateway.yaml 2025-06-08 10:13:22 +08:00
Haitao Pan
62d223a518 add pulp-operator 2025-06-08 09:35:32 +08:00
shenlan
2e921a9e6d Update setup-k3s-cluster-with-br0.sh 2025-05-29 13:55:46 +08:00
Haitao Pan
1b291ec749 feat(playbook): support cross-platform nginx install and cert generation 2025-05-25 12:16:01 +08:00
Haitao Pan
eab3a66076 feat(playbook): add macOS/Linux compatible config tasks 2025-05-25 12:13:16 +08:00
Haitao Pan
4ca48d9401 feat(playbook): add real-world config tasks like nginx setup and SSL update 2025-05-25 12:10:21 +08:00