Compare commits
No commits in common. "main" and "codex/fix-agent-install-first-run-exit" have entirely different histories.
main
...
codex/fix-
2
.gitignore
vendored
2
.gitignore
vendored
@ -4,7 +4,6 @@
|
||||
#########################################################
|
||||
# pigsty-ca and other certs
|
||||
#########################################################
|
||||
|
||||
files/*.key
|
||||
files/*.crt
|
||||
files/pki/*
|
||||
@ -27,7 +26,6 @@ docker/data/
|
||||
#########################################################
|
||||
# tmp files
|
||||
#########################################################
|
||||
.env
|
||||
# IDE files
|
||||
.idea/
|
||||
.code/
|
||||
|
||||
132
README.md
132
README.md
@ -5,7 +5,7 @@
|
||||
|
||||
**Observability.svc.plus** is an observability solution strictly following the Apache 2.0 license.
|
||||
|
||||
> **Focus**: Monitoring & Observability (监控/可观测). Integrating OpenTelemetry (OTel), VictoriaMetrics, and DeepFlow-based network observability without long-term raw-flow lock-in.
|
||||
> **Focus**: Monitoring & Observability (监控/可观测). Integrating OpenTelemetry (OTel), with future plans to incorporate DeepFlow Agent and other open-source NPM (Network Performance Monitoring) probes.
|
||||
|
||||
[Website](https://svc.plus/) | [Public Demo](https://svc.plus/services) | [Blog](https://svc.plus/blogs) | [Support](https://www.svc.plus/support)
|
||||
|
||||
@ -31,70 +31,6 @@ flowchart LR
|
||||
|
||||
## 3) Start
|
||||
|
||||
当前推荐按“混合部署到已有主机”的方式执行。
|
||||
|
||||
1. 先更新 DNS,把 `observability.svc.plus` 指到 `us-xhttp.svc.plus`
|
||||
2. 在 `us-xhttp.svc.plus` 上执行下面的 Server side 示例,部署中心端
|
||||
3. 再到其他已有主机执行下面的 Client side 示例,把采集数据回传到 `observability.svc.plus`
|
||||
|
||||
当前接入主机:
|
||||
|
||||
- `us-xhttp.svc.plus`:继续承载现有服务,同时承载 `observability.svc.plus`
|
||||
- `openclaw.svc.plus`:部署 agent,采集后上报到中心端
|
||||
- `jp-xhttp.svc.plus`:部署 agent,采集后上报到中心端
|
||||
|
||||
### Ansible (Recommended)
|
||||
|
||||
#### Server side
|
||||
|
||||
先导出 Cloudflare Token,然后在 `us-xhttp.svc.plus` 上执行服务端部署。`deploy_observability_service.yml` 会先把 Cloudflare 上的 `observability.svc.plus` 更新成指向 `us-xhttp.svc.plus` 的非代理记录,再等待公共 DNS 生效后继续部署,这样更容易保证 Caddy 首次自动签名成功。
|
||||
|
||||
```bash
|
||||
export CLOUDFLARE_API_TOKEN=...
|
||||
ansible-playbook -i <your-inventory> deploy_observability_service.yml -l us-xhttp.svc.plus
|
||||
```
|
||||
|
||||
如果希望给 `/ingest/*` 增加一层基础认证,可以在服务端部署时一起打开:
|
||||
|
||||
```bash
|
||||
export CLOUDFLARE_API_TOKEN=...
|
||||
ansible-playbook -i <your-inventory> deploy_observability_service.yml -l us-xhttp.svc.plus \
|
||||
-e observability_ingest_basic_auth_enabled=true \
|
||||
-e observability_ingest_basic_auth_user=ingest \
|
||||
-e observability_ingest_basic_auth_password='<strong-password>'
|
||||
```
|
||||
|
||||
#### Client side (agent)
|
||||
|
||||
再到采集端主机执行 `node.yml` 的 push mode:
|
||||
|
||||
```bash
|
||||
ansible-playbook -i <your-inventory> node.yml \
|
||||
-l openclaw.svc.plus,jp-xhttp.svc.plus \
|
||||
-e node_monitor_mode=push \
|
||||
-e observability_endpoint=https://observability.svc.plus/
|
||||
```
|
||||
|
||||
如果服务端已开启 ingest 基本认证,采集端也要带上同一组凭据:
|
||||
|
||||
```bash
|
||||
ansible-playbook -i <your-inventory> node.yml \
|
||||
-l openclaw.svc.plus,jp-xhttp.svc.plus \
|
||||
-e node_monitor_mode=push \
|
||||
-e observability_endpoint=https://observability.svc.plus/ \
|
||||
-e observability_ingest_basic_auth_enabled=true \
|
||||
-e observability_ingest_basic_auth_user=ingest \
|
||||
-e observability_ingest_basic_auth_password='<strong-password>'
|
||||
```
|
||||
|
||||
> `node_monitor_mode=push` 会在远端主机上部署 `node_exporter + process_exporter + vector`,并把 metrics / logs 主动汇总到 `observability.svc.plus`。`vector` 固定归到采集端任务,服务端 `infra.yml` 不再默认部署它。
|
||||
>
|
||||
> 如果采集端与 Victoria 服务端同机,playbook 会自动把 metrics / logs 改走本机 `127.0.0.1` ingest;跨主机时默认走 `https://observability.svc.plus/` 并自动补全 `/ingest/metrics/api/v1/write` 和 `/ingest/logs/insert`。
|
||||
>
|
||||
> `observability_ingest_basic_auth_*` 只保护 `/ingest/*` 写入入口,不影响 Caddy 暴露的其他站点页面;服务端和采集端必须使用同一组认证信息。
|
||||
|
||||
### Script Installers
|
||||
|
||||
### Server side
|
||||
|
||||
```bash
|
||||
@ -118,66 +54,10 @@ curl -fsSL https://raw.githubusercontent.com/cloud-neutral-toolkit/observability
|
||||
> - logs endpoint: `/ingest/logs/insert`
|
||||
> - The script automatically verifies installation after setup.
|
||||
|
||||
### Optional: DeepFlow Agent on Client
|
||||
|
||||
If you have deployed DeepFlow with `deepflow.yml`, you can install `deepflow-agent` on client nodes via the same script:
|
||||
### Remote client example (clawdbot.svc.plus)
|
||||
|
||||
```bash
|
||||
# example: endpoint exposed by caddy grpc ingress (deepflow_grpc_domain:443)
|
||||
curl -fsSL https://raw.githubusercontent.com/cloud-neutral-toolkit/observability.svc.plus/main/scripts/agent-install.sh \
|
||||
| bash -s -- \
|
||||
--endpoint https://observability.svc.plus/ingest/otlp \
|
||||
--deepflow-agent \
|
||||
--deepflow-grpc-endpoint deepflow-agent.svc.plus:443 \
|
||||
--deepflow-agent-download-url https://example.com/path/to/deepflow-agent
|
||||
```
|
||||
|
||||
> If `deepflow-agent` binary already exists on host, replace `--deepflow-agent-download-url` with `--deepflow-agent-bin /path/to/deepflow-agent`.
|
||||
|
||||
## 🚀 DeepFlow Deployment (Server Side)
|
||||
|
||||
This repo now provides dedicated DeepFlow roles:
|
||||
|
||||
- `deepflow_mysql`
|
||||
- `deepflow_clickhouse_s3`
|
||||
- `deepflow_server`
|
||||
- `deepflow_connector`
|
||||
- `deepflow_agent`
|
||||
|
||||
Quick start:
|
||||
|
||||
```bash
|
||||
./configure -c deepflow/deepflow
|
||||
vi pigsty.yml # adjust domain/password/ports
|
||||
./deploy.yml
|
||||
./docker.yml
|
||||
./deepflow.yml
|
||||
./infra.yml -t caddy # apply deepflow_grpc_domain ingress
|
||||
```
|
||||
|
||||
Default inventory template: `conf/deepflow/deepflow.yml`
|
||||
|
||||
### Lightweight Topology
|
||||
|
||||
- `deepflow-server` stays containerized with Docker Compose
|
||||
- ClickHouse is kept as short-retention local storage
|
||||
- MinIO/S3 is optional in lightweight mode
|
||||
- `deepflow_connector` exports selected DeepFlow L4/L7 metrics to VictoriaMetrics
|
||||
- `deepflow_agent` supports `binary/systemd`, `docker`, and rendered `k8s` manifests
|
||||
- default `deepflow_agent_profile=lite` keeps `pcap` enabled and disables built-in `vector`
|
||||
|
||||
### Remote client example (openclaw.svc.plus)
|
||||
|
||||
```bash
|
||||
ssh root@openclaw.svc.plus \
|
||||
'curl -fsSL https://raw.githubusercontent.com/cloud-neutral-toolkit/observability.svc.plus/main/scripts/agent-install.sh \
|
||||
| bash -s -- --endpoint https://observability.svc.plus/ingest/otlp'
|
||||
```
|
||||
|
||||
### Remote client example (jp-xhttp.svc.plus)
|
||||
|
||||
```bash
|
||||
ssh root@jp-xhttp.svc.plus \
|
||||
ssh root@clawdbot.svc.plus \
|
||||
'curl -fsSL https://raw.githubusercontent.com/cloud-neutral-toolkit/observability.svc.plus/main/scripts/agent-install.sh \
|
||||
| bash -s -- --endpoint https://observability.svc.plus/ingest/otlp'
|
||||
```
|
||||
@ -185,18 +65,18 @@ ssh root@jp-xhttp.svc.plus \
|
||||
### Optional SSH manager env example
|
||||
|
||||
```bash
|
||||
SSH_SERVER_CLAWBOT_HOST=openclaw.svc.plus
|
||||
SSH_SERVER_CLAWBOT_HOST=clawdbot.svc.plus
|
||||
SSH_SERVER_CLAWBOT_USER=root
|
||||
SSH_SERVER_CLAWBOT_KEYPATH=~/.ssh/id_rsa
|
||||
SSH_SERVER_CLAWBOT_PORT=22
|
||||
SSH_SERVER_CLAWBOT_DESCRIPTION=openclaw_server
|
||||
SSH_SERVER_CLAWBOT_DESCRIPTION=clawdbot_server
|
||||
```
|
||||
|
||||
## 4) Features
|
||||
|
||||
- **Observability First**: SOTA monitoring for PG / Infra / Node based on VictoriaMetrics, Grafana, and OpenTelemetry.
|
||||
- **OTel Integration**: Native support for OpenTelemetry, facilitating unified trace, metric, and log ingestion.
|
||||
- **DeepFlow Ready**: Lightweight DeepFlow server/agent deployment with short-lived flow storage and VictoriaMetrics archiving for high-value protocol metrics.
|
||||
- **Future Ready**: Planned integration for DeepFlow Agent and other open-source NPM probes for deep network and application observability.
|
||||
- **Reliable Base**: Robust self-healing HA clusters, PITR, and secure infrastructure.
|
||||
- **Maintainable**: One-Cmd Deploy, IaC support, and easy customization.
|
||||
- **Controllable**: Self-sufficient Cloud Neutral FOSS. Run on bare Linux.
|
||||
|
||||
@ -3,11 +3,11 @@ forks = 10
|
||||
nocows = 1
|
||||
timeout = 15
|
||||
pipelining = True
|
||||
inventory = observability.yml
|
||||
inventory = pigsty.yml
|
||||
host_key_checking = False
|
||||
command_warnings = False
|
||||
deprecation_warnings = False
|
||||
force_valid_group_names = ignore
|
||||
use_persistent_connections = True
|
||||
allow_world_readable_tmpfiles = False
|
||||
ansible_managed = 'ansible managed: %Y-%m-%d %H:%M:%S'
|
||||
ansible_managed = 'ansible managed: %Y-%m-%d %H:%M:%S'
|
||||
@ -1,115 +0,0 @@
|
||||
---
|
||||
#==============================================================#
|
||||
# File : deepflow.yml
|
||||
# Desc : observability config for running DeepFlow stack
|
||||
# Ctime : 2026-02-04
|
||||
# Mtime : 2026-02-04
|
||||
# License : Apache-2.0 @ https://pigsty.io/docs/about/license/
|
||||
#==============================================================#
|
||||
|
||||
# how to use this template:
|
||||
#
|
||||
# curl -fsSL https://repo.pigsty.io/get | bash; cd ~/pigsty
|
||||
# ./bootstrap # prepare local repo & ansible
|
||||
# ./configure -c deepflow/deepflow # use this deepflow config template
|
||||
# vi pigsty.yml # IMPORTANT: CHANGE CREDENTIALS / DOMAIN
|
||||
# ./deploy.yml # install infra stack
|
||||
# ./docker.yml # install docker & docker-compose
|
||||
# ./deepflow.yml # install deepflow with compose + optional connector/agent
|
||||
|
||||
all:
|
||||
children:
|
||||
|
||||
deepflow:
|
||||
hosts: { 10.10.10.10: {} }
|
||||
vars:
|
||||
deepflow_enabled: true
|
||||
deepflow_mysql_enabled: true
|
||||
deepflow_clickhouse_s3_enabled: true
|
||||
deepflow_connector_enabled: true
|
||||
deepflow_agent_enabled: false
|
||||
|
||||
deepflow_deploy_profile: lite
|
||||
deepflow_storage_mode: short_ttl
|
||||
|
||||
deepflow_data: /data/deepflow
|
||||
|
||||
# role: deepflow_mysql
|
||||
deepflow_mysql_port: 13306
|
||||
deepflow_mysql_root_password: DeepFlow.Root.ChangeMe
|
||||
deepflow_mysql_user: deepflow
|
||||
deepflow_mysql_password: DeepFlow.MySQL.ChangeMe
|
||||
deepflow_mysql_database: deepflow
|
||||
|
||||
# role: deepflow_clickhouse_s3
|
||||
deepflow_clickhouse_http_port: 18123
|
||||
deepflow_clickhouse_tcp_port: 19000
|
||||
deepflow_clickhouse_retention_hours: 24
|
||||
deepflow_s3_enabled: false
|
||||
deepflow_minio_api_port: 19090
|
||||
deepflow_minio_console_port: 19091
|
||||
deepflow_s3_bucket: deepflow
|
||||
deepflow_s3_access_key: deepflow
|
||||
deepflow_s3_secret_key: DeepFlow.S3.ChangeMe
|
||||
deepflow_s3_region: us-east-1
|
||||
|
||||
# role: deepflow_server
|
||||
deepflow_server_grpc_port: 20035
|
||||
deepflow_server_http_port: 20417
|
||||
deepflow_app_port: 20880
|
||||
deepflow_clickhouse_addr: host.docker.internal:19000
|
||||
deepflow_s3_endpoint: http://host.docker.internal:19090
|
||||
deepflow_mysql_addr: host.docker.internal:13306
|
||||
deepflow_l4_log_ttl_hour: 24
|
||||
deepflow_l7_log_ttl_hour: 24
|
||||
deepflow_flow_metrics_ttl_hour: 24
|
||||
deepflow_metrics_ttl_hour: 24
|
||||
deepflow_prometheus_ttl_hour: 24
|
||||
|
||||
# role: deepflow_connector
|
||||
deepflow_connector_source_endpoint: http://127.0.0.1:20417/metrics
|
||||
deepflow_connector_remote_write_url: http://127.0.0.1:8428/api/v1/write
|
||||
|
||||
# role: deepflow_agent
|
||||
deepflow_agent_mode: binary
|
||||
deepflow_agent_profile: lite
|
||||
deepflow_agent_disable_pcap: false
|
||||
deepflow_agent_disable_vector: true
|
||||
deepflow_agent_grpc_endpoint: "{{ deepflow_grpc_domain }}:443"
|
||||
|
||||
infra: { hosts: { 10.10.10.10: { infra_seq: 1 } } }
|
||||
etcd: { hosts: { 10.10.10.10: { etcd_seq: 1 } }, vars: { etcd_cluster: etcd } }
|
||||
|
||||
vars:
|
||||
version: v4.0.0
|
||||
admin_ip: 10.10.10.10
|
||||
region: default
|
||||
node_tune: oltp
|
||||
pg_conf: oltp.yml
|
||||
docker_enabled: true
|
||||
|
||||
# Caddy gRPC ingress for deepflow-agent:
|
||||
caddy_enabled: true
|
||||
deepflow_grpc_enabled: true
|
||||
deepflow_grpc_domain: deepflow-agent.pigsty
|
||||
deepflow_grpc_upstream: 127.0.0.1:20035
|
||||
|
||||
infra_portal:
|
||||
home : { domain: svc.plus }
|
||||
deepflow : { domain: deepflow.pigsty ,endpoint: "10.10.10.10:20880" }
|
||||
|
||||
proxy_env:
|
||||
no_proxy: "localhost,127.0.0.1,10.0.0.0/8,192.168.0.0/16,*.pigsty,*.aliyun.com,mirrors.*,*.tsinghua.edu.cn"
|
||||
|
||||
repo_enabled: false
|
||||
node_repo_modules: node,infra,pgsql
|
||||
|
||||
grafana_admin_password: pigsty
|
||||
grafana_view_password: DBUser.Viewer
|
||||
pg_admin_password: DBUser.DBA
|
||||
pg_monitor_password: DBUser.Monitor
|
||||
pg_replication_password: DBUser.Replicator
|
||||
patroni_password: Patroni.API
|
||||
haproxy_admin_password: pigsty
|
||||
minio_secret_key: S3User.MinIO
|
||||
etcd_root_password: Etcd.Root
|
||||
28
deepflow.yml
28
deepflow.yml
@ -1,28 +0,0 @@
|
||||
#!/usr/bin/env ansible-playbook
|
||||
---
|
||||
#==============================================================#
|
||||
# File : deepflow.yml
|
||||
# Desc : deploy deepflow stack with three dedicated roles
|
||||
# Ctime : 2026-02-04
|
||||
# Mtime : 2026-02-04
|
||||
# Path : deepflow.yml
|
||||
# License : Apache-2.0 @ https://pigsty.io/docs/about/license/
|
||||
#==============================================================#
|
||||
|
||||
- name: DEEPFLOW STACK
|
||||
become: true
|
||||
hosts: all
|
||||
gather_facts: no
|
||||
|
||||
roles:
|
||||
- { role: node_id , tags: node-id, when: deepflow_enabled | default(true) | bool }
|
||||
- { role: deepflow_mysql , tags: deepflow_mysql, when: deepflow_mysql_enabled | default(true) | bool }
|
||||
- { role: deepflow_clickhouse_s3, tags: deepflow_clickhouse_s3, when: deepflow_clickhouse_s3_enabled | default(true) | bool }
|
||||
- { role: deepflow_server , tags: deepflow_server, when: deepflow_enabled | default(true) | bool }
|
||||
- { role: deepflow_connector , tags: deepflow_connector, when: deepflow_connector_enabled | default(false) | bool }
|
||||
- { role: deepflow_agent , tags: deepflow_agent, when: deepflow_agent_enabled | default(false) | bool }
|
||||
|
||||
# Usage:
|
||||
# 1. Define deepflow group in pigsty.yml
|
||||
# 2. Ensure docker is installed: ./docker.yml
|
||||
# 3. Run ./deepflow.yml -l <deepflow_group>
|
||||
@ -1,147 +0,0 @@
|
||||
---
|
||||
- name: Update Cloudflare DNS for observability.svc.plus
|
||||
hosts: localhost
|
||||
connection: local
|
||||
gather_facts: false
|
||||
vars:
|
||||
cloudflare_zone_name: svc.plus
|
||||
cloudflare_api_base: https://api.cloudflare.com/client/v4
|
||||
observability_domain: observability.svc.plus
|
||||
observability_dns_target: us-xhttp.svc.plus
|
||||
observability_dns_type: CNAME
|
||||
observability_dns_ttl: 1
|
||||
observability_dns_proxied: false
|
||||
dns_wait_retries: 30
|
||||
dns_wait_delay: 10
|
||||
tasks:
|
||||
- name: Validate Cloudflare token is present in environment
|
||||
ansible.builtin.assert:
|
||||
that:
|
||||
- lookup('ansible.builtin.env', 'CLOUDFLARE_API_TOKEN') | length > 0
|
||||
fail_msg: "CLOUDFLARE_API_TOKEN must be exported before running this playbook."
|
||||
|
||||
- name: Resolve Cloudflare zone id
|
||||
ansible.builtin.uri:
|
||||
url: "{{ cloudflare_api_base }}/zones?name={{ cloudflare_zone_name }}"
|
||||
method: GET
|
||||
headers:
|
||||
Authorization: "Bearer {{ lookup('ansible.builtin.env', 'CLOUDFLARE_API_TOKEN') }}"
|
||||
Content-Type: application/json
|
||||
return_content: true
|
||||
register: cloudflare_zone_lookup
|
||||
|
||||
- name: Validate zone lookup result
|
||||
ansible.builtin.assert:
|
||||
that:
|
||||
- cloudflare_zone_lookup.json.success
|
||||
- cloudflare_zone_lookup.json.result | length > 0
|
||||
fail_msg: "Unable to resolve Cloudflare zone id for {{ cloudflare_zone_name }}."
|
||||
|
||||
- name: Set Cloudflare zone id
|
||||
ansible.builtin.set_fact:
|
||||
cloudflare_zone_id: "{{ cloudflare_zone_lookup.json.result[0].id }}"
|
||||
|
||||
- name: Query existing observability DNS records
|
||||
ansible.builtin.uri:
|
||||
url: "{{ cloudflare_api_base }}/zones/{{ cloudflare_zone_id }}/dns_records?name={{ observability_domain }}"
|
||||
method: GET
|
||||
headers:
|
||||
Authorization: "Bearer {{ lookup('ansible.builtin.env', 'CLOUDFLARE_API_TOKEN') }}"
|
||||
Content-Type: application/json
|
||||
return_content: true
|
||||
register: observability_dns_records
|
||||
|
||||
- name: Remove conflicting observability DNS records with different type
|
||||
ansible.builtin.uri:
|
||||
url: "{{ cloudflare_api_base }}/zones/{{ cloudflare_zone_id }}/dns_records/{{ item.id }}"
|
||||
method: DELETE
|
||||
headers:
|
||||
Authorization: "Bearer {{ lookup('ansible.builtin.env', 'CLOUDFLARE_API_TOKEN') }}"
|
||||
Content-Type: application/json
|
||||
loop: "{{ observability_dns_records.json.result | default([]) }}"
|
||||
loop_control:
|
||||
label: "{{ item.type }} {{ item.name }}"
|
||||
when: item.type != observability_dns_type
|
||||
|
||||
- name: Create observability DNS record when missing
|
||||
ansible.builtin.uri:
|
||||
url: "{{ cloudflare_api_base }}/zones/{{ cloudflare_zone_id }}/dns_records"
|
||||
method: POST
|
||||
headers:
|
||||
Authorization: "Bearer {{ lookup('ansible.builtin.env', 'CLOUDFLARE_API_TOKEN') }}"
|
||||
Content-Type: application/json
|
||||
body_format: raw
|
||||
body: >-
|
||||
{{
|
||||
{
|
||||
'type': observability_dns_type,
|
||||
'name': observability_domain,
|
||||
'content': observability_dns_target,
|
||||
'ttl': (observability_dns_ttl | int),
|
||||
'proxied': (observability_dns_proxied | bool)
|
||||
} | to_json
|
||||
}}
|
||||
when: (observability_dns_records.json.result | selectattr('type', 'equalto', observability_dns_type) | list | length) == 0
|
||||
|
||||
- name: Update observability DNS record when target changes
|
||||
ansible.builtin.uri:
|
||||
url: "{{ cloudflare_api_base }}/zones/{{ cloudflare_zone_id }}/dns_records/{{ (observability_dns_records.json.result | selectattr('type', 'equalto', observability_dns_type) | list | first).id }}"
|
||||
method: PUT
|
||||
headers:
|
||||
Authorization: "Bearer {{ lookup('ansible.builtin.env', 'CLOUDFLARE_API_TOKEN') }}"
|
||||
Content-Type: application/json
|
||||
body_format: raw
|
||||
body: >-
|
||||
{{
|
||||
{
|
||||
'type': observability_dns_type,
|
||||
'name': observability_domain,
|
||||
'content': observability_dns_target,
|
||||
'ttl': (observability_dns_ttl | int),
|
||||
'proxied': (observability_dns_proxied | bool)
|
||||
} | to_json
|
||||
}}
|
||||
when:
|
||||
- (observability_dns_records.json.result | selectattr('type', 'equalto', observability_dns_type) | list | length) > 0
|
||||
- >
|
||||
((observability_dns_records.json.result | selectattr('type', 'equalto', observability_dns_type) | list | first).content != observability_dns_target)
|
||||
or
|
||||
(((observability_dns_records.json.result | selectattr('type', 'equalto', observability_dns_type) | list | first).proxied | default(false)) != observability_dns_proxied)
|
||||
|
||||
- name: Wait for public DNS to expose observability CNAME
|
||||
ansible.builtin.uri:
|
||||
url: "https://cloudflare-dns.com/dns-query?name={{ observability_domain }}&type=CNAME"
|
||||
method: GET
|
||||
headers:
|
||||
Accept: application/dns-json
|
||||
return_content: true
|
||||
register: observability_dns_public
|
||||
until:
|
||||
- observability_dns_public.status == 200
|
||||
- >
|
||||
(
|
||||
observability_dns_public.json.Status
|
||||
if (observability_dns_public.json is defined)
|
||||
else ((observability_dns_public.content | from_json).Status | default(1))
|
||||
) == 0
|
||||
- >
|
||||
(
|
||||
observability_dns_public.json.Answer
|
||||
if (observability_dns_public.json is defined)
|
||||
else ((observability_dns_public.content | from_json).Answer | default([]))
|
||||
) | selectattr('data', 'equalto', observability_dns_target ~ '.')
|
||||
| list | length > 0
|
||||
retries: "{{ dns_wait_retries }}"
|
||||
delay: "{{ dns_wait_delay }}"
|
||||
|
||||
- name: Show effective observability DNS target
|
||||
ansible.builtin.debug:
|
||||
msg: "{{ observability_domain }} -> {{ observability_dns_target }} proxied={{ observability_dns_proxied }}"
|
||||
|
||||
- import_playbook: infra.yml
|
||||
vars:
|
||||
infra_domain: observability.svc.plus
|
||||
infra_portal:
|
||||
home: { domain: observability.svc.plus }
|
||||
caddy_enabled: true
|
||||
nginx_enabled: false
|
||||
@ -1,14 +0,0 @@
|
||||
# Documentation Coverage Matrix
|
||||
|
||||
This matrix tracks the bilingual canonical documentation set for `observability.svc.plus` and maps it back to the current codebase and older docs.
|
||||
|
||||
该矩阵用于跟踪 `observability.svc.plus` 的双语规范文档,并将其与当前代码状态和历史文档对应起来。
|
||||
|
||||
| Category | EN | ZH | Current status | Existing references | Next check |
|
||||
| --- | --- | --- | --- | --- | --- |
|
||||
| Architecture | Yes | Yes | Seeded from current codebase; deeper legacy consolidation is still needed. | None yet; use the new canonical page as the starting point. | Keep diagrams and ownership notes synchronized with actual directories, services, and integration dependencies. |
|
||||
| Design | Yes | Yes | Seeded from current codebase; deeper legacy consolidation is still needed. | None yet; use the new canonical page as the starting point. | Promote one-off implementation notes into reusable design records when behavior, APIs, or deployment contracts change. |
|
||||
| Deployment | Yes | Yes | Seeded from current codebase; deeper legacy consolidation is still needed. | None yet; use the new canonical page as the starting point. | Verify deployment steps against current scripts, manifests, CI/CD flow, and environment contracts before each release. |
|
||||
| User Guide | Yes | Yes | Seeded from current codebase; deeper legacy consolidation is still needed. | None yet; use the new canonical page as the starting point. | Prefer workflow-oriented examples and keep screenshots or terminal snippets aligned with the latest UI or CLI behavior. |
|
||||
| Developer Guide | Yes | Yes | Seeded from current codebase; deeper legacy consolidation is still needed. | None yet; use the new canonical page as the starting point. | Keep setup and test commands tied to actual package scripts, Make targets, or language toolchains in this repository. |
|
||||
| Vibe Coding Reference | Yes | Yes | Seeded from current codebase; deeper legacy consolidation is still needed. | None yet; use the new canonical page as the starting point. | Review prompt templates and repo rules whenever the project adds new subsystems, protected areas, or mandatory verification steps. |
|
||||
@ -1,31 +0,0 @@
|
||||
# Observability Service Plus / 可观测性服务
|
||||
|
||||
This `docs/` directory now has a bilingual canonical layer for the current repository state.
|
||||
|
||||
本 `docs/` 目录现已补齐双语规范层,用于承接当前仓库状态下的核心文档。
|
||||
|
||||
## Quick Entry / 快速入口
|
||||
|
||||
- Coverage checklist / 覆盖检查矩阵: `docs/DOC_COVERAGE.md`
|
||||
- English index / 英文入口: `docs/en/README.md`
|
||||
- 中文入口 / Chinese index: `docs/zh/README.md`
|
||||
|
||||
## Canonical Bilingual Pages / 双语规范页
|
||||
|
||||
- `docs/en/architecture.md` / `docs/zh/architecture.md`
|
||||
- `docs/en/design.md` / `docs/zh/design.md`
|
||||
- `docs/en/deployment.md` / `docs/zh/deployment.md`
|
||||
- `docs/en/user-guide.md` / `docs/zh/user-guide.md`
|
||||
- `docs/en/developer-guide.md` / `docs/zh/developer-guide.md`
|
||||
- `docs/en/vibe-coding-reference.md` / `docs/zh/vibe-coding-reference.md`
|
||||
|
||||
## Current Repo Context / 当前仓库背景
|
||||
|
||||
- Root README: `Observability.svc.plus`
|
||||
- Previous docs index: `Documentation`
|
||||
- Manifest evidence / 构建清单: repository structure and scripts only
|
||||
- Active code and ops directories / 当前主要目录: `app/`, `api/`, `scripts/`
|
||||
|
||||
## Existing Docs To Reconcile / 需要继续归并的现有文档
|
||||
|
||||
- No pre-existing markdown docs were detected in this repository.
|
||||
@ -1,23 +0,0 @@
|
||||
# Observability Service Plus Documentation
|
||||
|
||||
This repository documents infrastructure orchestration and observability composition rather than a single application binary.
|
||||
|
||||
## Current state snapshot
|
||||
|
||||
- Root README title: `Observability.svc.plus`
|
||||
- Build/runtime evidence: repository structure and scripts only
|
||||
- Primary directories detected: `app/`, `api/`, `scripts/`
|
||||
- Existing docs count: 0
|
||||
|
||||
## Canonical pages
|
||||
|
||||
- [Architecture](architecture.md)
|
||||
- [Design](design.md)
|
||||
- [Deployment](deployment.md)
|
||||
- [User Guide](user-guide.md)
|
||||
- [Developer Guide](developer-guide.md)
|
||||
- [Vibe Coding Reference](vibe-coding-reference.md)
|
||||
|
||||
## Legacy docs to fold in
|
||||
|
||||
- No pre-existing markdown docs were detected in this repository.
|
||||
@ -1,24 +0,0 @@
|
||||
# Architecture
|
||||
|
||||
This repository documents infrastructure orchestration and observability composition rather than a single application binary.
|
||||
|
||||
Use this page as the canonical bilingual overview of system boundaries, major components, and repo ownership.
|
||||
|
||||
## Current code-aligned notes
|
||||
|
||||
- Documentation target: `observability.svc.plus`
|
||||
- Repo kind: `infra-observability`
|
||||
- Manifest and build evidence: repository structure and scripts only
|
||||
- Primary implementation and ops directories: `app/`, `api/`, `scripts/`
|
||||
- Package scripts snapshot: No package.json scripts were detected.
|
||||
|
||||
## Existing docs to reconcile
|
||||
|
||||
- No directly matching legacy docs were detected; this page is currently the canonical seed.
|
||||
|
||||
## What this page should cover next
|
||||
|
||||
- Describe the current implementation rather than an aspirational future-only design.
|
||||
- Keep terminology aligned with the repository root README, manifests, and actual directories.
|
||||
- Link deeper runbooks, specs, or subsystem notes from the legacy docs listed above.
|
||||
- Keep diagrams and ownership notes synchronized with actual directories, services, and integration dependencies.
|
||||
@ -1,24 +0,0 @@
|
||||
# Deployment
|
||||
|
||||
This repository documents infrastructure orchestration and observability composition rather than a single application binary.
|
||||
|
||||
Use this page to standardize deployment prerequisites, supported topologies, operational checks, and rollback notes.
|
||||
|
||||
## Current code-aligned notes
|
||||
|
||||
- Documentation target: `observability.svc.plus`
|
||||
- Repo kind: `infra-observability`
|
||||
- Manifest and build evidence: repository structure and scripts only
|
||||
- Primary implementation and ops directories: `app/`, `api/`, `scripts/`
|
||||
- Package scripts snapshot: No package.json scripts were detected.
|
||||
|
||||
## Existing docs to reconcile
|
||||
|
||||
- No directly matching legacy docs were detected; this page is currently the canonical seed.
|
||||
|
||||
## What this page should cover next
|
||||
|
||||
- Describe the current implementation rather than an aspirational future-only design.
|
||||
- Keep terminology aligned with the repository root README, manifests, and actual directories.
|
||||
- Link deeper runbooks, specs, or subsystem notes from the legacy docs listed above.
|
||||
- Verify deployment steps against current scripts, manifests, CI/CD flow, and environment contracts before each release.
|
||||
@ -1,24 +0,0 @@
|
||||
# Design
|
||||
|
||||
This repository documents infrastructure orchestration and observability composition rather than a single application binary.
|
||||
|
||||
Use this page to consolidate design decisions, ADR-style tradeoffs, and roadmap-sensitive implementation notes.
|
||||
|
||||
## Current code-aligned notes
|
||||
|
||||
- Documentation target: `observability.svc.plus`
|
||||
- Repo kind: `infra-observability`
|
||||
- Manifest and build evidence: repository structure and scripts only
|
||||
- Primary implementation and ops directories: `app/`, `api/`, `scripts/`
|
||||
- Package scripts snapshot: No package.json scripts were detected.
|
||||
|
||||
## Existing docs to reconcile
|
||||
|
||||
- No directly matching legacy docs were detected; this page is currently the canonical seed.
|
||||
|
||||
## What this page should cover next
|
||||
|
||||
- Describe the current implementation rather than an aspirational future-only design.
|
||||
- Keep terminology aligned with the repository root README, manifests, and actual directories.
|
||||
- Link deeper runbooks, specs, or subsystem notes from the legacy docs listed above.
|
||||
- Promote one-off implementation notes into reusable design records when behavior, APIs, or deployment contracts change.
|
||||
@ -1,24 +0,0 @@
|
||||
# Developer Guide
|
||||
|
||||
This repository documents infrastructure orchestration and observability composition rather than a single application binary.
|
||||
|
||||
Use this page to document local setup, project structure, test surfaces, and contribution conventions tied to the current codebase.
|
||||
|
||||
## Current code-aligned notes
|
||||
|
||||
- Documentation target: `observability.svc.plus`
|
||||
- Repo kind: `infra-observability`
|
||||
- Manifest and build evidence: repository structure and scripts only
|
||||
- Primary implementation and ops directories: `app/`, `api/`, `scripts/`
|
||||
- Package scripts snapshot: No package.json scripts were detected.
|
||||
|
||||
## Existing docs to reconcile
|
||||
|
||||
- No directly matching legacy docs were detected; this page is currently the canonical seed.
|
||||
|
||||
## What this page should cover next
|
||||
|
||||
- Describe the current implementation rather than an aspirational future-only design.
|
||||
- Keep terminology aligned with the repository root README, manifests, and actual directories.
|
||||
- Link deeper runbooks, specs, or subsystem notes from the legacy docs listed above.
|
||||
- Keep setup and test commands tied to actual package scripts, Make targets, or language toolchains in this repository.
|
||||
@ -1,24 +0,0 @@
|
||||
# User Guide
|
||||
|
||||
This repository documents infrastructure orchestration and observability composition rather than a single application binary.
|
||||
|
||||
Use this page to document primary user/operator tasks, everyday workflows, and navigation to existing how-to material.
|
||||
|
||||
## Current code-aligned notes
|
||||
|
||||
- Documentation target: `observability.svc.plus`
|
||||
- Repo kind: `infra-observability`
|
||||
- Manifest and build evidence: repository structure and scripts only
|
||||
- Primary implementation and ops directories: `app/`, `api/`, `scripts/`
|
||||
- Package scripts snapshot: No package.json scripts were detected.
|
||||
|
||||
## Existing docs to reconcile
|
||||
|
||||
- No directly matching legacy docs were detected; this page is currently the canonical seed.
|
||||
|
||||
## What this page should cover next
|
||||
|
||||
- Describe the current implementation rather than an aspirational future-only design.
|
||||
- Keep terminology aligned with the repository root README, manifests, and actual directories.
|
||||
- Link deeper runbooks, specs, or subsystem notes from the legacy docs listed above.
|
||||
- Prefer workflow-oriented examples and keep screenshots or terminal snippets aligned with the latest UI or CLI behavior.
|
||||
@ -1,24 +0,0 @@
|
||||
# Vibe Coding Reference
|
||||
|
||||
This repository documents infrastructure orchestration and observability composition rather than a single application binary.
|
||||
|
||||
Use this page to align AI-assisted coding prompts, repo boundaries, safe edit rules, and documentation update expectations.
|
||||
|
||||
## Current code-aligned notes
|
||||
|
||||
- Documentation target: `observability.svc.plus`
|
||||
- Repo kind: `infra-observability`
|
||||
- Manifest and build evidence: repository structure and scripts only
|
||||
- Primary implementation and ops directories: `app/`, `api/`, `scripts/`
|
||||
- Package scripts snapshot: No package.json scripts were detected.
|
||||
|
||||
## Existing docs to reconcile
|
||||
|
||||
- No directly matching legacy docs were detected; this page is currently the canonical seed.
|
||||
|
||||
## What this page should cover next
|
||||
|
||||
- Describe the current implementation rather than an aspirational future-only design.
|
||||
- Keep terminology aligned with the repository root README, manifests, and actual directories.
|
||||
- Link deeper runbooks, specs, or subsystem notes from the legacy docs listed above.
|
||||
- Review prompt templates and repo rules whenever the project adds new subsystems, protected areas, or mandatory verification steps.
|
||||
@ -1,23 +0,0 @@
|
||||
# 可观测性服务 文档
|
||||
|
||||
该仓库更偏向基础设施编排与可观测体系组合,而不是单一应用二进制。
|
||||
|
||||
## 当前状态快照
|
||||
|
||||
- 根 README 标题: `Observability.svc.plus`
|
||||
- 构建与运行时证据: repository structure and scripts only
|
||||
- 自动识别的主要目录: `app/`, `api/`, `scripts/`
|
||||
- 现有文档数量: 0
|
||||
|
||||
## 核心双语文档
|
||||
|
||||
- [架构](architecture.md)
|
||||
- [设计](design.md)
|
||||
- [部署](deployment.md)
|
||||
- [使用手册](user-guide.md)
|
||||
- [开发手册](developer-guide.md)
|
||||
- [Vibe Coding 参考](vibe-coding-reference.md)
|
||||
|
||||
## 待归并的历史文档
|
||||
|
||||
- No pre-existing markdown docs were detected in this repository.
|
||||
@ -1,24 +0,0 @@
|
||||
# 架构
|
||||
|
||||
该仓库更偏向基础设施编排与可观测体系组合,而不是单一应用二进制。
|
||||
|
||||
本页作为系统边界、核心组件与仓库职责的双语总览入口。
|
||||
|
||||
## 与当前代码对齐的说明
|
||||
|
||||
- 文档目标仓库: `observability.svc.plus`
|
||||
- 仓库类型: `infra-observability`
|
||||
- 构建与运行依据: repository structure and scripts only
|
||||
- 主要实现与运维目录: `app/`, `api/`, `scripts/`
|
||||
- `package.json` 脚本快照: No package.json scripts were detected.
|
||||
|
||||
## 需要继续归并的现有文档
|
||||
|
||||
- 尚未发现直接对应的历史文档,本页目前就是该类别的规范起点。
|
||||
|
||||
## 本页下一步应补充的内容
|
||||
|
||||
- 先描述当前已落地实现,再补充未来规划,避免只写愿景不写现状。
|
||||
- 术语需要与仓库根 README、构建清单和实际目录保持一致。
|
||||
- 将上方列出的历史 runbook、spec、子系统说明逐步链接并归并到本页。
|
||||
- 随着目录结构、服务关系和集成依赖变化,持续同步图示与职责说明。
|
||||
@ -1,24 +0,0 @@
|
||||
# 部署
|
||||
|
||||
该仓库更偏向基础设施编排与可观测体系组合,而不是单一应用二进制。
|
||||
|
||||
本页用于统一部署前提、支持的拓扑、运维检查项与回滚注意事项。
|
||||
|
||||
## 与当前代码对齐的说明
|
||||
|
||||
- 文档目标仓库: `observability.svc.plus`
|
||||
- 仓库类型: `infra-observability`
|
||||
- 构建与运行依据: repository structure and scripts only
|
||||
- 主要实现与运维目录: `app/`, `api/`, `scripts/`
|
||||
- `package.json` 脚本快照: No package.json scripts were detected.
|
||||
|
||||
## 需要继续归并的现有文档
|
||||
|
||||
- 尚未发现直接对应的历史文档,本页目前就是该类别的规范起点。
|
||||
|
||||
## 本页下一步应补充的内容
|
||||
|
||||
- 先描述当前已落地实现,再补充未来规划,避免只写愿景不写现状。
|
||||
- 术语需要与仓库根 README、构建清单和实际目录保持一致。
|
||||
- 将上方列出的历史 runbook、spec、子系统说明逐步链接并归并到本页。
|
||||
- 每次发布前,依据当前脚本、清单、CI/CD 流程和环境契约重新核对部署步骤。
|
||||
@ -1,24 +0,0 @@
|
||||
# 设计
|
||||
|
||||
该仓库更偏向基础设施编排与可观测体系组合,而不是单一应用二进制。
|
||||
|
||||
本页用于汇总设计决策、类似 ADR 的权衡记录,以及与路线图相关的实现说明。
|
||||
|
||||
## 与当前代码对齐的说明
|
||||
|
||||
- 文档目标仓库: `observability.svc.plus`
|
||||
- 仓库类型: `infra-observability`
|
||||
- 构建与运行依据: repository structure and scripts only
|
||||
- 主要实现与运维目录: `app/`, `api/`, `scripts/`
|
||||
- `package.json` 脚本快照: No package.json scripts were detected.
|
||||
|
||||
## 需要继续归并的现有文档
|
||||
|
||||
- 尚未发现直接对应的历史文档,本页目前就是该类别的规范起点。
|
||||
|
||||
## 本页下一步应补充的内容
|
||||
|
||||
- 先描述当前已落地实现,再补充未来规划,避免只写愿景不写现状。
|
||||
- 术语需要与仓库根 README、构建清单和实际目录保持一致。
|
||||
- 将上方列出的历史 runbook、spec、子系统说明逐步链接并归并到本页。
|
||||
- 当行为、API 或部署契约发生变化时,把一次性实现笔记提升为可复用设计记录。
|
||||
@ -1,24 +0,0 @@
|
||||
# 开发手册
|
||||
|
||||
该仓库更偏向基础设施编排与可观测体系组合,而不是单一应用二进制。
|
||||
|
||||
本页用于记录本地开发环境、项目结构、测试面与贴合当前代码库的贡献约定。
|
||||
|
||||
## 与当前代码对齐的说明
|
||||
|
||||
- 文档目标仓库: `observability.svc.plus`
|
||||
- 仓库类型: `infra-observability`
|
||||
- 构建与运行依据: repository structure and scripts only
|
||||
- 主要实现与运维目录: `app/`, `api/`, `scripts/`
|
||||
- `package.json` 脚本快照: No package.json scripts were detected.
|
||||
|
||||
## 需要继续归并的现有文档
|
||||
|
||||
- 尚未发现直接对应的历史文档,本页目前就是该类别的规范起点。
|
||||
|
||||
## 本页下一步应补充的内容
|
||||
|
||||
- 先描述当前已落地实现,再补充未来规划,避免只写愿景不写现状。
|
||||
- 术语需要与仓库根 README、构建清单和实际目录保持一致。
|
||||
- 将上方列出的历史 runbook、spec、子系统说明逐步链接并归并到本页。
|
||||
- 持续让环境搭建与测试命令对应真实存在的脚本、Make 目标或语言工具链。
|
||||
@ -1,24 +0,0 @@
|
||||
# 使用手册
|
||||
|
||||
该仓库更偏向基础设施编排与可观测体系组合,而不是单一应用二进制。
|
||||
|
||||
本页用于记录主要用户或运维角色的日常任务、常见流程,以及现有操作文档入口。
|
||||
|
||||
## 与当前代码对齐的说明
|
||||
|
||||
- 文档目标仓库: `observability.svc.plus`
|
||||
- 仓库类型: `infra-observability`
|
||||
- 构建与运行依据: repository structure and scripts only
|
||||
- 主要实现与运维目录: `app/`, `api/`, `scripts/`
|
||||
- `package.json` 脚本快照: No package.json scripts were detected.
|
||||
|
||||
## 需要继续归并的现有文档
|
||||
|
||||
- 尚未发现直接对应的历史文档,本页目前就是该类别的规范起点。
|
||||
|
||||
## 本页下一步应补充的内容
|
||||
|
||||
- 先描述当前已落地实现,再补充未来规划,避免只写愿景不写现状。
|
||||
- 术语需要与仓库根 README、构建清单和实际目录保持一致。
|
||||
- 将上方列出的历史 runbook、spec、子系统说明逐步链接并归并到本页。
|
||||
- 优先提供面向流程的示例,并确保截图或终端片段与最新 UI/CLI 行为一致。
|
||||
@ -1,24 +0,0 @@
|
||||
# Vibe Coding 参考
|
||||
|
||||
该仓库更偏向基础设施编排与可观测体系组合,而不是单一应用二进制。
|
||||
|
||||
本页用于统一 AI 辅助开发提示词、仓库边界、安全编辑规则与文档同步要求。
|
||||
|
||||
## 与当前代码对齐的说明
|
||||
|
||||
- 文档目标仓库: `observability.svc.plus`
|
||||
- 仓库类型: `infra-observability`
|
||||
- 构建与运行依据: repository structure and scripts only
|
||||
- 主要实现与运维目录: `app/`, `api/`, `scripts/`
|
||||
- `package.json` 脚本快照: No package.json scripts were detected.
|
||||
|
||||
## 需要继续归并的现有文档
|
||||
|
||||
- 尚未发现直接对应的历史文档,本页目前就是该类别的规范起点。
|
||||
|
||||
## 本页下一步应补充的内容
|
||||
|
||||
- 先描述当前已落地实现,再补充未来规划,避免只写愿景不写现状。
|
||||
- 术语需要与仓库根 README、构建清单和实际目录保持一致。
|
||||
- 将上方列出的历史 runbook、spec、子系统说明逐步链接并归并到本页。
|
||||
- 当项目新增子系统、受保护目录或强制验证步骤时,同步更新提示模板与仓库规则。
|
||||
@ -1,31 +1,28 @@
|
||||
# Grafana Dashboards
|
||||
|
||||
This directory contains Grafana dashboard definitions for the observability stack.
|
||||
This directory contains Grafana dashboard definitions for Pigsty monitoring system.
|
||||
|
||||
## Overview
|
||||
|
||||
The repository currently provides **61 domain dashboards + 1 homepage dashboard**.
|
||||
Dashboards are organized by platform-engineering resource domains:
|
||||
Pigsty provides **57 built-in dashboards** organized by module:
|
||||
|
||||
| Folder | Count | Description |
|
||||
|--------|-------|-------------|
|
||||
| [01-iaas-compute](01-iaas-compute/) | 5 | IAAS compute: node overview, cluster, instance, alert, compatibility summary |
|
||||
| [02-iaas-storage](02-iaas-storage/) | 4 | IAAS storage: disk, JuiceFS, MinIO overview and instance |
|
||||
| [03-iaas-network](03-iaas-network/) | 1 | IAAS network: VIP and node-network entry |
|
||||
| [11-paas-control-plane](11-paas-control-plane/) | 10 | PaaS control plane: Pigsty, Grafana, Victoria stack, Alertmanager, etcd, CMDB |
|
||||
| [12-paas-cluster](12-paas-cluster/) | 1 | PaaS cluster: Kubernetes overview |
|
||||
| [13-paas-db](13-paas-db/) | 29 | PaaS DB: PostgreSQL, PGRDS, PGCAT, Mongo/FerretDB |
|
||||
| [14-paas-cache](14-paas-cache/) | 3 | PaaS cache: Redis overview, cluster, instance |
|
||||
| [22-bu-proxy](22-bu-proxy/) | 2 | Business unit proxy: Nginx and HAProxy |
|
||||
| [24-bu-request](24-bu-request/) | 5 | Business unit request: logs, sessions, vector, request-side tooling |
|
||||
| - | 1 | [homepage.json](homepage.json) - Platform engineering entry dashboard |
|
||||
| Directory | Count | Description |
|
||||
|-----------------|-------|-------------------------------------------------------------------------|
|
||||
| [pgsql](pgsql/) | 29 | PostgreSQL cluster, instance, database, and query monitoring |
|
||||
| [infra](infra/) | 11 | Infrastructure components (VictoriaMetrics, Grafana, Nginx, etcd, etc.) |
|
||||
| [node](node/) | 8 | Host-level metrics (CPU, memory, disk, network, HAProxy, VIP) |
|
||||
| [redis](redis/) | 3 | Redis cluster and instance monitoring |
|
||||
| [app](app/) | 2 | Application dashboards (PostgreSQL logs analysis) |
|
||||
| [minio](minio/) | 2 | MinIO S3-compatible storage monitoring |
|
||||
| [mongo](mongo/) | 1 | MongoDB/FerretDB monitoring |
|
||||
| - | 1 | [pigsty.json](pigsty.json) - Main home dashboard |
|
||||
|
||||
|
||||
## Dashboard Catalog
|
||||
|
||||
### Home
|
||||
|
||||
- **[homepage.json](homepage.json)** - Platform engineering entry dashboard with domain summaries and navigation
|
||||
- **[pigsty.json](pigsty.json)** - Pigsty home dashboard with global overview
|
||||
|
||||
### PGSQL Dashboards
|
||||
|
||||
|
||||
@ -10,51 +10,11 @@
|
||||
#==============================================================#
|
||||
import os, sys, json, requests
|
||||
|
||||
|
||||
def env_flag(name, default):
|
||||
value = os.environ.get(name)
|
||||
if value is None:
|
||||
return default
|
||||
return value.lower() in ('1', 'true', 'yes', 'on')
|
||||
|
||||
# grafana access info
|
||||
ENDPOINT = os.environ.get("GRAFANA_ENDPOINT", 'http://i.pigsty/ui')
|
||||
USERNAME = os.environ.get("GRAFANA_USERNAME", 'admin')
|
||||
PASSWORD = os.environ.get("GRAFANA_PASSWORD", 'pigsty')
|
||||
CREATE_FOLDERS = env_flag('GRAFANA_CREATE_FOLDERS', True)
|
||||
SKIP_SUBFOLDERS = env_flag('GRAFANA_SKIP_SUBFOLDERS', False)
|
||||
|
||||
FOLDER_TITLES = {
|
||||
'01-iaas-compute': 'IAAS / 计算',
|
||||
'02-iaas-storage': 'IAAS / 存储',
|
||||
'03-iaas-network': 'IAAS / 网络',
|
||||
'11-paas-control-plane': 'PaaS / 平台控制面',
|
||||
'12-paas-cluster': 'PaaS / 集群',
|
||||
'13-paas-db': 'PaaS / DB',
|
||||
'14-paas-cache': 'PaaS / 缓存',
|
||||
'15-paas-queue': 'PaaS / 队列',
|
||||
'21-bu-dns': '业务单元 / DNS',
|
||||
'22-bu-proxy': '业务单元 / 代理',
|
||||
'23-bu-gateway': '业务单元 / 网关',
|
||||
'24-bu-request': '业务单元 / 请求',
|
||||
'25-bu-throughput': '业务单元 / 吞吐',
|
||||
}
|
||||
|
||||
FOLDER_TAGS = {
|
||||
'01-iaas-compute': ['IAAS', 'IAAS-COMPUTE'],
|
||||
'02-iaas-storage': ['IAAS', 'IAAS-STORAGE'],
|
||||
'03-iaas-network': ['IAAS', 'IAAS-NETWORK'],
|
||||
'11-paas-control-plane': ['PAAS', 'PAAS-CONTROL-PLANE'],
|
||||
'12-paas-cluster': ['PAAS', 'PAAS-CLUSTER'],
|
||||
'13-paas-db': ['PAAS', 'PAAS-DB'],
|
||||
'14-paas-cache': ['PAAS', 'PAAS-CACHE'],
|
||||
'15-paas-queue': ['PAAS', 'PAAS-QUEUE'],
|
||||
'21-bu-dns': ['BU', 'BU-DNS'],
|
||||
'22-bu-proxy': ['BU', 'BU-PROXY'],
|
||||
'23-bu-gateway': ['BU', 'BU-GATEWAY'],
|
||||
'24-bu-request': ['BU', 'BU-REQUEST'],
|
||||
'25-bu-throughput': ['BU', 'BU-THROUGHPUT'],
|
||||
}
|
||||
CREATE_FOLDERS = True
|
||||
|
||||
METADB_PASSWORD = 'DBUser.Viewer'
|
||||
DEFAULT_DATASOURCES = {
|
||||
@ -158,7 +118,7 @@ def add_folder(uid, title=""):
|
||||
if not CREATE_FOLDERS:
|
||||
return
|
||||
if title == "":
|
||||
title = resolve_folder_title(uid)
|
||||
title = uid.upper()
|
||||
post('folders', {"uid": uid, "title": title})
|
||||
return put('folders/%s' % uid, {"title": title, "overwrite": True})
|
||||
|
||||
@ -252,30 +212,6 @@ def load_dashboard(path, substitute=False):
|
||||
else:
|
||||
return json.load(open(path))
|
||||
|
||||
|
||||
def resolve_folder_title(uid):
|
||||
return FOLDER_TITLES.get(uid, uid.upper())
|
||||
|
||||
|
||||
def enrich_dashboard(dashboard, folder=None):
|
||||
if not folder:
|
||||
return dashboard
|
||||
extra_tags = FOLDER_TAGS.get(folder, [])
|
||||
if not extra_tags:
|
||||
return dashboard
|
||||
existing_tags = dashboard.get("tags", [])
|
||||
if not isinstance(existing_tags, list):
|
||||
existing_tags = []
|
||||
merged_tags = []
|
||||
seen = set()
|
||||
for tag in existing_tags + extra_tags:
|
||||
if not tag or tag in seen:
|
||||
continue
|
||||
seen.add(tag)
|
||||
merged_tags.append(tag)
|
||||
dashboard["tags"] = merged_tags
|
||||
return dashboard
|
||||
|
||||
# json serializer: use compact_json if available, fallback to standard json
|
||||
try:
|
||||
from compact_json import Formatter
|
||||
@ -335,7 +271,7 @@ def init_all(dashboard_dir):
|
||||
if os.path.isfile(abs_path) and f.endswith('.json') and not f.startswith('.'):
|
||||
print("init dashboard : %s" % f)
|
||||
add_dashboard(load_dashboard(abs_path, True))
|
||||
if os.path.isdir(abs_path) and not SKIP_SUBFOLDERS:
|
||||
if os.path.isdir(abs_path):
|
||||
folders.append((f, abs_path)) # folder name, abs path
|
||||
|
||||
home_uid = "home"
|
||||
@ -347,13 +283,13 @@ def init_all(dashboard_dir):
|
||||
# load other second-layer dashboards
|
||||
for folder_name, folder_path in folders:
|
||||
print("init folder %s" % folder_name)
|
||||
add_folder(folder_name, resolve_folder_title(folder_name))
|
||||
add_folder(folder_name, folder_name.upper())
|
||||
|
||||
for f in os.listdir(folder_path):
|
||||
abs_path = os.path.join(dashboard_dir, folder_name, f)
|
||||
if os.path.isfile(abs_path) and f.endswith('.json') and not f.startswith('.'):
|
||||
print("init dashboard: %s / %s" % (folder_name, f))
|
||||
add_dashboard(enrich_dashboard(load_dashboard(abs_path, True), folder_name), folder_name)
|
||||
add_dashboard(load_dashboard(abs_path, True), folder_name)
|
||||
|
||||
|
||||
def load_all(dashboard_dir):
|
||||
@ -364,18 +300,18 @@ def load_all(dashboard_dir):
|
||||
if os.path.isfile(abs_path) and f.endswith('.json') and not f.startswith('.'):
|
||||
print("load dashboard : %s" % f)
|
||||
add_dashboard(load_dashboard(abs_path))
|
||||
if os.path.isdir(abs_path) and not SKIP_SUBFOLDERS:
|
||||
if os.path.isdir(abs_path):
|
||||
folders.append((f, abs_path)) # folder name, abs path
|
||||
|
||||
for folder_name, folder_path in folders:
|
||||
print("add folder %s" % folder_name)
|
||||
add_folder(folder_name, resolve_folder_title(folder_name))
|
||||
add_folder(folder_name, folder_name.upper())
|
||||
|
||||
for f in os.listdir(folder_path):
|
||||
abs_path = os.path.join(dashboard_dir, folder_name, f)
|
||||
if os.path.isfile(abs_path) and f.endswith('.json') and not f.startswith('.'):
|
||||
print("load dashboard: %s / %s" % (folder_name, f))
|
||||
add_dashboard(enrich_dashboard(load_dashboard(abs_path), folder_name), folder_name)
|
||||
add_dashboard(load_dashboard(abs_path), folder_name)
|
||||
|
||||
|
||||
def dump_all(dashboard_dir):
|
||||
|
||||
File diff suppressed because one or more lines are too long
11
infra.yml
11
infra.yml
@ -103,13 +103,4 @@
|
||||
# - add_logs : register infra as vector logging source
|
||||
# - add_ds : register infra victoria stack as grafana datasource
|
||||
#--------------------------------------------------------------#
|
||||
# Mixed Existing-Host Deployment
|
||||
#--------------------------------------------------------------#
|
||||
# Center service example:
|
||||
# ./infra.yml -l us-xhttp.svc.plus \
|
||||
# -e infra_domain=observability.svc.plus \
|
||||
# -e 'infra_portal={\"home\":{\"domain\":\"observability.svc.plus\"}}' \
|
||||
# -e caddy_enabled=true \
|
||||
# -e nginx_enabled=false
|
||||
#--------------------------------------------------------------#
|
||||
...
|
||||
...
|
||||
433
merge_dashboards.py
Normal file → Executable file
433
merge_dashboards.py
Normal file → Executable file
@ -1,357 +1,122 @@
|
||||
import copy
|
||||
import json
|
||||
|
||||
|
||||
CONTROL_PLANE_PATH = "files/grafana/11-paas-control-plane/pigsty.json"
|
||||
OUTPUT_PATH = "files/grafana/homepage.json"
|
||||
|
||||
VISIBLE_VARS = [
|
||||
{
|
||||
"name": "version",
|
||||
"type": "constant",
|
||||
"query": "v4.0.0",
|
||||
"hide": 2,
|
||||
},
|
||||
{
|
||||
"name": "origin_prometheus",
|
||||
"label": "数据源",
|
||||
"type": "query",
|
||||
"datasource": {"uid": "ds-prometheus"},
|
||||
"query": "label_values(kube_node_info,origin_prometheus)",
|
||||
"refresh": 1,
|
||||
},
|
||||
{
|
||||
"name": "interval",
|
||||
"label": "采样间隔",
|
||||
"type": "interval",
|
||||
"query": "3m,5m,10m,30m,1h,6h,12h,1d",
|
||||
},
|
||||
]
|
||||
|
||||
DOMAIN_SECTIONS = [
|
||||
{
|
||||
"title": "IAAS资源",
|
||||
"items": [
|
||||
{
|
||||
"title": "计算",
|
||||
"description": "主机容量、节点健康、实例告警",
|
||||
"folder_uid": "01-iaas-compute",
|
||||
"folder_title": "IAAS / 计算",
|
||||
"tag": "IAAS-COMPUTE",
|
||||
"highlights": ["Node Overview", "Node Instance", "Node Alert"],
|
||||
"dash_height": 9,
|
||||
},
|
||||
{
|
||||
"title": "存储",
|
||||
"description": "磁盘、卷、对象存储、JuiceFS",
|
||||
"folder_uid": "02-iaas-storage",
|
||||
"folder_title": "IAAS / 存储",
|
||||
"tag": "IAAS-STORAGE",
|
||||
"highlights": ["Node Disk", "MinIO Overview", "Node JuiceFS"],
|
||||
"dash_height": 9,
|
||||
},
|
||||
{
|
||||
"title": "网络",
|
||||
"description": "VIP、节点网络、底层连通性",
|
||||
"folder_uid": "03-iaas-network",
|
||||
"folder_title": "IAAS / 网络",
|
||||
"tag": "IAAS-NETWORK",
|
||||
"highlights": ["Node VIP"],
|
||||
"dash_height": 8,
|
||||
},
|
||||
],
|
||||
},
|
||||
{
|
||||
"title": "PaaS服务",
|
||||
"items": [
|
||||
{
|
||||
"title": "平台控制面",
|
||||
"description": "Grafana、Victoria、Alertmanager、Etcd、CMDB",
|
||||
"folder_uid": "11-paas-control-plane",
|
||||
"folder_title": "PaaS / 平台控制面",
|
||||
"tag": "PAAS-CONTROL-PLANE",
|
||||
"highlights": ["Infra Overview", "Victoria Metrics", "Alert Manager"],
|
||||
"dash_height": 10,
|
||||
},
|
||||
{
|
||||
"title": "集群",
|
||||
"description": "K8S 集群资源、命名空间与工作负载入口",
|
||||
"folder_uid": "12-paas-cluster",
|
||||
"folder_title": "PaaS / 集群",
|
||||
"tag": "PAAS-CLUSTER",
|
||||
"highlights": ["K8S Dashboard"],
|
||||
"dash_height": 8,
|
||||
},
|
||||
{
|
||||
"title": "DB",
|
||||
"description": "PGSQL、PGRDS、PGCAT、Ferret",
|
||||
"folder_uid": "13-paas-db",
|
||||
"folder_title": "PaaS / DB",
|
||||
"tag": "PAAS-DB",
|
||||
"highlights": ["PGSQL Overview", "PGSQL Cluster", "PGCAT Instance"],
|
||||
"dash_height": 14,
|
||||
},
|
||||
{
|
||||
"title": "缓存",
|
||||
"description": "Redis 集群、实例与缓存服务运行面",
|
||||
"folder_uid": "14-paas-cache",
|
||||
"folder_title": "PaaS / 缓存",
|
||||
"tag": "PAAS-CACHE",
|
||||
"highlights": ["Redis Overview", "Redis Cluster"],
|
||||
"dash_height": 9,
|
||||
},
|
||||
],
|
||||
},
|
||||
{
|
||||
"title": "业务监控",
|
||||
"items": [
|
||||
{
|
||||
"title": "代理",
|
||||
"description": "Nginx、HAProxy 与流量接入层",
|
||||
"folder_uid": "22-bu-proxy",
|
||||
"folder_title": "业务单元 / 代理",
|
||||
"tag": "BU-PROXY",
|
||||
"highlights": ["Nginx Instance", "Node HAProxy"],
|
||||
"dash_height": 8,
|
||||
},
|
||||
{
|
||||
"title": "请求",
|
||||
"description": "请求日志、会话、链路与请求级观测",
|
||||
"folder_uid": "24-bu-request",
|
||||
"folder_title": "业务单元 / 请求",
|
||||
"tag": "BU-REQUEST",
|
||||
"highlights": ["PGLOG Overview", "Logs Instance", "Node Vector"],
|
||||
"dash_height": 9,
|
||||
},
|
||||
],
|
||||
},
|
||||
]
|
||||
|
||||
|
||||
def shift_panel(panel, delta_y):
|
||||
panel["gridPos"]["y"] += delta_y
|
||||
for nested in panel.get("panels", []):
|
||||
shift_panel(nested, delta_y)
|
||||
|
||||
|
||||
def clone_panel(panel, x, y, w=None, h=None):
|
||||
cloned = copy.deepcopy(panel)
|
||||
cloned["gridPos"] = {
|
||||
"x": x,
|
||||
"y": y,
|
||||
"w": w if w is not None else panel["gridPos"]["w"],
|
||||
"h": h if h is not None else panel["gridPos"]["h"],
|
||||
}
|
||||
return cloned
|
||||
|
||||
|
||||
def make_text_panel(panel_id, title, html, x, y, w, h, transparent=True):
|
||||
return {
|
||||
"id": panel_id,
|
||||
"type": "text",
|
||||
"title": title,
|
||||
"gridPos": {"h": h, "w": w, "x": x, "y": y},
|
||||
"transparent": transparent,
|
||||
"options": {"content": html, "mode": "html"},
|
||||
}
|
||||
|
||||
|
||||
def make_row_panel(panel_id, title, y):
|
||||
return {
|
||||
"id": panel_id,
|
||||
"type": "row",
|
||||
"title": title,
|
||||
"collapsed": False,
|
||||
"panels": [],
|
||||
"gridPos": {"h": 1, "w": 24, "x": 0, "y": y},
|
||||
}
|
||||
|
||||
|
||||
def make_dashlist_panel(panel_id, title, tags, x, y, w, h, max_items=12):
|
||||
return {
|
||||
"id": panel_id,
|
||||
"type": "dashlist",
|
||||
"title": title,
|
||||
"pluginVersion": "12.3.0",
|
||||
"gridPos": {"h": h, "w": w, "x": x, "y": y},
|
||||
"options": {
|
||||
"includeVars": True,
|
||||
"keepTime": True,
|
||||
"maxItems": max_items,
|
||||
"query": "",
|
||||
"showFolderNames": False,
|
||||
"showHeadings": False,
|
||||
"showRecentlyViewed": False,
|
||||
"showSearch": False,
|
||||
"showStarred": False,
|
||||
"tags": tags,
|
||||
},
|
||||
}
|
||||
|
||||
|
||||
def summary_card_html(item):
|
||||
highlights = "".join(
|
||||
f"<li style='margin:0 0 4px 18px;'>{highlight}</li>"
|
||||
for highlight in item["highlights"]
|
||||
)
|
||||
return f"""
|
||||
<div style="border:1px solid #d1d5db;border-radius:16px;padding:14px 16px;background:#fbfdff;height:100%;">
|
||||
<div style="font-size:12px;color:#6b7280;margin-bottom:6px;">{item['folder_title']}</div>
|
||||
<div style="font-size:20px;font-weight:800;color:#111827;margin-bottom:8px;">{item['title']}</div>
|
||||
<div style="font-size:13px;line-height:1.5;color:#4b5563;">{item['description']}</div>
|
||||
<ul style="margin:10px 0 12px 0;padding:0;color:#111827;font-size:13px;line-height:1.45;">{highlights}</ul>
|
||||
<div style="display:inline-block;padding:8px 12px;border-radius:999px;background:#e5e7eb;color:#374151;font-size:12px;font-weight:700;">
|
||||
右侧保留可跳转目录
|
||||
</div>
|
||||
</div>
|
||||
"""
|
||||
|
||||
|
||||
def homepage_nav_html():
|
||||
return """
|
||||
<div style="padding:6px 2px 0 2px;">
|
||||
<div style="display:flex;justify-content:space-between;align-items:flex-end;gap:14px;flex-wrap:wrap;margin-bottom:10px;">
|
||||
<div>
|
||||
<div style="font-size:11px;color:#6b7280;margin-bottom:4px;">Platform Engineering Home</div>
|
||||
<div style="font-size:24px;font-weight:800;color:#111827;line-height:1.15;">平台工程总览入口</div>
|
||||
<div style="font-size:12px;color:#4b5563;margin-top:4px;line-height:1.45;">按 IaaS、PaaS、SaaS 逐层下钻,首页只保留入口与全局脉搏。</div>
|
||||
</div>
|
||||
<div style="font-size:11px;color:#94a3b8;font-weight:700;letter-spacing:0.04em;">IaaS → PaaS → SaaS</div>
|
||||
</div>
|
||||
<div style="display:grid;grid-template-columns:repeat(3,minmax(0,1fr));gap:10px;">
|
||||
<div style="border:1px solid #c7d2fe;border-radius:999px;padding:12px 18px;background:#eef4ff;min-height:0;display:flex;align-items:center;justify-content:center;">
|
||||
<div style="text-align:center;">
|
||||
<div style="font-size:26px;color:#1d4ed8;font-weight:800;line-height:1.1;">IaaS资源</div>
|
||||
<div style="font-size:12px;color:#5b6b91;margin-top:4px;">计算 / 存储 / 网络</div>
|
||||
</div>
|
||||
</div>
|
||||
<div style="border:1px solid #bbf7d0;border-radius:999px;padding:12px 18px;background:#effdf4;min-height:0;display:flex;align-items:center;justify-content:center;">
|
||||
<div style="text-align:center;">
|
||||
<div style="font-size:26px;color:#047857;font-weight:800;line-height:1.1;">PaaS服务</div>
|
||||
<div style="font-size:12px;color:#537566;margin-top:4px;">控制面 / 集群 / DB / 缓存</div>
|
||||
</div>
|
||||
</div>
|
||||
<div style="border:1px solid #fed7aa;border-radius:999px;padding:12px 18px;background:#fff7ed;min-height:0;display:flex;align-items:center;justify-content:center;">
|
||||
<div style="text-align:center;">
|
||||
<div style="font-size:26px;color:#c2410c;font-weight:800;line-height:1.1;">业务监控</div>
|
||||
<div style="font-size:12px;color:#8a6b53;margin-top:4px;">代理 / 请求</div>
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
"""
|
||||
|
||||
|
||||
def select_platform_summary_panels(control_plane):
|
||||
wanted = ["Pigsty ${version}", "Modules", "Instances", "Firing Alerts"]
|
||||
by_title = {panel.get("title"): panel for panel in control_plane.get("panels", [])}
|
||||
return [by_title[title] for title in wanted if title in by_title]
|
||||
|
||||
|
||||
def add_domain_section(homepage, start_id, current_y, section):
|
||||
panel_id = start_id
|
||||
homepage["panels"].append(make_row_panel(panel_id, section["title"], current_y))
|
||||
panel_id += 1
|
||||
current_y += 1
|
||||
|
||||
width = 24 // len(section["items"])
|
||||
summary_height = 5
|
||||
max_dash_height = max(item["dash_height"] for item in section["items"])
|
||||
|
||||
for index, item in enumerate(section["items"]):
|
||||
x = width * index
|
||||
homepage["panels"].append(
|
||||
make_text_panel(
|
||||
panel_id,
|
||||
f"{item['title']}摘要",
|
||||
summary_card_html(item),
|
||||
x,
|
||||
current_y,
|
||||
width,
|
||||
summary_height,
|
||||
)
|
||||
)
|
||||
panel_id += 1
|
||||
|
||||
current_y += summary_height
|
||||
|
||||
for index, item in enumerate(section["items"]):
|
||||
x = width * index
|
||||
homepage["panels"].append(
|
||||
make_dashlist_panel(
|
||||
panel_id,
|
||||
f"{item['title']}目录",
|
||||
[item["tag"]],
|
||||
x,
|
||||
current_y,
|
||||
width,
|
||||
item["dash_height"],
|
||||
max_items=20,
|
||||
)
|
||||
)
|
||||
panel_id += 1
|
||||
|
||||
current_y += max_dash_height
|
||||
return panel_id, current_y
|
||||
|
||||
import re
|
||||
import os
|
||||
|
||||
def merge_dashboards():
|
||||
with open(CONTROL_PLANE_PATH, "r") as handle:
|
||||
control_plane = json.load(handle)
|
||||
# Paths to source dashboards
|
||||
pig_path = 'files/grafana/pigsty.json'
|
||||
node_path = 'files/grafana/node.json'
|
||||
k8s_path = 'files/grafana/k8s.json'
|
||||
output_path = 'files/grafana/homepage.json'
|
||||
|
||||
# Read raw contents
|
||||
with open(pig_path, 'r') as f:
|
||||
pig_raw = f.read()
|
||||
with open(node_path, 'r') as f:
|
||||
node_raw = f.read()
|
||||
with open(k8s_path, 'r') as f:
|
||||
k8s_raw = f.read()
|
||||
|
||||
# Perform fixed variable mapping for node.json
|
||||
# $name -> $hostname, $instance -> $node, $show_name -> $show_hostname
|
||||
node_raw = re.sub(r'\$name\b', '$hostname', node_raw)
|
||||
node_raw = re.sub(r'\$\{name\}', '${hostname}', node_raw)
|
||||
node_raw = re.sub(r'\$instance\b', '$node', node_raw)
|
||||
node_raw = re.sub(r'\$\{instance\}', '${node}', node_raw)
|
||||
node_raw = re.sub(r'\$show_name\b', '$show_hostname', node_raw)
|
||||
node_raw = re.sub(r'\$\{show_name\}', '${show_hostname}', node_raw)
|
||||
|
||||
pig = json.loads(pig_raw)
|
||||
node = json.loads(node_raw)
|
||||
k8s = json.loads(k8s_raw)
|
||||
|
||||
# Base dashboard
|
||||
homepage = {
|
||||
"annotations": control_plane.get("annotations", {"list": []}),
|
||||
"description": "Platform engineering entry dashboard",
|
||||
"annotations": pig.get("annotations", {"list": []}),
|
||||
"description": "Pigsty Consolidated Homepage",
|
||||
"editable": True,
|
||||
"graphTooltip": 0,
|
||||
"id": None,
|
||||
"links": control_plane.get("links", []),
|
||||
"links": pig.get("links", []),
|
||||
"panels": [],
|
||||
"schemaVersion": 39,
|
||||
"tags": ["HOME", "Platform"],
|
||||
"templating": {"list": VISIBLE_VARS},
|
||||
"time": control_plane.get("time", {"from": "now-1h", "to": "now"}),
|
||||
"timepicker": control_plane.get("timepicker", {}),
|
||||
"tags": ["HOME", "Pigsty"],
|
||||
"templating": {"list": []},
|
||||
"time": pig.get("time", {"from": "now-1h", "to": "now"}),
|
||||
"timepicker": pig.get("timepicker", {}),
|
||||
"timezone": "browser",
|
||||
"title": "Homepage",
|
||||
"uid": "home",
|
||||
"version": 1,
|
||||
"version": 1
|
||||
}
|
||||
|
||||
panel_id = 1
|
||||
homepage["panels"].append(
|
||||
make_text_panel(panel_id, "总览导航", homepage_nav_html(), 0, 0, 24, 5)
|
||||
)
|
||||
panel_id += 1
|
||||
|
||||
current_y = 5
|
||||
homepage["panels"].append(make_row_panel(panel_id, "平台脉搏", current_y))
|
||||
panel_id += 1
|
||||
current_y += 1
|
||||
|
||||
summary_layout = [
|
||||
("Pigsty ${version}", 0, 6, 4, 6),
|
||||
("Modules", 4, 6, 4, 6),
|
||||
("Instances", 8, 6, 8, 6),
|
||||
("Firing Alerts", 16, 6, 8, 6),
|
||||
# Unified Variables
|
||||
unified_vars = [
|
||||
{"name": "version", "type": "constant", "query": "v4.0.0", "hide": 2},
|
||||
{"name": "origin_prometheus", "label": "数据源", "type": "query", "datasource": {"uid": "ds-prometheus"}, "query": "label_values(kube_node_info,origin_prometheus)", "refresh": 1},
|
||||
{"name": "Node", "label": "节点", "type": "query", "datasource": {"uid": "ds-prometheus"}, "query": "label_values(kube_node_info{origin_prometheus=~\"$origin_prometheus\"},node)"},
|
||||
{"name": "NameSpace", "label": "命名空间", "type": "query", "datasource": {"uid": "ds-prometheus"}, "query": "label_values(kube_namespace_created{origin_prometheus=~\"$origin_prometheus\"},namespace)"},
|
||||
{"name": "Container", "label": "微服务(容器名)", "type": "query", "datasource": {"uid": "ds-prometheus"}, "query": "label_values(kube_pod_container_info{origin_prometheus=~\"$origin_prometheus\",namespace=~\"$NameSpace\"},container)"},
|
||||
{"name": "Pod", "label": "Pod", "type": "query", "datasource": {"uid": "ds-prometheus"}, "query": "label_values(kube_pod_container_info{origin_prometheus=~\"$origin_prometheus\",namespace=~\"$NameSpace\",container=~\"$Container\"},pod)"},
|
||||
{"name": "job", "label": "JOB", "type": "query", "datasource": {"uid": "ds-prometheus"}, "query": "label_values(node_uname_info{origin_prometheus=~\"$origin_prometheus\"},job)"},
|
||||
{"name": "hostname", "label": "名称", "type": "query", "datasource": {"uid": "ds-prometheus"}, "query": "label_values(node_uname_info{origin_prometheus=~\"$origin_prometheus\", job=~\"$job\"},nodename)"},
|
||||
{"name": "node", "label": "IP", "type": "query", "datasource": {"uid": "ds-prometheus"}, "query": "label_values(node_uname_info{origin_prometheus=~\"$origin_prometheus\", job=~\"$job\", nodename=~\"$hostname\"},instance)"},
|
||||
{"name": "device", "label": "网卡", "type": "query", "datasource": {"uid": "ds-prometheus"}, "query": "label_values(node_network_info{origin_prometheus=~\"$origin_prometheus\", job=~\"$job\", instance=~\"$node\", device!~\"'tap.*|veth.*|br.*|docker.*|virbr.*|lo.*|cni.*'\"},device)"},
|
||||
{"name": "interval", "label": "间隔", "type": "interval", "query": "3m,5m,10m,30m,1h,6h,12h,1d"},
|
||||
{"name": "maxmount", "hide": 2, "type": "query", "datasource": {"uid": "ds-prometheus"}, "query": "query_result(topk(1,sort_desc(max(node_filesystem_size_bytes{origin_prometheus=~\"$origin_prometheus\",instance=~\"$node\",fstype=~\"ext.?|xfs\",mountpoint!~\".*pods.*\"}) by (mountpoint))))"},
|
||||
{"name": "show_hostname", "hide": 2, "type": "query", "datasource": {"uid": "ds-prometheus"}, "query": "label_values(node_uname_info{origin_prometheus=~\"$origin_prometheus\", job=~\"$job\", nodename=~\"$hostname\", instance=~\"$node\"},nodename)"},
|
||||
{"name": "total", "hide": 2, "type": "query", "datasource": {"uid": "ds-prometheus"}, "query": "query_result(count(node_uname_info{origin_prometheus=~\"$origin_prometheus\",job=~\"$job\"}))"}
|
||||
]
|
||||
summary_panels = {panel.get("title"): panel for panel in select_platform_summary_panels(control_plane)}
|
||||
for title, x, y, w, h in summary_layout:
|
||||
if title not in summary_panels:
|
||||
continue
|
||||
homepage["panels"].append(clone_panel(summary_panels[title], x, y, w, h))
|
||||
panel_id += 1
|
||||
current_y += 6
|
||||
homepage["templating"]["list"] = unified_vars
|
||||
|
||||
for section in DOMAIN_SECTIONS:
|
||||
panel_id, current_y = add_domain_section(homepage, panel_id, current_y, section)
|
||||
current_y = 0
|
||||
# 1. Infra
|
||||
homepage["panels"].append({"collapsed": False, "gridPos": {"h": 1, "w": 24, "x": 0, "y": current_y}, "title": "Infra Overview", "type": "row", "panels": []})
|
||||
current_y += 1
|
||||
|
||||
infra_max_y = current_y
|
||||
for p in pig.get("panels", []):
|
||||
if p.get("type") == "row": continue
|
||||
|
||||
# Replace "Apps" panel with "insight Overview" link
|
||||
if p.get("title") == "Apps":
|
||||
p["title"] = "insight Overview"
|
||||
p["type"] = "text"
|
||||
p["options"] = {
|
||||
"content": "<div style='text-align: center; padding-top: 10px;'><a href='https://observability.svc.plus/insight/' style='font-size: 18px; color: #58a6ff; font-weight: bold;'>insight Overview</a></div>",
|
||||
"mode": "html"
|
||||
}
|
||||
|
||||
p["gridPos"]["y"] += current_y
|
||||
homepage["panels"].append(p)
|
||||
infra_max_y = max(infra_max_y, p["gridPos"]["y"] + p["gridPos"]["h"])
|
||||
current_y = infra_max_y
|
||||
|
||||
for index, panel in enumerate(homepage["panels"], 1):
|
||||
panel["id"] = index
|
||||
# 2. Node
|
||||
homepage["panels"].append({"collapsed": False, "gridPos": {"h": 1, "w": 24, "x": 0, "y": current_y}, "title": "Node", "type": "row", "panels": []})
|
||||
current_y += 1
|
||||
node_max_y = current_y
|
||||
for p in node.get("panels", []):
|
||||
p["gridPos"]["y"] += current_y
|
||||
homepage["panels"].append(p)
|
||||
node_max_y = max(node_max_y, p["gridPos"]["y"] + p["gridPos"]["h"])
|
||||
current_y = node_max_y
|
||||
|
||||
with open(OUTPUT_PATH, "w") as handle:
|
||||
json.dump(homepage, handle, indent=2)
|
||||
# 3. K8S
|
||||
homepage["panels"].append({"collapsed": False, "gridPos": {"h": 1, "w": 24, "x": 0, "y": current_y}, "title": "K8S Cluster", "type": "row", "panels": []})
|
||||
current_y += 1
|
||||
k8s_max_y = current_y
|
||||
for p in k8s.get("panels", []):
|
||||
p["gridPos"]["y"] += current_y
|
||||
homepage["panels"].append(p)
|
||||
k8s_max_y = max(k8s_max_y, p["gridPos"]["y"] + p["gridPos"]["h"])
|
||||
current_y = k8s_max_y
|
||||
|
||||
for i, p in enumerate(homepage["panels"]):
|
||||
p["id"] = i + 1
|
||||
|
||||
with open(output_path, 'w') as f:
|
||||
json.dump(homepage, f, indent=2)
|
||||
|
||||
if __name__ == "__main__":
|
||||
merge_dashboards()
|
||||
|
||||
7
node.yml
7
node.yml
@ -32,11 +32,6 @@
|
||||
# node.yml -l <cls> # add groups
|
||||
# node.yml -l <ip> # add single node
|
||||
#
|
||||
# Observability push-agent mode:
|
||||
# ./node.yml -l openclaw.svc.plus,jp-xhttp.svc.plus \
|
||||
# -e node_monitor_mode=push \
|
||||
# -e observability_endpoint=https://observability.svc.plus/ \
|
||||
#
|
||||
# Bootstrap with another admin user: (Create admin with another admin)
|
||||
# node.yml -t node_admin # create admin user for nodes
|
||||
# node.yml -t node_admin -k -K -e ansible_user=<another admin>
|
||||
@ -117,4 +112,4 @@
|
||||
# - vector_config
|
||||
# - vector_launch
|
||||
#---------------------------------------------------------------
|
||||
...
|
||||
...
|
||||
@ -1,27 +0,0 @@
|
||||
# Role: deepflow_agent
|
||||
|
||||
Deploy DeepFlow agent in one of three modes:
|
||||
|
||||
- `binary + systemd`
|
||||
- `docker`
|
||||
- `k8s` manifest rendering
|
||||
|
||||
## Key Variables
|
||||
|
||||
- `deepflow_agent_mode` (`binary`, `docker`, `k8s`)
|
||||
- `deepflow_agent_profile` (`lite`, `full`)
|
||||
- `deepflow_agent_grpc_endpoint`
|
||||
- `deepflow_agent_download_url`
|
||||
- `deepflow_agent_binary_path`
|
||||
|
||||
## Default Lightweight Profile
|
||||
|
||||
The default `lite` profile keeps `pcap` enabled and disables:
|
||||
|
||||
- built-in `vector`
|
||||
- other optional non-core plugins
|
||||
|
||||
## Notes
|
||||
|
||||
- `k8s` mode renders a DaemonSet manifest and only applies it when `deepflow_agent_k8s_apply: true`
|
||||
- `docker` mode requires `docker_enabled: true`
|
||||
@ -1,41 +0,0 @@
|
||||
---
|
||||
#-----------------------------------------------------------------
|
||||
# DEEPFLOW AGENT
|
||||
#-----------------------------------------------------------------
|
||||
deepflow_agent_enabled: false
|
||||
deepflow_agent_mode: binary # binary|docker|k8s
|
||||
deepflow_agent_profile: lite # lite|full
|
||||
|
||||
deepflow_agent_stack_dir: /opt/deepflow-agent
|
||||
deepflow_agent_env_file: /etc/default/deepflow-agent
|
||||
deepflow_agent_compose_file: "{{ deepflow_agent_stack_dir }}/docker-compose.yml"
|
||||
deepflow_agent_k8s_file: "{{ deepflow_agent_stack_dir }}/deepflow-agent.yaml"
|
||||
deepflow_agent_run_script: /usr/local/bin/run-deepflow-agent.sh
|
||||
deepflow_agent_binary_path: /usr/local/bin/deepflow-agent
|
||||
deepflow_agent_download_url: ''
|
||||
|
||||
deepflow_agent_image: deepflowio/deepflow-agent-ce:latest
|
||||
deepflow_agent_grpc_endpoint: "{{ deepflow_grpc_domain | default('deepflow-agent.svc.plus') }}:443"
|
||||
deepflow_agent_endpoint_arg: --controller-ips
|
||||
deepflow_agent_extra_args: []
|
||||
deepflow_agent_disable_pcap: false
|
||||
deepflow_agent_disable_vector: true
|
||||
deepflow_agent_disable_plugins: true
|
||||
deepflow_agent_extra_env: {}
|
||||
|
||||
deepflow_agent_host_network: true
|
||||
deepflow_agent_container_name: deepflow-agent
|
||||
deepflow_agent_k8s_namespace: deepflow
|
||||
deepflow_agent_k8s_apply: false
|
||||
deepflow_agent_binary_install: true
|
||||
deepflow_agent_docker_enabled: true
|
||||
|
||||
deepflow_agent_cap_add:
|
||||
- NET_ADMIN
|
||||
- NET_RAW
|
||||
- SYS_ADMIN
|
||||
|
||||
deepflow_agent_volume_mounts:
|
||||
- /:/host:ro
|
||||
- /sys:/sys:ro
|
||||
- /var/run/docker.sock:/var/run/docker.sock
|
||||
@ -1,7 +0,0 @@
|
||||
galaxy_info:
|
||||
author: observability.svc.plus
|
||||
description: Deploy DeepFlow agent via binary/systemd, Docker, or Kubernetes manifests
|
||||
license: Apache-2.0
|
||||
min_ansible_version: '2.10'
|
||||
|
||||
dependencies: []
|
||||
Some files were not shown because too many files have changed in this diff Show More
Loading…
Reference in New Issue
Block a user