Files
awoooi/apps/api
OG T ceb61c3c8e
All checks were successful
CD Pipeline / build-and-deploy (push) Successful in 13m32s
feat(asset_scanner): Gap 1 修 — Prometheus targets 補齊 host-install services
Audit 發現 asset_inventory 只涵蓋 K8s (mon=120, mon1=121 共 2 node+78 pods),
完全漏 110 (Harbor/Gitea/監控) + 112 (security) + 188 (PG/Redis/Ollama) +
125 (mon backup/standby) 這 4 主機的 host-install services.

用戶 4 主機架構 (110/112/120/121/188) 只覆蓋 2/5 = 40%.

新增 _collect_prometheus_targets:
  GET /api/v1/targets?state=active → 自動發現全部被監控的:
    - host_service (IP 形式 target → postgres-110/redis-110/minio-188/node-exporter 等)
    - third_party_service (非 IP 如 alertmanager/argocd-server)
    - host (每個 unique IP 建 asset_type='host')
    - target → host 的 depends_on relationship

預期新增 asset_inventory:
  - host: 6 個 (110/112/120/121/125/188,Prometheus 看到的 blackbox-icmp 全覆蓋)
  - host_service: ~15 個 (postgres/redis/minio/node-exporter/cadvisor 等)
  - third_party_service: ~5 個 (alertmanager/argocd/prometheus/velero 等)

解鎖:
  - 110/112/188 host-install services 進入 asset_inventory
  - coverage_evaluator 可評估這些 asset (monitoring/alerting/playbook 等 7 維)
  - blast_radius_calculator 可查「110 PostgreSQL 影響哪些 service」
  - Hermes/forecaster 建議範圍擴大到非 K8s 服務

對齊統帥鐵律: 朝 AI 自主化 — 不硬編主機清單,動態從 Prometheus 發現

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-19 20:06:34 +08:00
..