7.1 KiB
IwoooS Monitoring / Alerting / Observability repo-only 清冊
| 項目 | 內容 |
|---|---|
| 日期 | 2026-06-12 |
| 狀態 | repo_only_inventory_ready |
| 工具 | scripts/security/monitoring-alerting-observability-inventory.py |
| Snapshot | docs/security/monitoring-alerting-observability-inventory.snapshot.json |
| Schema | docs/schemas/monitoring_alerting_observability_inventory_v1.schema.json |
| runtime gate | 0 |
1. 目的
這份清冊把 Prometheus、Alertmanager、Grafana、SigNoz、Sentry、Langfuse、OTEL、Telegram / notification policy、deploy / reload scripts 與 alert chain smoke scripts 集中納入 IwoooS 高價值配置控管。
本階段仍是 repo-only source inventory:只讀已提交檔案、計算 SHA256、整理 owner gate 與 live evidence 缺口。不連 live Prometheus、不 reload Alertmanager、不改 Grafana、不套用 SigNoz rule、不部署 Sentry、不發 Telegram、不建立 silence、不 SSH、不 kubectl、不讀 secret value。
2. 覆蓋摘要
| 指標 | 目前值 | 說明 |
|---|---|---|
| surface | 60 |
全部為 committed repo source |
| source exists | 60 |
每個 source path 都存在並有 SHA256 |
| Prometheus config surface | 8 |
基礎設定、remote write、generated target、service registry、exporter query |
| alert rule surface | 13 |
Prometheus、K8s、SLO、Ollama、app alert rule 與 Grafana alert rule |
| Alertmanager receiver surface | 1 |
route / receiver / grouping source |
| Grafana surface | 6 |
alert rule 與 dashboard JSON |
| SigNoz surface | 3 |
alert rule、log rule、API client |
| Sentry surface | 4 |
compose、deploy、webhook receiver、API client |
| Langfuse surface | 3 |
compose、runbook、API client |
| notification policy surface | 4 |
failure-only policy、backup policy、observability contract、notification matrix |
| Telegram surface | 3 |
digest policy、receipt package、gateway service |
| OTEL surface | 1 |
SigNoz OTEL collector |
| deploy / reload surface | 6 |
Alertmanager / Prometheus / Sentry / exporter deploy 或 reload-capable script |
| drift guard surface | 1 |
Prometheus rule drift guard |
| smoke surface | 4 |
live / test alert 與 alert chain smoke script |
| write-capable surface | 11 |
可能 reload、deploy、send notification、fire alert 或 restart exporter |
| owner response received / accepted | 0 / 0 |
不得假性拉高 |
| live evidence received | 0 |
尚未驗證 live monitoring truth |
| reload owner / receiver owner / route smoke accepted | 0 / 0 / 0 |
尚未授權 reload、route change 或 live smoke |
| runtime gate / action button | 0 / 0 |
不得建立前端執行入口 |
| P1-4 成熟度 | 56% -> 62% |
只代表 repo-only 清冊完成,不代表 live alert chain 通過 |
3. Write-capable surface
| surface | 風險 | 必要 gate |
|---|---|---|
deploy_alertmanager_config_script |
可影響 receiver / route / reload | owner response、維護窗口、rollback owner、receiver smoke |
deploy_prometheus_alerts_script |
可影響 alert rules / reload | rule test、receiver mapping、reload owner、rollback owner |
k8s_deploy_prometheus_config_script |
可觸發 K8s apply / reload | K8s owner、ArgoCD / kubectl 邊界、rollback owner |
api_apply_prometheus_config_script |
可由 API side 套用 Prometheus config | API deploy owner、config source owner、reload proof |
monitoring_exporter_deploy_script |
可透過 host deploy 影響 exporter / restart | host owner、SSH boundary、restart window |
sentry_self_hosted_deploy |
可部署 Sentry self-hosted stack | deploy owner、backup owner、migration rollback |
telegram_gateway_service |
可送 Telegram / 寫 delivery path | token injection owner、receipt owner、send approval gate |
notification_manager_service |
可改通知通道與路由 | channel owner、receipt owner、failure-only policy |
converged_alert_recurrence_notifier |
可造成 recurrence escalation 或噪音 | noise budget owner、silence boundary、receipt proof |
fire_live_alert_script |
可向 live alert chain 打告警 | allowed receiver、test window、cleanup owner |
fire_test_alert_script |
可觸發測試告警與通知 | dedup proof、receiver owner、noise guard |
4. 固定 0 / false 邊界
runtime_execution_authorized=false
host_write_authorized=false
prometheus_reload_authorized=false
alertmanager_reload_authorized=false
grafana_dashboard_apply_authorized=false
signoz_rule_apply_authorized=false
sentry_deploy_authorized=false
langfuse_config_change_authorized=false
otel_collector_reload_authorized=false
receiver_route_change_authorized=false
silence_policy_change_authorized=false
telegram_send_authorized=false
notification_route_change_authorized=false
webhook_receiver_change_authorized=false
remote_write_change_authorized=false
exporter_deploy_authorized=false
live_alert_fire_authorized=false
alert_chain_smoke_authorized=false
ssh_read_authorized=false
ssh_write_authorized=false
kubectl_action_authorized=false
secret_value_collection_allowed=false
active_scan_authorized=false
action_buttons_allowed=false
5. 下一步 owner response
- Prometheus owner:提供 live config hash、rule diff、reload owner、rollback owner 與 false-green guard。
- Alertmanager owner:提供 receiver diff、route owner、silence policy owner、reload owner 與 failure-only notification proof。
- Grafana owner:提供 dashboard UID / folder owner、import owner、rollback ref 與 smoke plan。
- SigNoz / OTEL owner:提供 pipeline diff、rule apply owner、data export boundary 與 rollback owner。
- Sentry / Langfuse owner:提供 compose live hash、secret injection owner、upgrade / restart window、backup owner 與 rollback owner。
- Telegram / notification owner:提供 receiver owner、receipt owner、redaction proof、retry boundary 與 no-secret-value evidence。
- Alert chain smoke owner:提供 allowed receiver、execution window、expected receipt、noise budget 與 cleanup owner。
6. 完成度
| 工作 | 完成度 | 說明 |
|---|---|---|
| P1-4 repo-only surface 註冊 | 100% |
60 個 surface 已納入 snapshot |
| source existence / SHA256 | 100% |
60 / 60 source path 存在 |
| monitoring / alerting 高價值配置成熟度 | 56% -> 62% |
只代表清冊與 guard 準備度 |
| owner response 收件 / 接受 | 0% |
尚未收到或接受任何 owner response |
| live evidence collection | 0% |
未讀 live monitoring stack |
| reload / receiver / route smoke gate | 0% |
未授權、未執行 |
| runtime gate | 0% |
無前端執行按鈕 |
7. 邊界
本清冊完成不代表 live Prometheus / Alertmanager / Grafana / SigNoz / Sentry / Langfuse 已一致,也不代表 alert route 已送達或告警必到。不得把 repo source 可見、snapshot、IwoooS UI、AwoooP approval 或 LOGBOOK 當成 Prometheus reload、Alertmanager reload、Grafana import、SigNoz apply、Sentry deploy、Langfuse change、OTEL reload、remote write change、silence change、Telegram send、live alert fire、alert chain smoke、SSH、kubectl、active scan 或 secret collection 授權。