192 lines
9.4 KiB
Markdown
192 lines
9.4 KiB
Markdown
# IwoooS SSH / network / firewall post-incident readback plan
|
||
|
||
| 項目 | 內容 |
|
||
|------|------|
|
||
| 日期 | 2026-06-15 |
|
||
| 狀態 | `post_incident_readback_plan_ready_no_runtime_action` |
|
||
| 工具 | `scripts/security/ssh-network-post-incident-readback-plan.py` |
|
||
| Snapshot | `docs/security/ssh-network-post-incident-readback-plan.snapshot.json` |
|
||
| Source acceptance | `docs/security/port-firewall-change-evidence-acceptance.snapshot.json` |
|
||
| runtime gate | `0` |
|
||
|
||
## 1. 目的
|
||
|
||
本文件承接端口 / 防火牆變更證據驗收帳本,補上事故後回讀計畫。未來若再次發生端口被關、firewall / NetworkPolicy / NodePort / WireGuard policy 被改動、deploy SSH 斷線、AI provider route 異常、public route 或 monitoring 路徑受影響,IwoooS 必須先收齊「誰改、何時改、改前改後狀態、影響哪些服務、是否同步相關產品、怎麼恢復、怎麼防再發」的脫敏證據。
|
||
|
||
這不是 SSH 授權、不是 live firewall read、不是 firewall / port change、不是 route smoke、不是 host restart,也不是 runtime gate。它只建立 post-incident readback 的欄位、reviewer checks、分流與拒收條件,避免把「服務後來恢復」誤判成「事故原因、責任、影響與防再發都已驗收」。
|
||
|
||
## 2. 摘要
|
||
|
||
| 指標 | 目前值 | 說明 |
|
||
|------|--------|------|
|
||
| readback candidate | `14` | 承接端口 / 防火牆 / NodePort / NetworkPolicy / WireGuard / deploy SSH / sudo / alert action surface |
|
||
| write-capable readback candidate | `6` | 可能影響 deploy SSH、monitoring deploy、sudoers 或 alert action catalog 的 surface |
|
||
| policy / exposure readback candidate | `5` | NetworkPolicy、NodePort、WireGuard 與 exposure 相關 surface |
|
||
| health impact review required | `14` | 全部都必須交代 service / AI provider / monitoring / product impact |
|
||
| cross-project sync required | `14` | 全部都必須交代跨產品 / owner / Session 同步 ref |
|
||
| recurrence guard required | `14` | 全部都必須提出防再發 guard 或 change freeze rule |
|
||
| readback field | `30` | readback 欄位總數 |
|
||
| required readback field | `24` | owner / reviewer 必填欄位 |
|
||
| reviewer check | `24` | actor、before / after、health impact、通知、同步、恢復、防再發與 no-false-green 檢查 |
|
||
| outcome lane | `10` | waiting、補 actor、補 before-after、補 health impact、隔離、拒收、review、ledger-only、防再發回補、runtime gate |
|
||
| blocked action | `34` | SSH、firewall、port、route smoke、reload、restart、secret、active scan、provider switch、prompt send、production write 等 |
|
||
| post-incident readback received / accepted | `0 / 0` | 尚未收到或驗收 |
|
||
| no-false-green accepted | `0` | 不把 route 200、service up 或 UI 可見當事故驗收 |
|
||
| runtime gate / action button | `0 / 0` | 不提供操作入口 |
|
||
|
||
## 3. Readback Candidate 範圍
|
||
|
||
| Candidate | 驗收焦點 |
|
||
|-----------|----------|
|
||
| `ssh_network_post_incident_readback:ansible_inventory_ssh_targets` | 主機存取異動、端口影響、維護窗口、rollback 與 post-check |
|
||
| `ssh_network_post_incident_readback:gitea_cd_deploy_ssh` | deploy SSH 可達性、回復證據、rollback owner 與跨專案通知 |
|
||
| `ssh_network_post_incident_readback:gitea_cd_dev_ssh` | dev / prod 邊界、端口 policy、owner decision 與防再發 |
|
||
| `ssh_network_post_incident_readback:deploy_alerts_ssh_path` | alert deploy path、通知鏈路、受影響產品與恢復 readback |
|
||
| `ssh_network_post_incident_readback:monitoring_discover_docker_ssh` | monitoring discovery 可達性、read-only window 與 false-green 風險 |
|
||
| `ssh_network_post_incident_readback:monitoring_exporter_deploy_ssh` | exporter deploy access、firewall owner、post-check 與 rollback |
|
||
| `ssh_network_post_incident_readback:backup_config_ssh_capture` | backup access、restore validation、service dependency 與 notification |
|
||
| `ssh_network_post_incident_readback:host_ops_sudoers_wrapper` | sudo 授權邊界、break-glass、回復責任與 forbidden command proof |
|
||
| `ssh_network_post_incident_readback:k8s_prod_network_policy` | ingress / egress policy、route impact、metrics / alert 與回滾 |
|
||
| `ssh_network_post_incident_readback:argocd_metrics_network_policy` | metrics scrape、NodePort exposure、source whitelist 與 monitoring impact |
|
||
| `ssh_network_post_incident_readback:argocd_metrics_nodeport` | NodePort exposure、firewall owner、rollback 與 public/admin route 影響 |
|
||
| `ssh_network_post_incident_readback:velero_metrics_nodeport` | backup metrics exposure、access policy 與 restore readiness 影響 |
|
||
| `ssh_network_post_incident_readback:wireguard_mesh_runbook` | mesh cutover、firewall rule owner、canary / rollback 與 maintenance window |
|
||
| `ssh_network_post_incident_readback:alert_rules_ssh_actions` | alert action catalog、read/write/admin 分級、cooldown 與 post-check |
|
||
|
||
## 4. 必填 Readback 欄位
|
||
|
||
1. `change_or_incident_ref`
|
||
2. `actor_attribution_ref`
|
||
3. `incident_detected_at_ref`
|
||
4. `change_window_ref`
|
||
5. `affected_port_or_policy_ref`
|
||
6. `before_state_ref`
|
||
7. `after_state_ref`
|
||
8. `service_dependency_ref`
|
||
9. `public_route_impact_ref`
|
||
10. `ai_provider_impact_ref`
|
||
11. `monitoring_alert_impact_ref`
|
||
12. `customer_or_product_impact_ref`
|
||
13. `operator_notification_ref`
|
||
14. `cross_project_sync_ref`
|
||
15. `restoration_evidence_ref`
|
||
16. `postcheck_readback_ref`
|
||
17. `recurrence_guard_ref`
|
||
18. `maintenance_window`
|
||
19. `rollback_owner`
|
||
20. `followup_owner`
|
||
21. `redacted_evidence_refs`
|
||
22. `no_secret_value_attestation`
|
||
23. `no_raw_firewall_dump_attestation`
|
||
24. `no_false_green_attestation`
|
||
|
||
## 5. Reviewer Checks
|
||
|
||
1. `source_change_evidence_current`
|
||
2. `incident_ref_present`
|
||
3. `actor_not_anonymous`
|
||
4. `before_after_state_present`
|
||
5. `port_policy_redacted`
|
||
6. `service_dependency_present`
|
||
7. `public_route_impact_present`
|
||
8. `ai_provider_impact_present`
|
||
9. `monitoring_alert_impact_present`
|
||
10. `customer_product_impact_present`
|
||
11. `operator_notification_present`
|
||
12. `cross_project_sync_present`
|
||
13. `restoration_evidence_present`
|
||
14. `postcheck_independent`
|
||
15. `recurrence_guard_present`
|
||
16. `emergency_classification_present`
|
||
17. `maintenance_window_present`
|
||
18. `rollback_owner_present`
|
||
19. `no_false_green_route_200`
|
||
20. `raw_firewall_dump_absent`
|
||
21. `secret_or_key_value_absent`
|
||
22. `hidden_impact_absent`
|
||
23. `counts_transition_safe`
|
||
24. `runtime_stays_zero`
|
||
|
||
## 6. Outcome Lanes
|
||
|
||
| Lane | 說明 |
|
||
|------|------|
|
||
| `waiting_post_incident_readback` | 尚未收到事故回讀包;所有 accepted / runtime count 維持 0 |
|
||
| `request_actor_supplement` | 缺 actor / owner / decision 時要求補件 |
|
||
| `request_before_after_supplement` | 缺 before / after 或 restoration evidence 時要求補件 |
|
||
| `request_health_impact_supplement` | 缺 service / AI provider / monitoring / product impact 時要求補件 |
|
||
| `quarantine_raw_payload` | 收到 raw firewall dump、secret 或 key material 時只能隔離 |
|
||
| `reject_unattributed_incident` | 無 actor、無 affected scope、無 rollback 或無 notification 的事故回讀不得驗收 |
|
||
| `ready_for_post_incident_review` | metadata 合格後,只能進 reviewer review |
|
||
| `incident_readback_only_update` | 只允許更新只讀 ledger,不得反向視為已批准操作 |
|
||
| `recurrence_guard_backfill_required` | 需補防再發 guard、owner review 與 change freeze |
|
||
| `waiting_runtime_gate` | 即使 readback accepted,runtime gate 仍需獨立人工批准 |
|
||
|
||
## 7. 禁止動作
|
||
|
||
1. `ssh_read`
|
||
2. `ssh_write`
|
||
3. `live_firewall_read`
|
||
4. `firewall_change`
|
||
5. `port_change`
|
||
6. `port_close`
|
||
7. `port_open`
|
||
8. `network_policy_apply`
|
||
9. `nodeport_change`
|
||
10. `wireguard_change`
|
||
11. `sudo_action`
|
||
12. `deploy_ssh_action`
|
||
13. `route_smoke`
|
||
14. `public_gateway_reload`
|
||
15. `nginx_reload`
|
||
16. `host_restart`
|
||
17. `docker_restart`
|
||
18. `systemd_restart`
|
||
19. `secret_value_collection`
|
||
20. `ssh_key_collection`
|
||
21. `raw_firewall_dump_storage`
|
||
22. `raw_key_material_storage`
|
||
23. `mark_readback_accepted_without_reviewer_record`
|
||
24. `mark_incident_resolved_without_postcheck`
|
||
25. `hide_cross_project_impact`
|
||
26. `treat_route_200_as_all_green`
|
||
27. `treat_break_glass_as_approval`
|
||
28. `close_management_port_without_owner`
|
||
29. `open_runtime_gate`
|
||
30. `add_action_button`
|
||
31. `production_write`
|
||
32. `active_scan`
|
||
33. `provider_switch`
|
||
34. `prompt_send`
|
||
|
||
## 8. 指令
|
||
|
||
產生 committed snapshot:
|
||
|
||
```bash
|
||
python3 scripts/security/ssh-network-post-incident-readback-plan.py \
|
||
--root . \
|
||
--source-change-evidence-report docs/security/port-firewall-change-evidence-acceptance.snapshot.json \
|
||
--output docs/security/ssh-network-post-incident-readback-plan.snapshot.json \
|
||
--generated-at 2026-06-15T19:16:00+08:00
|
||
```
|
||
|
||
驗證 guard:
|
||
|
||
```bash
|
||
python3 scripts/security/security-mirror-progress-guard.py --root .
|
||
```
|
||
|
||
## 9. 完成度
|
||
|
||
| 工作 | 完成度 | 說明 |
|
||
|------|--------|------|
|
||
| post-incident readback plan artifact | `100%` | 14 份候選、snapshot、文件與 guard 已固定 |
|
||
| post-incident readback received / accepted | `0%` | 尚未收到,尚未驗收 |
|
||
| actor / before-after / impact evidence | `0%` | 尚未收到 owner-provided evidence |
|
||
| service / AI provider / monitoring impact | `0%` | 尚未收到脫敏 impact refs |
|
||
| cross-project sync / notification evidence | `0%` | 尚未收到同步與通知證據 |
|
||
| recurrence guard / no-false-green accepted | `0%` | 尚未驗收防再發或 no-false-green |
|
||
| SSH / firewall / port / route / restart action | `0%` | 未授權且未執行 |
|
||
| runtime gate / production write | `0%` | 未授權且未執行 |
|