diff --git a/docs/LOGBOOK.md b/docs/LOGBOOK.md index 727c766e..69e3dfef 100644 --- a/docs/LOGBOOK.md +++ b/docs/LOGBOOK.md @@ -45596,3 +45596,40 @@ production browser smoke: - DR credential escrow evidence 仍缺 `5`:不得宣稱 `DR_COMPLETE`。 - Wazuh manager registry accepted 仍為 `0`:不得宣稱 Wazuh 全主機納管恢復。 - certbot formal renewal 尚未完成 readback;本輪完成的是 HTTP-01 route / timer hygiene / failed-unit 清除,正式 renew 成功需等 snap certbot timer 或獨立 ACME window。 + +## 2026-06-26 — 13:01 post-reboot owner response preflight / SOP v1.74 + +**時間與來源**: +- 2026-06-26 13:01-13:23 Asia/Taipei。 +- 來源:`scripts/reboot-recovery/post-reboot-next-gate-owner-packets.py --no-color`、新增 `scripts/reboot-recovery/post-reboot-owner-response-preflight.py --no-color`、placeholder template `docs/templates/post-reboot-next-gate-owner-response.json`、SOP / workplan 文件同步。 + +**完成內容**: +- 新增 post-reboot owner response preflight,驗收未來 owner response JSON 是否符合目前 `awoooi_post_reboot_next_gate_owner_packets_v1` 的動態 gate set。 +- 新增 placeholder response template,刻意保留 `owner_role_here`、`non_secret_evidence_ref_here`、`registry_export_ref_here` 等 placeholder,作為 fail-closed 測試樣本;直接套用模板不得被算成已收件或已接受。 +- `docs/runbooks/REBOOT-POST-START-QUICK-CHECK.md` 升至 v1.14,固定流程改為 summary → declaration guard → next-gate dispatch → owner packet → contract guard → owner response preflight。 +- `docs/runbooks/FULL-STACK-COLD-START-SOP.md` 升至 v1.74,將 owner response preflight 納入完整開機 / 關機 / 重啟 SOP。 +- `docs/workplans/2026-06-04-reboot-cold-start-backup-recovery-workplan.md` 更新為 `DONE_WITH_OWNER_RESPONSE_PREFLIGHT_V174`。 + +**live / preflight 證據**: +- 13:23 owner packet live generation 讀回 `next_gate_count=2`,只剩 `credential_escrow_evidence` 與 `wazuh_manager_registry_export`;`request_sent_count=0`、`owner_response_received_count=0`、`owner_response_accepted_count=0`、`runtime_action_authorized_count=0`。 +- 12:58 post-start summary 已恢復為 `POST_START_RESULT=FULL_STACK_GREEN_DR_ESCROW_BLOCKED`、`POST_START_PASS=38`、`POST_START_WARN=4`、`POST_START_BLOCKED=0`、`SERVICE_GREEN=1`、`PRODUCT_DATA_GREEN=1`、`BACKUP_CORE_GREEN=1`、`DR_ESCROW_BLOCKED=1`、`ESCROW_MISSING_COUNT=5`、`HOST_188_HYGIENE_BLOCKED=0`、`HOST_188_RESULT=HOST_188_HYGIENE_GREEN.`、`WAZUH_ROUTE_CODE=200`、`WAZUH_TRANSPORT_COUNT=6`、`WAZUH_COVERAGE_SCOPE=6`、`WAZUH_DIRECT_ACTIVE=2`、`WAZUH_NO_TRANSPORT=1`、`WAZUH_SSH_BLOCKED=3`、`WAZUH_DASHBOARD_API_CONNECTION=pending_or_spinning`、`WAZUH_DASHBOARD_INDEX_OK=3`、`WAZUH_MANAGER_REGISTRY_ACCEPTED=0`、`RUNTIME_ACTION_AUTHORIZED=0`、`OVERALL_DECLARATION=FULL_STACK_GREEN_DR_ESCROW_BLOCKED`、`NEXT_REQUIRED_GATES=credential_escrow_evidence,wazuh_manager_registry_export`。 +- 12:55 首輪 owner-packet generation 曾因 110 transient `stockplatform-review-bulk-ux` active process / service warning 使 summary 暫時落入 service warning;未 kill、未 restart、未取消 CI;12:58 重跑後自動恢復,證明 SOP 會把 transient / active CI process 與真正 orphan / service blocker 分開。 +- 無 response file 預期輸出:`POST_REBOOT_OWNER_RESPONSE_PREFLIGHT_BLOCKED status=blocked_waiting_owner_response_file expected_gates=2 received=0 accepted=0 runtime_gate=0 blockers=1`。 +- placeholder template 輸出:`POST_REBOOT_OWNER_RESPONSE_PREFLIGHT_BLOCKED status=blocked_waiting_owner_response_content expected_gates=2 received=0 accepted=0 runtime_gate=0 blockers=41`。 + +**做過的命令類型**: +- 只讀:post-reboot owner packet generation、owner response preflight、contract / declaration / source guards。 +- 寫入:repo script / docs / template only。 +- 未做:沒有 host / Docker / systemd / Nginx / firewall / K8s / DB / Wazuh runtime 寫操作;沒有讀 secret 明文;沒有寫 credential marker;沒有送 owner request;沒有 Wazuh active response / agent re-enroll / restart;沒有 Kali active scan。 + +**目前判定**: +- Owner response preflight automation:`0% -> 100%`。 +- Reboot service / product data / backup / 188 host hygiene:`GREEN`。 +- Overall recovery declaration:`FULL_STACK_GREEN_DR_ESCROW_BLOCKED`。 +- SOP / quick-check / owner-packet / owner-response preflight:v1.74。 + +**仍 blocked / 不得宣稱**: +- DR credential escrow evidence 仍缺 `5`:不得宣稱 `DR_COMPLETE` 或 credential escrow complete。 +- Wazuh manager registry accepted 仍為 `0`:不得宣稱 Wazuh 全主機納管恢復。 +- Owner response received / accepted 仍為 `0 / 0`;不得把「批准繼續」、空模板、UI 可見、route `200`、transport `6`、Dashboard index pattern `3` 或 owner-packet JSON 當成 evidence accepted。 +- Runtime action / host write / credential marker write / Wazuh active response / Kali active scan 仍全部 `0 / false`。 diff --git a/docs/runbooks/FULL-STACK-COLD-START-SOP.md b/docs/runbooks/FULL-STACK-COLD-START-SOP.md index bc75f398..3629dd20 100644 --- a/docs/runbooks/FULL-STACK-COLD-START-SOP.md +++ b/docs/runbooks/FULL-STACK-COLD-START-SOP.md @@ -1,6 +1,6 @@ # AWOOOI 全棧冷啟動與主機重啟 SOP -> Version: v1.73 +> Version: v1.74 > Last updated: 2026-06-26 Asia/Taipei > Scope: 110 / 120 / 121 / 188 full-stack reboot recovery. 112 Kali is recorded as P3 optional and is not part of this recovery path. @@ -10,10 +10,12 @@ 本節是每次接手、開機、關機、重啟後的第一個判定錨點。若日期不是今天,必須先重跑 live check,再更新本節與 `docs/workplans/2026-06-04-reboot-cold-start-backup-recovery-workplan.md`。 -若只是重啟後要快速判斷能不能宣稱恢復,先跑機器可讀摘要:`scripts/reboot-recovery/post-reboot-readiness-summary.sh --no-color`。此腳本會呼叫一頁式總檢查、188 host hygiene checklist 與 Wazuh no-false-green repo gates,並把 delegated logs 留在 `/tmp/awoooi-post-reboot-readiness-*`。接著跑 `scripts/reboot-recovery/post-reboot-declaration-guard.py --no-color`,把 summary 轉成 allowed / forbidden declaration,避免把服務綠誤報成 DR complete、188 host hygiene、Wazuh registry recovered 或 runtime authorized。若 summary 顯示 `SERVICE_GREEN=1` 但 `NEXT_REQUIRED_GATES` 仍非空,再跑 `scripts/reboot-recovery/post-reboot-next-gate-dispatch.sh --no-color`,把 live summary 內尚未完成的 blocker 轉成 owner / evidence / forbidden-action dispatch checklist;需要機器可讀 intake 時,再跑 `scripts/reboot-recovery/post-reboot-next-gate-owner-packets.py --no-color --output /tmp/awoooi-post-reboot-owner-packets.json` 產生 `awoooi_post_reboot_next_gate_owner_packets_v1` JSON,並立刻跑 `scripts/reboot-recovery/post-reboot-owner-packet-contract-guard.py --packet-file /tmp/awoooi-post-reboot-owner-packets.json`。dispatch / packet / guard 均固定 `DISPATCH_AUTHORIZED=0`、`REQUEST_SENT_COUNT=0`、`OWNER_RESPONSE_ACCEPTED=0`、`HOST_WRITE_AUTHORIZED=0`、`SECRET_VALUE_COLLECTION_ALLOWED=0`、`RUNTIME_GATE=0`;guard 未通過時不得送 owner request、不得寫 escrow marker、不得進維護窗口、不得宣稱 DR / Wazuh registry complete。需要人工展開時,再跑 `scripts/reboot-recovery/post-start-quick-check.sh --no-color` 並以 `docs/runbooks/REBOOT-POST-START-QUICK-CHECK.md` 作為 fallback。長 SOP 保留完整背景、例外處理與 Plan B;短版 wrapper / checklist 負責每次 T+10 分鐘內的固定判定。 +若只是重啟後要快速判斷能不能宣稱恢復,先跑機器可讀摘要:`scripts/reboot-recovery/post-reboot-readiness-summary.sh --no-color`。此腳本會呼叫一頁式總檢查、188 host hygiene checklist 與 Wazuh no-false-green repo gates,並把 delegated logs 留在 `/tmp/awoooi-post-reboot-readiness-*`。接著跑 `scripts/reboot-recovery/post-reboot-declaration-guard.py --no-color`,把 summary 轉成 allowed / forbidden declaration,避免把服務綠誤報成 DR complete、188 host hygiene、Wazuh registry recovered 或 runtime authorized。若 summary 顯示 `SERVICE_GREEN=1` 但 `NEXT_REQUIRED_GATES` 仍非空,再跑 `scripts/reboot-recovery/post-reboot-next-gate-dispatch.sh --no-color`,把 live summary 內尚未完成的 blocker 轉成 owner / evidence / forbidden-action dispatch checklist;需要機器可讀 intake 時,再跑 `scripts/reboot-recovery/post-reboot-next-gate-owner-packets.py --no-color --output /tmp/awoooi-post-reboot-owner-packets.json` 產生 `awoooi_post_reboot_next_gate_owner_packets_v1` JSON,並立刻跑 `scripts/reboot-recovery/post-reboot-owner-packet-contract-guard.py --packet-file /tmp/awoooi-post-reboot-owner-packets.json`。dispatch / packet / guard 均固定 `DISPATCH_AUTHORIZED=0`、`REQUEST_SENT_COUNT=0`、`OWNER_RESPONSE_ACCEPTED=0`、`HOST_WRITE_AUTHORIZED=0`、`SECRET_VALUE_COLLECTION_ALLOWED=0`、`RUNTIME_GATE=0`;guard 未通過時不得送 owner request、不得寫 escrow marker、不得進維護窗口、不得宣稱 DR / Wazuh registry complete。v1.74 起,任何 owner response JSON 還必須經過 `scripts/reboot-recovery/post-reboot-owner-response-preflight.py --no-color --response-file `:空模板、placeholder、secret payload、runtime action request、credential marker write、Wazuh active response / re-enroll / restart、Kali active scan 或缺少 Dashboard API / manager registry evidence 都必須 fail-closed;preflight 通過也只表示可進入獨立 reviewer acceptance,不是 runtime 授權。需要人工展開時,再跑 `scripts/reboot-recovery/post-start-quick-check.sh --no-color` 並以 `docs/runbooks/REBOOT-POST-START-QUICK-CHECK.md` 作為 fallback。長 SOP 保留完整背景、例外處理與 Plan B;短版 wrapper / checklist 負責每次 T+10 分鐘內的固定判定。 2026-06-26 12:13 latest live summary supersedes the 08:59 gate set:`scripts/reboot-recovery/post-reboot-readiness-summary.sh --no-color` 回傳 `POST_START_RESULT=FULL_STACK_GREEN_DR_ESCROW_BLOCKED`、`POST_START_PASS=38`、`POST_START_WARN=4`、`POST_START_BLOCKED=0`、`SERVICE_GREEN=1`、`PRODUCT_DATA_GREEN=1`、`BACKUP_CORE_GREEN=1`、`DR_ESCROW_BLOCKED=1`、`ESCROW_MISSING_COUNT=5`、`HOST_188_SERVICE_GREEN=1`、`HOST_188_HYGIENE_BLOCKED=0`、`HOST_188_RESULT=HOST_188_HYGIENE_GREEN.`、`WAZUH_ROUTE_CODE=200`、`WAZUH_TRANSPORT_COUNT=6`、`WAZUH_MANAGER_REGISTRY_ACCEPTED=0`、`WAZUH_DASHBOARD_API_CONNECTION=pending_or_spinning`、`WAZUH_DASHBOARD_INDEX_OK=3`、`RUNTIME_ACTION_AUTHORIZED=0`、`OVERALL_DECLARATION=FULL_STACK_GREEN_DR_ESCROW_BLOCKED`、`NEXT_REQUIRED_GATES=credential_escrow_evidence,wazuh_manager_registry_export`。188 host hygiene 已從 blocker 移除;目前不可宣稱完成的只剩 DR credential escrow 與 Wazuh manager registry。ACME HTTP-01 route 與 certbot timer hygiene 已修復,但不得宣稱憑證已正式 renew,需等 snap certbot timer / ACME window readback。 +2026-06-26 13:01 owner response preflight baseline:新增 `scripts/reboot-recovery/post-reboot-owner-response-preflight.py --no-color` 與 `docs/templates/post-reboot-next-gate-owner-response.json`。無 response file 時必須輸出 `POST_REBOOT_OWNER_RESPONSE_PREFLIGHT_BLOCKED status=blocked_waiting_owner_response_file expected_gates=2 received=0 accepted=0 runtime_gate=0`;直接使用模板時必須輸出 `POST_REBOOT_OWNER_RESPONSE_PREFLIGHT_BLOCKED status=blocked_waiting_owner_response_content expected_gates=2 received=0 accepted=0 runtime_gate=0`。此 gate 只驗收 `credential_escrow_evidence` 與 `wazuh_manager_registry_export` 的脫敏 owner evidence,不送 request、不寫 escrow marker、不讀 secret、不做 Wazuh / host / Kali runtime action,也不把一般批准訊息轉成 owner accepted。 + 2026-06-26 07:47 machine-readable readiness summary retained as historical pre-repair evidence:當時 `HOST_188_HYGIENE_BLOCKED=1`、`NEXT_REQUIRED_GATES=credential_escrow_evidence,host_188_hygiene_maintenance_window,wazuh_manager_registry_export`。此段只用來比對 188 修復前後差異;現行 gate set 必須使用 12:13 baseline。 2026-06-26 08:12 next-gate dispatch baseline retained as historical pre-repair evidence:當時 output 固定三個 P0 checklist。12:13 起 dispatch 依 live summary 動態輸出,目前 expected `NEXT_GATE_COUNT=2`,只剩 credential escrow 與 Wazuh registry。 diff --git a/docs/runbooks/REBOOT-POST-START-QUICK-CHECK.md b/docs/runbooks/REBOOT-POST-START-QUICK-CHECK.md index ec4c6f3f..456091c7 100644 --- a/docs/runbooks/REBOOT-POST-START-QUICK-CHECK.md +++ b/docs/runbooks/REBOOT-POST-START-QUICK-CHECK.md @@ -1,6 +1,6 @@ # 主機重啟後一頁式總檢查 -> Version: v1.13 +> Version: v1.14 > Last updated: 2026-06-26 Asia/Taipei > Scope: 110 / 120 / 121 / 188 post-reboot service recovery. 112 Kali / Wazuh / active scan 不屬於本流程。 @@ -10,7 +10,7 @@ 每次 110 / 120 / 121 / 188 任一台主機開機、關機、重啟、斷電恢復、VMware console fsck、Docker / K3s 大量重排後,都先跑本頁,再決定是否宣稱恢復。 -最新基準:2026-06-26 12:13 post-reboot summary / declaration guard。`scripts/reboot-recovery/post-reboot-readiness-summary.sh --no-color` 回傳 `SERVICE_GREEN=1`、`PRODUCT_DATA_GREEN=1`、`BACKUP_CORE_GREEN=1`、`DR_ESCROW_BLOCKED=1`、`ESCROW_MISSING_COUNT=5`、`HOST_188_HYGIENE_BLOCKED=0`、`HOST_188_RESULT=HOST_188_HYGIENE_GREEN.`、`WAZUH_MANAGER_REGISTRY_ACCEPTED=0`、`WAZUH_COVERAGE_SCOPE=6`、`WAZUH_DIRECT_ACTIVE=2`、`WAZUH_NO_TRANSPORT=1`、`WAZUH_SSH_BLOCKED=3`、`WAZUH_ROUTE_CODE=200`、`WAZUH_TRANSPORT_COUNT=6`、`WAZUH_DASHBOARD_API_CONNECTION=pending_or_spinning`、`WAZUH_DASHBOARD_INDEX_OK=3`、`RUNTIME_ACTION_AUTHORIZED=0`、`OVERALL_DECLARATION=FULL_STACK_GREEN_DR_ESCROW_BLOCKED`。`scripts/reboot-recovery/post-reboot-declaration-guard.py --no-color` 會把 summary 轉成 allowed / forbidden declaration:目前允許宣稱服務、產品資料、備份核心、188 host hygiene green 與 `FULL_STACK_GREEN_DR_ESCROW_BLOCKED`;禁止宣稱 `DR_COMPLETE`、`WAZUH_REGISTRY_RECOVERED`、`RUNTIME_ACTION_AUTHORIZED`。接著 `scripts/reboot-recovery/post-reboot-next-gate-dispatch.sh --no-color` 將 `NEXT_REQUIRED_GATES=credential_escrow_evidence,wazuh_manager_registry_export` 展成 owner / evidence / forbidden-action checklist;Wazuh checklist 的 `CURRENT_EVIDENCE` 會保留 registry accepted、coverage scope、direct active、no transport、SSH blocked、route、transport、Dashboard API 與 index pattern 狀態,避免把 route `200` 或 transport `6` 誤報成 registry recovered。`scripts/reboot-recovery/post-reboot-next-gate-owner-packets.py --no-color` 進一步轉成 `awoooi_post_reboot_next_gate_owner_packets_v1` JSON,固定 `dispatch_authorized=0`、`request_sent_count=0`、`owner_response_accepted_count=0`、`host_write_authorized=0`、`secret_value_collection_allowed=0`、`runtime_gate_count=0`;`scripts/reboot-recovery/post-reboot-owner-packet-contract-guard.py --packet-file /tmp/awoooi-post-reboot-owner-packets.json` 依 live `next_required_gates` 動態鎖定 P0 gate、所有 `0 / false` 邊界、禁用 secret payload / runtime action 與 no-false-green 規則。DR 仍因 `escrow_missing=5` 不可宣稱 complete;Wazuh manager registry 仍是 service green 之外的獨立 blocker。ACME HTTP-01 route / certbot timer hygiene 已修復,但憑證正式 renew 成功需等 snap certbot timer 或獨立 ACME window readback。 +最新基準:2026-06-26 13:01 post-reboot owner response preflight。`scripts/reboot-recovery/post-reboot-readiness-summary.sh --no-color` 回傳 `SERVICE_GREEN=1`、`PRODUCT_DATA_GREEN=1`、`BACKUP_CORE_GREEN=1`、`DR_ESCROW_BLOCKED=1`、`ESCROW_MISSING_COUNT=5`、`HOST_188_HYGIENE_BLOCKED=0`、`HOST_188_RESULT=HOST_188_HYGIENE_GREEN.`、`WAZUH_MANAGER_REGISTRY_ACCEPTED=0`、`WAZUH_COVERAGE_SCOPE=6`、`WAZUH_DIRECT_ACTIVE=2`、`WAZUH_NO_TRANSPORT=1`、`WAZUH_SSH_BLOCKED=3`、`WAZUH_ROUTE_CODE=200`、`WAZUH_TRANSPORT_COUNT=6`、`WAZUH_DASHBOARD_API_CONNECTION=pending_or_spinning`、`WAZUH_DASHBOARD_INDEX_OK=3`、`RUNTIME_ACTION_AUTHORIZED=0`、`OVERALL_DECLARATION=FULL_STACK_GREEN_DR_ESCROW_BLOCKED`。`scripts/reboot-recovery/post-reboot-declaration-guard.py --no-color` 會把 summary 轉成 allowed / forbidden declaration:目前允許宣稱服務、產品資料、備份核心、188 host hygiene green 與 `FULL_STACK_GREEN_DR_ESCROW_BLOCKED`;禁止宣稱 `DR_COMPLETE`、`WAZUH_REGISTRY_RECOVERED`、`RUNTIME_ACTION_AUTHORIZED`。接著 `scripts/reboot-recovery/post-reboot-next-gate-dispatch.sh --no-color` 將 `NEXT_REQUIRED_GATES=credential_escrow_evidence,wazuh_manager_registry_export` 展成 owner / evidence / forbidden-action checklist;Wazuh checklist 的 `CURRENT_EVIDENCE` 會保留 registry accepted、coverage scope、direct active、no transport、SSH blocked、route、transport、Dashboard API 與 index pattern 狀態,避免把 route `200` 或 transport `6` 誤報成 registry recovered。`scripts/reboot-recovery/post-reboot-next-gate-owner-packets.py --no-color` 進一步轉成 `awoooi_post_reboot_next_gate_owner_packets_v1` JSON,固定 `dispatch_authorized=0`、`request_sent_count=0`、`owner_response_accepted_count=0`、`host_write_authorized=0`、`secret_value_collection_allowed=0`、`runtime_gate_count=0`;`scripts/reboot-recovery/post-reboot-owner-packet-contract-guard.py --packet-file /tmp/awoooi-post-reboot-owner-packets.json` 依 live `next_required_gates` 動態鎖定 P0 gate、所有 `0 / false` 邊界、禁用 secret payload / runtime action 與 no-false-green 規則。新增 `scripts/reboot-recovery/post-reboot-owner-response-preflight.py --no-color` 作為 owner response 收件預檢:沒有 response file 必須是 `blocked_waiting_owner_response_file`;直接套用 `docs/templates/post-reboot-next-gate-owner-response.json` 必須是 `blocked_waiting_owner_response_content`;只有具備遮罩 evidence refs、完整 owner 欄位、Wazuh registry / Dashboard API 狀態、五個 credential escrow 非 secret evidence refs,且沒有 secret value / runtime action request 的 response 才能進入下一層 reviewer acceptance。DR 仍因 `escrow_missing=5` 不可宣稱 complete;Wazuh manager registry 仍是 service green 之外的獨立 blocker。ACME HTTP-01 route / certbot timer hygiene 已修復,但憑證正式 renew 成功需等 snap certbot timer 或獨立 ACME window readback。 本頁只回答四件事: @@ -100,6 +100,15 @@ scripts/reboot-recovery/post-reboot-owner-packet-contract-guard.py --packet-file guard 必須輸出 `POST_REBOOT_OWNER_PACKET_CONTRACT_GUARD_OK gates= request_sent=0 accepted=0 runtime_gate=0`。目前預期 `gates=2`;若 188 hygiene 回到 blocked,才會是 `gates=3`。若 gate 數量、P0 gate id、`0 / false` 欄位、禁用 secret payload、Wazuh 禁用 active response / host write,或 no-false-green 規則任何一項漂移,視為 `BLOCKED`,不得送 owner request、不得寫 escrow marker、不得進維護窗口、不得宣稱 DR / Wazuh 完成。 +收到 owner response 檔案前,或收到任何聲稱已補證據的 JSON 前,必須跑 owner response preflight: + +```bash +scripts/reboot-recovery/post-reboot-owner-response-preflight.py --no-color +scripts/reboot-recovery/post-reboot-owner-response-preflight.py --no-color --response-file docs/templates/post-reboot-next-gate-owner-response.json +``` + +第一個命令必須輸出 `POST_REBOOT_OWNER_RESPONSE_PREFLIGHT_BLOCKED status=blocked_waiting_owner_response_file expected_gates=2 received=0 accepted=0 runtime_gate=0`。第二個命令必須輸出 `POST_REBOOT_OWNER_RESPONSE_PREFLIGHT_BLOCKED status=blocked_waiting_owner_response_content expected_gates=2 received=0 accepted=0 runtime_gate=0`,證明空模板不能被算成已收件或已接受。合格 response 只能包含脫敏 evidence refs、owner role / team / decision / reviewer / followup owner、五個 escrow item 的 non-secret evidence ref,以及 Wazuh manager registry / Dashboard API readback;不得包含密碼、token、secret value、hash、prefix/suffix、raw Wazuh payload、agent 原名、內網 IP、`client.keys`、active response、host write、agent re-enroll、Wazuh restart、Kali active scan 或 credential marker write。preflight 通過也只代表可進入獨立 reviewer acceptance,不代表 `DR_COMPLETE`、`WAZUH_REGISTRY_RECOVERED` 或任何 runtime action 授權。 + 需要展開細節時,再使用 repo-side wrapper: ```bash diff --git a/docs/templates/post-reboot-next-gate-owner-response.json b/docs/templates/post-reboot-next-gate-owner-response.json new file mode 100644 index 00000000..23f0f5ef --- /dev/null +++ b/docs/templates/post-reboot-next-gate-owner-response.json @@ -0,0 +1,98 @@ +{ + "schema_version": "awoooi_post_reboot_next_gate_owner_response_v1", + "responses": [ + { + "gate_id": "credential_escrow_evidence", + "owner_role": "owner_role_here", + "owner_team": "owner_team_here", + "decision": "pending", + "decision_reason": "decision_reason_here", + "affected_scope": "AWOOOI DR credential escrow non-secret evidence", + "redacted_evidence_refs": [ + "redacted_evidence_ref_here" + ], + "followup_owner": "followup_owner_here", + "runtime_action_requested": false, + "host_write_requested": false, + "secret_value_included": false, + "secret_value_collection_allowed": false, + "credential_marker_write_requested": false, + "escrow_items": [ + { + "item_id": "restic_repository_password", + "non_secret_evidence_ref": "non_secret_evidence_ref_here", + "recovery_owner": "owner_role_here", + "reviewer": "reviewer_here", + "last_reviewed_at": "pending", + "contains_secret_value": false + }, + { + "item_id": "offsite_provider_credentials", + "non_secret_evidence_ref": "non_secret_evidence_ref_here", + "recovery_owner": "owner_role_here", + "reviewer": "reviewer_here", + "last_reviewed_at": "pending", + "contains_secret_value": false + }, + { + "item_id": "break_glass_admin_credentials", + "non_secret_evidence_ref": "non_secret_evidence_ref_here", + "recovery_owner": "owner_role_here", + "reviewer": "reviewer_here", + "last_reviewed_at": "pending", + "contains_secret_value": false + }, + { + "item_id": "dns_registrar_recovery", + "non_secret_evidence_ref": "non_secret_evidence_ref_here", + "recovery_owner": "owner_role_here", + "reviewer": "reviewer_here", + "last_reviewed_at": "pending", + "contains_secret_value": false + }, + { + "item_id": "oauth_ai_provider_recovery", + "non_secret_evidence_ref": "non_secret_evidence_ref_here", + "recovery_owner": "owner_role_here", + "reviewer": "reviewer_here", + "last_reviewed_at": "pending", + "contains_secret_value": false + } + ] + }, + { + "gate_id": "wazuh_manager_registry_export", + "owner_role": "owner_role_here", + "owner_team": "owner_team_here", + "decision": "pending", + "decision_reason": "decision_reason_here", + "affected_scope": "Wazuh manager registry redacted export", + "redacted_evidence_refs": [ + "redacted_evidence_ref_here" + ], + "followup_owner": "followup_owner_here", + "runtime_action_requested": false, + "host_write_requested": false, + "secret_value_included": false, + "secret_value_collection_allowed": false, + "wazuh_active_response_requested": false, + "agent_reenroll_requested": false, + "wazuh_restart_requested": false, + "kali_active_scan_requested": false, + "registry_export_ref": "registry_export_ref_here", + "registry_time_window": "pending", + "expected_host_aliases": [ + "core-110", + "gateway-188", + "k3s-control-120", + "k3s-control-121", + "security-observer-112", + "dev-workstation-111" + ], + "manager_registry_count": 0, + "dashboard_api_connection_status": "pending", + "dashboard_api_version_status": "pending", + "reviewer": "reviewer_here" + } + ] +} diff --git a/docs/workplans/2026-06-04-reboot-cold-start-backup-recovery-workplan.md b/docs/workplans/2026-06-04-reboot-cold-start-backup-recovery-workplan.md index 34e68637..d755febc 100644 --- a/docs/workplans/2026-06-04-reboot-cold-start-backup-recovery-workplan.md +++ b/docs/workplans/2026-06-04-reboot-cold-start-backup-recovery-workplan.md @@ -15,7 +15,7 @@ | P0 host / K3s recovery | DONE | 100% | 120 booted after console fsck at `2026-06-12 15:13`; latest 2026-06-26 07:19 readback shows 120 and 121 reachable, K3s active, `mon` and `mon1` both `Ready control-plane`, AWOOOI API/Web replicas split across both nodes, ArgoCD `awoooi-prod Synced / Healthy` at revision `1fd5e2a8b0f18d24eed16aa2a44286bcbf230603`, and `km-vectorize` official 03:00 台北時間 run succeeded with `lastSuccess=2026-06-25T19:00:14Z`. | | P1 backup / alert / escrow | BLOCKED_DR_ESCROW | 97% | 2026-06-26 06:58 backup readback shows 110 `13/13 fresh failed=0`, 188 `2/2 fresh failed=0`, `core_blockers=0`, `integrity_stale=0`, `offsite_fresh=1`, `rclone_gdrive_fresh=1`, `escrow_missing=5`, last aggregate `2026-06-26 02:31:02`。DR remains blocked on real non-secret credential escrow evidence IDs; do not write placeholder markers or paste secret values. | | P2 service / data truth | DONE | 100% | Service routes and core runtime are available, 110 current CPU pressure is attributable to active AWOOOI Web `turbo build` / Docker buildx, and previous orphan Chrome groups remain cleared. 2026-06-26 07:19 StockPlatform `/api/v1/system/freshness` returned `200`; 07:01 freshness payload was `status=ok`, `latest_trading_date=2026-06-25`, blockers `[]`; price / chips / margin / AI recommendations are all on `2026-06-25`. `ai.recommendations` row count is `2868`; `core.margin_short_daily` row count is `1976`. MOMO health `V10.699`, current-month parity `15383|15383|2026-06-01|2026-06-24|2026-06-01|2026-06-24`, and `MOMO_DAILY_FRESHNESS 1|2026-06-24` are green; expanded public routes are green. | -| P3 docs / automation contracts | DONE_WITH_DECLARATION_GUARD_V173 | 100% | Workplan, SOP v1.73, post-reboot declaration guard, machine-readable post-reboot readiness summary with Wazuh registry detail fields, post-reboot next-gate dispatch checklist, owner-packet JSON generator, dynamic owner-packet contract guard, one-page post-start quick check v1.13, route retry gate, deploy warmup classification, expanded public route list, StockPlatform freshness gate, StockPlatform cron-source recovery evidence, StockPlatform natural schedule green evidence, 110 orphan Chrome recurrence cleanup evidence, 188 fail-closed startup data recovery gate, 188 host hygiene read-only checklist, 188 PostgreSQL runtime-ready source-of-truth, 188 ACME route/timer hygiene, baseline `stockplatform_system_freshness_ok`, BACKUP-STATUS, LOGBOOK, 120 console/fsck recovery, Gitea backup stale-dump hardening, reboot ledger/version-comparison SOP, escrow evidence audit, 188 nginx Ansible baseline, 110 cold-start detector script, startup judgment layers, GO/NO-GO tree, host recovery cards, explicit Plan B degraded-operation path, machine-readable `plan_b` baseline, readiness-audit Plan B guard, B0-B5 service levels, T+0/T+120 fallback timeline checks, host role / load-balancing assessment, CD `known_hosts` guardrail, `fwupd-refresh.timer` rollback note, K3s filesystem event blocker, AWOOOI backup no-direct-offsite-sync contract, 110/188 Ansible source-of-truth, Gitea self-hosted readiness validation workflow, post-CD no-regression readbacks, stale-vs-active K8s failed Job classification, 110 runaway browser / CI load AIOps exporter + alert + gated remediation PlayBook, Telegram / AI event packet mapping, healthy heartbeat Telegram suppression, MOMO scheduler / current-month detector fix, exporter restore helpers, 110 Docker disk pressure cleanup boundary, notification-noise readback, MOMO import-boundary / Drive-auth fail-closed deploys, product version/readback matrix, and stricter product-data / route retry gates are updated. Declaration guard now machine-checks allowed / forbidden recovery statements: service/data/backup/188 host hygiene green may be declared when live summary says so, while `DR_COMPLETE`、`WAZUH_REGISTRY_RECOVERED` and `RUNTIME_ACTION_AUTHORIZED` remain forbidden until evidence gates close. Live 110 script sync remains a separate approved live-write gate; do not claim it here. | +| P3 docs / automation contracts | DONE_WITH_OWNER_RESPONSE_PREFLIGHT_V174 | 100% | Workplan, SOP v1.74, post-reboot declaration guard, machine-readable post-reboot readiness summary with Wazuh registry detail fields, post-reboot next-gate dispatch checklist, owner-packet JSON generator, dynamic owner-packet contract guard, post-reboot owner response preflight, owner response placeholder template, one-page post-start quick check v1.14, route retry gate, deploy warmup classification, expanded public route list, StockPlatform freshness gate, StockPlatform cron-source recovery evidence, StockPlatform natural schedule green evidence, 110 orphan Chrome recurrence cleanup evidence, 188 fail-closed startup data recovery gate, 188 host hygiene read-only checklist, 188 PostgreSQL runtime-ready source-of-truth, 188 ACME route/timer hygiene, baseline `stockplatform_system_freshness_ok`, BACKUP-STATUS, LOGBOOK, 120 console/fsck recovery, Gitea backup stale-dump hardening, reboot ledger/version-comparison SOP, escrow evidence audit, 188 nginx Ansible baseline, 110 cold-start detector script, startup judgment layers, GO/NO-GO tree, host recovery cards, explicit Plan B degraded-operation path, machine-readable `plan_b` baseline, readiness-audit Plan B guard, B0-B5 service levels, T+0/T+120 fallback timeline checks, host role / load-balancing assessment, CD `known_hosts` guardrail, `fwupd-refresh.timer` rollback note, K3s filesystem event blocker, AWOOOI backup no-direct-offsite-sync contract, 110/188 Ansible source-of-truth, Gitea self-hosted readiness validation workflow, post-CD no-regression readbacks, stale-vs-active K8s failed Job classification, 110 runaway browser / CI load AIOps exporter + alert + gated remediation PlayBook, Telegram / AI event packet mapping, healthy heartbeat Telegram suppression, MOMO scheduler / current-month detector fix, exporter restore helpers, 110 Docker disk pressure cleanup boundary, notification-noise readback, MOMO import-boundary / Drive-auth fail-closed deploys, product version/readback matrix, and stricter product-data / route retry gates are updated. Declaration guard now machine-checks allowed / forbidden recovery statements: service/data/backup/188 host hygiene green may be declared when live summary says so, while `DR_COMPLETE`、`WAZUH_REGISTRY_RECOVERED` and `RUNTIME_ACTION_AUTHORIZED` remain forbidden until evidence gates close. Owner response preflight blocks missing files, placeholder templates, secret payloads, credential marker writes, Wazuh active response / re-enroll / restart, host write, and Kali active scan before any evidence can be counted as received or accepted. Live 110 script sync remains a separate approved live-write gate; do not claim it here. | 2026-06-26 12:13 machine-readable summary baseline supersedes the 07:47 / 08:59 gate set: `scripts/reboot-recovery/post-reboot-readiness-summary.sh --no-color` stores delegated logs under `/tmp/awoooi-post-reboot-readiness-20260626-121303` and returns `SERVICE_GREEN=1`, `PRODUCT_DATA_GREEN=1`, `BACKUP_CORE_GREEN=1`, `DR_ESCROW_BLOCKED=1`, `ESCROW_MISSING_COUNT=5`, `HOST_188_SERVICE_GREEN=1`, `HOST_188_HYGIENE_BLOCKED=0`, `HOST_188_CHECK_RC=0`, `HOST_188_RESULT=HOST_188_HYGIENE_GREEN.`, `WAZUH_ROUTE_CODE=200`, `WAZUH_TRANSPORT_COUNT=6`, `WAZUH_COVERAGE_SCOPE=6`, `WAZUH_DIRECT_ACTIVE=2`, `WAZUH_NO_TRANSPORT=1`, `WAZUH_SSH_BLOCKED=3`, `WAZUH_DASHBOARD_API_CONNECTION=pending_or_spinning`, `WAZUH_DASHBOARD_INDEX_OK=3`, `WAZUH_MANAGER_REGISTRY_ACCEPTED=0`, `WAZUH_RUNTIME_GATE=0`, `RUNTIME_ACTION_AUTHORIZED=0`, `OVERALL_DECLARATION=FULL_STACK_GREEN_DR_ESCROW_BLOCKED`, and `NEXT_REQUIRED_GATES=credential_escrow_evidence,wazuh_manager_registry_export`. This is now the preferred first operator/AI-agent entrypoint after reboot because it separates service health from DR and security registry evidence; 188 host hygiene is no longer a next gate unless the live checklist regresses. @@ -27,6 +27,8 @@ 2026-06-26 12:13 owner-packet contract guard baseline: `scripts/reboot-recovery/post-reboot-owner-packet-contract-guard.py --packet-file /tmp/awoooi-post-reboot-owner-packets.json` validates the generated JSON before any owner review intake. It requires the packet gates to equal the live `source.next_required_gates`, preserves `request_sent=0`、`owner_response_received=0`、`owner_response_accepted=0`、`runtime_action_authorized=0`、`host_write_authorized=0`、`secret_value_collection_allowed=0`、`runtime_gate=0`, and rejects missing forbidden payload/action controls for active gates. Current expected success line: `POST_REBOOT_OWNER_PACKET_CONTRACT_GUARD_OK gates=2 request_sent=0 accepted=0 runtime_gate=0`. +2026-06-26 13:01 owner response preflight baseline: `scripts/reboot-recovery/post-reboot-owner-response-preflight.py --no-color` validates future owner responses against the dynamic owner-packet gate set without sending requests, writing markers, reading secrets, or changing runtime. Missing response file must remain `blocked_waiting_owner_response_file`; the placeholder template `docs/templates/post-reboot-next-gate-owner-response.json` must remain `blocked_waiting_owner_response_content` with `received=0`, `accepted=0`, and `runtime_gate=0`. The only acceptable payload class is redacted owner evidence for credential escrow and Wazuh manager registry export; secret values, hash / prefix / suffix, raw Wazuh payload, agent real names, internal IPs, `client.keys`, credential marker write, host write, Wazuh active response / re-enroll / restart, and Kali active scan are rejected. + 2026-06-26 08:47 Wazuh registry detail summary baseline: post-reboot readiness summary now emits `WAZUH_COVERAGE_SCOPE`, `WAZUH_DIRECT_ACTIVE`, `WAZUH_NO_TRANSPORT`, `WAZUH_SSH_BLOCKED`, `WAZUH_DASHBOARD_API_CONNECTION`, and `WAZUH_DASHBOARD_INDEX_OK` alongside existing route / transport / registry fields. Current read-only truth is coverage scope `6`, direct active `2`, no transport `1`, SSH blocked `3`, route `200`, transport `6`, Dashboard API `pending_or_spinning`, index OK `3`, manager registry accepted `0`, runtime gate `0`. This is a security evidence blocker, not a reboot service blocker. 2026-06-26 12:13 declaration guard baseline: `scripts/reboot-recovery/post-reboot-declaration-guard.py --no-color` emits `schema_version=awoooi_post_reboot_declaration_guard_v1`, status `allowed_with_boundary_blockers`, allowed declarations including service / product data / backup / 188 host hygiene green for this evidence set, and forbidden declarations `DR_COMPLETE`、`WAZUH_REGISTRY_RECOVERED`、`RUNTIME_ACTION_AUTHORIZED`. Proposed false-green declarations are rejected before they can enter LOGBOOK / owner packets / external status updates. diff --git a/scripts/reboot-recovery/post-reboot-owner-response-preflight.py b/scripts/reboot-recovery/post-reboot-owner-response-preflight.py new file mode 100755 index 00000000..4ad7a5f3 --- /dev/null +++ b/scripts/reboot-recovery/post-reboot-owner-response-preflight.py @@ -0,0 +1,401 @@ +#!/usr/bin/env python3 +"""Preflight owner responses for post-reboot next gates. + +Read-only by design. This script validates an owner response JSON file against +the current post-reboot owner packets. It never sends requests, reads secrets, +writes credential markers, or modifies host/runtime state. +""" + +from __future__ import annotations + +import argparse +import json +import re +import subprocess +import sys +from pathlib import Path +from typing import Any + + +ROOT = Path(__file__).resolve().parents[2] +OWNER_PACKET_GENERATOR = ( + ROOT / "scripts" / "reboot-recovery" / "post-reboot-next-gate-owner-packets.py" +) + +EXPECTED_SCHEMA = "awoooi_post_reboot_next_gate_owner_response_v1" +EXPECTED_OWNER_PACKET_SCHEMA = "awoooi_post_reboot_next_gate_owner_packets_v1" +PLACEHOLDER_VALUES = { + "", + "pending", + "todo", + "tbd", + "n/a", + "na", + "owner_role_here", + "owner_team_here", + "decision_reason_here", + "redacted_evidence_ref_here", + "non_secret_evidence_ref_here", + "registry_export_ref_here", + "followup_owner_here", + "reviewer_here", +} + +ESCROW_ITEM_IDS = { + "restic_repository_password", + "offsite_provider_credentials", + "break_glass_admin_credentials", + "dns_registrar_recovery", + "oauth_ai_provider_recovery", +} + +EXPECTED_HOST_ALIASES = { + "core-110", + "gateway-188", + "k3s-control-120", + "k3s-control-121", + "security-observer-112", + "dev-workstation-111", +} + +FORBIDDEN_BOOLEAN_FIELDS = { + "runtime_action_requested", + "runtime_action_authorized", + "host_write_requested", + "host_write_authorized", + "secret_value_included", + "secret_value_collection_allowed", + "credential_marker_write_requested", + "credential_marker_write_authorized", + "wazuh_active_response_requested", + "wazuh_active_response_authorized", + "agent_reenroll_requested", + "wazuh_restart_requested", + "kali_active_scan_requested", +} + +SECRET_VALUE_PATTERNS = [ + re.compile(r"-----BEGIN [A-Z ]*PRIVATE KEY-----"), + re.compile(r"\bBearer\s+[A-Za-z0-9._~+/=-]{12,}", re.IGNORECASE), + re.compile(r"\bAuthorization\s*:\s*", re.IGNORECASE), + re.compile(r"\bgh[pousr]_[A-Za-z0-9_]{20,}"), + re.compile(r"\bsk-[A-Za-z0-9]{20,}"), + re.compile(r"\bAIza[0-9A-Za-z_-]{20,}"), + re.compile(r"\b[0-9]{8,10}:[A-Za-z0-9_-]{20,}\b"), + re.compile(r"\b(password|token|secret)\s*[:=]\s*[^,\s]+", re.IGNORECASE), + re.compile(r"\bclient\.keys\b", re.IGNORECASE), +] + + +def parse_args() -> argparse.Namespace: + parser = argparse.ArgumentParser( + description="Validate post-reboot owner response evidence without opening runtime gates.", + ) + parser.add_argument("--response-file", type=Path, help="Owner response JSON to validate.") + parser.add_argument( + "--owner-packet-file", + type=Path, + help="Use an existing owner packet JSON instead of generating one.", + ) + parser.add_argument("--json", action="store_true", help="Print machine-readable JSON.") + parser.add_argument( + "--no-color", + action="store_true", + help="Pass --no-color when generating owner packets.", + ) + return parser.parse_args() + + +def load_json(path: Path) -> dict[str, Any]: + try: + payload = json.loads(path.read_text(encoding="utf-8")) + except FileNotFoundError as exc: + raise SystemExit(f"response_file_not_found={path}") from exc + except json.JSONDecodeError as exc: + raise SystemExit(f"response_json_invalid={exc}") from exc + if not isinstance(payload, dict): + raise SystemExit("response_json_not_object") + return payload + + +def generate_owner_packet(no_color: bool) -> dict[str, Any]: + cmd = [str(OWNER_PACKET_GENERATOR)] + if no_color: + cmd.append("--no-color") + completed = subprocess.run( + cmd, + cwd=ROOT, + check=False, + text=True, + stdout=subprocess.PIPE, + stderr=subprocess.STDOUT, + ) + if completed.returncode != 0: + raise SystemExit( + "owner_packet_generation_failed " + f"rc={completed.returncode}\n{completed.stdout}" + ) + try: + packet = json.loads(completed.stdout) + except json.JSONDecodeError as exc: + raise SystemExit(f"owner_packet_json_invalid={exc}") from exc + if not isinstance(packet, dict): + raise SystemExit("owner_packet_json_not_object") + return packet + + +def load_owner_packet(args: argparse.Namespace) -> dict[str, Any]: + if args.owner_packet_file: + return load_json(args.owner_packet_file) + return generate_owner_packet(no_color=args.no_color) + + +def as_list(value: Any) -> list[Any]: + if value is None: + return [] + if isinstance(value, list): + return value + return [value] + + +def normalized(value: Any) -> str: + if value is None: + return "" + return str(value).strip() + + +def is_placeholder(value: Any) -> bool: + return normalized(value).lower() in PLACEHOLDER_VALUES + + +def collect_strings(value: Any, path: str = "$") -> list[tuple[str, str]]: + strings: list[tuple[str, str]] = [] + if isinstance(value, str): + strings.append((path, value)) + elif isinstance(value, dict): + for key, child in value.items(): + strings.extend(collect_strings(child, f"{path}.{key}")) + elif isinstance(value, list): + for index, child in enumerate(value): + strings.extend(collect_strings(child, f"{path}[{index}]")) + return strings + + +def find_forbidden_strings(response: dict[str, Any]) -> list[str]: + failures: list[str] = [] + for path, value in collect_strings(response): + if path.endswith(".item_id") and value in ESCROW_ITEM_IDS: + continue + if path.endswith(".gate_id") and value in { + "credential_escrow_evidence", + "wazuh_manager_registry_export", + "host_188_hygiene_maintenance_window", + }: + continue + for pattern in SECRET_VALUE_PATTERNS: + if pattern.search(value): + failures.append(f"forbidden_payload_at={path}") + break + return failures + + +def find_forbidden_booleans(value: Any, path: str = "$") -> list[str]: + failures: list[str] = [] + if isinstance(value, dict): + for key, child in value.items(): + child_path = f"{path}.{key}" + if key in FORBIDDEN_BOOLEAN_FIELDS and child is not False: + failures.append(f"{child_path}={child!r}") + failures.extend(find_forbidden_booleans(child, child_path)) + elif isinstance(value, list): + for index, child in enumerate(value): + failures.extend(find_forbidden_booleans(child, f"{path}[{index}]")) + return failures + + +def owner_packet_gate_ids(packet: dict[str, Any]) -> set[str]: + if packet.get("schema_version") != EXPECTED_OWNER_PACKET_SCHEMA: + raise SystemExit(f"owner_packet_schema={packet.get('schema_version')!r}") + return { + str(item.get("packet_id")) + for item in as_list(packet.get("owner_packets")) + if isinstance(item, dict) and item.get("packet_id") + } + + +def response_by_gate(response: dict[str, Any]) -> dict[str, dict[str, Any]]: + responses = as_list(response.get("responses")) + by_gate: dict[str, dict[str, Any]] = {} + for item in responses: + if not isinstance(item, dict): + continue + gate_id = normalized(item.get("gate_id")) + if gate_id: + by_gate[gate_id] = item + return by_gate + + +def validate_common(gate_id: str, item: dict[str, Any]) -> list[str]: + failures: list[str] = [] + for key in ( + "owner_role", + "owner_team", + "decision", + "decision_reason", + "affected_scope", + "followup_owner", + ): + if is_placeholder(item.get(key)): + failures.append(f"{gate_id}.{key}_missing") + decision = normalized(item.get("decision")).lower() + if decision not in {"accepted", "rejected", "needs_supplement"}: + failures.append(f"{gate_id}.decision_invalid={decision!r}") + evidence_refs = [ + ref for ref in as_list(item.get("redacted_evidence_refs")) if not is_placeholder(ref) + ] + if not evidence_refs: + failures.append(f"{gate_id}.redacted_evidence_refs_missing") + return failures + + +def validate_credential_escrow(item: dict[str, Any]) -> list[str]: + failures: list[str] = [] + escrow_items = as_list(item.get("escrow_items")) + seen = { + normalized(entry.get("item_id")) + for entry in escrow_items + if isinstance(entry, dict) + } + missing = sorted(ESCROW_ITEM_IDS - seen) + if missing: + failures.append(f"credential_escrow_evidence.missing_items={missing}") + for entry in escrow_items: + if not isinstance(entry, dict): + failures.append("credential_escrow_evidence.escrow_item_not_object") + continue + item_id = normalized(entry.get("item_id")) + if item_id not in ESCROW_ITEM_IDS: + failures.append(f"credential_escrow_evidence.unknown_item={item_id!r}") + for key in ("non_secret_evidence_ref", "recovery_owner", "reviewer", "last_reviewed_at"): + if is_placeholder(entry.get(key)): + failures.append(f"credential_escrow_evidence.{item_id}.{key}_missing") + if entry.get("contains_secret_value") is not False: + failures.append(f"credential_escrow_evidence.{item_id}.contains_secret_value_not_false") + return failures + + +def validate_wazuh_registry(item: dict[str, Any]) -> list[str]: + failures: list[str] = [] + for key in ( + "registry_export_ref", + "registry_time_window", + "dashboard_api_connection_status", + "dashboard_api_version_status", + "reviewer", + ): + if is_placeholder(item.get(key)): + failures.append(f"wazuh_manager_registry_export.{key}_missing") + aliases = {normalized(alias) for alias in as_list(item.get("expected_host_aliases"))} + missing_aliases = sorted(EXPECTED_HOST_ALIASES - aliases) + if missing_aliases: + failures.append(f"wazuh_manager_registry_export.missing_aliases={missing_aliases}") + if not isinstance(item.get("manager_registry_count"), int): + failures.append("wazuh_manager_registry_export.manager_registry_count_not_int") + if normalized(item.get("dashboard_api_connection_status")).lower() != "ok": + failures.append("wazuh_manager_registry_export.dashboard_api_connection_not_ok") + if normalized(item.get("dashboard_api_version_status")).lower() != "ok": + failures.append("wazuh_manager_registry_export.dashboard_api_version_not_ok") + return failures + + +def evaluate(packet: dict[str, Any], response: dict[str, Any] | None) -> dict[str, Any]: + expected_gates = owner_packet_gate_ids(packet) + result: dict[str, Any] = { + "schema_version": "awoooi_post_reboot_owner_response_preflight_v1", + "expected_gate_count": len(expected_gates), + "expected_gates": sorted(expected_gates), + "owner_response_received_count": 0, + "owner_response_accepted_count": 0, + "runtime_gate_count": 0, + "runtime_action_authorized": False, + "host_write_authorized": False, + "secret_value_collection_allowed": False, + "status": "blocked_waiting_owner_response_file", + "blockers": [], + } + if response is None: + result["blockers"] = ["owner_response_file_missing"] + return result + + failures: list[str] = [] + if response.get("schema_version") != EXPECTED_SCHEMA: + failures.append(f"schema_version={response.get('schema_version')!r}") + failures.extend(find_forbidden_strings(response)) + failures.extend(find_forbidden_booleans(response)) + + by_gate = response_by_gate(response) + gate_ids = set(by_gate) + unknown_gates = sorted(gate_ids - expected_gates) + missing_gates = sorted(expected_gates - gate_ids) + if unknown_gates: + failures.append(f"unknown_gate_ids={unknown_gates}") + if missing_gates: + failures.append(f"missing_gate_responses={missing_gates}") + + received = 0 + accepted = 0 + for gate_id in sorted(expected_gates & gate_ids): + item = by_gate[gate_id] + gate_failures = validate_common(gate_id, item) + if gate_id == "credential_escrow_evidence": + gate_failures.extend(validate_credential_escrow(item)) + elif gate_id == "wazuh_manager_registry_export": + gate_failures.extend(validate_wazuh_registry(item)) + else: + gate_failures.append(f"{gate_id}.unsupported_for_response_preflight") + if gate_failures: + failures.extend(gate_failures) + else: + received += 1 + if normalized(item.get("decision")).lower() == "accepted": + accepted += 1 + + result["owner_response_received_count"] = received + result["owner_response_accepted_count"] = accepted + result["blockers"] = failures + if failures: + result["status"] = "blocked_waiting_owner_response_content" + elif accepted == len(expected_gates): + result["status"] = "ready_for_independent_reviewer_acceptance" + else: + result["status"] = "blocked_waiting_owner_acceptance" + return result + + +def main() -> int: + args = parse_args() + packet = load_owner_packet(args) + response = load_json(args.response_file) if args.response_file else None + result = evaluate(packet, response) + + if args.json: + print(json.dumps(result, ensure_ascii=False, indent=2, sort_keys=True)) + else: + prefix = ( + "POST_REBOOT_OWNER_RESPONSE_PREFLIGHT_OK" + if result["status"] == "ready_for_independent_reviewer_acceptance" + else "POST_REBOOT_OWNER_RESPONSE_PREFLIGHT_BLOCKED" + ) + print( + f"{prefix} status={result['status']} " + f"expected_gates={result['expected_gate_count']} " + f"received={result['owner_response_received_count']} " + f"accepted={result['owner_response_accepted_count']} " + f"runtime_gate={result['runtime_gate_count']} " + f"blockers={len(result['blockers'])}" + ) + return 0 + + +if __name__ == "__main__": + sys.exit(main())