ops(reboot): add post-reboot owner response preflight
This commit is contained in:
@@ -45596,3 +45596,40 @@ production browser smoke:
|
||||
- DR credential escrow evidence 仍缺 `5`:不得宣稱 `DR_COMPLETE`。
|
||||
- Wazuh manager registry accepted 仍為 `0`:不得宣稱 Wazuh 全主機納管恢復。
|
||||
- certbot formal renewal 尚未完成 readback;本輪完成的是 HTTP-01 route / timer hygiene / failed-unit 清除,正式 renew 成功需等 snap certbot timer 或獨立 ACME window。
|
||||
|
||||
## 2026-06-26 — 13:01 post-reboot owner response preflight / SOP v1.74
|
||||
|
||||
**時間與來源**:
|
||||
- 2026-06-26 13:01-13:23 Asia/Taipei。
|
||||
- 來源:`scripts/reboot-recovery/post-reboot-next-gate-owner-packets.py --no-color`、新增 `scripts/reboot-recovery/post-reboot-owner-response-preflight.py --no-color`、placeholder template `docs/templates/post-reboot-next-gate-owner-response.json`、SOP / workplan 文件同步。
|
||||
|
||||
**完成內容**:
|
||||
- 新增 post-reboot owner response preflight,驗收未來 owner response JSON 是否符合目前 `awoooi_post_reboot_next_gate_owner_packets_v1` 的動態 gate set。
|
||||
- 新增 placeholder response template,刻意保留 `owner_role_here`、`non_secret_evidence_ref_here`、`registry_export_ref_here` 等 placeholder,作為 fail-closed 測試樣本;直接套用模板不得被算成已收件或已接受。
|
||||
- `docs/runbooks/REBOOT-POST-START-QUICK-CHECK.md` 升至 v1.14,固定流程改為 summary → declaration guard → next-gate dispatch → owner packet → contract guard → owner response preflight。
|
||||
- `docs/runbooks/FULL-STACK-COLD-START-SOP.md` 升至 v1.74,將 owner response preflight 納入完整開機 / 關機 / 重啟 SOP。
|
||||
- `docs/workplans/2026-06-04-reboot-cold-start-backup-recovery-workplan.md` 更新為 `DONE_WITH_OWNER_RESPONSE_PREFLIGHT_V174`。
|
||||
|
||||
**live / preflight 證據**:
|
||||
- 13:23 owner packet live generation 讀回 `next_gate_count=2`,只剩 `credential_escrow_evidence` 與 `wazuh_manager_registry_export`;`request_sent_count=0`、`owner_response_received_count=0`、`owner_response_accepted_count=0`、`runtime_action_authorized_count=0`。
|
||||
- 12:58 post-start summary 已恢復為 `POST_START_RESULT=FULL_STACK_GREEN_DR_ESCROW_BLOCKED`、`POST_START_PASS=38`、`POST_START_WARN=4`、`POST_START_BLOCKED=0`、`SERVICE_GREEN=1`、`PRODUCT_DATA_GREEN=1`、`BACKUP_CORE_GREEN=1`、`DR_ESCROW_BLOCKED=1`、`ESCROW_MISSING_COUNT=5`、`HOST_188_HYGIENE_BLOCKED=0`、`HOST_188_RESULT=HOST_188_HYGIENE_GREEN.`、`WAZUH_ROUTE_CODE=200`、`WAZUH_TRANSPORT_COUNT=6`、`WAZUH_COVERAGE_SCOPE=6`、`WAZUH_DIRECT_ACTIVE=2`、`WAZUH_NO_TRANSPORT=1`、`WAZUH_SSH_BLOCKED=3`、`WAZUH_DASHBOARD_API_CONNECTION=pending_or_spinning`、`WAZUH_DASHBOARD_INDEX_OK=3`、`WAZUH_MANAGER_REGISTRY_ACCEPTED=0`、`RUNTIME_ACTION_AUTHORIZED=0`、`OVERALL_DECLARATION=FULL_STACK_GREEN_DR_ESCROW_BLOCKED`、`NEXT_REQUIRED_GATES=credential_escrow_evidence,wazuh_manager_registry_export`。
|
||||
- 12:55 首輪 owner-packet generation 曾因 110 transient `stockplatform-review-bulk-ux` active process / service warning 使 summary 暫時落入 service warning;未 kill、未 restart、未取消 CI;12:58 重跑後自動恢復,證明 SOP 會把 transient / active CI process 與真正 orphan / service blocker 分開。
|
||||
- 無 response file 預期輸出:`POST_REBOOT_OWNER_RESPONSE_PREFLIGHT_BLOCKED status=blocked_waiting_owner_response_file expected_gates=2 received=0 accepted=0 runtime_gate=0 blockers=1`。
|
||||
- placeholder template 輸出:`POST_REBOOT_OWNER_RESPONSE_PREFLIGHT_BLOCKED status=blocked_waiting_owner_response_content expected_gates=2 received=0 accepted=0 runtime_gate=0 blockers=41`。
|
||||
|
||||
**做過的命令類型**:
|
||||
- 只讀:post-reboot owner packet generation、owner response preflight、contract / declaration / source guards。
|
||||
- 寫入:repo script / docs / template only。
|
||||
- 未做:沒有 host / Docker / systemd / Nginx / firewall / K8s / DB / Wazuh runtime 寫操作;沒有讀 secret 明文;沒有寫 credential marker;沒有送 owner request;沒有 Wazuh active response / agent re-enroll / restart;沒有 Kali active scan。
|
||||
|
||||
**目前判定**:
|
||||
- Owner response preflight automation:`0% -> 100%`。
|
||||
- Reboot service / product data / backup / 188 host hygiene:`GREEN`。
|
||||
- Overall recovery declaration:`FULL_STACK_GREEN_DR_ESCROW_BLOCKED`。
|
||||
- SOP / quick-check / owner-packet / owner-response preflight:v1.74。
|
||||
|
||||
**仍 blocked / 不得宣稱**:
|
||||
- DR credential escrow evidence 仍缺 `5`:不得宣稱 `DR_COMPLETE` 或 credential escrow complete。
|
||||
- Wazuh manager registry accepted 仍為 `0`:不得宣稱 Wazuh 全主機納管恢復。
|
||||
- Owner response received / accepted 仍為 `0 / 0`;不得把「批准繼續」、空模板、UI 可見、route `200`、transport `6`、Dashboard index pattern `3` 或 owner-packet JSON 當成 evidence accepted。
|
||||
- Runtime action / host write / credential marker write / Wazuh active response / Kali active scan 仍全部 `0 / false`。
|
||||
|
||||
@@ -1,6 +1,6 @@
|
||||
# AWOOOI 全棧冷啟動與主機重啟 SOP
|
||||
|
||||
> Version: v1.73
|
||||
> Version: v1.74
|
||||
> Last updated: 2026-06-26 Asia/Taipei
|
||||
> Scope: 110 / 120 / 121 / 188 full-stack reboot recovery. 112 Kali is recorded as P3 optional and is not part of this recovery path.
|
||||
|
||||
@@ -10,10 +10,12 @@
|
||||
|
||||
本節是每次接手、開機、關機、重啟後的第一個判定錨點。若日期不是今天,必須先重跑 live check,再更新本節與 `docs/workplans/2026-06-04-reboot-cold-start-backup-recovery-workplan.md`。
|
||||
|
||||
若只是重啟後要快速判斷能不能宣稱恢復,先跑機器可讀摘要:`scripts/reboot-recovery/post-reboot-readiness-summary.sh --no-color`。此腳本會呼叫一頁式總檢查、188 host hygiene checklist 與 Wazuh no-false-green repo gates,並把 delegated logs 留在 `/tmp/awoooi-post-reboot-readiness-*`。接著跑 `scripts/reboot-recovery/post-reboot-declaration-guard.py --no-color`,把 summary 轉成 allowed / forbidden declaration,避免把服務綠誤報成 DR complete、188 host hygiene、Wazuh registry recovered 或 runtime authorized。若 summary 顯示 `SERVICE_GREEN=1` 但 `NEXT_REQUIRED_GATES` 仍非空,再跑 `scripts/reboot-recovery/post-reboot-next-gate-dispatch.sh --no-color`,把 live summary 內尚未完成的 blocker 轉成 owner / evidence / forbidden-action dispatch checklist;需要機器可讀 intake 時,再跑 `scripts/reboot-recovery/post-reboot-next-gate-owner-packets.py --no-color --output /tmp/awoooi-post-reboot-owner-packets.json` 產生 `awoooi_post_reboot_next_gate_owner_packets_v1` JSON,並立刻跑 `scripts/reboot-recovery/post-reboot-owner-packet-contract-guard.py --packet-file /tmp/awoooi-post-reboot-owner-packets.json`。dispatch / packet / guard 均固定 `DISPATCH_AUTHORIZED=0`、`REQUEST_SENT_COUNT=0`、`OWNER_RESPONSE_ACCEPTED=0`、`HOST_WRITE_AUTHORIZED=0`、`SECRET_VALUE_COLLECTION_ALLOWED=0`、`RUNTIME_GATE=0`;guard 未通過時不得送 owner request、不得寫 escrow marker、不得進維護窗口、不得宣稱 DR / Wazuh registry complete。需要人工展開時,再跑 `scripts/reboot-recovery/post-start-quick-check.sh --no-color` 並以 `docs/runbooks/REBOOT-POST-START-QUICK-CHECK.md` 作為 fallback。長 SOP 保留完整背景、例外處理與 Plan B;短版 wrapper / checklist 負責每次 T+10 分鐘內的固定判定。
|
||||
若只是重啟後要快速判斷能不能宣稱恢復,先跑機器可讀摘要:`scripts/reboot-recovery/post-reboot-readiness-summary.sh --no-color`。此腳本會呼叫一頁式總檢查、188 host hygiene checklist 與 Wazuh no-false-green repo gates,並把 delegated logs 留在 `/tmp/awoooi-post-reboot-readiness-*`。接著跑 `scripts/reboot-recovery/post-reboot-declaration-guard.py --no-color`,把 summary 轉成 allowed / forbidden declaration,避免把服務綠誤報成 DR complete、188 host hygiene、Wazuh registry recovered 或 runtime authorized。若 summary 顯示 `SERVICE_GREEN=1` 但 `NEXT_REQUIRED_GATES` 仍非空,再跑 `scripts/reboot-recovery/post-reboot-next-gate-dispatch.sh --no-color`,把 live summary 內尚未完成的 blocker 轉成 owner / evidence / forbidden-action dispatch checklist;需要機器可讀 intake 時,再跑 `scripts/reboot-recovery/post-reboot-next-gate-owner-packets.py --no-color --output /tmp/awoooi-post-reboot-owner-packets.json` 產生 `awoooi_post_reboot_next_gate_owner_packets_v1` JSON,並立刻跑 `scripts/reboot-recovery/post-reboot-owner-packet-contract-guard.py --packet-file /tmp/awoooi-post-reboot-owner-packets.json`。dispatch / packet / guard 均固定 `DISPATCH_AUTHORIZED=0`、`REQUEST_SENT_COUNT=0`、`OWNER_RESPONSE_ACCEPTED=0`、`HOST_WRITE_AUTHORIZED=0`、`SECRET_VALUE_COLLECTION_ALLOWED=0`、`RUNTIME_GATE=0`;guard 未通過時不得送 owner request、不得寫 escrow marker、不得進維護窗口、不得宣稱 DR / Wazuh registry complete。v1.74 起,任何 owner response JSON 還必須經過 `scripts/reboot-recovery/post-reboot-owner-response-preflight.py --no-color --response-file <file>`:空模板、placeholder、secret payload、runtime action request、credential marker write、Wazuh active response / re-enroll / restart、Kali active scan 或缺少 Dashboard API / manager registry evidence 都必須 fail-closed;preflight 通過也只表示可進入獨立 reviewer acceptance,不是 runtime 授權。需要人工展開時,再跑 `scripts/reboot-recovery/post-start-quick-check.sh --no-color` 並以 `docs/runbooks/REBOOT-POST-START-QUICK-CHECK.md` 作為 fallback。長 SOP 保留完整背景、例外處理與 Plan B;短版 wrapper / checklist 負責每次 T+10 分鐘內的固定判定。
|
||||
|
||||
2026-06-26 12:13 latest live summary supersedes the 08:59 gate set:`scripts/reboot-recovery/post-reboot-readiness-summary.sh --no-color` 回傳 `POST_START_RESULT=FULL_STACK_GREEN_DR_ESCROW_BLOCKED`、`POST_START_PASS=38`、`POST_START_WARN=4`、`POST_START_BLOCKED=0`、`SERVICE_GREEN=1`、`PRODUCT_DATA_GREEN=1`、`BACKUP_CORE_GREEN=1`、`DR_ESCROW_BLOCKED=1`、`ESCROW_MISSING_COUNT=5`、`HOST_188_SERVICE_GREEN=1`、`HOST_188_HYGIENE_BLOCKED=0`、`HOST_188_RESULT=HOST_188_HYGIENE_GREEN.`、`WAZUH_ROUTE_CODE=200`、`WAZUH_TRANSPORT_COUNT=6`、`WAZUH_MANAGER_REGISTRY_ACCEPTED=0`、`WAZUH_DASHBOARD_API_CONNECTION=pending_or_spinning`、`WAZUH_DASHBOARD_INDEX_OK=3`、`RUNTIME_ACTION_AUTHORIZED=0`、`OVERALL_DECLARATION=FULL_STACK_GREEN_DR_ESCROW_BLOCKED`、`NEXT_REQUIRED_GATES=credential_escrow_evidence,wazuh_manager_registry_export`。188 host hygiene 已從 blocker 移除;目前不可宣稱完成的只剩 DR credential escrow 與 Wazuh manager registry。ACME HTTP-01 route 與 certbot timer hygiene 已修復,但不得宣稱憑證已正式 renew,需等 snap certbot timer / ACME window readback。
|
||||
|
||||
2026-06-26 13:01 owner response preflight baseline:新增 `scripts/reboot-recovery/post-reboot-owner-response-preflight.py --no-color` 與 `docs/templates/post-reboot-next-gate-owner-response.json`。無 response file 時必須輸出 `POST_REBOOT_OWNER_RESPONSE_PREFLIGHT_BLOCKED status=blocked_waiting_owner_response_file expected_gates=2 received=0 accepted=0 runtime_gate=0`;直接使用模板時必須輸出 `POST_REBOOT_OWNER_RESPONSE_PREFLIGHT_BLOCKED status=blocked_waiting_owner_response_content expected_gates=2 received=0 accepted=0 runtime_gate=0`。此 gate 只驗收 `credential_escrow_evidence` 與 `wazuh_manager_registry_export` 的脫敏 owner evidence,不送 request、不寫 escrow marker、不讀 secret、不做 Wazuh / host / Kali runtime action,也不把一般批准訊息轉成 owner accepted。
|
||||
|
||||
2026-06-26 07:47 machine-readable readiness summary retained as historical pre-repair evidence:當時 `HOST_188_HYGIENE_BLOCKED=1`、`NEXT_REQUIRED_GATES=credential_escrow_evidence,host_188_hygiene_maintenance_window,wazuh_manager_registry_export`。此段只用來比對 188 修復前後差異;現行 gate set 必須使用 12:13 baseline。
|
||||
|
||||
2026-06-26 08:12 next-gate dispatch baseline retained as historical pre-repair evidence:當時 output 固定三個 P0 checklist。12:13 起 dispatch 依 live summary 動態輸出,目前 expected `NEXT_GATE_COUNT=2`,只剩 credential escrow 與 Wazuh registry。
|
||||
|
||||
@@ -1,6 +1,6 @@
|
||||
# 主機重啟後一頁式總檢查
|
||||
|
||||
> Version: v1.13
|
||||
> Version: v1.14
|
||||
> Last updated: 2026-06-26 Asia/Taipei
|
||||
> Scope: 110 / 120 / 121 / 188 post-reboot service recovery. 112 Kali / Wazuh / active scan 不屬於本流程。
|
||||
|
||||
@@ -10,7 +10,7 @@
|
||||
|
||||
每次 110 / 120 / 121 / 188 任一台主機開機、關機、重啟、斷電恢復、VMware console fsck、Docker / K3s 大量重排後,都先跑本頁,再決定是否宣稱恢復。
|
||||
|
||||
最新基準:2026-06-26 12:13 post-reboot summary / declaration guard。`scripts/reboot-recovery/post-reboot-readiness-summary.sh --no-color` 回傳 `SERVICE_GREEN=1`、`PRODUCT_DATA_GREEN=1`、`BACKUP_CORE_GREEN=1`、`DR_ESCROW_BLOCKED=1`、`ESCROW_MISSING_COUNT=5`、`HOST_188_HYGIENE_BLOCKED=0`、`HOST_188_RESULT=HOST_188_HYGIENE_GREEN.`、`WAZUH_MANAGER_REGISTRY_ACCEPTED=0`、`WAZUH_COVERAGE_SCOPE=6`、`WAZUH_DIRECT_ACTIVE=2`、`WAZUH_NO_TRANSPORT=1`、`WAZUH_SSH_BLOCKED=3`、`WAZUH_ROUTE_CODE=200`、`WAZUH_TRANSPORT_COUNT=6`、`WAZUH_DASHBOARD_API_CONNECTION=pending_or_spinning`、`WAZUH_DASHBOARD_INDEX_OK=3`、`RUNTIME_ACTION_AUTHORIZED=0`、`OVERALL_DECLARATION=FULL_STACK_GREEN_DR_ESCROW_BLOCKED`。`scripts/reboot-recovery/post-reboot-declaration-guard.py --no-color` 會把 summary 轉成 allowed / forbidden declaration:目前允許宣稱服務、產品資料、備份核心、188 host hygiene green 與 `FULL_STACK_GREEN_DR_ESCROW_BLOCKED`;禁止宣稱 `DR_COMPLETE`、`WAZUH_REGISTRY_RECOVERED`、`RUNTIME_ACTION_AUTHORIZED`。接著 `scripts/reboot-recovery/post-reboot-next-gate-dispatch.sh --no-color` 將 `NEXT_REQUIRED_GATES=credential_escrow_evidence,wazuh_manager_registry_export` 展成 owner / evidence / forbidden-action checklist;Wazuh checklist 的 `CURRENT_EVIDENCE` 會保留 registry accepted、coverage scope、direct active、no transport、SSH blocked、route、transport、Dashboard API 與 index pattern 狀態,避免把 route `200` 或 transport `6` 誤報成 registry recovered。`scripts/reboot-recovery/post-reboot-next-gate-owner-packets.py --no-color` 進一步轉成 `awoooi_post_reboot_next_gate_owner_packets_v1` JSON,固定 `dispatch_authorized=0`、`request_sent_count=0`、`owner_response_accepted_count=0`、`host_write_authorized=0`、`secret_value_collection_allowed=0`、`runtime_gate_count=0`;`scripts/reboot-recovery/post-reboot-owner-packet-contract-guard.py --packet-file /tmp/awoooi-post-reboot-owner-packets.json` 依 live `next_required_gates` 動態鎖定 P0 gate、所有 `0 / false` 邊界、禁用 secret payload / runtime action 與 no-false-green 規則。DR 仍因 `escrow_missing=5` 不可宣稱 complete;Wazuh manager registry 仍是 service green 之外的獨立 blocker。ACME HTTP-01 route / certbot timer hygiene 已修復,但憑證正式 renew 成功需等 snap certbot timer 或獨立 ACME window readback。
|
||||
最新基準:2026-06-26 13:01 post-reboot owner response preflight。`scripts/reboot-recovery/post-reboot-readiness-summary.sh --no-color` 回傳 `SERVICE_GREEN=1`、`PRODUCT_DATA_GREEN=1`、`BACKUP_CORE_GREEN=1`、`DR_ESCROW_BLOCKED=1`、`ESCROW_MISSING_COUNT=5`、`HOST_188_HYGIENE_BLOCKED=0`、`HOST_188_RESULT=HOST_188_HYGIENE_GREEN.`、`WAZUH_MANAGER_REGISTRY_ACCEPTED=0`、`WAZUH_COVERAGE_SCOPE=6`、`WAZUH_DIRECT_ACTIVE=2`、`WAZUH_NO_TRANSPORT=1`、`WAZUH_SSH_BLOCKED=3`、`WAZUH_ROUTE_CODE=200`、`WAZUH_TRANSPORT_COUNT=6`、`WAZUH_DASHBOARD_API_CONNECTION=pending_or_spinning`、`WAZUH_DASHBOARD_INDEX_OK=3`、`RUNTIME_ACTION_AUTHORIZED=0`、`OVERALL_DECLARATION=FULL_STACK_GREEN_DR_ESCROW_BLOCKED`。`scripts/reboot-recovery/post-reboot-declaration-guard.py --no-color` 會把 summary 轉成 allowed / forbidden declaration:目前允許宣稱服務、產品資料、備份核心、188 host hygiene green 與 `FULL_STACK_GREEN_DR_ESCROW_BLOCKED`;禁止宣稱 `DR_COMPLETE`、`WAZUH_REGISTRY_RECOVERED`、`RUNTIME_ACTION_AUTHORIZED`。接著 `scripts/reboot-recovery/post-reboot-next-gate-dispatch.sh --no-color` 將 `NEXT_REQUIRED_GATES=credential_escrow_evidence,wazuh_manager_registry_export` 展成 owner / evidence / forbidden-action checklist;Wazuh checklist 的 `CURRENT_EVIDENCE` 會保留 registry accepted、coverage scope、direct active、no transport、SSH blocked、route、transport、Dashboard API 與 index pattern 狀態,避免把 route `200` 或 transport `6` 誤報成 registry recovered。`scripts/reboot-recovery/post-reboot-next-gate-owner-packets.py --no-color` 進一步轉成 `awoooi_post_reboot_next_gate_owner_packets_v1` JSON,固定 `dispatch_authorized=0`、`request_sent_count=0`、`owner_response_accepted_count=0`、`host_write_authorized=0`、`secret_value_collection_allowed=0`、`runtime_gate_count=0`;`scripts/reboot-recovery/post-reboot-owner-packet-contract-guard.py --packet-file /tmp/awoooi-post-reboot-owner-packets.json` 依 live `next_required_gates` 動態鎖定 P0 gate、所有 `0 / false` 邊界、禁用 secret payload / runtime action 與 no-false-green 規則。新增 `scripts/reboot-recovery/post-reboot-owner-response-preflight.py --no-color` 作為 owner response 收件預檢:沒有 response file 必須是 `blocked_waiting_owner_response_file`;直接套用 `docs/templates/post-reboot-next-gate-owner-response.json` 必須是 `blocked_waiting_owner_response_content`;只有具備遮罩 evidence refs、完整 owner 欄位、Wazuh registry / Dashboard API 狀態、五個 credential escrow 非 secret evidence refs,且沒有 secret value / runtime action request 的 response 才能進入下一層 reviewer acceptance。DR 仍因 `escrow_missing=5` 不可宣稱 complete;Wazuh manager registry 仍是 service green 之外的獨立 blocker。ACME HTTP-01 route / certbot timer hygiene 已修復,但憑證正式 renew 成功需等 snap certbot timer 或獨立 ACME window readback。
|
||||
|
||||
本頁只回答四件事:
|
||||
|
||||
@@ -100,6 +100,15 @@ scripts/reboot-recovery/post-reboot-owner-packet-contract-guard.py --packet-file
|
||||
|
||||
guard 必須輸出 `POST_REBOOT_OWNER_PACKET_CONTRACT_GUARD_OK gates=<live_next_gate_count> request_sent=0 accepted=0 runtime_gate=0`。目前預期 `gates=2`;若 188 hygiene 回到 blocked,才會是 `gates=3`。若 gate 數量、P0 gate id、`0 / false` 欄位、禁用 secret payload、Wazuh 禁用 active response / host write,或 no-false-green 規則任何一項漂移,視為 `BLOCKED`,不得送 owner request、不得寫 escrow marker、不得進維護窗口、不得宣稱 DR / Wazuh 完成。
|
||||
|
||||
收到 owner response 檔案前,或收到任何聲稱已補證據的 JSON 前,必須跑 owner response preflight:
|
||||
|
||||
```bash
|
||||
scripts/reboot-recovery/post-reboot-owner-response-preflight.py --no-color
|
||||
scripts/reboot-recovery/post-reboot-owner-response-preflight.py --no-color --response-file docs/templates/post-reboot-next-gate-owner-response.json
|
||||
```
|
||||
|
||||
第一個命令必須輸出 `POST_REBOOT_OWNER_RESPONSE_PREFLIGHT_BLOCKED status=blocked_waiting_owner_response_file expected_gates=2 received=0 accepted=0 runtime_gate=0`。第二個命令必須輸出 `POST_REBOOT_OWNER_RESPONSE_PREFLIGHT_BLOCKED status=blocked_waiting_owner_response_content expected_gates=2 received=0 accepted=0 runtime_gate=0`,證明空模板不能被算成已收件或已接受。合格 response 只能包含脫敏 evidence refs、owner role / team / decision / reviewer / followup owner、五個 escrow item 的 non-secret evidence ref,以及 Wazuh manager registry / Dashboard API readback;不得包含密碼、token、secret value、hash、prefix/suffix、raw Wazuh payload、agent 原名、內網 IP、`client.keys`、active response、host write、agent re-enroll、Wazuh restart、Kali active scan 或 credential marker write。preflight 通過也只代表可進入獨立 reviewer acceptance,不代表 `DR_COMPLETE`、`WAZUH_REGISTRY_RECOVERED` 或任何 runtime action 授權。
|
||||
|
||||
需要展開細節時,再使用 repo-side wrapper:
|
||||
|
||||
```bash
|
||||
|
||||
98
docs/templates/post-reboot-next-gate-owner-response.json
vendored
Normal file
98
docs/templates/post-reboot-next-gate-owner-response.json
vendored
Normal file
@@ -0,0 +1,98 @@
|
||||
{
|
||||
"schema_version": "awoooi_post_reboot_next_gate_owner_response_v1",
|
||||
"responses": [
|
||||
{
|
||||
"gate_id": "credential_escrow_evidence",
|
||||
"owner_role": "owner_role_here",
|
||||
"owner_team": "owner_team_here",
|
||||
"decision": "pending",
|
||||
"decision_reason": "decision_reason_here",
|
||||
"affected_scope": "AWOOOI DR credential escrow non-secret evidence",
|
||||
"redacted_evidence_refs": [
|
||||
"redacted_evidence_ref_here"
|
||||
],
|
||||
"followup_owner": "followup_owner_here",
|
||||
"runtime_action_requested": false,
|
||||
"host_write_requested": false,
|
||||
"secret_value_included": false,
|
||||
"secret_value_collection_allowed": false,
|
||||
"credential_marker_write_requested": false,
|
||||
"escrow_items": [
|
||||
{
|
||||
"item_id": "restic_repository_password",
|
||||
"non_secret_evidence_ref": "non_secret_evidence_ref_here",
|
||||
"recovery_owner": "owner_role_here",
|
||||
"reviewer": "reviewer_here",
|
||||
"last_reviewed_at": "pending",
|
||||
"contains_secret_value": false
|
||||
},
|
||||
{
|
||||
"item_id": "offsite_provider_credentials",
|
||||
"non_secret_evidence_ref": "non_secret_evidence_ref_here",
|
||||
"recovery_owner": "owner_role_here",
|
||||
"reviewer": "reviewer_here",
|
||||
"last_reviewed_at": "pending",
|
||||
"contains_secret_value": false
|
||||
},
|
||||
{
|
||||
"item_id": "break_glass_admin_credentials",
|
||||
"non_secret_evidence_ref": "non_secret_evidence_ref_here",
|
||||
"recovery_owner": "owner_role_here",
|
||||
"reviewer": "reviewer_here",
|
||||
"last_reviewed_at": "pending",
|
||||
"contains_secret_value": false
|
||||
},
|
||||
{
|
||||
"item_id": "dns_registrar_recovery",
|
||||
"non_secret_evidence_ref": "non_secret_evidence_ref_here",
|
||||
"recovery_owner": "owner_role_here",
|
||||
"reviewer": "reviewer_here",
|
||||
"last_reviewed_at": "pending",
|
||||
"contains_secret_value": false
|
||||
},
|
||||
{
|
||||
"item_id": "oauth_ai_provider_recovery",
|
||||
"non_secret_evidence_ref": "non_secret_evidence_ref_here",
|
||||
"recovery_owner": "owner_role_here",
|
||||
"reviewer": "reviewer_here",
|
||||
"last_reviewed_at": "pending",
|
||||
"contains_secret_value": false
|
||||
}
|
||||
]
|
||||
},
|
||||
{
|
||||
"gate_id": "wazuh_manager_registry_export",
|
||||
"owner_role": "owner_role_here",
|
||||
"owner_team": "owner_team_here",
|
||||
"decision": "pending",
|
||||
"decision_reason": "decision_reason_here",
|
||||
"affected_scope": "Wazuh manager registry redacted export",
|
||||
"redacted_evidence_refs": [
|
||||
"redacted_evidence_ref_here"
|
||||
],
|
||||
"followup_owner": "followup_owner_here",
|
||||
"runtime_action_requested": false,
|
||||
"host_write_requested": false,
|
||||
"secret_value_included": false,
|
||||
"secret_value_collection_allowed": false,
|
||||
"wazuh_active_response_requested": false,
|
||||
"agent_reenroll_requested": false,
|
||||
"wazuh_restart_requested": false,
|
||||
"kali_active_scan_requested": false,
|
||||
"registry_export_ref": "registry_export_ref_here",
|
||||
"registry_time_window": "pending",
|
||||
"expected_host_aliases": [
|
||||
"core-110",
|
||||
"gateway-188",
|
||||
"k3s-control-120",
|
||||
"k3s-control-121",
|
||||
"security-observer-112",
|
||||
"dev-workstation-111"
|
||||
],
|
||||
"manager_registry_count": 0,
|
||||
"dashboard_api_connection_status": "pending",
|
||||
"dashboard_api_version_status": "pending",
|
||||
"reviewer": "reviewer_here"
|
||||
}
|
||||
]
|
||||
}
|
||||
@@ -15,7 +15,7 @@
|
||||
| P0 host / K3s recovery | DONE | 100% | 120 booted after console fsck at `2026-06-12 15:13`; latest 2026-06-26 07:19 readback shows 120 and 121 reachable, K3s active, `mon` and `mon1` both `Ready control-plane`, AWOOOI API/Web replicas split across both nodes, ArgoCD `awoooi-prod Synced / Healthy` at revision `1fd5e2a8b0f18d24eed16aa2a44286bcbf230603`, and `km-vectorize` official 03:00 台北時間 run succeeded with `lastSuccess=2026-06-25T19:00:14Z`. |
|
||||
| P1 backup / alert / escrow | BLOCKED_DR_ESCROW | 97% | 2026-06-26 06:58 backup readback shows 110 `13/13 fresh failed=0`, 188 `2/2 fresh failed=0`, `core_blockers=0`, `integrity_stale=0`, `offsite_fresh=1`, `rclone_gdrive_fresh=1`, `escrow_missing=5`, last aggregate `2026-06-26 02:31:02`。DR remains blocked on real non-secret credential escrow evidence IDs; do not write placeholder markers or paste secret values. |
|
||||
| P2 service / data truth | DONE | 100% | Service routes and core runtime are available, 110 current CPU pressure is attributable to active AWOOOI Web `turbo build` / Docker buildx, and previous orphan Chrome groups remain cleared. 2026-06-26 07:19 StockPlatform `/api/v1/system/freshness` returned `200`; 07:01 freshness payload was `status=ok`, `latest_trading_date=2026-06-25`, blockers `[]`; price / chips / margin / AI recommendations are all on `2026-06-25`. `ai.recommendations` row count is `2868`; `core.margin_short_daily` row count is `1976`. MOMO health `V10.699`, current-month parity `15383|15383|2026-06-01|2026-06-24|2026-06-01|2026-06-24`, and `MOMO_DAILY_FRESHNESS 1|2026-06-24` are green; expanded public routes are green. |
|
||||
| P3 docs / automation contracts | DONE_WITH_DECLARATION_GUARD_V173 | 100% | Workplan, SOP v1.73, post-reboot declaration guard, machine-readable post-reboot readiness summary with Wazuh registry detail fields, post-reboot next-gate dispatch checklist, owner-packet JSON generator, dynamic owner-packet contract guard, one-page post-start quick check v1.13, route retry gate, deploy warmup classification, expanded public route list, StockPlatform freshness gate, StockPlatform cron-source recovery evidence, StockPlatform natural schedule green evidence, 110 orphan Chrome recurrence cleanup evidence, 188 fail-closed startup data recovery gate, 188 host hygiene read-only checklist, 188 PostgreSQL runtime-ready source-of-truth, 188 ACME route/timer hygiene, baseline `stockplatform_system_freshness_ok`, BACKUP-STATUS, LOGBOOK, 120 console/fsck recovery, Gitea backup stale-dump hardening, reboot ledger/version-comparison SOP, escrow evidence audit, 188 nginx Ansible baseline, 110 cold-start detector script, startup judgment layers, GO/NO-GO tree, host recovery cards, explicit Plan B degraded-operation path, machine-readable `plan_b` baseline, readiness-audit Plan B guard, B0-B5 service levels, T+0/T+120 fallback timeline checks, host role / load-balancing assessment, CD `known_hosts` guardrail, `fwupd-refresh.timer` rollback note, K3s filesystem event blocker, AWOOOI backup no-direct-offsite-sync contract, 110/188 Ansible source-of-truth, Gitea self-hosted readiness validation workflow, post-CD no-regression readbacks, stale-vs-active K8s failed Job classification, 110 runaway browser / CI load AIOps exporter + alert + gated remediation PlayBook, Telegram / AI event packet mapping, healthy heartbeat Telegram suppression, MOMO scheduler / current-month detector fix, exporter restore helpers, 110 Docker disk pressure cleanup boundary, notification-noise readback, MOMO import-boundary / Drive-auth fail-closed deploys, product version/readback matrix, and stricter product-data / route retry gates are updated. Declaration guard now machine-checks allowed / forbidden recovery statements: service/data/backup/188 host hygiene green may be declared when live summary says so, while `DR_COMPLETE`、`WAZUH_REGISTRY_RECOVERED` and `RUNTIME_ACTION_AUTHORIZED` remain forbidden until evidence gates close. Live 110 script sync remains a separate approved live-write gate; do not claim it here. |
|
||||
| P3 docs / automation contracts | DONE_WITH_OWNER_RESPONSE_PREFLIGHT_V174 | 100% | Workplan, SOP v1.74, post-reboot declaration guard, machine-readable post-reboot readiness summary with Wazuh registry detail fields, post-reboot next-gate dispatch checklist, owner-packet JSON generator, dynamic owner-packet contract guard, post-reboot owner response preflight, owner response placeholder template, one-page post-start quick check v1.14, route retry gate, deploy warmup classification, expanded public route list, StockPlatform freshness gate, StockPlatform cron-source recovery evidence, StockPlatform natural schedule green evidence, 110 orphan Chrome recurrence cleanup evidence, 188 fail-closed startup data recovery gate, 188 host hygiene read-only checklist, 188 PostgreSQL runtime-ready source-of-truth, 188 ACME route/timer hygiene, baseline `stockplatform_system_freshness_ok`, BACKUP-STATUS, LOGBOOK, 120 console/fsck recovery, Gitea backup stale-dump hardening, reboot ledger/version-comparison SOP, escrow evidence audit, 188 nginx Ansible baseline, 110 cold-start detector script, startup judgment layers, GO/NO-GO tree, host recovery cards, explicit Plan B degraded-operation path, machine-readable `plan_b` baseline, readiness-audit Plan B guard, B0-B5 service levels, T+0/T+120 fallback timeline checks, host role / load-balancing assessment, CD `known_hosts` guardrail, `fwupd-refresh.timer` rollback note, K3s filesystem event blocker, AWOOOI backup no-direct-offsite-sync contract, 110/188 Ansible source-of-truth, Gitea self-hosted readiness validation workflow, post-CD no-regression readbacks, stale-vs-active K8s failed Job classification, 110 runaway browser / CI load AIOps exporter + alert + gated remediation PlayBook, Telegram / AI event packet mapping, healthy heartbeat Telegram suppression, MOMO scheduler / current-month detector fix, exporter restore helpers, 110 Docker disk pressure cleanup boundary, notification-noise readback, MOMO import-boundary / Drive-auth fail-closed deploys, product version/readback matrix, and stricter product-data / route retry gates are updated. Declaration guard now machine-checks allowed / forbidden recovery statements: service/data/backup/188 host hygiene green may be declared when live summary says so, while `DR_COMPLETE`、`WAZUH_REGISTRY_RECOVERED` and `RUNTIME_ACTION_AUTHORIZED` remain forbidden until evidence gates close. Owner response preflight blocks missing files, placeholder templates, secret payloads, credential marker writes, Wazuh active response / re-enroll / restart, host write, and Kali active scan before any evidence can be counted as received or accepted. Live 110 script sync remains a separate approved live-write gate; do not claim it here. |
|
||||
|
||||
2026-06-26 12:13 machine-readable summary baseline supersedes the 07:47 / 08:59 gate set: `scripts/reboot-recovery/post-reboot-readiness-summary.sh --no-color` stores delegated logs under `/tmp/awoooi-post-reboot-readiness-20260626-121303` and returns `SERVICE_GREEN=1`, `PRODUCT_DATA_GREEN=1`, `BACKUP_CORE_GREEN=1`, `DR_ESCROW_BLOCKED=1`, `ESCROW_MISSING_COUNT=5`, `HOST_188_SERVICE_GREEN=1`, `HOST_188_HYGIENE_BLOCKED=0`, `HOST_188_CHECK_RC=0`, `HOST_188_RESULT=HOST_188_HYGIENE_GREEN.`, `WAZUH_ROUTE_CODE=200`, `WAZUH_TRANSPORT_COUNT=6`, `WAZUH_COVERAGE_SCOPE=6`, `WAZUH_DIRECT_ACTIVE=2`, `WAZUH_NO_TRANSPORT=1`, `WAZUH_SSH_BLOCKED=3`, `WAZUH_DASHBOARD_API_CONNECTION=pending_or_spinning`, `WAZUH_DASHBOARD_INDEX_OK=3`, `WAZUH_MANAGER_REGISTRY_ACCEPTED=0`, `WAZUH_RUNTIME_GATE=0`, `RUNTIME_ACTION_AUTHORIZED=0`, `OVERALL_DECLARATION=FULL_STACK_GREEN_DR_ESCROW_BLOCKED`, and `NEXT_REQUIRED_GATES=credential_escrow_evidence,wazuh_manager_registry_export`. This is now the preferred first operator/AI-agent entrypoint after reboot because it separates service health from DR and security registry evidence; 188 host hygiene is no longer a next gate unless the live checklist regresses.
|
||||
|
||||
@@ -27,6 +27,8 @@
|
||||
|
||||
2026-06-26 12:13 owner-packet contract guard baseline: `scripts/reboot-recovery/post-reboot-owner-packet-contract-guard.py --packet-file /tmp/awoooi-post-reboot-owner-packets.json` validates the generated JSON before any owner review intake. It requires the packet gates to equal the live `source.next_required_gates`, preserves `request_sent=0`、`owner_response_received=0`、`owner_response_accepted=0`、`runtime_action_authorized=0`、`host_write_authorized=0`、`secret_value_collection_allowed=0`、`runtime_gate=0`, and rejects missing forbidden payload/action controls for active gates. Current expected success line: `POST_REBOOT_OWNER_PACKET_CONTRACT_GUARD_OK gates=2 request_sent=0 accepted=0 runtime_gate=0`.
|
||||
|
||||
2026-06-26 13:01 owner response preflight baseline: `scripts/reboot-recovery/post-reboot-owner-response-preflight.py --no-color` validates future owner responses against the dynamic owner-packet gate set without sending requests, writing markers, reading secrets, or changing runtime. Missing response file must remain `blocked_waiting_owner_response_file`; the placeholder template `docs/templates/post-reboot-next-gate-owner-response.json` must remain `blocked_waiting_owner_response_content` with `received=0`, `accepted=0`, and `runtime_gate=0`. The only acceptable payload class is redacted owner evidence for credential escrow and Wazuh manager registry export; secret values, hash / prefix / suffix, raw Wazuh payload, agent real names, internal IPs, `client.keys`, credential marker write, host write, Wazuh active response / re-enroll / restart, and Kali active scan are rejected.
|
||||
|
||||
2026-06-26 08:47 Wazuh registry detail summary baseline: post-reboot readiness summary now emits `WAZUH_COVERAGE_SCOPE`, `WAZUH_DIRECT_ACTIVE`, `WAZUH_NO_TRANSPORT`, `WAZUH_SSH_BLOCKED`, `WAZUH_DASHBOARD_API_CONNECTION`, and `WAZUH_DASHBOARD_INDEX_OK` alongside existing route / transport / registry fields. Current read-only truth is coverage scope `6`, direct active `2`, no transport `1`, SSH blocked `3`, route `200`, transport `6`, Dashboard API `pending_or_spinning`, index OK `3`, manager registry accepted `0`, runtime gate `0`. This is a security evidence blocker, not a reboot service blocker.
|
||||
|
||||
2026-06-26 12:13 declaration guard baseline: `scripts/reboot-recovery/post-reboot-declaration-guard.py --no-color` emits `schema_version=awoooi_post_reboot_declaration_guard_v1`, status `allowed_with_boundary_blockers`, allowed declarations including service / product data / backup / 188 host hygiene green for this evidence set, and forbidden declarations `DR_COMPLETE`、`WAZUH_REGISTRY_RECOVERED`、`RUNTIME_ACTION_AUTHORIZED`. Proposed false-green declarations are rejected before they can enter LOGBOOK / owner packets / external status updates.
|
||||
|
||||
401
scripts/reboot-recovery/post-reboot-owner-response-preflight.py
Executable file
401
scripts/reboot-recovery/post-reboot-owner-response-preflight.py
Executable file
@@ -0,0 +1,401 @@
|
||||
#!/usr/bin/env python3
|
||||
"""Preflight owner responses for post-reboot next gates.
|
||||
|
||||
Read-only by design. This script validates an owner response JSON file against
|
||||
the current post-reboot owner packets. It never sends requests, reads secrets,
|
||||
writes credential markers, or modifies host/runtime state.
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import argparse
|
||||
import json
|
||||
import re
|
||||
import subprocess
|
||||
import sys
|
||||
from pathlib import Path
|
||||
from typing import Any
|
||||
|
||||
|
||||
ROOT = Path(__file__).resolve().parents[2]
|
||||
OWNER_PACKET_GENERATOR = (
|
||||
ROOT / "scripts" / "reboot-recovery" / "post-reboot-next-gate-owner-packets.py"
|
||||
)
|
||||
|
||||
EXPECTED_SCHEMA = "awoooi_post_reboot_next_gate_owner_response_v1"
|
||||
EXPECTED_OWNER_PACKET_SCHEMA = "awoooi_post_reboot_next_gate_owner_packets_v1"
|
||||
PLACEHOLDER_VALUES = {
|
||||
"",
|
||||
"pending",
|
||||
"todo",
|
||||
"tbd",
|
||||
"n/a",
|
||||
"na",
|
||||
"owner_role_here",
|
||||
"owner_team_here",
|
||||
"decision_reason_here",
|
||||
"redacted_evidence_ref_here",
|
||||
"non_secret_evidence_ref_here",
|
||||
"registry_export_ref_here",
|
||||
"followup_owner_here",
|
||||
"reviewer_here",
|
||||
}
|
||||
|
||||
ESCROW_ITEM_IDS = {
|
||||
"restic_repository_password",
|
||||
"offsite_provider_credentials",
|
||||
"break_glass_admin_credentials",
|
||||
"dns_registrar_recovery",
|
||||
"oauth_ai_provider_recovery",
|
||||
}
|
||||
|
||||
EXPECTED_HOST_ALIASES = {
|
||||
"core-110",
|
||||
"gateway-188",
|
||||
"k3s-control-120",
|
||||
"k3s-control-121",
|
||||
"security-observer-112",
|
||||
"dev-workstation-111",
|
||||
}
|
||||
|
||||
FORBIDDEN_BOOLEAN_FIELDS = {
|
||||
"runtime_action_requested",
|
||||
"runtime_action_authorized",
|
||||
"host_write_requested",
|
||||
"host_write_authorized",
|
||||
"secret_value_included",
|
||||
"secret_value_collection_allowed",
|
||||
"credential_marker_write_requested",
|
||||
"credential_marker_write_authorized",
|
||||
"wazuh_active_response_requested",
|
||||
"wazuh_active_response_authorized",
|
||||
"agent_reenroll_requested",
|
||||
"wazuh_restart_requested",
|
||||
"kali_active_scan_requested",
|
||||
}
|
||||
|
||||
SECRET_VALUE_PATTERNS = [
|
||||
re.compile(r"-----BEGIN [A-Z ]*PRIVATE KEY-----"),
|
||||
re.compile(r"\bBearer\s+[A-Za-z0-9._~+/=-]{12,}", re.IGNORECASE),
|
||||
re.compile(r"\bAuthorization\s*:\s*", re.IGNORECASE),
|
||||
re.compile(r"\bgh[pousr]_[A-Za-z0-9_]{20,}"),
|
||||
re.compile(r"\bsk-[A-Za-z0-9]{20,}"),
|
||||
re.compile(r"\bAIza[0-9A-Za-z_-]{20,}"),
|
||||
re.compile(r"\b[0-9]{8,10}:[A-Za-z0-9_-]{20,}\b"),
|
||||
re.compile(r"\b(password|token|secret)\s*[:=]\s*[^,\s]+", re.IGNORECASE),
|
||||
re.compile(r"\bclient\.keys\b", re.IGNORECASE),
|
||||
]
|
||||
|
||||
|
||||
def parse_args() -> argparse.Namespace:
|
||||
parser = argparse.ArgumentParser(
|
||||
description="Validate post-reboot owner response evidence without opening runtime gates.",
|
||||
)
|
||||
parser.add_argument("--response-file", type=Path, help="Owner response JSON to validate.")
|
||||
parser.add_argument(
|
||||
"--owner-packet-file",
|
||||
type=Path,
|
||||
help="Use an existing owner packet JSON instead of generating one.",
|
||||
)
|
||||
parser.add_argument("--json", action="store_true", help="Print machine-readable JSON.")
|
||||
parser.add_argument(
|
||||
"--no-color",
|
||||
action="store_true",
|
||||
help="Pass --no-color when generating owner packets.",
|
||||
)
|
||||
return parser.parse_args()
|
||||
|
||||
|
||||
def load_json(path: Path) -> dict[str, Any]:
|
||||
try:
|
||||
payload = json.loads(path.read_text(encoding="utf-8"))
|
||||
except FileNotFoundError as exc:
|
||||
raise SystemExit(f"response_file_not_found={path}") from exc
|
||||
except json.JSONDecodeError as exc:
|
||||
raise SystemExit(f"response_json_invalid={exc}") from exc
|
||||
if not isinstance(payload, dict):
|
||||
raise SystemExit("response_json_not_object")
|
||||
return payload
|
||||
|
||||
|
||||
def generate_owner_packet(no_color: bool) -> dict[str, Any]:
|
||||
cmd = [str(OWNER_PACKET_GENERATOR)]
|
||||
if no_color:
|
||||
cmd.append("--no-color")
|
||||
completed = subprocess.run(
|
||||
cmd,
|
||||
cwd=ROOT,
|
||||
check=False,
|
||||
text=True,
|
||||
stdout=subprocess.PIPE,
|
||||
stderr=subprocess.STDOUT,
|
||||
)
|
||||
if completed.returncode != 0:
|
||||
raise SystemExit(
|
||||
"owner_packet_generation_failed "
|
||||
f"rc={completed.returncode}\n{completed.stdout}"
|
||||
)
|
||||
try:
|
||||
packet = json.loads(completed.stdout)
|
||||
except json.JSONDecodeError as exc:
|
||||
raise SystemExit(f"owner_packet_json_invalid={exc}") from exc
|
||||
if not isinstance(packet, dict):
|
||||
raise SystemExit("owner_packet_json_not_object")
|
||||
return packet
|
||||
|
||||
|
||||
def load_owner_packet(args: argparse.Namespace) -> dict[str, Any]:
|
||||
if args.owner_packet_file:
|
||||
return load_json(args.owner_packet_file)
|
||||
return generate_owner_packet(no_color=args.no_color)
|
||||
|
||||
|
||||
def as_list(value: Any) -> list[Any]:
|
||||
if value is None:
|
||||
return []
|
||||
if isinstance(value, list):
|
||||
return value
|
||||
return [value]
|
||||
|
||||
|
||||
def normalized(value: Any) -> str:
|
||||
if value is None:
|
||||
return ""
|
||||
return str(value).strip()
|
||||
|
||||
|
||||
def is_placeholder(value: Any) -> bool:
|
||||
return normalized(value).lower() in PLACEHOLDER_VALUES
|
||||
|
||||
|
||||
def collect_strings(value: Any, path: str = "$") -> list[tuple[str, str]]:
|
||||
strings: list[tuple[str, str]] = []
|
||||
if isinstance(value, str):
|
||||
strings.append((path, value))
|
||||
elif isinstance(value, dict):
|
||||
for key, child in value.items():
|
||||
strings.extend(collect_strings(child, f"{path}.{key}"))
|
||||
elif isinstance(value, list):
|
||||
for index, child in enumerate(value):
|
||||
strings.extend(collect_strings(child, f"{path}[{index}]"))
|
||||
return strings
|
||||
|
||||
|
||||
def find_forbidden_strings(response: dict[str, Any]) -> list[str]:
|
||||
failures: list[str] = []
|
||||
for path, value in collect_strings(response):
|
||||
if path.endswith(".item_id") and value in ESCROW_ITEM_IDS:
|
||||
continue
|
||||
if path.endswith(".gate_id") and value in {
|
||||
"credential_escrow_evidence",
|
||||
"wazuh_manager_registry_export",
|
||||
"host_188_hygiene_maintenance_window",
|
||||
}:
|
||||
continue
|
||||
for pattern in SECRET_VALUE_PATTERNS:
|
||||
if pattern.search(value):
|
||||
failures.append(f"forbidden_payload_at={path}")
|
||||
break
|
||||
return failures
|
||||
|
||||
|
||||
def find_forbidden_booleans(value: Any, path: str = "$") -> list[str]:
|
||||
failures: list[str] = []
|
||||
if isinstance(value, dict):
|
||||
for key, child in value.items():
|
||||
child_path = f"{path}.{key}"
|
||||
if key in FORBIDDEN_BOOLEAN_FIELDS and child is not False:
|
||||
failures.append(f"{child_path}={child!r}")
|
||||
failures.extend(find_forbidden_booleans(child, child_path))
|
||||
elif isinstance(value, list):
|
||||
for index, child in enumerate(value):
|
||||
failures.extend(find_forbidden_booleans(child, f"{path}[{index}]"))
|
||||
return failures
|
||||
|
||||
|
||||
def owner_packet_gate_ids(packet: dict[str, Any]) -> set[str]:
|
||||
if packet.get("schema_version") != EXPECTED_OWNER_PACKET_SCHEMA:
|
||||
raise SystemExit(f"owner_packet_schema={packet.get('schema_version')!r}")
|
||||
return {
|
||||
str(item.get("packet_id"))
|
||||
for item in as_list(packet.get("owner_packets"))
|
||||
if isinstance(item, dict) and item.get("packet_id")
|
||||
}
|
||||
|
||||
|
||||
def response_by_gate(response: dict[str, Any]) -> dict[str, dict[str, Any]]:
|
||||
responses = as_list(response.get("responses"))
|
||||
by_gate: dict[str, dict[str, Any]] = {}
|
||||
for item in responses:
|
||||
if not isinstance(item, dict):
|
||||
continue
|
||||
gate_id = normalized(item.get("gate_id"))
|
||||
if gate_id:
|
||||
by_gate[gate_id] = item
|
||||
return by_gate
|
||||
|
||||
|
||||
def validate_common(gate_id: str, item: dict[str, Any]) -> list[str]:
|
||||
failures: list[str] = []
|
||||
for key in (
|
||||
"owner_role",
|
||||
"owner_team",
|
||||
"decision",
|
||||
"decision_reason",
|
||||
"affected_scope",
|
||||
"followup_owner",
|
||||
):
|
||||
if is_placeholder(item.get(key)):
|
||||
failures.append(f"{gate_id}.{key}_missing")
|
||||
decision = normalized(item.get("decision")).lower()
|
||||
if decision not in {"accepted", "rejected", "needs_supplement"}:
|
||||
failures.append(f"{gate_id}.decision_invalid={decision!r}")
|
||||
evidence_refs = [
|
||||
ref for ref in as_list(item.get("redacted_evidence_refs")) if not is_placeholder(ref)
|
||||
]
|
||||
if not evidence_refs:
|
||||
failures.append(f"{gate_id}.redacted_evidence_refs_missing")
|
||||
return failures
|
||||
|
||||
|
||||
def validate_credential_escrow(item: dict[str, Any]) -> list[str]:
|
||||
failures: list[str] = []
|
||||
escrow_items = as_list(item.get("escrow_items"))
|
||||
seen = {
|
||||
normalized(entry.get("item_id"))
|
||||
for entry in escrow_items
|
||||
if isinstance(entry, dict)
|
||||
}
|
||||
missing = sorted(ESCROW_ITEM_IDS - seen)
|
||||
if missing:
|
||||
failures.append(f"credential_escrow_evidence.missing_items={missing}")
|
||||
for entry in escrow_items:
|
||||
if not isinstance(entry, dict):
|
||||
failures.append("credential_escrow_evidence.escrow_item_not_object")
|
||||
continue
|
||||
item_id = normalized(entry.get("item_id"))
|
||||
if item_id not in ESCROW_ITEM_IDS:
|
||||
failures.append(f"credential_escrow_evidence.unknown_item={item_id!r}")
|
||||
for key in ("non_secret_evidence_ref", "recovery_owner", "reviewer", "last_reviewed_at"):
|
||||
if is_placeholder(entry.get(key)):
|
||||
failures.append(f"credential_escrow_evidence.{item_id}.{key}_missing")
|
||||
if entry.get("contains_secret_value") is not False:
|
||||
failures.append(f"credential_escrow_evidence.{item_id}.contains_secret_value_not_false")
|
||||
return failures
|
||||
|
||||
|
||||
def validate_wazuh_registry(item: dict[str, Any]) -> list[str]:
|
||||
failures: list[str] = []
|
||||
for key in (
|
||||
"registry_export_ref",
|
||||
"registry_time_window",
|
||||
"dashboard_api_connection_status",
|
||||
"dashboard_api_version_status",
|
||||
"reviewer",
|
||||
):
|
||||
if is_placeholder(item.get(key)):
|
||||
failures.append(f"wazuh_manager_registry_export.{key}_missing")
|
||||
aliases = {normalized(alias) for alias in as_list(item.get("expected_host_aliases"))}
|
||||
missing_aliases = sorted(EXPECTED_HOST_ALIASES - aliases)
|
||||
if missing_aliases:
|
||||
failures.append(f"wazuh_manager_registry_export.missing_aliases={missing_aliases}")
|
||||
if not isinstance(item.get("manager_registry_count"), int):
|
||||
failures.append("wazuh_manager_registry_export.manager_registry_count_not_int")
|
||||
if normalized(item.get("dashboard_api_connection_status")).lower() != "ok":
|
||||
failures.append("wazuh_manager_registry_export.dashboard_api_connection_not_ok")
|
||||
if normalized(item.get("dashboard_api_version_status")).lower() != "ok":
|
||||
failures.append("wazuh_manager_registry_export.dashboard_api_version_not_ok")
|
||||
return failures
|
||||
|
||||
|
||||
def evaluate(packet: dict[str, Any], response: dict[str, Any] | None) -> dict[str, Any]:
|
||||
expected_gates = owner_packet_gate_ids(packet)
|
||||
result: dict[str, Any] = {
|
||||
"schema_version": "awoooi_post_reboot_owner_response_preflight_v1",
|
||||
"expected_gate_count": len(expected_gates),
|
||||
"expected_gates": sorted(expected_gates),
|
||||
"owner_response_received_count": 0,
|
||||
"owner_response_accepted_count": 0,
|
||||
"runtime_gate_count": 0,
|
||||
"runtime_action_authorized": False,
|
||||
"host_write_authorized": False,
|
||||
"secret_value_collection_allowed": False,
|
||||
"status": "blocked_waiting_owner_response_file",
|
||||
"blockers": [],
|
||||
}
|
||||
if response is None:
|
||||
result["blockers"] = ["owner_response_file_missing"]
|
||||
return result
|
||||
|
||||
failures: list[str] = []
|
||||
if response.get("schema_version") != EXPECTED_SCHEMA:
|
||||
failures.append(f"schema_version={response.get('schema_version')!r}")
|
||||
failures.extend(find_forbidden_strings(response))
|
||||
failures.extend(find_forbidden_booleans(response))
|
||||
|
||||
by_gate = response_by_gate(response)
|
||||
gate_ids = set(by_gate)
|
||||
unknown_gates = sorted(gate_ids - expected_gates)
|
||||
missing_gates = sorted(expected_gates - gate_ids)
|
||||
if unknown_gates:
|
||||
failures.append(f"unknown_gate_ids={unknown_gates}")
|
||||
if missing_gates:
|
||||
failures.append(f"missing_gate_responses={missing_gates}")
|
||||
|
||||
received = 0
|
||||
accepted = 0
|
||||
for gate_id in sorted(expected_gates & gate_ids):
|
||||
item = by_gate[gate_id]
|
||||
gate_failures = validate_common(gate_id, item)
|
||||
if gate_id == "credential_escrow_evidence":
|
||||
gate_failures.extend(validate_credential_escrow(item))
|
||||
elif gate_id == "wazuh_manager_registry_export":
|
||||
gate_failures.extend(validate_wazuh_registry(item))
|
||||
else:
|
||||
gate_failures.append(f"{gate_id}.unsupported_for_response_preflight")
|
||||
if gate_failures:
|
||||
failures.extend(gate_failures)
|
||||
else:
|
||||
received += 1
|
||||
if normalized(item.get("decision")).lower() == "accepted":
|
||||
accepted += 1
|
||||
|
||||
result["owner_response_received_count"] = received
|
||||
result["owner_response_accepted_count"] = accepted
|
||||
result["blockers"] = failures
|
||||
if failures:
|
||||
result["status"] = "blocked_waiting_owner_response_content"
|
||||
elif accepted == len(expected_gates):
|
||||
result["status"] = "ready_for_independent_reviewer_acceptance"
|
||||
else:
|
||||
result["status"] = "blocked_waiting_owner_acceptance"
|
||||
return result
|
||||
|
||||
|
||||
def main() -> int:
|
||||
args = parse_args()
|
||||
packet = load_owner_packet(args)
|
||||
response = load_json(args.response_file) if args.response_file else None
|
||||
result = evaluate(packet, response)
|
||||
|
||||
if args.json:
|
||||
print(json.dumps(result, ensure_ascii=False, indent=2, sort_keys=True))
|
||||
else:
|
||||
prefix = (
|
||||
"POST_REBOOT_OWNER_RESPONSE_PREFLIGHT_OK"
|
||||
if result["status"] == "ready_for_independent_reviewer_acceptance"
|
||||
else "POST_REBOOT_OWNER_RESPONSE_PREFLIGHT_BLOCKED"
|
||||
)
|
||||
print(
|
||||
f"{prefix} status={result['status']} "
|
||||
f"expected_gates={result['expected_gate_count']} "
|
||||
f"received={result['owner_response_received_count']} "
|
||||
f"accepted={result['owner_response_accepted_count']} "
|
||||
f"runtime_gate={result['runtime_gate_count']} "
|
||||
f"blockers={len(result['blockers'])}"
|
||||
)
|
||||
return 0
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
sys.exit(main())
|
||||
Reference in New Issue
Block a user