ops(reboot): include Wazuh detail in post-reboot summary [skip ci]

This commit is contained in:
ogt
2026-06-26 08:54:00 +08:00
parent c45f274d5e
commit 75c9314528
6 changed files with 68 additions and 5 deletions

View File

@@ -45191,6 +45191,38 @@ production browser smoke:
- Wazuh manager registry accepted remains `0`
- 不得宣稱 owner request 已送出、owner response 已收到 / 接受、runtime 寫入已批准、`DR_COMPLETE`、188 host fully green、或 Wazuh registry recovered。
## 2026-06-26 — 08:47 Wazuh registry detail in post-reboot summary / SOP v1.71
**時間與來源**
- 2026-06-26 08:47 Asia/Taipei。
- 來源:`scripts/security/wazuh-managed-host-coverage-gate.py --root .``scripts/security/wazuh-agent-visibility-runtime-gate.py --root .` 的 repo-side read-only output。
**完成內容**
- `scripts/reboot-recovery/post-reboot-readiness-summary.sh` 新增 Wazuh detail 欄位:`WAZUH_COVERAGE_SCOPE``WAZUH_DIRECT_ACTIVE``WAZUH_NO_TRANSPORT``WAZUH_SSH_BLOCKED``WAZUH_DASHBOARD_API_CONNECTION``WAZUH_DASHBOARD_INDEX_OK`
- `scripts/reboot-recovery/post-reboot-next-gate-dispatch.sh``wazuh_manager_registry_export` gate 現在會在 `CURRENT_EVIDENCE` 保留 registry accepted、coverage scope、direct active、no transport、SSH blocked、route、transport、Dashboard API 與 index pattern 狀態。
- `docs/runbooks/REBOOT-POST-START-QUICK-CHECK.md` 升至 v1.11,明確列出 route / transport / index pattern 不能取代 manager registry accepted。
- `docs/runbooks/FULL-STACK-COLD-START-SOP.md` 升至 v1.71,將 Wazuh evidence detail 納入重啟後固定判讀。
- `docs/workplans/2026-06-04-reboot-cold-start-backup-recovery-workplan.md` 更新為 `DONE_WITH_WAZUH_DETAIL_SUMMARY_V171`
**只讀驗證結果**
- `WAZUH_MANAGED_HOST_COVERAGE_GATE_OK scope=6 direct_active=2 no_transport=1 ssh_blocked=3 registry=0 runtime_gate=0`
- `WAZUH_AGENT_VISIBILITY_RUNTIME_GATE_OK registry=0 route=200 transport=6 dashboard_degraded=1 api_connection=pending_or_spinning index_ok=3 runtime_gate=0`
**做過的命令類型**
- 只讀Wazuh repo-side coverage / runtime gates、post-reboot summary / dispatch / owner-packet / contract guard、source guards。
- 寫入repo script / docs-only。
- 未做Wazuh agent re-enroll / restart、Wazuh manager / dashboard / indexer restart、active response、host write、Nginx reload、firewall change、Kali active scan、secret collection。
**目前判定**
- Wazuh detail in post-reboot summary`0% -> 100%`
- Wazuh manager registry accepted`0`
- Reboot service / data / backup readiness remains `GREEN`
- Overall declaration remains `FULL_STACK_GREEN_DR_ESCROW_BLOCKED`
**仍 blocked / 不得宣稱**
- 不得把 Wazuh route `200`、transport `6`、Dashboard index pattern `3`、Dashboard 可開或 UI 卡片可見宣稱為全主機納管恢復。
- 不得宣稱 active response、host write、agent re-enroll、restart、secret patch、Kali active scan 或 runtime gate 已批准。
## 2026-06-26 — 08:40 post-reboot owner-packet contract guard / SOP v1.70
**時間與來源**

View File

@@ -1,6 +1,6 @@
# AWOOOI 全棧冷啟動與主機重啟 SOP
> Version: v1.70
> Version: v1.71
> Last updated: 2026-06-26 Asia/Taipei
> Scope: 110 / 120 / 121 / 188 full-stack reboot recovery. 112 Kali is recorded as P3 optional and is not part of this recovery path.
@@ -20,6 +20,8 @@
2026-06-26 08:40 owner-packet contract guard baseline`scripts/reboot-recovery/post-reboot-owner-packet-contract-guard.py --packet-file /tmp/awoooi-post-reboot-owner-packets.json` 鎖定 `schema_version=awoooi_post_reboot_next_gate_owner_packets_v1`、三個 P0 gate id、`next_gate_count=3``p0_gate_count=3``request_sent_count=0``owner_response_received_count=0``owner_response_accepted_count=0``runtime_action_authorized_count=0``dispatch_authorized=0``host_write_authorized=0``secret_value_collection_allowed=0``runtime_gate_count=0`。此 guard 也驗證 escrow 禁止 password / token / secret value / hash / prefix / suffix / raw credential188 禁止 `pg_resetwal` / certbot renew / Nginx reload / DB restore / Docker restart / host file writeWazuh 禁止 raw payload / internal IP / active response / re-enroll / restart / secret patch / host write / Kali active scan並要求四條 no-false-green 規則存在。輸出必須是 `POST_REBOOT_OWNER_PACKET_CONTRACT_GUARD_OK gates=3 request_sent=0 accepted=0 runtime_gate=0`
2026-06-26 08:47 Wazuh registry detail baseline`scripts/reboot-recovery/post-reboot-readiness-summary.sh --no-color` 已把 Wazuh repo-side coverage / runtime gate 的細節納入固定 key/value`WAZUH_COVERAGE_SCOPE=6``WAZUH_DIRECT_ACTIVE=2``WAZUH_NO_TRANSPORT=1``WAZUH_SSH_BLOCKED=3``WAZUH_ROUTE_CODE=200``WAZUH_TRANSPORT_COUNT=6``WAZUH_DASHBOARD_API_CONNECTION=pending_or_spinning``WAZUH_DASHBOARD_INDEX_OK=3``WAZUH_MANAGER_REGISTRY_ACCEPTED=0``WAZUH_RUNTIME_GATE=0``scripts/reboot-recovery/post-reboot-next-gate-dispatch.sh --no-color``wazuh_manager_registry_export` gate 會把這些狀態放入 `CURRENT_EVIDENCE`。判讀鐵律route `200`、transport `6`、Dashboard index pattern `3` 都不是 manager registry accepted全主機納管與 Dashboard API 修復仍需 owner evidence / registry export / acceptance record。
2026-06-26 07:39 live quick-check refresh`scripts/reboot-recovery/post-start-quick-check.sh --no-color` 完整跑完,四主機 ping / SSH 全部 OKdelegated cold-start 為 `PASS=89 WARN=0 BLOCKED=0`wrapper 總結為 `POST_START_QUICK_CHECK PASS=38 WARN=3 BLOCKED=0`、warning split `SERVICE=0 BOUNDARY=1 EVIDENCE=2``RESULT=FULL_STACK_GREEN_DR_ESCROW_BLOCKED`。MOMO health `V10.701`daily snapshot `109061` rows / `2025-07-01..2026-06-24`current-month parity `15383|15383|2026-06-01|2026-06-24|2026-06-01|2026-06-24`latest import job `57 completed`。StockPlatform freshness `status=ok`、latest trading date `2026-06-25`price / chips / margin / AI recommendations 均為 `2026-06-25`。Backup-status 07:39 顯示 110 `13/13 fresh failed=0`、188 `2/2 fresh failed=0``core_blockers=0`、offsite/rclone fresh、`last_backup_all=2026-06-26 02:31:02``escrow_missing=5`。Public routes extended list 全部回 expected 2xx/3xx。110 CPU attribution 顯示 load 約 `5.19 / 4.66 / 4.91`CPU idle 多數樣本 `80%+`,目前負載來自 Gitea / ClickHouse / Docker / Kafka / StockPlatform / AWOOOI API / Sentry 等正常平台工作,不是 orphan Chrome。這一輪 allowed declaration主機、K3s、服務、網站、產品資料 freshness、備份核心與 offsite freshness 綠forbidden declarationDR complete、credential escrow complete、188 host fully green、Wazuh registry recovered。
2026-06-26 07:19 follow-up`gitea/main` 已包含前一輪 SOP 文件 commit `1fd5e2a8`ArgoCD `awoooi-prod` 讀回 `Synced / Healthy`revision `1fd5e2a8b0f18d24eed16aa2a44286bcbf230603`API `2/2`、Web `2/2`、Worker `1/1`pods `restart=0`。重跑 full cold-start 仍是 `PASS=87 WARN=0 BLOCKED=0`result `GREEN`。直接 public route 讀回AWOOOI API `200`、AWOOOI Web `307`、VibeWork `200`、AwoooGo `200`、MOMO health `200`、Stock freshness `200`、Bitan `200`、Gitea `200`、Harbor `200`、Registry `/v2/` expected `401`、Sentry expected `302`、SigNoz `200`、Langfuse `200`。188 blocker 精準分類:`pg_lsclusters` 顯示 host PostgreSQL `14/main` down`systemctl status postgresql@14-main` 顯示 `invalid primary checkpoint record``PANIC: could not locate a valid checkpoint record``certbot.service` 顯示 `sentry.wooo.work` renew rate-limited`snap.certbot.renew.service` 顯示 challenge failed`awoooi-startup.service` 曾嘗試以 root 執行 `pg_resetwal` 並失敗。本輪不執行 `pg_resetwal`、不 `reset-failed`、不重啟 service188 需用獨立維護窗口、rollback owner、restore/source-of-truth plan 處理,詳見 `docs/runbooks/HOST-188-HYGIENE-MAINTENANCE-RUNBOOK.md`,並可先跑 `scripts/reboot-recovery/188-host-hygiene-maintenance-checklist.sh --no-color` 取得只讀 preflight。110 load 已降到約 `4.83 / 4.82 / 5.52`top CPU 是 active AWOOOI Web `turbo build` / Docker buildxSwap 仍滿但 memory available 約 `41Gi`,本輪不手動清 swap。整體宣告仍是 `FULL_STACK_GREEN_DR_ESCROW_BLOCKED`

View File

@@ -1,6 +1,6 @@
# 主機重啟後一頁式總檢查
> Version: v1.10
> Version: v1.11
> Last updated: 2026-06-26 Asia/Taipei
> Scope: 110 / 120 / 121 / 188 post-reboot service recovery. 112 Kali / Wazuh / active scan 不屬於本流程。
@@ -10,7 +10,7 @@
每次 110 / 120 / 121 / 188 任一台主機開機、關機、重啟、斷電恢復、VMware console fsck、Docker / K3s 大量重排後,都先跑本頁,再決定是否宣稱恢復。
最新基準2026-06-26 08:40 next-gate owner packet contract guard`scripts/reboot-recovery/post-reboot-readiness-summary.sh --no-color` 回傳 `SERVICE_GREEN=1``PRODUCT_DATA_GREEN=1``BACKUP_CORE_GREEN=1``DR_ESCROW_BLOCKED=1``ESCROW_MISSING_COUNT=5``HOST_188_HYGIENE_BLOCKED=1``WAZUH_MANAGER_REGISTRY_ACCEPTED=0``RUNTIME_ACTION_AUTHORIZED=0``OVERALL_DECLARATION=FULL_STACK_GREEN_DR_ESCROW_BLOCKED`。接著 `scripts/reboot-recovery/post-reboot-next-gate-dispatch.sh --no-color``NEXT_REQUIRED_GATES=credential_escrow_evidence,host_188_hygiene_maintenance_window,wazuh_manager_registry_export` 展成三個 owner / evidence / forbidden-action checklist`scripts/reboot-recovery/post-reboot-next-gate-owner-packets.py --no-color` 進一步轉成 `awoooi_post_reboot_next_gate_owner_packets_v1` JSON固定 `dispatch_authorized=0``request_sent_count=0``owner_response_accepted_count=0``host_write_authorized=0``secret_value_collection_allowed=0``runtime_gate_count=0``scripts/reboot-recovery/post-reboot-owner-packet-contract-guard.py --packet-file /tmp/awoooi-post-reboot-owner-packets.json` 鎖定三個 P0 gate、所有 `0 / false` 邊界、禁用 secret payload / runtime action 與 no-false-green 規則。Cold-start `PASS=89 WARN=0 BLOCKED=0`MOMO `V10.701`、latest import job `57 completed``DB_DAILY_FRESHNESS 1|2026-06-24`StockPlatform `/api/v1/system/freshness``status=ok``latest_trading_date=2026-06-25`、blockers `[]`backup-status 110 `13/13 fresh failed=0`、188 `2/2 fresh failed=0``core_blockers=0``offsite_fresh=1``rclone_gdrive_fresh=1``last_backup_all=2026-06-26 02:31:02`。DR 仍因 `escrow_missing=5` 不可宣稱 complete。188 host hygiene 與 Wazuh manager registry 仍是 service green 之外的獨立 blocker。
最新基準2026-06-26 08:47 Wazuh registry detail in post-reboot summary`scripts/reboot-recovery/post-reboot-readiness-summary.sh --no-color` 回傳 `SERVICE_GREEN=1``PRODUCT_DATA_GREEN=1``BACKUP_CORE_GREEN=1``DR_ESCROW_BLOCKED=1``ESCROW_MISSING_COUNT=5``HOST_188_HYGIENE_BLOCKED=1``WAZUH_MANAGER_REGISTRY_ACCEPTED=0``WAZUH_COVERAGE_SCOPE=6``WAZUH_DIRECT_ACTIVE=2``WAZUH_NO_TRANSPORT=1``WAZUH_SSH_BLOCKED=3``WAZUH_ROUTE_CODE=200``WAZUH_TRANSPORT_COUNT=6``WAZUH_DASHBOARD_API_CONNECTION=pending_or_spinning``WAZUH_DASHBOARD_INDEX_OK=3``RUNTIME_ACTION_AUTHORIZED=0``OVERALL_DECLARATION=FULL_STACK_GREEN_DR_ESCROW_BLOCKED`。接著 `scripts/reboot-recovery/post-reboot-next-gate-dispatch.sh --no-color``NEXT_REQUIRED_GATES=credential_escrow_evidence,host_188_hygiene_maintenance_window,wazuh_manager_registry_export` 展成三個 owner / evidence / forbidden-action checklistWazuh checklist 的 `CURRENT_EVIDENCE` 會保留 registry accepted、coverage scope、direct active、no transport、SSH blocked、route、transport、Dashboard API 與 index pattern 狀態,避免把 route `200` 或 transport `6` 誤報成 registry recovered。`scripts/reboot-recovery/post-reboot-next-gate-owner-packets.py --no-color` 進一步轉成 `awoooi_post_reboot_next_gate_owner_packets_v1` JSON固定 `dispatch_authorized=0``request_sent_count=0``owner_response_accepted_count=0``host_write_authorized=0``secret_value_collection_allowed=0``runtime_gate_count=0``scripts/reboot-recovery/post-reboot-owner-packet-contract-guard.py --packet-file /tmp/awoooi-post-reboot-owner-packets.json` 鎖定三個 P0 gate、所有 `0 / false` 邊界、禁用 secret payload / runtime action 與 no-false-green 規則。Cold-start `PASS=89 WARN=0 BLOCKED=0`MOMO `V10.701`、latest import job `57 completed``DB_DAILY_FRESHNESS 1|2026-06-24`StockPlatform `/api/v1/system/freshness``status=ok``latest_trading_date=2026-06-25`、blockers `[]`backup-status 110 `13/13 fresh failed=0`、188 `2/2 fresh failed=0``core_blockers=0``offsite_fresh=1``rclone_gdrive_fresh=1``last_backup_all=2026-06-26 02:31:02`。DR 仍因 `escrow_missing=5` 不可宣稱 complete。188 host hygiene 與 Wazuh manager registry 仍是 service green 之外的獨立 blocker。
本頁只回答四件事:
@@ -57,6 +57,7 @@ scripts/reboot-recovery/post-reboot-readiness-summary.sh --no-color
- `DR_ESCROW_BLOCKED=1` / `ESCROW_MISSING_COUNT>0`:不可宣稱 DR complete。
- `HOST_188_HYGIENE_BLOCKED=1`188 host hygiene 需維護窗口,不等於產品服務掛掉。
- `WAZUH_MANAGER_REGISTRY_ACCEPTED=0`:不可宣稱 Wazuh 全主機納管恢復。
- `WAZUH_ROUTE_CODE=200` / `WAZUH_TRANSPORT_COUNT>0` 只能代表 route / transport evidence仍必須搭配 `WAZUH_COVERAGE_SCOPE``WAZUH_DIRECT_ACTIVE``WAZUH_NO_TRANSPORT``WAZUH_SSH_BLOCKED``WAZUH_DASHBOARD_API_CONNECTION``WAZUH_MANAGER_REGISTRY_ACCEPTED` 判讀。
- `RUNTIME_ACTION_AUTHORIZED=0`:本流程沒有授權 runtime 寫操作。
- `OVERALL_DECLARATION`:本輪可使用的最高宣告。

View File

@@ -15,7 +15,7 @@
| P0 host / K3s recovery | DONE | 100% | 120 booted after console fsck at `2026-06-12 15:13`; latest 2026-06-26 07:19 readback shows 120 and 121 reachable, K3s active, `mon` and `mon1` both `Ready control-plane`, AWOOOI API/Web replicas split across both nodes, ArgoCD `awoooi-prod Synced / Healthy` at revision `1fd5e2a8b0f18d24eed16aa2a44286bcbf230603`, and `km-vectorize` official 03:00 台北時間 run succeeded with `lastSuccess=2026-06-25T19:00:14Z`. |
| P1 backup / alert / escrow | BLOCKED_DR_ESCROW | 97% | 2026-06-26 06:58 backup readback shows 110 `13/13 fresh failed=0`, 188 `2/2 fresh failed=0`, `core_blockers=0`, `integrity_stale=0`, `offsite_fresh=1`, `rclone_gdrive_fresh=1`, `escrow_missing=5`, last aggregate `2026-06-26 02:31:02`。DR remains blocked on real non-secret credential escrow evidence IDs; do not write placeholder markers or paste secret values. |
| P2 service / data truth | DONE | 100% | Service routes and core runtime are available, 110 current CPU pressure is attributable to active AWOOOI Web `turbo build` / Docker buildx, and previous orphan Chrome groups remain cleared. 2026-06-26 07:19 StockPlatform `/api/v1/system/freshness` returned `200`; 07:01 freshness payload was `status=ok`, `latest_trading_date=2026-06-25`, blockers `[]`; price / chips / margin / AI recommendations are all on `2026-06-25`. `ai.recommendations` row count is `2868`; `core.margin_short_daily` row count is `1976`. MOMO health `V10.699`, current-month parity `15383|15383|2026-06-01|2026-06-24|2026-06-01|2026-06-24`, and `MOMO_DAILY_FRESHNESS 1|2026-06-24` are green; expanded public routes are green. |
| P3 docs / automation contracts | DONE_WITH_OWNER_PACKET_CONTRACT_GUARD_V170 | 100% | Workplan, SOP v1.70, machine-readable post-reboot readiness summary, post-reboot next-gate dispatch checklist, owner-packet JSON generator, owner-packet contract guard, one-page post-start quick check v1.10, route retry gate, deploy warmup classification, expanded public route list, StockPlatform freshness gate, StockPlatform cron-source recovery evidence, StockPlatform natural schedule green evidence, 110 orphan Chrome recurrence cleanup evidence, 188 fail-closed startup data recovery gate, 188 host hygiene read-only checklist, baseline `stockplatform_system_freshness_ok`, BACKUP-STATUS, LOGBOOK, 120 console/fsck recovery, Gitea backup stale-dump hardening, reboot ledger/version-comparison SOP, escrow evidence audit, 188 nginx Ansible baseline, 110 cold-start detector script, startup judgment layers, GO/NO-GO tree, host recovery cards, explicit Plan B degraded-operation path, machine-readable `plan_b` baseline, readiness-audit Plan B guard, B0-B5 service levels, T+0/T+120 fallback timeline checks, host role / load-balancing assessment, CD `known_hosts` guardrail, `fwupd-refresh.timer` rollback note, K3s filesystem event blocker, AWOOOI backup no-direct-offsite-sync contract, 110/188 Ansible source-of-truth, Gitea self-hosted readiness validation workflow, post-CD no-regression readbacks, stale-vs-active K8s failed Job classification, 110 runaway browser / CI load AIOps exporter + alert + gated remediation PlayBook, Telegram / AI event packet mapping, healthy heartbeat Telegram suppression, MOMO scheduler / current-month detector fix, exporter restore helpers, 110 Docker disk pressure cleanup boundary, notification-noise readback, MOMO import-boundary / Drive-auth fail-closed deploys, product version/readback matrix, and stricter product-data / route retry gates are updated. Owner-packet JSON turns `credential_escrow_evidence``host_188_hygiene_maintenance_window``wazuh_manager_registry_export` into structured review packets while keeping request sent / owner accepted / host write / secret collection / runtime action at `0`; contract guard now rejects packet drift before owner review intake. Live 110 script sync remains a separate approved live-write gate; do not claim it here. |
| P3 docs / automation contracts | DONE_WITH_WAZUH_DETAIL_SUMMARY_V171 | 100% | Workplan, SOP v1.71, machine-readable post-reboot readiness summary with Wazuh registry detail fields, post-reboot next-gate dispatch checklist, owner-packet JSON generator, owner-packet contract guard, one-page post-start quick check v1.11, route retry gate, deploy warmup classification, expanded public route list, StockPlatform freshness gate, StockPlatform cron-source recovery evidence, StockPlatform natural schedule green evidence, 110 orphan Chrome recurrence cleanup evidence, 188 fail-closed startup data recovery gate, 188 host hygiene read-only checklist, baseline `stockplatform_system_freshness_ok`, BACKUP-STATUS, LOGBOOK, 120 console/fsck recovery, Gitea backup stale-dump hardening, reboot ledger/version-comparison SOP, escrow evidence audit, 188 nginx Ansible baseline, 110 cold-start detector script, startup judgment layers, GO/NO-GO tree, host recovery cards, explicit Plan B degraded-operation path, machine-readable `plan_b` baseline, readiness-audit Plan B guard, B0-B5 service levels, T+0/T+120 fallback timeline checks, host role / load-balancing assessment, CD `known_hosts` guardrail, `fwupd-refresh.timer` rollback note, K3s filesystem event blocker, AWOOOI backup no-direct-offsite-sync contract, 110/188 Ansible source-of-truth, Gitea self-hosted readiness validation workflow, post-CD no-regression readbacks, stale-vs-active K8s failed Job classification, 110 runaway browser / CI load AIOps exporter + alert + gated remediation PlayBook, Telegram / AI event packet mapping, healthy heartbeat Telegram suppression, MOMO scheduler / current-month detector fix, exporter restore helpers, 110 Docker disk pressure cleanup boundary, notification-noise readback, MOMO import-boundary / Drive-auth fail-closed deploys, product version/readback matrix, and stricter product-data / route retry gates are updated. Summary / dispatch now carry Wazuh coverage scope, direct active, no transport, SSH blocked, route, transport, Dashboard API connection and index OK fields while keeping manager registry accepted / runtime gate at `0`; route `200` and transport `6` are explicitly not accepted as full Wazuh recovery. Live 110 script sync remains a separate approved live-write gate; do not claim it here. |
2026-06-26 07:47 machine-readable summary baseline: `scripts/reboot-recovery/post-reboot-readiness-summary.sh --no-color` stores delegated logs under `/tmp/awoooi-post-reboot-readiness-20260626-074702` and returns `SERVICE_GREEN=1`, `PRODUCT_DATA_GREEN=1`, `BACKUP_CORE_GREEN=1`, `DR_ESCROW_BLOCKED=1`, `ESCROW_MISSING_COUNT=5`, `HOST_188_SERVICE_GREEN=1`, `HOST_188_HYGIENE_BLOCKED=1`, `WAZUH_ROUTE_CODE=200`, `WAZUH_TRANSPORT_COUNT=6`, `WAZUH_MANAGER_REGISTRY_ACCEPTED=0`, `WAZUH_RUNTIME_GATE=0`, `RUNTIME_ACTION_AUTHORIZED=0`, `OVERALL_DECLARATION=FULL_STACK_GREEN_DR_ESCROW_BLOCKED`, and `NEXT_REQUIRED_GATES=credential_escrow_evidence,host_188_hygiene_maintenance_window,wazuh_manager_registry_export`. This is now the preferred first operator/AI-agent entrypoint after reboot because it separates service health from DR, host hygiene, and security registry evidence.
@@ -25,6 +25,8 @@
2026-06-26 08:40 owner-packet contract guard baseline: `scripts/reboot-recovery/post-reboot-owner-packet-contract-guard.py --packet-file /tmp/awoooi-post-reboot-owner-packets.json` validates the generated JSON before any owner review intake. It requires exactly three P0 gates, preserves `request_sent=0``owner_response_received=0``owner_response_accepted=0``runtime_action_authorized=0``host_write_authorized=0``secret_value_collection_allowed=0``runtime_gate=0`, and rejects missing forbidden payload/action controls for credential escrow, 188 host hygiene, and Wazuh registry export. Expected success line: `POST_REBOOT_OWNER_PACKET_CONTRACT_GUARD_OK gates=3 request_sent=0 accepted=0 runtime_gate=0`.
2026-06-26 08:47 Wazuh registry detail summary baseline: post-reboot readiness summary now emits `WAZUH_COVERAGE_SCOPE`, `WAZUH_DIRECT_ACTIVE`, `WAZUH_NO_TRANSPORT`, `WAZUH_SSH_BLOCKED`, `WAZUH_DASHBOARD_API_CONNECTION`, and `WAZUH_DASHBOARD_INDEX_OK` alongside existing route / transport / registry fields. Current read-only truth is coverage scope `6`, direct active `2`, no transport `1`, SSH blocked `3`, route `200`, transport `6`, Dashboard API `pending_or_spinning`, index OK `3`, manager registry accepted `0`, runtime gate `0`. This is a security evidence blocker, not a reboot service blocker.
2026-06-26 07:39 live quick-check refresh supersedes the 07:19 row for current operator status. `scripts/reboot-recovery/post-start-quick-check.sh --no-color` returned `POST_START_QUICK_CHECK PASS=38 WARN=3 BLOCKED=0`, warning split `SERVICE=0 BOUNDARY=1 EVIDENCE=2`, result `FULL_STACK_GREEN_DR_ESCROW_BLOCKED`. Delegated cold-start returned `PASS=89 WARN=0 BLOCKED=0`; four reboot-scope hosts ping/SSH were OK; AWOOOI / VibeWork / AwoooGo / 2026FIFA / Agent Bounty / MOMO / Stock / Bitan / TsenYang / VTuber / Gitea / Harbor / Registry / Sentry / SigNoz / Langfuse / AIOps routes returned expected 2xx/3xx. MOMO `V10.701` has job `57 completed`, daily freshness `1|2026-06-24`, and current-month parity `15383|15383|2026-06-01|2026-06-24|2026-06-01|2026-06-24`. StockPlatform freshness is `ok` through `2026-06-25` with price / chips / margin / AI recommendations current. Backup core remains green: 110 `13/13 fresh failed=0`, 188 `2/2 fresh failed=0`, `core_blockers=0`, offsite/rclone fresh, `last_backup_all=2026-06-26 02:31:02`; DR still has `escrow_missing=5`. 110 load around `5.19 / 4.66 / 4.91` is attributable to normal platform processes, not orphan Chrome. 188 host hygiene remains blocked by failed host PostgreSQL / certbot / startup units and must use the dedicated maintenance runbook and read-only checklist.
2026-06-25 19:06 post-CD wrapper readback supersedes the 18:53 wording: consecutive main pushes created a deploy storm where older deploy markers were superseded by later commits. Latest production truth is deploy marker `d8ca8224 chore(cd): deploy 9dbe044 [skip ci]`, ArgoCD `Synced / Healthy`, API/Web/Worker image tag `9dbe044ea1e8e3894ccbeb5ed760bb124b87f7be`, direct route smoke 200 for AWOOOI API / IwoooS / VibeWork / AwoooGo / MOMO health / Stock / Bitan and expected route-gate statuses for MOMO / Gitea / Harbor / Registry / Sentry / SigNoz / Langfuse / AIOps, and wrapper `POST_START_QUICK_CHECK PASS=18 WARN=3 BLOCKED=0`. Repo-side cold-start returns `PASS=89 WARN=0 BLOCKED=0`; `/backup/scripts/backup-status.sh --no-notify --no-refresh` reports 110 `13/13 fresh failed=0`, 188 `2/2 fresh failed=0`, `core_blockers=0`, `integrity_stale=0`, `offsite_fresh=1`, `rclone_gdrive_fresh=1`, `escrow_missing=5`; MOMO dedicated preflight returns `PASS=19 WARN=2 BLOCKED=0`; MOMO health is `V10.690`; AwoooGo / Stock transient 502 reads cleared after upstream warmup and five consecutive route reads returned `200`; 110 load is around `14.51 / 12.34 / 11.42`, with Gitea Actions cache save / `zstdmt` / `tar`, StockPlatform headless Chrome smoke / CI, Gitea, AWOOOI API, ClickHouse, Docker, and platform services visible, not an AWOOOI service blocker. Wrapper result is `FULL_STACK_GREEN_DR_ESCROW_BLOCKED`, not `DEGRADED`, because service warnings are `0` and only DR boundary / evidence warnings remain. Wazuh route readback is now `200 disabled_waiting_iwooos_wazuh_owner_gate`, but manager registry accepted remains `0`, so Wazuh is a security registry evidence blocker rather than a reboot service blocker.

View File

@@ -81,6 +81,14 @@ next_required_gates="$(value_for NEXT_REQUIRED_GATES)"
escrow_missing_count="$(value_for ESCROW_MISSING_COUNT)"
host_188_hygiene_blocked="$(value_for HOST_188_HYGIENE_BLOCKED)"
wazuh_registry_accepted="$(value_for WAZUH_MANAGER_REGISTRY_ACCEPTED)"
wazuh_coverage_scope="$(value_for WAZUH_COVERAGE_SCOPE)"
wazuh_direct_active="$(value_for WAZUH_DIRECT_ACTIVE)"
wazuh_no_transport="$(value_for WAZUH_NO_TRANSPORT)"
wazuh_ssh_blocked="$(value_for WAZUH_SSH_BLOCKED)"
wazuh_route_code="$(value_for WAZUH_ROUTE_CODE)"
wazuh_transport_count="$(value_for WAZUH_TRANSPORT_COUNT)"
wazuh_dashboard_api_connection="$(value_for WAZUH_DASHBOARD_API_CONNECTION)"
wazuh_dashboard_index_ok="$(value_for WAZUH_DASHBOARD_INDEX_OK)"
runtime_action_authorized="$(value_for RUNTIME_ACTION_AUTHORIZED)"
summary_artifact_dir="$(value_for ARTIFACT_DIR)"
@@ -153,7 +161,7 @@ if contains_gate "wazuh_manager_registry_export"; then
print_gate_header "wazuh_manager_registry_export" "Wazuh manager registry redacted export"
echo "GATE_PRIORITY=P0"
echo "GATE_STATUS=readonly_registry_export_required"
echo "CURRENT_EVIDENCE=wazuh_manager_registry_accepted:${wazuh_registry_accepted:-unknown}"
echo "CURRENT_EVIDENCE=wazuh_manager_registry_accepted:${wazuh_registry_accepted:-unknown};coverage_scope:${wazuh_coverage_scope:-unknown};direct_active:${wazuh_direct_active:-unknown};no_transport:${wazuh_no_transport:-unknown};ssh_blocked:${wazuh_ssh_blocked:-unknown};route:${wazuh_route_code:-unknown};transport:${wazuh_transport_count:-unknown};dashboard_api:${wazuh_dashboard_api_connection:-unknown};index_ok:${wazuh_dashboard_index_ok:-unknown}"
echo "OWNER_GROUP=iwooos_soc_owner,wazuh_owner,host_owner"
echo "REQUIRED_EXPORT=redacted_manager_registry_counts,per_host_alias_status,dashboard_api_connection_status,dashboard_api_version_status,collection_time_window,reviewer"
echo "FORBIDDEN_PAYLOADS=agent_real_name,internal_ip,client_keys,raw_wazuh_payload,token,password,authorization_header"

View File

@@ -138,9 +138,15 @@ if [[ "$RUN_188_HYGIENE" -eq 1 ]]; then
fi
wazuh_registry_accepted="unknown"
wazuh_coverage_scope="unknown"
wazuh_direct_active="unknown"
wazuh_no_transport="unknown"
wazuh_ssh_blocked="unknown"
wazuh_route_code="unknown"
wazuh_transport_count="unknown"
wazuh_dashboard_degraded="unknown"
wazuh_dashboard_api_connection="unknown"
wazuh_dashboard_index_ok="unknown"
wazuh_runtime_gate="0"
if [[ "$RUN_WAZUH_GATES" -eq 1 ]]; then
wazuh_coverage_log="$ARTIFACT_DIR/wazuh-managed-host-coverage.log"
@@ -150,9 +156,15 @@ if [[ "$RUN_WAZUH_GATES" -eq 1 ]]; then
coverage_line="$(tail -n 1 "$wazuh_coverage_log" || true)"
runtime_line="$(tail -n 1 "$wazuh_runtime_log" || true)"
wazuh_registry_accepted="$(extract_named_token registry "$coverage_line")"
wazuh_coverage_scope="$(extract_named_token scope "$coverage_line")"
wazuh_direct_active="$(extract_named_token direct_active "$coverage_line")"
wazuh_no_transport="$(extract_named_token no_transport "$coverage_line")"
wazuh_ssh_blocked="$(extract_named_token ssh_blocked "$coverage_line")"
wazuh_route_code="$(extract_named_token route "$runtime_line")"
wazuh_transport_count="$(extract_named_token transport "$runtime_line")"
wazuh_dashboard_degraded="$(extract_named_token dashboard_degraded "$runtime_line")"
wazuh_dashboard_api_connection="$(extract_named_token api_connection "$runtime_line")"
wazuh_dashboard_index_ok="$(extract_named_token index_ok "$runtime_line")"
wazuh_runtime_gate="$(extract_named_token runtime_gate "$runtime_line")"
fi
@@ -207,7 +219,13 @@ HOST_188_CHECK_RC=$host_188_rc
HOST_188_RESULT=$host_188_result
WAZUH_ROUTE_CODE=$wazuh_route_code
WAZUH_TRANSPORT_COUNT=$wazuh_transport_count
WAZUH_COVERAGE_SCOPE=$wazuh_coverage_scope
WAZUH_DIRECT_ACTIVE=$wazuh_direct_active
WAZUH_NO_TRANSPORT=$wazuh_no_transport
WAZUH_SSH_BLOCKED=$wazuh_ssh_blocked
WAZUH_DASHBOARD_DEGRADED=$wazuh_dashboard_degraded
WAZUH_DASHBOARD_API_CONNECTION=$wazuh_dashboard_api_connection
WAZUH_DASHBOARD_INDEX_OK=$wazuh_dashboard_index_ok
WAZUH_MANAGER_REGISTRY_ACCEPTED=$wazuh_registry_accepted
WAZUH_RUNTIME_GATE=$wazuh_runtime_gate
RUNTIME_ACTION_AUTHORIZED=$runtime_action_authorized