From 3200f9af976a6b70426da6c53fadff71c7adaecb Mon Sep 17 00:00:00 2001 From: Your Name Date: Sun, 28 Jun 2026 09:00:26 +0800 Subject: [PATCH 1/3] docs(runner): add direct runner pressure exception [skip ci] --- AGENTS.md | 2 +- docs/HARD_RULES.md | 4 ++-- docs/LOGBOOK.md | 4 ++++ ops/runner/README.md | 14 ++++++++++++-- 4 files changed, 19 insertions(+), 5 deletions(-) diff --git a/AGENTS.md b/AGENTS.md index 750ee049..791feadf 100644 --- a/AGENTS.md +++ b/AGENTS.md @@ -46,7 +46,7 @@ 正確動作是 AI 自動補齊 target selector、source-of-truth diff、check-mode / dry-run、rollback、post-apply verifier、KM / PlayBook trust writeback,然後推進可驗證、可回滾、低爆炸半徑的實作。 -**110 runner 壓力事故例外**:Gitea / act-runner 對 110 造成 CPU / headless smoke 壓力時,屬事故級容量保護,不得用「全面授權」直接重開 runner、移除 mask、還原 runner binary 或把 host pressure gate 改成 warn-only。正確動作是先做 runner 搬遷 / 限流 / label isolation / smoke 排程,再以 check-mode、rollback 與 post-apply verifier 受控恢復。 +**110 runner 壓力事故例外**:Gitea / act-runner / direct transient runner 對 110 造成 CPU / headless smoke 壓力時,屬事故級容量保護,不得用「全面授權」直接重開 runner、移除 mask、還原 runner binary、用 `systemd-run` 直啟 `.real` binary,或把 host pressure gate 改成 warn-only。正確動作是先做 runner 搬遷 / 限流 / label isolation / smoke 排程,再以 check-mode、rollback 與 post-apply verifier 受控恢復。 --- diff --git a/docs/HARD_RULES.md b/docs/HARD_RULES.md index 39580d3c..236fffc2 100644 --- a/docs/HARD_RULES.md +++ b/docs/HARD_RULES.md @@ -289,9 +289,9 @@ force push / 刪 repo / 刪 refs / 改 repo visibility / raw runtime secret volu ### 110 runner 壓力事故例外 -2026-06-28 事故後,110 上的 Gitea / act-runner、StockPlatform headless smoke、host-side Next build 與 Docker / BuildKit 壓力屬容量事故保護面。即使收到「批准 / 繼續 / 全面授權」,也不得直接重開 runner、解除 service mask、還原 live runner binary、恢復泛用 `ubuntu-latest` label,或把 host pressure gate 改成 warn-only 作為預設。 +2026-06-28 事故後,110 上的 Gitea / act-runner / direct transient runner、StockPlatform headless smoke、host-side Next build 與 Docker / BuildKit 壓力屬容量事故保護面。即使收到「批准 / 繼續 / 全面授權」,也不得直接重開 runner、解除 service mask、還原 live runner binary、用 `systemd-run` 直啟 `.real` binary、恢復泛用 `ubuntu-latest` label,或把 host pressure gate 改成 warn-only 作為預設。 -允許的 controlled apply 是降壓與防再發:停止 / disable / mask runner、quarantine runner binary、收斂 labels、補 source fail-closed guard、搬遷 runner、限制 concurrency、把 smoke 改成排程 / 非 110 runner,以及執行只讀 pressure / cold-start verifier。 +允許的 controlled apply 是降壓與防再發:停止 / disable / mask runner、mask direct transient unit、quarantine runner binary、收斂 labels、補 source fail-closed guard、搬遷 runner、限制 concurrency、把 smoke 改成排程 / 非 110 runner,以及執行只讀 pressure / cold-start verifier。 恢復 runner 必須同時具備: diff --git a/docs/LOGBOOK.md b/docs/LOGBOOK.md index 02d38bf3..81f1b7a1 100644 --- a/docs/LOGBOOK.md +++ b/docs/LOGBOOK.md @@ -30,6 +30,7 @@ - Live 110 `/usr/local/bin/awoooi-wait-host-web-build-pressure.sh` 與 `/usr/local/bin/awoooi-startup-110.sh` 已同步 fail-closed 並加 immutable。 - Live 110 四條 runner 入口改為 immutable fail-closed stub,原 ELF 僅 quarantine 不讀內容:`/home/wooo/act-runner/act_runner`、`/home/wooo/act-runner/act_runner.real-20260628-runner-pressure-guard`、`/home/wooo/act-runner-controlled/act_runner`、`/home/wooo/awoooi-controlled-runner/awoooi_controlled_runner`。 - System-level 與 user-level `gitea-act-runner-host.service`、`gitea-act-runner-awoooi-controlled.service`、`gitea-awoooi-controlled-runner.service`、`gitea-act-runner-awoooi-open.service` 皆讀回 inactive / masked;原 unit file 僅 quarantine,並以 `/dev/null` mask symlink 防止直啟。 +- 08:54 又抓到 transient `awoooi-direct-runner-open.service` 直啟 `.real` binary,且把 `/home/wooo/act-runner/act_runner` / `.real` 還原成 ELF;已強制 kill、將 `awoooi-direct-runner-open.service` 與 `awoooi-direct-runner.service` 建立 `/dev/null` mask,並再次 quarantine ELF / 還原 immutable fail-closed stub。 - `AGENTS.md`、`docs/HARD_RULES.md`、`ops/runner/README.md` 與 MASTER runner 章節補上 110 runner 壓力事故例外:全面授權不等於可重開 production host runner。 **Live readback**: @@ -37,6 +38,9 @@ - 08:40 曾抓到 parent=1 的孤兒 `/home/wooo/act-runner/act_runner.real-20260628-runner-pressure-guard daemon --config config.yaml`,已 terminate;後續改為 stub 以防任何 direct exec 繞過 systemd。 - 08:47 延遲讀回:四個 system runner unit 皆 `LoadState=masked`、`ActiveState=inactive`、`UnitFileState=masked`;user-level runner units 亦為 masked;精準 runner process scan 無命中。 - 08:52 以乾淨 SSH command 重跑 live pressure gate:`/usr/local/bin/awoooi-wait-host-web-build-pressure.sh` 回 `GATE_RC=0`、`no host web/build/smoke pressure detected`。 +- 08:56 讀回:`awoooi-direct-runner-open.service`、`awoooi-direct-runner.service` 與四個 Gitea runner units 全部 `masked / inactive`;四條 runner binary 皆為 163-byte immutable shell stub;`pgrep` 無 runner process;pressure gate 回 `GATE_RC=0`。 +- 08:58 延遲讀回曾顯示 transient `awoooi-direct-runner-open.service` 已消失成 `not-found`;已立即補 `systemctl mask awoooi-direct-runner-open.service`,確認 `/etc/systemd/system/awoooi-direct-runner-open.service -> /dev/null` 且 `LoadState=masked` / `UnitFileState=masked`。 +- 08:59 最終短延遲讀回:兩個 direct runner units 與四個 Gitea runner units 全部 `masked / inactive`;runner process scan 無命中;pressure gate 回 `GATE_RC=0`。 - `post-start-quick-check.sh --no-color` 回 `SCORECARD_RC=2`、`POST_START_QUICK_CHECK PASS=35 WARN=4 BLOCKED=4`、`RESULT=BLOCKED`;public routes 全部 HTTP OK,StockPlatform freshness `ok` / latest trading date `2026-06-26`,但 MOMO daily sales data stale `4|2026-06-24`、backup heartbeat core blocker、credential escrow missing `5` 仍阻擋全主機 green。 - 08:49 曾觀察到 `/home/wooo/awoooi-manual-deploy` 的手動 Web image build;parent 是另一條 SSH session,不是 Gitea runner。08:52 乾淨 gate 已恢復 `GATE_RC=0`;此 live-only manual deploy path 仍需後續納入 pressure lock / 非 110 build path。 diff --git a/ops/runner/README.md b/ops/runner/README.md index bc9773f9..2a85bf9b 100644 --- a/ops/runner/README.md +++ b/ops/runner/README.md @@ -388,14 +388,24 @@ runner registration / service: 2026-06-28 live update:110 runner 壓力事故確認有直呼 `/home/wooo/act-runner/act_runner.real-20260628-runner-pressure-guard` 的孤兒 -daemon。四條 live runner 入口已改為 immutable fail-closed stub,原 ELF 僅 -quarantine 不讀內容;相關 systemd units 維持 inactive / disabled / masked: +daemon,且曾透過 transient `awoooi-direct-runner-open.service` 繞過既有 +Gitea service 名稱。四條 live runner 入口已改為 immutable fail-closed stub, +原 ELF 僅 quarantine 不讀內容;相關 systemd units 維持 inactive / masked: - `/home/wooo/act-runner/act_runner` - `/home/wooo/act-runner/act_runner.real-20260628-runner-pressure-guard` - `/home/wooo/act-runner-controlled/act_runner` - `/home/wooo/awoooi-controlled-runner/awoooi_controlled_runner` +必須一併維持 masked 的 unit 名稱: + +- `awoooi-direct-runner-open.service` +- `awoooi-direct-runner.service` +- `gitea-act-runner-host.service` +- `gitea-act-runner-awoooi-controlled.service` +- `gitea-awoooi-controlled-runner.service` +- `gitea-act-runner-awoooi-open.service` + 未完成 runner 搬遷 / 限流 / smoke 排程前,不得解除 mask、還原 ELF、恢復 泛用 runner label,或把 host pressure gate 預設改成 warn-only。 From 264b8e0a70a7b2fad70afede4b0d7a1c08d1aef8 Mon Sep 17 00:00:00 2001 From: Your Name Date: Sun, 28 Jun 2026 09:06:17 +0800 Subject: [PATCH 2/3] fix(iwooos): add wazuh accepted status message [skip ci] --- apps/web/messages/en.json | 3 ++- apps/web/messages/zh-TW.json | 3 ++- apps/web/src/app/[locale]/iwooos/page.tsx | 2 +- 3 files changed, 5 insertions(+), 3 deletions(-) diff --git a/apps/web/messages/en.json b/apps/web/messages/en.json index 1c08ff0e..d2660aa4 100644 --- a/apps/web/messages/en.json +++ b/apps/web/messages/en.json @@ -20568,7 +20568,7 @@ }, "wazuhAccepted": { "label": "Wazuh accepted", - "detail": "Manager registry accepted 仍為 0。" + "detail": "Manager registry accepted readback is 6; runtime gate remains 0." } }, "domainMetric": { @@ -20583,6 +20583,7 @@ "waiting_actor_before_after_and_recurrence_guard": "等待 actor、before / after 與防再發證據", "manifest_mapped_read_only_runtime_gate_closed": "Manifest 已映射,runtime gate 仍關閉", "waiting_manager_registry_readback": "等待 Wazuh manager registry 全量讀回", + "manager_registry_readback_accepted_runtime_gate_closed": "Manager registry accepted readback is present; runtime gate remains closed", "draft_waiting_owner_review_runtime_gate_closed": "等待 owner evidence review,runtime gate 仍關閉", "read_only_inventory_runtime_write_gate_closed": "只讀盤點完成,AI runtime write gate 仍關閉" }, diff --git a/apps/web/messages/zh-TW.json b/apps/web/messages/zh-TW.json index 1c08ff0e..7628e3a4 100644 --- a/apps/web/messages/zh-TW.json +++ b/apps/web/messages/zh-TW.json @@ -20568,7 +20568,7 @@ }, "wazuhAccepted": { "label": "Wazuh accepted", - "detail": "Manager registry accepted 仍為 0。" + "detail": "Manager registry accepted 已讀回 6;runtime gate 仍為 0。" } }, "domainMetric": { @@ -20583,6 +20583,7 @@ "waiting_actor_before_after_and_recurrence_guard": "等待 actor、before / after 與防再發證據", "manifest_mapped_read_only_runtime_gate_closed": "Manifest 已映射,runtime gate 仍關閉", "waiting_manager_registry_readback": "等待 Wazuh manager registry 全量讀回", + "manager_registry_readback_accepted_runtime_gate_closed": "Manager registry accepted 已讀回,runtime gate 仍關閉", "draft_waiting_owner_review_runtime_gate_closed": "等待 owner evidence review,runtime gate 仍關閉", "read_only_inventory_runtime_write_gate_closed": "只讀盤點完成,AI runtime write gate 仍關閉" }, diff --git a/apps/web/src/app/[locale]/iwooos/page.tsx b/apps/web/src/app/[locale]/iwooos/page.tsx index cb8d8c54..bcd36b34 100644 --- a/apps/web/src/app/[locale]/iwooos/page.tsx +++ b/apps/web/src/app/[locale]/iwooos/page.tsx @@ -8573,7 +8573,7 @@ function IwoooSSecurityControlCoverageBoard() { key: 'wazuhAccepted', value: summary ? String(summary.wazuh_manager_registry_accepted_count) : '...', icon: Radar, - tone: 'locked', + tone: summary && summary.wazuh_manager_registry_accepted_count > 0 ? 'steady' : 'locked', }, ] const domains = data?.domains ?? [] From 27b54dfe75221ce0b92e2c81ade0d5ead3832a4a Mon Sep 17 00:00:00 2001 From: AWOOOI CD Date: Sun, 28 Jun 2026 09:08:36 +0800 Subject: [PATCH 3/3] chore(cd): deploy a1f5935 [skip ci] --- k8s/awoooi-prod/kustomization.yaml | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/k8s/awoooi-prod/kustomization.yaml b/k8s/awoooi-prod/kustomization.yaml index da429a27..de53cdcf 100644 --- a/k8s/awoooi-prod/kustomization.yaml +++ b/k8s/awoooi-prod/kustomization.yaml @@ -41,7 +41,7 @@ resources: images: - name: 192.168.0.110:5000/library/api:IMAGE_TAG_PLACEHOLDER newName: 192.168.0.110:5000/awoooi/api - newTag: d4c2cc6e200fc00e07d179ebb9a4a156cef2c6d5 + newTag: a1f5935481ad01cc3f73ebb4354726d57e7a2e41 - name: 192.168.0.110:5000/library/web:IMAGE_TAG_PLACEHOLDER newName: 192.168.0.110:5000/awoooi/web - newTag: d4c2cc6e200fc00e07d179ebb9a4a156cef2c6d5 + newTag: a1f5935481ad01cc3f73ebb4354726d57e7a2e41