diff --git a/docs/LOGBOOK.md b/docs/LOGBOOK.md index 7abd7be5..1ff6ea4b 100644 --- a/docs/LOGBOOK.md +++ b/docs/LOGBOOK.md @@ -1,3 +1,64 @@ +## 2026-05-13 | T5 Incident / Approval / Execution reconciliation 已推版 + +**背景**:B6C589 類 incident 會出現狀態矛盾:Telegram 顯示需要審批 / 處理,DB 裡 approval 已 `APPROVED` 且 action 是 `NO_ACTION`,但 incident 仍 `INVESTIGATING`,automation execution / verification 又沒有成功紀錄。Operator 不能再靠人工猜測「AI 到底修了沒」。 + +**修正**: +- `awooop_truth_chain_service.py` 新增 read-only `incident_reconciliation_v1`。 +- 不自動關 incident、不補寫 approval、不重跑 execution;只把跨表狀態一致性機器化輸出。 +- Reconciliation 會比對: + - incident 是否已關閉。 + - latest approval 是否已終態。 + - approval 是否 approved 但沒有 `automation_operation_log`。 + - `NO_ACTION` 是否沒有 successful executor operation。 + - evidence sensors 是否全部失敗。 + - timeline 是否缺少 lifecycle entries。 +- Truth-chain 回傳: + - `consistency_status=consistent|degraded|blocked|not_applicable` + - `operator_next_state=continue|investigate|manual_required|not_applicable` + - `facts` + - `mismatches[]` + +**驗證與推版**: +- Local: + - `py_compile`:pass。 + - `ruff --select F,E9`:pass。 + - `pytest tests/test_awooop_truth_chain_service.py tests/test_phase25_drift_detection.py tests/test_drift_interpreter_ollama_first.py tests/test_platform_router_order.py tests/test_awooop_operator_auth.py -q`:39 passed。 + - `git diff --check`:pass。 +- Gitea: + - `1003fa42 feat(awooop): expose incident reconciliation state` 已推 `gitea main`。 + - Code Review run `1940`:success。 + - CD run `1939`:success。 + - Deploy marker:`631fc220 chore(cd): deploy 1003fa4 [skip ci]`。 +- Production: + - API/Web/Worker image 均為 `1003fa4246290bec2bec4cd04caae9b8221996d9`。 + - K3s rollout status:API/Web/Worker success。 + - Health:host-local NodePort `127.0.0.1:32334` healthy / mock_mode=false;本機直連 `192.168.0.120:32334` 當下仍 timeout,需另查 host/network path。 + - Truth-chain smoke `INC-20260512-B6C589`: + - `source_type=incident` + - `current_stage=manual_required` + - `stage_status=blocked` + - `needs_human=true` + - `reconciliation_schema=incident_reconciliation_v1` + - `consistency_status=blocked` + - `operator_next_state=manual_required` + - mismatch codes: + - `incident_open_after_approval_resolved` + - `approval_approved_without_execution_record` + - `approval_no_action_without_execution` + - `evidence_all_sensors_failed` + - `automation_records=0` + - `timeline_events=1` + +**整體進度**: +- Wave 0:MOMO PostgreSQL backup → AwoooP 失敗通知接線完成並已推版。 +- T0:Truth-chain read-only API 完成、部署、production smoke 完成。 +- T1:Channel Event hardening 完成、部署、production smoke 完成。 +- T2:legacy MCP audit bridge / backfill / truth-chain visibility 完成、部署、production smoke 完成;first-class Gateway enforced path 仍待後續 wave。 +- T3:Ansible audit contract + decision candidate dry-run audit 完成、部署、production smoke 完成。 +- T4:Config Drift stable fingerprint / repeat-state / Telegram stage visibility 完成、部署、production smoke 完成。 +- T5:Incident / Approval / Execution reconciliation 完成、部署、production smoke 完成。 +- 仍未完成:first-class MCP Gateway enforcement、Ansible 真正 check-mode executor / diff / apply / rollback、reconciliation 結果推回 Telegram / Operator Console UI 的顯示層。 + ## 2026-05-13 | T4 Config Drift fingerprint repeat-state 已推版 **背景**:Config Drift Telegram 卡片只顯示單次 `report_id` 與 HIGH/MEDIUM/INFO 計數,Operator 無法判斷是否同一漂移一直重複、已跑到哪個流程階段、是否需要人工。舊 truth-chain repeat 只用 namespace/status/counts 分組,會把「剛好同計數但 items 不同」誤認為同一漂移。 diff --git a/docs/superpowers/specs/2026-04-15-MASTER-ai-autonomous-flywheel-v2.md b/docs/superpowers/specs/2026-04-15-MASTER-ai-autonomous-flywheel-v2.md index 5b4be5f9..f83d53c9 100644 --- a/docs/superpowers/specs/2026-04-15-MASTER-ai-autonomous-flywheel-v2.md +++ b/docs/superpowers/specs/2026-04-15-MASTER-ai-autonomous-flywheel-v2.md @@ -1958,6 +1958,21 @@ Phase 6 完成後 - 重要校正:舊 count-based repeat 看到 12 次,新 stable item fingerprint 證實同一漂移 fingerprint 只有 2 次;12 次只能稱為同計數候選,不能稱為同一漂移。 - 邊界:T4 只補可觀測與重複判定,不做 auto-adopt / rollback / ignore。 +**T5 Incident / Approval / Execution reconciliation production verified(2026-05-13 台北)**: +- `1003fa42 feat(awooop): expose incident reconciliation state` 已推 Gitea main,Code Review run `1940` success,CD run `1939` success。 +- Deploy marker:`631fc220 chore(cd): deploy 1003fa4 [skip ci]`。 +- Truth-chain 新增 read-only `incident_reconciliation_v1`,不自動關單、不補寫 approval、不重跑 execution,只輸出跨表一致性。 +- Reconciliation 會回傳 `consistency_status`、`operator_next_state`、`facts`、`mismatches[]`,用於 Operator Console / Telegram 顯示「AI 是否真的處理完成,或必須人工介入」。 +- Production `INC-20260512-B6C589` smoke: + - `current_stage=manual_required` + - `stage_status=blocked` + - `consistency_status=blocked` + - `operator_next_state=manual_required` + - mismatches:`incident_open_after_approval_resolved`、`approval_approved_without_execution_record`、`approval_no_action_without_execution`、`evidence_all_sensors_failed` + - `automation_records=0` +- Health:K3s rollout success;host-local NodePort health 200 / `mock_mode=false`。本機直連 `192.168.0.120:32334` 當下 timeout,需另查 workstation-to-node path;cluster 內與 host-local API healthy。 +- 邊界:T5 只讓矛盾狀態可見;下一段仍需把 reconciliation 結果回推 Telegram / Operator Console UI,並處理 root cause(execution / incident closure)。 + --- ### 2026-04-20 晚 (台北) — C1-C4 全流程串接 — Playbook 鏈路保護(commit de2d34d)