docs(awooop): record t5 reconciliation deployment
This commit is contained in:
@@ -1,3 +1,64 @@
|
||||
## 2026-05-13 | T5 Incident / Approval / Execution reconciliation 已推版
|
||||
|
||||
**背景**:B6C589 類 incident 會出現狀態矛盾:Telegram 顯示需要審批 / 處理,DB 裡 approval 已 `APPROVED` 且 action 是 `NO_ACTION`,但 incident 仍 `INVESTIGATING`,automation execution / verification 又沒有成功紀錄。Operator 不能再靠人工猜測「AI 到底修了沒」。
|
||||
|
||||
**修正**:
|
||||
- `awooop_truth_chain_service.py` 新增 read-only `incident_reconciliation_v1`。
|
||||
- 不自動關 incident、不補寫 approval、不重跑 execution;只把跨表狀態一致性機器化輸出。
|
||||
- Reconciliation 會比對:
|
||||
- incident 是否已關閉。
|
||||
- latest approval 是否已終態。
|
||||
- approval 是否 approved 但沒有 `automation_operation_log`。
|
||||
- `NO_ACTION` 是否沒有 successful executor operation。
|
||||
- evidence sensors 是否全部失敗。
|
||||
- timeline 是否缺少 lifecycle entries。
|
||||
- Truth-chain 回傳:
|
||||
- `consistency_status=consistent|degraded|blocked|not_applicable`
|
||||
- `operator_next_state=continue|investigate|manual_required|not_applicable`
|
||||
- `facts`
|
||||
- `mismatches[]`
|
||||
|
||||
**驗證與推版**:
|
||||
- Local:
|
||||
- `py_compile`:pass。
|
||||
- `ruff --select F,E9`:pass。
|
||||
- `pytest tests/test_awooop_truth_chain_service.py tests/test_phase25_drift_detection.py tests/test_drift_interpreter_ollama_first.py tests/test_platform_router_order.py tests/test_awooop_operator_auth.py -q`:39 passed。
|
||||
- `git diff --check`:pass。
|
||||
- Gitea:
|
||||
- `1003fa42 feat(awooop): expose incident reconciliation state` 已推 `gitea main`。
|
||||
- Code Review run `1940`:success。
|
||||
- CD run `1939`:success。
|
||||
- Deploy marker:`631fc220 chore(cd): deploy 1003fa4 [skip ci]`。
|
||||
- Production:
|
||||
- API/Web/Worker image 均為 `1003fa4246290bec2bec4cd04caae9b8221996d9`。
|
||||
- K3s rollout status:API/Web/Worker success。
|
||||
- Health:host-local NodePort `127.0.0.1:32334` healthy / mock_mode=false;本機直連 `192.168.0.120:32334` 當下仍 timeout,需另查 host/network path。
|
||||
- Truth-chain smoke `INC-20260512-B6C589`:
|
||||
- `source_type=incident`
|
||||
- `current_stage=manual_required`
|
||||
- `stage_status=blocked`
|
||||
- `needs_human=true`
|
||||
- `reconciliation_schema=incident_reconciliation_v1`
|
||||
- `consistency_status=blocked`
|
||||
- `operator_next_state=manual_required`
|
||||
- mismatch codes:
|
||||
- `incident_open_after_approval_resolved`
|
||||
- `approval_approved_without_execution_record`
|
||||
- `approval_no_action_without_execution`
|
||||
- `evidence_all_sensors_failed`
|
||||
- `automation_records=0`
|
||||
- `timeline_events=1`
|
||||
|
||||
**整體進度**:
|
||||
- Wave 0:MOMO PostgreSQL backup → AwoooP 失敗通知接線完成並已推版。
|
||||
- T0:Truth-chain read-only API 完成、部署、production smoke 完成。
|
||||
- T1:Channel Event hardening 完成、部署、production smoke 完成。
|
||||
- T2:legacy MCP audit bridge / backfill / truth-chain visibility 完成、部署、production smoke 完成;first-class Gateway enforced path 仍待後續 wave。
|
||||
- T3:Ansible audit contract + decision candidate dry-run audit 完成、部署、production smoke 完成。
|
||||
- T4:Config Drift stable fingerprint / repeat-state / Telegram stage visibility 完成、部署、production smoke 完成。
|
||||
- T5:Incident / Approval / Execution reconciliation 完成、部署、production smoke 完成。
|
||||
- 仍未完成:first-class MCP Gateway enforcement、Ansible 真正 check-mode executor / diff / apply / rollback、reconciliation 結果推回 Telegram / Operator Console UI 的顯示層。
|
||||
|
||||
## 2026-05-13 | T4 Config Drift fingerprint repeat-state 已推版
|
||||
|
||||
**背景**:Config Drift Telegram 卡片只顯示單次 `report_id` 與 HIGH/MEDIUM/INFO 計數,Operator 無法判斷是否同一漂移一直重複、已跑到哪個流程階段、是否需要人工。舊 truth-chain repeat 只用 namespace/status/counts 分組,會把「剛好同計數但 items 不同」誤認為同一漂移。
|
||||
|
||||
@@ -1958,6 +1958,21 @@ Phase 6 完成後
|
||||
- 重要校正:舊 count-based repeat 看到 12 次,新 stable item fingerprint 證實同一漂移 fingerprint 只有 2 次;12 次只能稱為同計數候選,不能稱為同一漂移。
|
||||
- 邊界:T4 只補可觀測與重複判定,不做 auto-adopt / rollback / ignore。
|
||||
|
||||
**T5 Incident / Approval / Execution reconciliation production verified(2026-05-13 台北)**:
|
||||
- `1003fa42 feat(awooop): expose incident reconciliation state` 已推 Gitea main,Code Review run `1940` success,CD run `1939` success。
|
||||
- Deploy marker:`631fc220 chore(cd): deploy 1003fa4 [skip ci]`。
|
||||
- Truth-chain 新增 read-only `incident_reconciliation_v1`,不自動關單、不補寫 approval、不重跑 execution,只輸出跨表一致性。
|
||||
- Reconciliation 會回傳 `consistency_status`、`operator_next_state`、`facts`、`mismatches[]`,用於 Operator Console / Telegram 顯示「AI 是否真的處理完成,或必須人工介入」。
|
||||
- Production `INC-20260512-B6C589` smoke:
|
||||
- `current_stage=manual_required`
|
||||
- `stage_status=blocked`
|
||||
- `consistency_status=blocked`
|
||||
- `operator_next_state=manual_required`
|
||||
- mismatches:`incident_open_after_approval_resolved`、`approval_approved_without_execution_record`、`approval_no_action_without_execution`、`evidence_all_sensors_failed`
|
||||
- `automation_records=0`
|
||||
- Health:K3s rollout success;host-local NodePort health 200 / `mock_mode=false`。本機直連 `192.168.0.120:32334` 當下 timeout,需另查 workstation-to-node path;cluster 內與 host-local API healthy。
|
||||
- 邊界:T5 只讓矛盾狀態可見;下一段仍需把 reconciliation 結果回推 Telegram / Operator Console UI,並處理 root cause(execution / incident closure)。
|
||||
|
||||
---
|
||||
|
||||
### 2026-04-20 晚 (台北) — C1-C4 全流程串接 — Playbook 鏈路保護(commit de2d34d)
|
||||
|
||||
Reference in New Issue
Block a user