diff --git a/docs/LOGBOOK.md b/docs/LOGBOOK.md index ee391d89..65f0f122 100644 --- a/docs/LOGBOOK.md +++ b/docs/LOGBOOK.md @@ -1,3 +1,68 @@ +## 2026-05-31|Telegram / AwoooP 歷史 result 補寫與殘留狀態分流 + +**背景**: + +- T154c / T154d 已把新告警的 `operator_outcome_v1`、Telegram result delivery、以及 rejected / expired 的非執行終局語意補上;但 production 近 7 天仍有歷史 approval 已進入 `EXECUTION_SUCCESS` / `EXECUTION_FAILED`,卻缺少 `TELEGRAM_RESULT_SENT` 與部分 `EXECUTION_COMPLETED` durable evidence。 +- 這些歷史缺口會讓 Telegram 與 AwoooP 看起來像「已批准 / 執行中 / 已處理」後沒有最終結果,也不清楚是否還需要人工介入。 + +**本次 production backfill**: + +- Backfill id:`operator_outcome_result_backfill_20260531_t154e`。 +- 範圍:最近 7 天、`approval_records.status in (EXECUTION_SUCCESS, EXECUTION_FAILED)`、且缺少 `alert_operation_log.TELEGRAM_RESULT_SENT` 的 approval;不處理 `PENDING` / `EXPIRED` / `REJECTED`,避免把審批或逾期狀態偽裝成執行結果。 +- 寫入: + - `approval_records.extra_metadata` 補 `execution_kind`、`repair_attempted`、`repair_executed` 與 backfill id,共 67 筆。 + - `alert_operation_log.EXECUTION_COMPLETED` 補 25 筆。 + - Telegram SRE 群組發送單一歷史摘要,不逐筆洗版;message id `19602`。 + - `alert_operation_log.TELEGRAM_RESULT_SENT` 補 67 筆,context 保留 backfill id、Telegram digest message id、operator outcome、execution_kind、repair flags。 + +**驗證**: + +```text +dry-run: + candidate_total=67 + EXECUTION_SUCCESS=57 + EXECUTION_FAILED=10 + metadata_updates=67 + execution_completed_inserts=25 + result_event_inserts_after_digest=67 + +actual: + metadata_updates=67 + execution_completed_inserted=25 + execution_completed_failures=0 + telegram_result_inserted=67 + telegram_result_failures=0 + telegram_digest_message_id=19602 + by_operator_outcome: + verification_degraded_manual_required=40 + diagnostic_only_manual_review=17 + execution_failed_manual_required=10 + +post-verify 7d: + EXECUTION_SUCCESS total=61 missing_result=0 missing_completed=0 + EXECUTION_FAILED total=16 missing_result=0 missing_completed=0 + APPROVED / PENDING / EXPIRED / REJECTED 仍不是 execution result,需由各自 outcome / approval queue 呈現 + +production smoke: + rejected sample INC-20260528-CD7B3A -> + state=approval_rejected_no_execution + needs_human=false + reason=approval_rejected + expired sample INC-20260529-746D4B -> + state=approval_expired_manual_review + needs_human=true + reason=approval_expired_without_operator_decision + execution failed sample INC-20260531-BE2B25 -> + state=execution_failed_manual_required + needs_human=true + next_action=manual_fix_or_rollback +``` + +**進度邊界**: + +- 這次只補 durable audit / result notification evidence,不重跑任何修復動作。 +- 近 7 天 `PENDING`、`APPROVED`、`EXPIRED`、`REJECTED` 的 `missing_result` 仍會存在,這是正確分流:它們不是 execution terminal result,應分別透過 pending approval queue、expired manual review、rejected no-execution outcome 呈現。 + ## 2026-05-31|IwoooS 首屏資安推進總覽與全站繁中收斂 **背景**: diff --git a/docs/superpowers/specs/2026-04-15-MASTER-ai-autonomous-flywheel-v2.md b/docs/superpowers/specs/2026-04-15-MASTER-ai-autonomous-flywheel-v2.md index f20da337..9a5113ce 100644 --- a/docs/superpowers/specs/2026-04-15-MASTER-ai-autonomous-flywheel-v2.md +++ b/docs/superpowers/specs/2026-04-15-MASTER-ai-autonomous-flywheel-v2.md @@ -2698,6 +2698,12 @@ Phase 6 完成後 - Verification:API `py_compile` pass;targeted `ruff --select E9,F401,F821,F841` pass;`test_approval_execution_no_action.py` + `test_operator_outcome.py` + `test_awooop_truth_chain_service.py` + `test_awooop_operator_timeline_labels.py` + `test_telegram_message_templates.py` + `test_incident_timeline_service.py` -> 169 passed;`git diff --check` pass。 - 判讀:T154d 不把 rejected/expired 偽裝成執行結果;它補的是「審批終局」的 operator outcome,讓前台與 Telegram 摘要可以分辨「已人工拒絕」與「逾期需重新審查」。 +**T154e Historical result notification backfill(2026-05-31 台北)**: +- 觸發:T154c/T154d 上線後,production 近 7 天仍有歷史 `EXECUTION_SUCCESS` / `EXECUTION_FAILED` approval 缺 `TELEGRAM_RESULT_SENT`,其中部分也缺 `EXECUTION_COMPLETED`,造成 operator 看到批准或執行狀態後沒有可追溯終局結果。 +- Backfill:`operator_outcome_result_backfill_20260531_t154e` 只處理最近 7 天 `status in (EXECUTION_SUCCESS, EXECUTION_FAILED)` 且缺 result event 的 approval,不碰 `PENDING` / `EXPIRED` / `REJECTED`。補 `approval_records.extra_metadata` 67 筆、`alert_operation_log.EXECUTION_COMPLETED` 25 筆、發送單一 Telegram SRE 歷史摘要 message id `19602`、補 `alert_operation_log.TELEGRAM_RESULT_SENT` 67 筆。每筆 result context 保留 backfill id、digest message id、`operator_outcome`、`execution_kind`、`repair_attempted`、`repair_executed`。 +- Verification:dry-run candidate `67`(`EXECUTION_SUCCESS=57`, `EXECUTION_FAILED=10`);actual `metadata_updates=67`, `execution_completed_inserted=25`, `telegram_result_inserted=67`, failures=0。Post-verify 7d:`EXECUTION_SUCCESS total=61 missing_result=0 missing_completed=0`,`EXECUTION_FAILED total=16 missing_result=0 missing_completed=0`。Production smoke:`INC-20260531-BE2B25 -> execution_failed_manual_required`;rejected/expired samples continue returning `approval_rejected_no_execution` / `approval_expired_manual_review`。 +- 判讀:T154e 是歷史 audit/result 補洞與單封摘要補通知,不重跑任何修復。`APPROVED` / `PENDING` / `EXPIRED` / `REJECTED` 的 missing result 不是執行缺口,需由 approval queue / terminal outcome 分流呈現。 + **T152 Ansible runtime readiness surfaced(2026-05-24 台北)**: - 觸發:T151 已讓首頁看到 execution backend / Ansible attribution,但 operator 仍看不到 runtime 端缺什麼,容易把「Ansible 有候選」誤解成「Ansible 已能自動修復」。 - 修正:API image 複製 `infra/ansible/` 作 read-only catalog;`truth-chain/quality/summary` 新增 `ansible_runtime`,回報 playbook binary、catalog、inventory、playbook_count、can_run_check_mode、blockers。首頁 execution evidence 同步顯示 runtime 狀態;目前 production 顯示 `runtime 未就緒:ansible_playbook_binary_missing`。未安裝 `ansible-core`、未啟用 check-mode / apply。