docs(logbook): record telegram result backfill [skip ci]

This commit is contained in:
Your Name
2026-05-31 16:49:24 +08:00
parent aee92bc7a3
commit 752de4e1b3
2 changed files with 71 additions and 0 deletions

View File

@@ -1,3 +1,68 @@
## 2026-05-31Telegram / AwoooP 歷史 result 補寫與殘留狀態分流
**背景**
- T154c / T154d 已把新告警的 `operator_outcome_v1`、Telegram result delivery、以及 rejected / expired 的非執行終局語意補上;但 production 近 7 天仍有歷史 approval 已進入 `EXECUTION_SUCCESS` / `EXECUTION_FAILED`,卻缺少 `TELEGRAM_RESULT_SENT` 與部分 `EXECUTION_COMPLETED` durable evidence。
- 這些歷史缺口會讓 Telegram 與 AwoooP 看起來像「已批准 / 執行中 / 已處理」後沒有最終結果,也不清楚是否還需要人工介入。
**本次 production backfill**
- Backfill id`operator_outcome_result_backfill_20260531_t154e`
- 範圍:最近 7 天、`approval_records.status in (EXECUTION_SUCCESS, EXECUTION_FAILED)`、且缺少 `alert_operation_log.TELEGRAM_RESULT_SENT` 的 approval不處理 `PENDING` / `EXPIRED` / `REJECTED`,避免把審批或逾期狀態偽裝成執行結果。
- 寫入:
- `approval_records.extra_metadata``execution_kind``repair_attempted``repair_executed` 與 backfill id共 67 筆。
- `alert_operation_log.EXECUTION_COMPLETED` 補 25 筆。
- Telegram SRE 群組發送單一歷史摘要不逐筆洗版message id `19602`
- `alert_operation_log.TELEGRAM_RESULT_SENT` 補 67 筆context 保留 backfill id、Telegram digest message id、operator outcome、execution_kind、repair flags。
**驗證**
```text
dry-run:
candidate_total=67
EXECUTION_SUCCESS=57
EXECUTION_FAILED=10
metadata_updates=67
execution_completed_inserts=25
result_event_inserts_after_digest=67
actual:
metadata_updates=67
execution_completed_inserted=25
execution_completed_failures=0
telegram_result_inserted=67
telegram_result_failures=0
telegram_digest_message_id=19602
by_operator_outcome:
verification_degraded_manual_required=40
diagnostic_only_manual_review=17
execution_failed_manual_required=10
post-verify 7d:
EXECUTION_SUCCESS total=61 missing_result=0 missing_completed=0
EXECUTION_FAILED total=16 missing_result=0 missing_completed=0
APPROVED / PENDING / EXPIRED / REJECTED 仍不是 execution result需由各自 outcome / approval queue 呈現
production smoke:
rejected sample INC-20260528-CD7B3A ->
state=approval_rejected_no_execution
needs_human=false
reason=approval_rejected
expired sample INC-20260529-746D4B ->
state=approval_expired_manual_review
needs_human=true
reason=approval_expired_without_operator_decision
execution failed sample INC-20260531-BE2B25 ->
state=execution_failed_manual_required
needs_human=true
next_action=manual_fix_or_rollback
```
**進度邊界**
- 這次只補 durable audit / result notification evidence不重跑任何修復動作。
- 近 7 天 `PENDING``APPROVED``EXPIRED``REJECTED``missing_result` 仍會存在,這是正確分流:它們不是 execution terminal result應分別透過 pending approval queue、expired manual review、rejected no-execution outcome 呈現。
## 2026-05-31IwoooS 首屏資安推進總覽與全站繁中收斂
**背景**

View File

@@ -2698,6 +2698,12 @@ Phase 6 完成後
- VerificationAPI `py_compile` passtargeted `ruff --select E9,F401,F821,F841` pass`test_approval_execution_no_action.py` + `test_operator_outcome.py` + `test_awooop_truth_chain_service.py` + `test_awooop_operator_timeline_labels.py` + `test_telegram_message_templates.py` + `test_incident_timeline_service.py` -> 169 passed`git diff --check` pass。
- 判讀T154d 不把 rejected/expired 偽裝成執行結果;它補的是「審批終局」的 operator outcome讓前台與 Telegram 摘要可以分辨「已人工拒絕」與「逾期需重新審查」。
**T154e Historical result notification backfill2026-05-31 台北)**
- 觸發T154c/T154d 上線後production 近 7 天仍有歷史 `EXECUTION_SUCCESS` / `EXECUTION_FAILED` approval 缺 `TELEGRAM_RESULT_SENT`,其中部分也缺 `EXECUTION_COMPLETED`,造成 operator 看到批准或執行狀態後沒有可追溯終局結果。
- Backfill`operator_outcome_result_backfill_20260531_t154e` 只處理最近 7 天 `status in (EXECUTION_SUCCESS, EXECUTION_FAILED)` 且缺 result event 的 approval不碰 `PENDING` / `EXPIRED` / `REJECTED`。補 `approval_records.extra_metadata` 67 筆、`alert_operation_log.EXECUTION_COMPLETED` 25 筆、發送單一 Telegram SRE 歷史摘要 message id `19602`、補 `alert_operation_log.TELEGRAM_RESULT_SENT` 67 筆。每筆 result context 保留 backfill id、digest message id、`operator_outcome``execution_kind``repair_attempted``repair_executed`
- Verificationdry-run candidate `67``EXECUTION_SUCCESS=57`, `EXECUTION_FAILED=10`actual `metadata_updates=67`, `execution_completed_inserted=25`, `telegram_result_inserted=67`, failures=0。Post-verify 7d`EXECUTION_SUCCESS total=61 missing_result=0 missing_completed=0``EXECUTION_FAILED total=16 missing_result=0 missing_completed=0`。Production smoke`INC-20260531-BE2B25 -> execution_failed_manual_required`rejected/expired samples continue returning `approval_rejected_no_execution` / `approval_expired_manual_review`
- 判讀T154e 是歷史 audit/result 補洞與單封摘要補通知,不重跑任何修復。`APPROVED` / `PENDING` / `EXPIRED` / `REJECTED` 的 missing result 不是執行缺口,需由 approval queue / terminal outcome 分流呈現。
**T152 Ansible runtime readiness surfaced2026-05-24 台北)**
- 觸發T151 已讓首頁看到 execution backend / Ansible attribution但 operator 仍看不到 runtime 端缺什麼容易把「Ansible 有候選」誤解成「Ansible 已能自動修復」。
- 修正API image 複製 `infra/ansible/` 作 read-only catalog`truth-chain/quality/summary` 新增 `ansible_runtime`,回報 playbook binary、catalog、inventory、playbook_count、can_run_check_mode、blockers。首頁 execution evidence 同步顯示 runtime 狀態;目前 production 顯示 `runtime 未就緒ansible_playbook_binary_missing`。未安裝 `ansible-core`、未啟用 check-mode / apply。