Files
awoooi/apps/api
OG T ca862c5575
Some checks failed
CD Pipeline / build-and-deploy (push) Has been cancelled
fix(GAP-A4 Phase 2): LLM 路徑 target 救援 — 解開 12 次飛輪攔截
統帥全景報告診斷(2026-04-14 20:00):
2h 內 12 次 auto_execute_blocked_unresolved_placeholder
全是 LLM 直接產出 `kubectl ... deployment HostHighCpuLoad`
GAP-A4 Phase 1 只修了 alert_rule_engine._extract_vars
但 LLM 在 decision_manager 路徑沒做同樣檢查 → 12 次擋下 → 0 KM 0 飛輪

修復 (decision_manager._auto_execute placeholder 替換後):
1. 從 action regex 提取 deployment 名(kubectl ... deployment XXX)
2. 套用 alert_rule_engine._is_bad_target() 驗證
3. 若是垃圾(==alertname/unknown/IP)→ 從 incident.signals[0].labels
   重推 (用 _extract_vars 同一套 multi-layer 邏輯)
4. 若有合法 target → action.replace(llm_target, good_target)
5. 若 labels 也救不了 → log target_rescue_failed → safety guard 處理

效果:
- KubePodCrashLooping (有 deployment label) → LLM 即使填錯也救回
- HostHighCpuLoad (純主機,無 K8s label) → 仍進 safety guard,
  但 log 變 target_rescue_failed 而非 unresolved_placeholder
- 12 次飛輪攔截可望大幅減少

回歸:66/66 (GAP-A4 + kubectl validation) 全過

Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
2026-04-14 20:06:05 +08:00
..