【P0.4 補丁】pre_decision_investigator Prometheus query 欄位缺失 - _build_tool_params() 補 "query" 欄位(prometheus_query tool 必要參數) - 新增 _build_prometheus_query() — 依告警類型生成 PromQL(CPU/Memory/Crash/Disk/HTTP/Pod/fallback) - 修復後 D3_METRICS 感官維度實際取得資料(原本 100% 回 missing_query_parameter) 【P1 Playbook 學習閉環 B1-B5 全修】 - B2 db/models.py: ApprovalRecord 新增 matched_playbook_id 欄位 + ix_approval_matched_playbook index - B2 db/models.py: TimelineEvent 新增 incident_id 欄位(MCP 稽核用)+ index - B3 approval_db.py: record→ApprovalRequest 補回 incident_id + matched_playbook_id - B4 approval_repository.py: 同 B3(兩個轉換函式必須同步) - B5 approval_db.py: approval_request_to_record_data 補 matched_playbook_id → DB 才能存值 【P1.5 KM 寫入】approval_execution.py: fire-and-forget → await wait_for(30s) - 根因:asyncio.create_task 在 Pod recycle 時被殺,KM 寫入靜默遺失 - 修復:await asyncio.wait_for(..., timeout=30.0) + TimeoutError log 【Migration 文件】adr092_p1_learning_chain_fix.sql - ALTER TABLE approval_records ADD COLUMN matched_playbook_id VARCHAR(36) - ALTER TABLE timeline_events ADD COLUMN incident_id VARCHAR(64) - 執行:psql $DATABASE_URL -f apps/api/migrations/adr092_p1_learning_chain_fix.sql 【附帶 Agent 改動】 - decision_manager: Phase 2 YAML NO_ACTION 優先門(主機層/外部服務跳過 Agent Debate) - alert_rules.yaml: Sentry/ClickHouse + HostDiskUsageHigh/Critical 新規則 - solver_agent: action_title 語意合成兜底(取代靜默丟棄) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
35 KiB
35 KiB