Files
awoooi/apps/api/migrations/adr092_p1_learning_chain_fix.sql
Your Name 04ff22563e
Some checks failed
run-migration / migrate (push) Failing after 14s
CD Pipeline / build-and-deploy (push) Failing after 2m7s
fix(aiops-p1): Playbook 學習閉環 5斷點全修 + DB Migration(ADR-092 B4)
【P0.4 補丁】pre_decision_investigator Prometheus query 欄位缺失
- _build_tool_params() 補 "query" 欄位(prometheus_query tool 必要參數)
- 新增 _build_prometheus_query() — 依告警類型生成 PromQL(CPU/Memory/Crash/Disk/HTTP/Pod/fallback)
- 修復後 D3_METRICS 感官維度實際取得資料(原本 100% 回 missing_query_parameter)

【P1 Playbook 學習閉環 B1-B5 全修】
- B2 db/models.py: ApprovalRecord 新增 matched_playbook_id 欄位 + ix_approval_matched_playbook index
- B2 db/models.py: TimelineEvent 新增 incident_id 欄位(MCP 稽核用)+ index
- B3 approval_db.py: record→ApprovalRequest 補回 incident_id + matched_playbook_id
- B4 approval_repository.py: 同 B3(兩個轉換函式必須同步)
- B5 approval_db.py: approval_request_to_record_data 補 matched_playbook_id → DB 才能存值

【P1.5 KM 寫入】approval_execution.py: fire-and-forget → await wait_for(30s)
- 根因:asyncio.create_task 在 Pod recycle 時被殺,KM 寫入靜默遺失
- 修復:await asyncio.wait_for(..., timeout=30.0) + TimeoutError log

【Migration 文件】adr092_p1_learning_chain_fix.sql
- ALTER TABLE approval_records ADD COLUMN matched_playbook_id VARCHAR(36)
- ALTER TABLE timeline_events ADD COLUMN incident_id VARCHAR(64)
- 執行:psql $DATABASE_URL -f apps/api/migrations/adr092_p1_learning_chain_fix.sql

【附帶 Agent 改動】
- decision_manager: Phase 2 YAML NO_ACTION 優先門(主機層/外部服務跳過 Agent Debate)
- alert_rules.yaml: Sentry/ClickHouse + HostDiskUsageHigh/Critical 新規則
- solver_agent: action_title 語意合成兜底(取代靜默丟棄)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-24 15:41:35 +08:00

41 lines
2.2 KiB
PL/PgSQL
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
-- ADR-092 B4 — Playbook 學習閉環斷鏈修復DB Schema
-- 根因approval_records 缺 matched_playbook_id → 人工審核後 EWMA 無法更新 Playbook trust score
-- timeline_events 缺 incident_id → pre_decision_investigator MCP 呼叫稽核每天+1 靜默錯誤
--
-- 執行方式(需人工執行一次):
-- psql $DATABASE_URL -f apps/api/migrations/adr092_p1_learning_chain_fix.sql
--
-- 2026-04-24 ogt + Claude Sonnet 4.6(亞太)
BEGIN;
-- ─────────────────────────────────────────────────────────────────────────────
-- approval_records: 新增 matched_playbook_id 欄位B2 fix
-- ─────────────────────────────────────────────────────────────────────────────
ALTER TABLE approval_records
ADD COLUMN IF NOT EXISTS matched_playbook_id VARCHAR(36) DEFAULT NULL;
CREATE INDEX IF NOT EXISTS ix_approval_matched_playbook
ON approval_records (matched_playbook_id)
WHERE matched_playbook_id IS NOT NULL;
COMMENT ON COLUMN approval_records.matched_playbook_id
IS 'Playbook ID 命中時紀錄,學習服務讀取以更新 EWMA trust score';
-- ─────────────────────────────────────────────────────────────────────────────
-- timeline_events: 新增 incident_id 欄位P1.6 fix
-- ─────────────────────────────────────────────────────────────────────────────
ALTER TABLE timeline_events
ADD COLUMN IF NOT EXISTS incident_id VARCHAR(64) DEFAULT NULL;
CREATE INDEX IF NOT EXISTS ix_timeline_incident_id
ON timeline_events (incident_id)
WHERE incident_id IS NOT NULL;
COMMENT ON COLUMN timeline_events.incident_id
IS 'MCP 工具呼叫稽核時關聯的 Incident ID';
COMMIT;