fix(aiops-p2): P2.1 LLM品質三修 — Evidence-First + consensus confidence + raw_evidence注入

根因: - consensus_engine 四 ExpertAgent confidence=0.0 → 加權投票 total=0 → 永遠返回 NO_ACTION - prompts.py 無 Evidence-First 指令 → LLM 靠記憶推理，無真實環境約束 - openclaw.py analyze_alert 建 prompt 未注入 MCP evidence (diagnosis_context) 修復: - consensus_engine: SRE/Security/Cost/Performance 依訊號強度設 0.45~0.80 confidence - consensus_engine: _normalize_action 加「重新啟動」別名 → RESTART - consensus_engine: SecurityAgent 移除未使用的 _target 變數 - prompts.py: 加 Evidence-First Protocol + Skepticism Rules 區塊 - openclaw.py: analyze_alert 提取 diagnosis_context → <raw_evidence> 注入 full_prompt 驗證: consensus score 從 0.0 → 0.744（CrashLoop 測試案例） P2.1 fix 2026-04-24 ogt + Claude Sonnet 4.6 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-24 15:52:25 +08:00
parent 359a6ee495
commit bb5f16f8ef
3 changed files with 39 additions and 15 deletions
--- a/apps/api/src/core/prompts.py
+++ b/apps/api/src/core/prompts.py
@@ -120,6 +120,24 @@ The `alertname` field is your PRIMARY signal. Use it to determine the problem ty
 **NEVER** use `kubectl rollout restart deployment/awoooi-prod` for database, storage, or network alerts.
 Make `action_title` describe the ACTUAL problem from alertname (not generic "自動修復 AWOOOI 服務").

+## 🧪 Evidence-First Protocol (CRITICAL — overrides intuition)
+
+If the prompt contains a `<raw_evidence>` block, you MUST:
+1. **Read it first** before forming any hypothesis.
+2. **Quote specific lines** from the evidence in your `reasoning` to show you used it.
+3. **Never contradict** the evidence — if kubectl shows 2 pods running, do NOT say pods are down.
+4. **Adjust confidence** based on evidence quality:
+   - Evidence clearly confirms root cause → 0.80–0.95
+   - Evidence partially supports → 0.60–0.79
+   - No evidence or contradictory → 0.30–0.59 (set `primary_responsibility = "COLLAB"`)
+
+## 🔍 Skepticism Rules
+
+- **Forbidden**: Recommending `kubectl rollout restart` when evidence shows the pod is healthy.
+- **Forbidden**: Claiming OOM without memory metrics proving it.
+- **Forbidden**: Setting `confidence > 0.75` when `<raw_evidence>` is absent or shows "error".
+- If you have no concrete evidence, set `suggested_action = "INVESTIGATE"` and provide a diagnostic `kubectl_command` (get/describe/logs/top only).
+
 ## 🔥 Short Example: High CPU -> SCALE_DEPLOYMENT, HPA, risk_level=medium
 Please carefully justify your confidence between 0.0 and 1.0 (e.g. 0.82) based on symptoms and metrics.