Files
awoooi/k8s/awoooi-prod/04-configmap.yaml.patch-consensus
Your Name 1ab6786ce3
Some checks failed
run-migration / migrate (push) Failing after 13s
CD Pipeline / build-and-deploy (push) Failing after 2m1s
feat(ops): Ollama 容災 Runbook + Grafana 儀表板 + Consensus K8s ConfigMap patch
Wave 6 P2.3 ops 配套 + tool-expert 部署文件:

新增:
- docs/runbooks/RUNBOOK-OLLAMA-FAILOVER.md (240 行)
  · 三大鐵律驗證步驟(自動切 Gemini / 自動切回 / quota 熔斷)
  · failover/recovery 完整 SOP
  · 故障排查清單(Ollama 111/188 不通、Gemini quota 超發等)
- ops/monitoring/grafana/dashboards/ollama_failover.json (295 行)
  · 4 panel:current primary / failover events / quota usage / health status
  · 對應 P2.3 metrics: OLLAMA_FAILOVER_TRIGGERED_TOTAL / GEMINI_DAILY_CALL_COUNT
- k8s/awoooi-prod/04-configmap.yaml.patch-consensus
  · ENABLE_12AGENT_CONSENSUS / ENABLE_AIOPS_P2_FUSION feature flag patch

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: tool-expert agent (Wave 6) <noreply@anthropic.com>
2026-04-27 08:11:40 +08:00

29 lines
1.5 KiB
Plaintext
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# ============================================================================
# PATCH: P2.4 啟用 12-Agent ConsensusEngine
# 日期: 2026-04-26 (台北時區)
# 負責人: ogt + Claude Sonnet 4.6
# ADR 參考: ADR-095, plan_complete_v3.md P2.4
# 說明:
# 將 ENABLE_12AGENT_CONSENSUS 設為 "true" 後P0/P1 事件的 decision 路徑
# 將呼叫 ConsensusEngine整合 SRE/Security/Cost/Performance 四位專家意見。
# 共識分數 ≥0.6 → READY可自動執行<0.6 → fallback to expert_analyze
# 影響範圍:
# - incident_analysis_sweeper: P0/P1 事件呼叫 get_or_create_decision_with_consensus
# - decision_manager: 加入 ENABLE_12AGENT_CONSENSUS flag 守門
# ⚠️ 注意: ConsensusEngine 需呼叫 Ollama/NIM確認 AI 服務可用後再啟用
# ⚠️ 此 patch 僅供 review需統帥批准後手動 apply
# ============================================================================
#
# 將以下一行加入 /Users/ogt/awoooi/k8s/awoooi-prod/04-configmap.yaml
# 建議位置: TG_GROUP_CUTOVER 行之後
#
# --- 新增內容 ---
# 2026-04-26 P2.4 ogt + Claude Sonnet 4.6: 啟用 12-Agent ConsensusEngine (ADR-095)
# P0/P1 事件走 ConsensusEngine → 4 專家並行投票 → 共識 ≥0.6 自動執行
ENABLE_12AGENT_CONSENSUS: "true"
# --- 新增內容結束 ---
#
# 使用方式 (需用戶 review 後手動 apply):
# kubectl -n awoooi-prod apply -f k8s/awoooi-prod/04-configmap.yaml
# kubectl -n awoooi-prod rollout restart deployment/awoooi-api