OG T
19b00a1ca0
fix(api): 移除 Consensus Engine 假信心分數
...
🔴 違反鐵律: feedback_confidence_truthfulness.md
Expert System 必須 confidence = 0.0,禁止假裝 AI 仲裁
修正:
- SREAgent: 0.85/0.80/0.75/0.60 → 0.0
- SecurityAgent: 0.70/0.85 → 0.0
- CostAgent: 0.75 → 0.0
- PerformanceAgent: 0.80/0.70 → 0.0
所有規則匹配現在正確顯示為「⚙️ 規則匹配」而非「🤖 AI 仲裁」
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com >
2026-03-29 15:57:04 +08:00
OG T
89a2339796
feat(api): ADR-038 Circuit Breaker 整合 + Graceful Degradation
...
sentry_webhook.py:
- 整合 OpenClawGuard (Circuit Breaker + Semaphore)
- 斷路狀態快速失敗,不呼叫 OpenClaw
- 並發控制: 最多 3 個同時 LLM 推理
anomaly_counter.py:
- record_anomaly() Redis 故障 Graceful Degradation
- 失敗時返回預設 AnomalyFrequency (count=0)
- 不中斷主流程
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com >
2026-03-29 15:55:51 +08:00
OG T
39396dc57a
feat(worker): Wave 1 Signal Worker XCLAIM + Graceful Shutdown
...
ADR-038/039 Wave 1 強化:
- 新增 Active Sweeper: XPENDING + XCLAIM 回收閒置訊息
- PENDING_IDLE_MS: 60秒無ACK則可被回收
- SWEEP_INTERVAL_S: 每30秒掃描一次
- Graceful Shutdown: 75秒超時 (搭配 K8s 90秒)
- 超過 MAX_RETRIES 的訊息強制 ACK
K8s Worker Deployment:
- 新增 terminationGracePeriodSeconds: 90
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com >
2026-03-29 15:53:05 +08:00
OG T
27509db212
feat(api): Wave 1 安全網 - Circuit Breaker + Global Repair Cooldown
...
ADR-038: OpenClaw 雙層保護
- Layer 1: Circuit Breaker (5 failures → 60s cooldown)
- Layer 2: Concurrency Semaphore (max 3 concurrent)
- 新增 src/core/circuit_breaker.py
ADR-039: 全域修復熔斷
- Global Cooldown: 5 repairs/15min → freeze
- StatefulSet Blacklist: postgres/redis/clickhouse 禁止自動重啟
- 新增 src/services/global_repair_cooldown.py
- 整合到 auto_repair_service.py
測試:
- test_circuit_breaker.py (狀態轉換 + Semaphore)
- test_global_repair_cooldown.py (黑名單 + 計數閾值)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com >
2026-03-29 15:48:03 +08:00
OG T
2c79cba629
fix(api): 修復最後 2 個 bare except 錯誤
...
- scripts/test_nemotron_tool_calling.py: except -> except Exception
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com >
2026-03-29 15:37:02 +08:00
OG T
d89f0520f9
fix(api): 修復 34 個 Ruff lint 錯誤
...
- 自動修復 import 排序、unused imports
- 手動修復 raise from、isinstance union、unused variable
- scripts/ 暫時保留 (非 CI 阻擋)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com >
2026-03-29 15:27:49 +08:00
OG T
5f9a6a7e55
fix(ai): 移除假信心分數 + 顯示 AI 模型來源
...
問題: AI 仲裁顯示硬編碼信心分數 (0.75/0.88/0.92/0.70)
修復:
- decision_manager: 預設 confidence 0.75 → 0.0
- decision_manager: Expert System confidence=0.0 + is_rule_based
- openclaw: 所有 Mock Response confidence → 0.0
- telegram_gateway: 新增 ai_provider 欄位
- telegram_gateway: 動態來源標籤 (Ollama/Gemini/Claude/規則匹配)
Telegram 卡片顯示:
- confidence > 0 + provider=ollama → 🤖 Ollama 仲裁
- confidence > 0 + provider=gemini → 🤖 Gemini 仲裁
- confidence > 0 + provider=claude → 🤖 Claude 仲裁
- confidence == 0 → ⚙️ 規則匹配
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com >
2026-03-29 15:19:51 +08:00
OG T
c5db6520c8
perf(web): P1 前端優化 - 移除 Polling + CSS Cursor Blink
...
Phase 8.0 #15-17 前端效能優化:
#15 Sidebar Polling → SSE:
- 移除 30s setInterval polling
- 改用 useApprovalStore SSE 驅動的 pendingApprovals
- 新增 mounted check 防止 hydration mismatch
#16 Cursor Blink DOM Bypass:
- thinking-stream.tsx: setInterval → animate-pulse
- ai-thinking-panel.tsx: 移除 cursorVisible state
- clawbot-panel.tsx: 移除 cursorVisible state
- openclaw-panel.tsx: 移除 cursorVisible state
#17 Hydration Fix:
- sidebar.tsx badge 加入 mounted check
結果: -46 行代碼 (移除不必要的 setState/setInterval)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com >
2026-03-29 15:09:44 +08:00
OG T
49f21dc4e1
test(api): P1-3/P1-4 ApprovalRequestCreate + Telegram 測試
...
P1-3: ApprovalRequestCreate 欄位對齊測試 (13 tests)
- 必填欄位驗證 (action, description, requested_by)
- BlastRadius Model 驗證
- SignOz/Sentry/GitHub Webhook 格式驗證
- Pydantic v2 額外欄位行為驗證
P1-4: Telegram 整合驗證測試 (19 tests)
- SignOzMetricsBlock 格式化
- TelegramMessage 結構
- 風險等級 Emoji 映射
- Webhook → Telegram 訊息流程
遵循: feedback_no_mock_testing.md (禁止 Mock)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com >
2026-03-29 11:28:33 +08:00
OG T
ac2715e541
fix(api): P1-2 ApprovalRequestCreate 欄位對齊
...
修正 SignOz + GitHub Webhook 的 ApprovalRequestCreate:
Before (錯誤欄位):
- action_type, target_resource, source
- blast_radius=BlastRadius.SINGLE (enum 不存在)
- dry_run_check=DryRunCheck.SKIPPED (錯誤格式)
- 缺少 action, description, requested_by
After (正確欄位):
- action, description (必填)
- blast_radius=BlastRadius(...) (Pydantic Model)
- dry_run_checks=[] (list)
- requested_by (必填)
- 其他欄位移至 metadata
遵循: ApprovalRequestBase schema (approval.py)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com >
2026-03-29 11:17:27 +08:00
OG T
50c055b547
feat(api): Phase D-G P0 修正 - Learning Repository 積木化
...
新增:
- ILearningRepository Protocol (interfaces.py)
- LearningRepository (Redis 持久化層)
- Learning API 端點 (/api/v1/learning/*)
- LearningService.get_recommended_fix() 方法
- LearningService.get_learning_summary() 方法
修正:
- Service 不直接依賴 Redis Client (透過 Repository)
- 符合 leWOOOgo 積木化原則
- 首席架構師審查: 74/100 → 92/100
更新:
- ADR-030: 新增 Phase D-G P0 修正章節
- Skill 02: v1.9 → v2.0
- Runner 修復: 序列建構解決 _runner_file_commands 衝突
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com >
2026-03-29 11:03:51 +08:00
OG T
ae21ba2cc6
feat(ai): Phase 20 P3 優化 - Circuit Breaker + 指數退避 + Prometheus
...
P3-1: Circuit Breaker 狀態機 (CLOSED/OPEN/HALF_OPEN)
- 連續 3 次失敗觸發斷路
- 60 秒後自動嘗試恢復
- 防止連鎖故障
P3-2: 指數退避重試
- 基礎延遲 1s,最大 30s
- 含 10% jitter 避免雷鳴
P3-3: Prometheus Metrics
- nvidia_tool_call_requests_total (status, tool_name)
- nvidia_tool_call_latency_seconds (histogram)
- nvidia_circuit_breaker_state_changes_total
測試: 25 → 34 PASSED
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com >
2026-03-29 01:49:08 +08:00
OG T
d9a6f9d066
feat(api): Sentry Session Replay UX 自動監控
...
Phase 19 UX 監控 - 善用 Sentry Session Replay:
- SentryService: 新增 list_replays, get_ux_audit_summary
- 偵測: 憤怒點擊 (Rage Clicks) + 死亡點擊 (Dead Clicks)
- 偵測: 有錯誤的 Session Replay
- 偵測: UI 相關錯誤 (TypeError/render)
- API: GET /api/v1/errors/ux-audit 端點
- 腳本: audit_ux_sentry.py CLI 工具
統帥回饋: "AI都要全自動化!" ✅
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com >
2026-03-29 01:48:59 +08:00
OG T
8fa99209c3
fix(web): OmniTerminal Escape 關閉 + 響應式底部抽屜
...
Phase 19.R - 修復 UX 問題:
- 新增 Escape 鍵關閉 Terminal (之前僅有 CMD+J)
- Mobile: 全螢幕改為 70vh 底部抽屜
- 新增半透明 backdrop,點擊可關閉
- 響應式: Mobile/Tablet/Desktop 三級適配
修復問題: Terminal 開啟後無法關閉
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com >
2026-03-29 01:47:05 +08:00
OG T
d6dc80bcbc
fix(sentry): OpenClaw URL 修正 8088→8089
...
ADR-028 端口統一,Sentry webhook 漏掉更新
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com >
2026-03-29 01:46:28 +08:00
OG T
b0b91a59e5
fix(telegram): 修復簽核按鈕無作用 - 方法名稱錯誤
...
根本原因:
- telegram_gateway.py 呼叫 service.add_signature() 但該方法不存在
- telegram.py 呼叫 service.reject() 但該方法不存在
- 正確方法為 sign_approval() 和 reject_approval()
修復:
- _execute_approval_action: add_signature → sign_approval
- _execute_approval_action: reject → reject_approval
- telegram webhook: 同步修復
影響範圍:
- Telegram 簽核/拒絕/稍後/靜默按鈕現在正常運作
- 前端 Y/n 按鈕本就使用正確 API (不受影響)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com >
2026-03-29 01:36:38 +08:00
OG T
179e659f14
chore: 清理 Playwright 產物 + kube-state-metrics 告警擴充
...
清理工作:
- .gitignore 新增 playwright-report/ 和 test-results/ 排除
- 保留 phase19/ 參考截圖目錄
kube-state-metrics 告警擴充 (P3):
- CronJobLastRunFailed: Job 執行失敗
- DaemonSetMissingPods: DaemonSet 缺少 Pod
- StatefulSetReplicasMismatch: StatefulSet 副本不足
- ContainerWaiting: ImagePullBackOff/CrashLoopBackOff 偵測
- PDBViolation: PDB 健康 Pod 數不足
- NodeUnschedulable: 節點標記為不可排程
新增:
- apps/api/scripts/test_nemotron_tool_calling.py (E2E 比較測試)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com >
2026-03-29 01:28:35 +08:00
OG T
725392b578
fix(k8s): NetworkPolicy 繞過 kustomize commonLabels
...
問題: kustomize commonLabels 會加到 NetworkPolicy egress[].to[].podSelector
導致 DNS rule 要求 CoreDNS pods 有 system:awoooi + environment:prod
但 CoreDNS 只有 k8s-app:kube-dns,造成 DNS 解析失敗
修復:
- kustomization.yaml: 移除 02-network-policy.yaml
- cd.yaml: 新增 Apply NetworkPolicy step 單獨套用
2026-03-29 ogt: 根本原因修復
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com >
2026-03-29 01:27:29 +08:00
OG T
4f7282a97a
fix(ai): Phase 20 P2 修復 - Protocol + 邊界測試 + model_registry
...
P2-1: 定義 INvidiaProvider Protocol (@runtime_checkable)
P2-2: 補充邊界測試 15 → 25 案例
P2-3: model_registry 新增 NVIDIA + tool_calling_fallback_order
首席架構師評分: 82 → 86 → 90/100
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com >
2026-03-29 01:24:17 +08:00
OG T
ee2bceefff
feat(monitoring): Phase 19.6 測試文檔 + P1-P3 改進 + 首席架構師審查
...
Phase 19.6 測試文檔收尾:
- E2E 測試擴充至 18 項 (Terminal/GenUI 驗證)
- 新增 PHASE19-VERIFICATION-CHECKLIST.md (完整驗證清單)
P1 驗證:
- ArgoCD Metrics NodePort 監控 (30883/30884)
- TLS 證書監控 (Blackbox Exporter 9115)
P2 改進:
- waitForTimeout → waitForLoadState('networkidle')
- 跨平台快捷鍵 (Meta+J / Control+J)
- SKIP_MULTISIG_TESTS 環境變數控制
- Prometheus GitOps 部署腳本
P3 改進:
- HPA maxReplicas 4 → 6 (API/Web)
首席架構師審查: 47/50 OUTSTANDING (94%)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com >
2026-03-29 01:19:26 +08:00
OG T
6de1c0ff3b
fix(ai): 修復 Pydantic validation error + tuple unpacking
...
1. kubectl_command 允許 None (LLM 可能返回 null)
2. 加入 field_validator 將 null 轉換為空字串
3. generate_incident_proposal 完整解包 6 值 (含 ai_tokens/ai_cost)
2026-03-29 ogt: Gemini API validation 修復
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com >
2026-03-29 00:46:02 +08:00
OG T
fb643eb645
feat(ai): ADR-036 Nemotron E2E 驗證腳本
...
新增 verify_nemotron_e2e.py:
- 測試 NVIDIA API 連線
- 測試 AIRouter 整合
- 測試高風險 Tool 檢測
- 測試繁體中文 Tool Calling
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com >
2026-03-29 00:11:40 +08:00
OG T
7c905c4bf3
fix(ai): 修復 generate_incident_proposal tuple unpacking 錯誤
...
- _call_with_cache 返回 6 值 (含 ai_tokens/ai_cost)
- generate_incident_proposal 解包只取 4 值導致 ValueError
- 修復: 完整解包 6 值並傳遞 ai_tokens/ai_cost 到 proposal_dict
2026-03-29 ogt: Token/Cost 追蹤補遺
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com >
2026-03-29 00:03:22 +08:00
OG T
b77e151387
feat(ai): ADR-036 NVIDIA Nemotron Tool Calling 整合
...
Phase 20 - 提升 Tool Calling 精準度 50% → 83.3%
新增:
- src/models/nvidia.py: Pydantic Schema
- src/services/nvidia_provider.py: NvidiaProvider 類別
- tests/test_nvidia_provider.py: 15 項單元測試 (全部通過)
修改:
- ai_router.py: AIProvider.NVIDIA + route_tool_calling()
- ai_rate_limiter.py: NVIDIA 限制 (5 RPM, 100/day)
- models.json: NVIDIA 配置
- cd.yaml: Secrets 注入 NVIDIA_API_KEY
路由策略:
- Tool Calling: Nemotron → Gemini → Claude
- 一般對話: Ollama → Gemini → Claude (不變)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com >
2026-03-29 00:00:08 +08:00
OG T
e75e578547
feat(monitoring): P1/P2 改進 - ArgoCD Metrics + TLS 證書告警
...
## P1: ArgoCD Metrics
- 新增 ArgoCD Metrics NodePort (30882, 30883)
- 更新 NetworkPolicy 允許 Prometheus (188) 抓取
- 提供 Prometheus scrape config 範本
## P1: NetworkPolicy AI API
- 文檔標註 K8s NetworkPolicy 不支援 FQDN 限制
- 維持現有配置避免 AI 功能中斷
## P2: TLS 證書告警
- 新增 TLSCertExpiringIn30Days (30天預警)
- 新增 TLSCertExpiringIn7Days (7天緊急)
- 新增 TLSCertExpired (已過期)
- 新增 TLSProbeFailure (探測失敗)
## P2: Multi-Sig E2E 測試
- 標記為條件式執行 (API 不可用時自動跳過)
- 避免 CI/CD 因無法連接生產 API 而失敗
首席架構師審查: 2026-03-29 (台北時間)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com >
2026-03-28 23:48:57 +08:00
OG T
6ac0f8c0e5
chore: force API rebuild (runner temp file fix)
...
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com >
2026-03-28 23:47:18 +08:00
OG T
ba521fa531
fix(ai): 更新 Gemini 模型名稱 1.5-flash → 2.0-flash (2026-03-28 ogt)
...
根本原因: gemini-1.5-flash 已停用,API 返回 404
解決方案: 更新到 gemini-2.0-flash
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com >
2026-03-28 23:23:52 +08:00
OG T
c76a10ad6e
feat(ai): $5 USD 成本上限 + 自動切換 Ollama (2026-03-29 ogt)
...
統帥要求:
1. 累積成本超過 $5 USD → 自動停用 Gemini,切換回 Ollama
2. 發送 Telegram 告警通知統帥
3. $4 USD 時發送警告
實作:
- ai_rate_limiter.py: 新增 COST_LIMITS, record_cost(), reset_cost()
- openclaw.py: 每次成功呼叫後記錄成本
- 成本存入 Redis (不過期,手動重置)
- 重置指令: redis-cli DEL ai_rate:total_cost:gemini
API 端點: GET /api/v1/health/ai-usage
- 顯示 total_cost_usd.current/limit/remaining
- 顯示 cost_exceeded: true/false
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com >
2026-03-28 22:34:51 +08:00
OG T
d469a239af
fix(ai): 移除 confidence 預設值,強制 LLM 真實計算
...
變更:
1. models/ai.py: confidence 改為 REQUIRED (移除 default=0.8)
2. openclaw.py: 如果 LLM 沒輸出 confidence,設為 0.5 + COLLAB
根本原因:
- 原本 Pydantic default=0.8 導致信心分數永遠是 80%
- 現在強制 LLM 必須計算真實信心分數
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com >
2026-03-28 22:21:29 +08:00
OG T
984d31de0c
feat(ai): Gemini 優先 + Token/Cost 追蹤 (2026-03-29 ogt)
...
變更:
1. ConfigMap: Gemini 優先 ["gemini","ollama","claude"]
2. openclaw.py: 捕獲 Gemini usageMetadata (tokens/cost)
3. webhooks.py: 傳遞 ai_tokens/ai_cost 到 Telegram
4. telegram_gateway.py: 顯示 💰 Tokens: X / $Y.YYYY
Gemini 1.5 Flash 定價:
- Input: $0.075/1M tokens
- Output: $0.30/1M tokens
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com >
2026-03-28 22:18:24 +08:00
OG T
26839227ff
fix(web): 修復 TypeScript 錯誤
...
- useCSRF: 修正 import 路徑 @/lib/env → @/lib/config
- terminal-telemetry: 新增 UNKNOWN_COMPONENT 錯誤碼
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com >
2026-03-28 19:06:44 +08:00
OG T
6ca2efe27b
fix(ci): 修復 ESLint + spectral-cli 安裝錯誤
...
- 移除不存在的 @typescript-eslint/no-deprecated 規則
- 修復 npm ENOTEMPTY 錯誤 (先清理舊目錄)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com >
2026-03-28 19:00:06 +08:00
OG T
59c9eff83a
fix(api): 修復 10 個 Lint 錯誤 (imports 排序 + unused imports + set comprehension)
...
- F401: 移除未使用的 imports (TerminalSessionStatus, AutoApproveDecision, TerminalSession)
- I001: 修正 import blocks 排序
- C401: set(generator) → {set comprehension}
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com >
2026-03-28 18:51:52 +08:00
OG T
c361153c67
fix(ui): Phase 19 P1 修復 Header「已斷線」狀態
...
問題: 非 Dashboard 頁面顯示「已斷線」,因為 SSE 只在 Dashboard 啟動
修復:
- AppLayout 全局啟動 SSE 連接 (所有頁面共享)
- LiveDashboard 移除重複的 SSE 連接邏輯
- 現在所有頁面都會顯示正確的連線狀態
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com >
2026-03-28 18:45:26 +08:00
OG T
d206460751
feat(security): Phase 20 CSRF 防護實作
...
Phase 19 首席架構師審查指出: 核鑰 UX 安全性缺 CSRF 防護
後端:
- 新增 src/core/csrf.py (Double Submit Cookie 模式)
- 新增 src/api/v1/csrf.py (GET /api/v1/csrf/token)
- 新增 src/models/csrf.py (CSRFTokenResponse)
- 修改 approvals.py sign/reject/bulk 端點加入 CSRFToken 驗證
前端:
- 新增 hooks/useCSRF.ts (React Hook)
- 修改 approval.store.ts 整合 CSRF Token 參數
安全特性:
- 256-bit Token (secrets.token_hex)
- 時序安全比較 (secrets.compare_digest)
- SameSite=Strict Cookie
- 1 小時 Token 有效期
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com >
2026-03-28 18:31:58 +08:00
OG T
cd305a0baf
fix(test): 修正 Phase 19 E2E 測試路徑錯誤
...
- /incidents 改為 /action-logs (正確路由)
- 11/11 測試全部通過
- 更新驗證報告
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com >
2026-03-28 18:30:49 +08:00
OG T
7b9b0c490b
feat(phase19): Omni-Terminal 100% 完成 + 首席架構師審查 47/50
...
## Phase 19 Omni-Terminal (Wave 0-6 全部完成)
### 核心功能
- SSE 狀態機 (7-State 設計,10/10 分)
- GenUI 動態渲染 (6 張卡片 + Zod Schema 驗證)
- 核鑰 UX (長按授權 + 風險分級)
- Terminal Telemetry (Sentry 整合)
### P0-P2 修復
- P0: Singleton → FastAPI Depends 依賴注入
- P1: Zod Schema 升級 (7 個驗證 Schema)
- P1: 錯誤分類碼聚合 (Sentry fingerprint)
- P2: Slow Query 監控 (5s 警告 / 10s 嚴重)
### 測試
- test_terminal_service.py: 54 項測試全通過
- 意圖分類: 42 個測試案例 (9 種 IntentType)
### 文檔
- ADR-031: SSE 架構實作紀錄
- ADR-032: GenUI 渲染實作紀錄
- Skills: v1.9 (後端 Terminal 章節)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com >
2026-03-28 18:04:12 +08:00
OG T
ecdcb6110e
fix(api): 修復 Sentry Approval 創建參數 (P2)
...
ApprovalDBService.create_approval() 不接受 approval_id 參數
ID 由 Service 自動生成,返回後從 ApprovalRequest.id 取得
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com >
2026-03-28 00:18:12 +08:00
OG T
e5ded3b3f2
feat(phase19): OmniTerminal + GenUI + Hybrid SSE 架構實作 (Wave 0-2)
...
Phase 19 OmniTerminal MVP 完成:
- Wave 0: Backend (Hybrid SSE POST→GET 架構)
- Wave 1: Frontend (OmniTerminal 狀態機 + GenUI Registry)
- Wave 2: UI 組件 (8 個 GenUI 動態卡片)
ADR 文檔:
- ADR-031: OmniTerminal SSE 架構
- ADR-032: GenUI 動態渲染框架
- ADR-033: K3s HA 架構設計
GenUI 組件:
- GenUIRenderer, K8sPodStatusCard, SentryErrorCard
- MetricsSummaryCard, IncidentTimelineCard
- TraceWaterfallCard, ApprovalCard, NuclearKeyButton
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com >
2026-03-28 00:17:26 +08:00
OG T
a5ff57ddc3
fix(api): 修復 Sentry Approval 欄位對齊 ApprovalRequestBase
...
- ApprovalRequestCreate 使用正確欄位 (action, description, blast_radius...)
- BlastRadius 改用 Model 實例而非不存在的 enum
- 移除未使用的 DryRunCheck import
- 原始欄位移至 metadata
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com >
2026-03-27 23:14:24 +08:00
OG T
74734f5b8a
fix(api): 修復 SentryService.check_dedup Redis import
...
- get_redis_pool → get_redis (正確函數名稱)
- Phase 10.2.1 E2E 測試發現
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com >
2026-03-27 23:03:31 +08:00
OG T
7456492482
fix(api): 註冊 Sentry Webhook Router (Phase 10.2.1)
...
- 新增 sentry_webhook_v1 import
- include_router 註冊 /api/v1/webhooks/sentry/* 路由
- 修復 Sentry Alert Rule → AWOOOI 連線
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com >
2026-03-27 16:13:04 +08:00
OG T
2b069818af
refactor(api): Sentry dedup 邏輯移至 Service 層 (leWOOOgo 模組化)
...
Phase 10.2.1 - 2026-03-27 台北時區
- 將 check_sentry_dedup() 從 Router 移至 SentryService.check_dedup()
- Router 層禁止直接存取 Redis (遵循 leWOOOgo 積木化原則)
- 保持 10 分鐘 TTL 去重窗口
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com >
2026-03-27 15:04:53 +08:00
OG T
138ef0c2db
fix(api): 修復 7 個 Lint 錯誤 (unused imports + zip strict + dict comprehension)
...
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com >
2026-03-27 14:42:47 +08:00
OG T
177563f513
fix(api): 告警收斂不重複發送 Telegram
...
問題: 相同 fingerprint 的告警收斂時,仍會重複發送 Telegram
修復: 收斂告警只更新 hit_count,跳過 Telegram 推送
影響: /alerts + /alertmanager 兩個端點
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com >
2026-03-27 14:21:22 +08:00
OG T
7720551b8c
fix(api): 修復 Telegram 訊息 INC-INC- 重複前綴
...
問題: TelegramMessage.format() 中的 incident_id 生成邏輯
當 approval_id 已是 "INC-xxx" 格式時仍添加 "INC-" 前綴
修復: 檢查 approval_id 是否已有 INC- 前綴,避免重複
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com >
2026-03-27 10:28:18 +08:00
OG T
f1b037bb0c
refactor(api): playbook_rag.py 模組化改造 (P1 違規修復)
...
修復 P1 違規:
- Line 29: Service 直接 import Redis → Repository Pattern
- Line 156: 自建 httpx.AsyncClient → DI 注入
變更:
- 新增 IEmbeddingCacheRepository Protocol (interfaces.py)
- 新增 EmbeddingCacheRepository 實作 (embedding_repository.py)
- PlaybookRAGService 改用 DI 注入 http_client + embedding_cache
- get_playbook_rag_service() 改為 async factory
- PlaybookService 改用 lazy initialization
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com >
2026-03-27 10:07:30 +08:00
OG T
abc21c735e
feat(api): P1 Telegram 按鈕優化 - 稍後/靜默
...
新增按鈕:
- ⏰ 稍後 (snooze): 延遲 30 分鐘後再提醒
- 🔕 靜默 1h (silence): 同類資源告警靜默 1 小時
實作細節:
- telegram_gateway.py: 新增 _handle_snooze/_handle_silence
- decision_manager.py: 發送前檢查 silence 狀態
- Redis Key: telegram_snooze:{approval_id}, telegram_silence:{resource_name}
- Skill 03 v1.5 → v1.6
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com >
2026-03-27 09:50:28 +08:00
OG T
79b526b472
fix(api): P0 統一 Stream Key 為 awoooi:signals
...
修復 Producer/Worker/Webhooks 使用不同 Stream Key 導致訊息無法消費
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com >
2026-03-27 09:33:13 +08:00
OG T
e34b0f2e9a
fix(api): Telegram 去重 + 修復 INC-INC-INC- 重複前綴
...
- 加入 Redis 去重機制 (10 分鐘 TTL)
- 修復 approval_id 重複添加 INC- 前綴
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com >
2026-03-27 09:27:40 +08:00