55 Commits

Author SHA1 Message Date
OG T
c9c60c3a61 feat(mcp-integrations): Phase S 架構修復 + MCP 整合基礎建設
Some checks failed
E2E Health Check / e2e-health (push) Has been cancelled
CD Pipeline / build-and-deploy (push) Has been cancelled
Type Sync Check / check-type-sync (push) Failing after 22s
Phase S 技術債修復 (首席架構師審查 82→完整):
- S-01: generate_alert_fingerprint 移至 AlertAnalyzer.generate_fingerprint() staticmethod
- S-04: 移除 Pydantic v2 deprecated json_encoders (直接用原生 datetime 序列化)

Sentry MCP 整合 (Phase 23):
- ADR-048: Sentry→OpenClaw AI Triage 架構決策
- sentry_webhook_service.py: parse/analyze/create_incident/build_message Service 層
- config.py: SENTRY_WEBHOOK_SECRET (Fail-Closed HMAC-SHA256)

Playwright MCP 整合 (短期):
- smoke.spec.ts: 5 頁面 E2E smoke test (home/dashboard/incidents/approvals/terminal)
- cd.yaml: E2E Smoke Test 步驟 + Telegram 🎭 Smoke 狀態通知

長期規劃 ADR:
- ADR-049: Figma Code Connect 設計系統同步
- ADR-050: Telegram 互動式 Incident 2.0 (6鍵 Inline Keyboard)
- ADR-051: Context7 依賴升級顧問 (Next.js 14→15, FastAPI 0.115→0.128)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-01 16:20:57 +08:00
OG T
22de22c989 refactor(phase-s): Phase S 技術債清理 - 五項架構改善
S-01: generate_alert_fingerprint() 移至 alert_analyzer_service (Router→Service)
S-02: 移除廢棄 USE_NEW_ENGINE config (Phase R 已完成歷史使命)
S-03: github_webhook.py linter 清理 (Field unused + delivery_id noqa)
S-04: Pydantic v2 遷移 - approval/incident models (class Config → ConfigDict)
S-05: Skill 09 v1.1 更新 (USE_NEW_ENGINE 廢棄說明)

測試: 393 passed, 零失敗

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-01 13:12:02 +08:00
OG T
6fed8be8c4 docs(adr): ADR-024 R4 Router 瘦身標記完成
Some checks failed
E2E Health Check / e2e-health (push) Successful in 17s
Type Sync Check / check-type-sync (push) Failing after 22s
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-01 09:27:40 +08:00
OG T
5086bafa36 docs: ADR-045 Telegram Gateway 統一到 K8s AWOOOI API
記錄 2026-03-31 已實施的架構決策:
- 統一 Telegram 到 K8s AWOOOI API Webhook 模式
- 解決 OpenClaw (188) Long Polling 雙軌競爭問題

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-01 09:17:08 +08:00
OG T
a94bb57d8b feat(types): ADR-046 IncidentConverter + IncidentEngineAdapter
實作 ADR-046 Option B: IncidentConverter 轉換層,解決
BrainIncident (lewooogo-brain) 與 LocalIncident (apps/api) 型別邊界問題。

變更:
- 新增 src/utils/incident_converter.py
  - brain_to_local(): BrainIncident → LocalIncident
  - local_to_brain(): LocalIncident → BrainIncident
  - ESCALATED → MITIGATING 映射 (brain 無 ESCALATED)
- incident_engine.py: 新增 IncidentEngineAdapter 包裝層
  - process_signal() / get_incident() 輸出轉換為 LocalIncident
  - get_incident_engine() 返回 IncidentEngineAdapter
- incident_memory.py: 加入 brain_to_local import,更新 _record_to_incident 說明
- ADR-046: 標記三個轉換點全部完成

解鎖: #123 proposal_service.py 清理 (下一步)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-31 22:47:54 +08:00
OG T
2ba61acf72 fix(api): Phase R-R2.2 首席架構師 72/100 P2 修復
P2-01 signal_worker.py: persisted_to_pg 改用 getattr 防 BrainIncident AttributeError
P2-02 IIncidentEngine Protocol: update_incident_status → update_status 對齊 brain 實作
P2-03 config.py USE_NEW_ENGINE: 標記失效 + 回滾路徑更正 (git revert 而非 kubectl)
ADR-046: Option B (IncidentConverter) 決策完成,待實作清單更新
ADR-024: 審查結論 + 正式回滾指令更新
Skill 02: v2.5 版本記錄

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-31 22:33:08 +08:00
OG T
cd91560e0b docs: Phase R-R2 完成文件更新 + ADR-046 型別統一
- ADR-024: 更新執行進度 (R1 R2 R3 R4待執行)
- ADR-046: 新增跨套件 Incident 型別統一治理 (待決策)
  推薦 Option B: IncidentConverter 轉換層
- Skill 02: v2.5 記錄 Phase R-R2 + R-R2.1 + ADR-046
- LOGBOOK: 更新當前狀態

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-31 22:17:44 +08:00
OG T
dd526684ab feat(ai): Phase 22 OpenClaw + Nemotron 協作架構 (ADR-044)
All checks were successful
E2E Health Check / e2e-health (push) Successful in 17s
統帥批准實作「仲裁-執行分工」架構:
- OpenClaw = 仲裁者 (Why + Risk Level)
- Nemotron = 執行者 (How + kubectl Command)

新增功能:
- config.py: ENABLE_NEMOTRON_COLLABORATION Feature Flag
- openclaw.py: generate_incident_proposal_with_tools()
- openclaw.py: _call_nemotron_tools() Nemotron 呼叫
- telegram_gateway.py: TelegramMessage Nemotron 欄位
- telegram_gateway.py: format_with_nemotron() 雙區塊格式
- decision_manager.py: 整合協作方法
- proposal_service.py: 整合協作方法

觸發條件:
- LOW 風險 → 僅 OpenClaw
- MEDIUM/HIGH/CRITICAL → OpenClaw + Nemotron 雙軌

首席架構師審查: 83/100 條件通過

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-03-31 18:52:53 +08:00
OG T
52177d7096 docs: 更新 ARCHITECTURE_MEMORY + ADR-041
All checks were successful
E2E Health Check / e2e-health (push) Successful in 16s
- ARCHITECTURE_MEMORY: 新增 Service 層架構說明
- ADR-041: 更新 Phase 21 定期報告狀態

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-03-31 16:06:51 +08:00
OG T
268b944d5c docs: Phase 18 首席架構師審查 + ADR-043
All checks were successful
E2E Health Check / e2e-health (push) Successful in 24s
2026-03-31 Claude Code (首席架構師)

審查結果:
- 初審: 91/100 OUTSTANDING (條件通過)
- P0 修復後: 95/100 OUTSTANDING

ADR-043: FailureWatcher 與 DecisionManager 協調
- 優先級規則: DecisionManager > FailureWatcher
- 資源鎖定機制 (Redis)
- Approval 衝突檢查
- 來源標記審計

P1 待改進 (Phase 19):
- DI 注入模式統一
- 整合測試覆蓋
- Trace ID 記錄

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-03-31 12:24:54 +08:00
OG T
bcd33e854f docs: ADR-042 前端效能優化模式 (DOM Bypass + Optimistic Updates)
All checks were successful
E2E Health Check / e2e-health (push) Successful in 16s
新增 ADR-042:
- Pattern 1: DOM Bypass (繞過 React 渲染,100x 效能提升)
- Pattern 2: Optimistic Updates (0ms UI 延遲 + 失敗回滾)
- Pattern 3: SSE Incremental Updates (增量更新,減少 API 請求)
- Pattern 4: AbortController (防止記憶體洩漏)

更新 Skills 01:
- v1.6 版本更新
- 新增效能優化模式章節
- 參考 ADR-042

首席架構師審查: 96-98/100 OUTSTANDING

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-03-31 11:36:21 +08:00
OG T
ce6b1b1c64 docs: 更新 LOGBOOK - #17 i18n Hydration 完成
前端 P1 改進全部完成:
- #15 SSE + 樂觀更新 (8c8664c)
- #16 DOM Bypass (0b87018)
- #17 i18n Hydration (f25e94e)

首席架構師審查: 96/100 OUTSTANDING

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-03-31 11:23:38 +08:00
OG T
f25e94e8c4 fix(web): #17 i18n Hydration 防護 (NEXT_LOCALE Cookie)
Phase D #17: 修復 i18n 語系切換 Hydration 當機

問題: Client/Server 渲染語系落差導致 Hydration Mismatch
解法: Middleware 強制綁定 NEXT_LOCALE Cookie

實作內容:
- 從 URL 路徑提取當前語系
- 強制設定 NEXT_LOCALE cookie (1年 TTL)
- 確保 Server/Client 語系一致

@see QA Report 3.1 節

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-03-31 11:18:53 +08:00
OG T
b31e079e41 docs: 更新 LOGBOOK - Phase A/B/C P1 完成 (97/100)
Some checks failed
CD Pipeline / build-and-deploy (push) Successful in 3m42s
E2E Health Check / e2e-health (push) Has been cancelled
- LOGBOOK: Phase A/B/C 首席架構師審查 OUTSTANDING
- Skills: DevOps Commander 更新
- ADR-033: K3s HA 架構補充

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-03-31 11:03:10 +08:00
OG T
bf3a21d88e docs: 首席架構師審查 94/100 OUTSTANDING
Some checks failed
E2E Health Check / e2e-health (push) Has been cancelled
CD Pipeline / build-and-deploy (push) Has been cancelled
- Skills v2.2: 新增 Phase 19.4 API 整合模式
- ADR-030: 補充 §5.3 Playbook 自動狀態轉換閾值
- LOGBOOK: 更新審查結果

審查範圍: 18 commits (Phase 19.4 + ADR-039 + AI 仲裁)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-03-30 01:38:41 +08:00
OG T
4f06115497 docs: 首席架構師審查 - 前端內網 IP 禁令 (90/100)
Some checks failed
E2E Health Check / e2e-health (push) Has been cancelled
CD Pipeline / build-and-deploy (push) Has been cancelled
審查結果:
- P0 安全修復: sudo 密碼改用 secret 
- P1 識別: Sentry DSN build-arg 待處理
- P2 識別: 3 項次要問題已記錄

已更新:
- Skills 01 v1.5: 前端建置禁止內網 IP
- Skills 04 v2.1: CD 安全規範 + 內網 IP 禁令
- ADR-022: 新增前端內網 IP 禁令章節
- MEMORY.md: 新增審查記錄索引

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-03-30 01:32:48 +08:00
OG T
998b5a7b5f docs: ADR-039 重編號為 ADR-040 + LOGBOOK 更新
ADR 變更:
- ADR-039 (gitea-cicd-migration) 保留給 Gitea CI/CD 遷移
- 原 ADR-039 (global-autorepair-governance) 改為 ADR-040

LOGBOOK:
- 新增 Phase 19.4 Terminal Service API 整合記錄
- 更新當前狀態

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-03-30 01:20:50 +08:00
OG T
97b2e059bc docs: ADR-039 完成 - Gitea CI/CD 遷移成功
Some checks failed
CD Pipeline / build-and-deploy (push) Has been cancelled
E2E Health Check / e2e-health (push) Has been cancelled
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-03-30 01:07:51 +08:00
OG T
d6b8224942 feat(cicd): ADR-039 Gitea CI/CD 遷移
2026-03-29 Claude Code (統帥授權):
- 新增 .gitea/workflows/cd.yaml (Build → Harbor → K8s)
- 新增 .gitea/workflows/e2e-health.yaml (E2E 健康檢查)
- 新增 ADR-039 文檔記錄遷移決策

方案 B: GitHub → Gitea CI/CD 遷移
- Gitea 作為主倉和 CI/CD
- GitHub 降級為只讀備份

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-03-29 21:51:45 +08:00
OG T
89e05e6ea2 docs: ADR-037 + 監控架構提案 + Runbooks
- ADR-037 監控增強架構
- MONITORING_MASTER_PLAN 主計畫
- MASTER_EXECUTION_SCHEDULE 執行排程
- Phase D/E/Worker HPA Runbooks

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-03-29 16:04:08 +08:00
OG T
95b46af986 docs: 新增稽核報告 + 靈感實驗室 + Runbook 更新
- AWOOOI_COMPREHENSIVE_AUDIT_2026Q1.md 全維度稽核
- INSPIRATION_LAB.md 靈感收集
- K3S-OPTIMIZATION-RUNBOOK.md 優化指南
- ADR-006 AI Fallback 策略更新

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-03-29 16:03:41 +08:00
OG T
bf06737eed docs: ADR-038/039 + LOGBOOK 更新
- ADR-038: OpenClaw 併發治理架構
- ADR-039: 全域自動修復熔斷
- LOGBOOK: 今日進度記錄

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-03-29 15:48:09 +08:00
OG T
50c055b547 feat(api): Phase D-G P0 修正 - Learning Repository 積木化
新增:
- ILearningRepository Protocol (interfaces.py)
- LearningRepository (Redis 持久化層)
- Learning API 端點 (/api/v1/learning/*)
- LearningService.get_recommended_fix() 方法
- LearningService.get_learning_summary() 方法

修正:
- Service 不直接依賴 Redis Client (透過 Repository)
- 符合 leWOOOgo 積木化原則
- 首席架構師審查: 74/100 → 92/100

更新:
- ADR-030: 新增 Phase D-G P0 修正章節
- Skill 02: v1.9 → v2.0
- Runner 修復: 序列建構解決 _runner_file_commands 衝突

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-03-29 11:03:51 +08:00
OG T
94d6a0c720 docs(ai): 更新 ADR-036 和 LOGBOOK - P3 優化記錄
- ADR-036 v1.4: P3 優化完成 (95/100)
- LOGBOOK: Phase 20 P1+P2+P3 全部完成
- 測試: 34/34 PASSED

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-03-29 01:51:35 +08:00
OG T
1a4be7b18a feat(k-mon): K3s monitoring integration (Phase K-MON)
- Add Velero metrics NodePort service (30885)
- Add K3s infrastructure alert rules:
  - VIP 6443 availability
  - Node ICMP checks
  - AWOOOI API/Web TCP checks
  - SignOz/Sentry availability
- Add Velero backup alerts (failed/missing)
- Add ADR-034 for ArgoCD GitOps adoption

Deployed to:
- K3s: velero-metrics service
- 188: Prometheus + Alertmanager configs

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-03-28 21:57:57 +08:00
OG T
6a38c0c968 fix(cd): ADR-035 Telegram Secrets 自動注入三層防護
🔴 事故根因: K8s Secrets 未注入,Telegram 告警長時間失效
- kustomization.yaml 說「由 CI/CD 處理」但 CD 從未執行

🛡️ 三層防護機制:
- Layer 1: Pre-flight 檢查 GitHub Secrets 存在
- Layer 2: Deploy 時 kubectl patch secret 自動注入
- Layer 3: Post-Deploy E2E 測試告警驗證

📄 文件更新:
- ADR-035: docs/adr/ADR-035-telegram-alert-chain-enforcement.md
- DevOps Skill v1.9: 新增 Secrets 注入鐵律
- CLAUDE.md: 新增告警鏈路章節
- LOGBOOK: 事故記錄

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-03-28 21:47:49 +08:00
OG T
d206460751 feat(security): Phase 20 CSRF 防護實作
Phase 19 首席架構師審查指出: 核鑰 UX 安全性缺 CSRF 防護

後端:
- 新增 src/core/csrf.py (Double Submit Cookie 模式)
- 新增 src/api/v1/csrf.py (GET /api/v1/csrf/token)
- 新增 src/models/csrf.py (CSRFTokenResponse)
- 修改 approvals.py sign/reject/bulk 端點加入 CSRFToken 驗證

前端:
- 新增 hooks/useCSRF.ts (React Hook)
- 修改 approval.store.ts 整合 CSRF Token 參數

安全特性:
- 256-bit Token (secrets.token_hex)
- 時序安全比較 (secrets.compare_digest)
- SameSite=Strict Cookie
- 1 小時 Token 有效期

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-03-28 18:31:58 +08:00
OG T
7b9b0c490b feat(phase19): Omni-Terminal 100% 完成 + 首席架構師審查 47/50
## Phase 19 Omni-Terminal (Wave 0-6 全部完成)

### 核心功能
- SSE 狀態機 (7-State 設計,10/10 分)
- GenUI 動態渲染 (6 張卡片 + Zod Schema 驗證)
- 核鑰 UX (長按授權 + 風險分級)
- Terminal Telemetry (Sentry 整合)

### P0-P2 修復
- P0: Singleton → FastAPI Depends 依賴注入
- P1: Zod Schema 升級 (7 個驗證 Schema)
- P1: 錯誤分類碼聚合 (Sentry fingerprint)
- P2: Slow Query 監控 (5s 警告 / 10s 嚴重)

### 測試
- test_terminal_service.py: 54 項測試全通過
- 意圖分類: 42 個測試案例 (9 種 IntentType)

### 文檔
- ADR-031: SSE 架構實作紀錄
- ADR-032: GenUI 渲染實作紀錄
- Skills: v1.9 (後端 Terminal 章節)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-03-28 18:04:12 +08:00
OG T
3e5315aaf8 docs(k3s): 首席架構師審查完成 46/50 (92%)
K3s 優化工作審查完成:

- ADR-033: Phase K0 + K-NET 標記為已完成
- 09-pdb.yaml: Worker PDB 設計說明註釋
- DevOps Skill: 新增 keepalived 快速操作參考

審查結果:
- 架構合規性: 9/10
- Runbook 完整性: 10/10 
- 模組化合規: 9/10
- 風險控制: 9/10
- 文檔完整性: 9/10

P2 問題已修復,無 P0/P1 阻擋項

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-03-28 18:00:07 +08:00
OG T
e5ded3b3f2 feat(phase19): OmniTerminal + GenUI + Hybrid SSE 架構實作 (Wave 0-2)
Phase 19 OmniTerminal MVP 完成:
- Wave 0: Backend (Hybrid SSE POST→GET 架構)
- Wave 1: Frontend (OmniTerminal 狀態機 + GenUI Registry)
- Wave 2: UI 組件 (8 個 GenUI 動態卡片)

ADR 文檔:
- ADR-031: OmniTerminal SSE 架構
- ADR-032: GenUI 動態渲染框架
- ADR-033: K3s HA 架構設計

GenUI 組件:
- GenUIRenderer, K8sPodStatusCard, SentryErrorCard
- MetricsSummaryCard, IncidentTimelineCard
- TraceWaterfallCard, ApprovalCard, NuclearKeyButton

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-03-28 00:17:26 +08:00
OG T
4ee5376bd1 docs: 告警機制優化計畫 + ADR-030 Phase 6 + Skill 03 v1.5
- LOGBOOK: 新增告警機制完整審查記錄
- ADR-030: 新增 Phase 6 非同步分析優化章節
- Skill 03: v1.5 Stream Key 統一 + Telegram 去重

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-03-27 09:42:53 +08:00
OG T
d3a0ed4253 docs(adr): ADR-030 智能自動修復系統完整設計
五階段實施計畫:
- Phase 1: 智能診斷基礎  已完成
- Phase 2: 資料收集強化 (K8s Events + SignOz 深度整合)
- Phase 3: Playbook RAG (向量化 + 語意搜尋)
- Phase 4: 自動執行機制 (信任度 + 風險評估)
- Phase 5: 持續學習迴圈 (反饋 + 信任度調整)

架構相容性分析:
- 介面擴展點定義
- 資料庫 Schema 變更
- 風險評估與回滾計畫

預計時程: 10-15 週

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-03-26 21:48:41 +08:00
OG T
309a019cc3 docs: 記錄 Telegram 告警轟炸事故修復
更新:
- ADR-027: 新增緊急事故修復章節
- LOGBOOK: 記錄 2026-03-26 事故時間線
- Skill 02 v1.6: 新增 Telegram 去重機制章節

根因: Phase 6.5 修改 + INC- 前綴重複
修復: Redis 去重 (10 分鐘) + 前綴檢查

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-03-26 20:13:07 +08:00
OG T
fb03430469 feat(api): ADR-027 Phase 2 - 簽核/拒絕後自動同步 Incident 狀態
Router 整合點:
- POST /approvals/{id}/sign → on_approval_status_change("approved")
- POST /approvals/{id}/reject → on_approval_status_change("rejected")
- POST /approvals/bulk-approve → 批次同步

變更:
- 移除舊的 resolve_incident_after_approval() 調用
- 改用 IncidentApprovalService.on_approval_status_change()
- 同步失敗不阻斷主流程 (容錯設計)

ADR-027 進度: Phase 1-2  完成

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-03-26 19:44:59 +08:00
OG T
2f5986df5c docs: ADR 整理與新增 (021-029)
ADR 編號修正:
- ADR-023 failure-auto-repair → ADR-028
- ADR-025 cicd-ai-integration → ADR-029

新增 ADR:
- ADR-021: Playbook 更新驗證
- ADR-022: Sentry 整合架構
- ADR-027: Incident-Approval 同步
- ADR-028: 失敗自動修復閉環
- ADR-029: CI/CD AI 整合 (原 ADR-025)

更新:
- ADR-018: LLM 測試策略狀態更新

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-03-26 19:09:08 +08:00
OG T
0a9d94d82b feat(k8s): CoreDNS GitOps 架構 (ADR-026)
問題: DNS 配置沒有版本控制,手動修改易遺失

架構:
- k8s/k3s-system/coredns-custom.yaml: HelmChartConfig
- CD workflow: k3s-system 路徑偵測 + 自動 apply
- ADR-026: CoreDNS GitOps 管控架構

DNS 上游:
- 使用 8.8.8.8 + 1.1.1.1
- 禁止 /etc/resolv.conf (systemd-resolved)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-03-26 18:43:28 +08:00
OG T
6e3a7fca20 docs: ADR-006 v1.2 Rate Limiter + LOGBOOK 更新
- ADR-006: 新增 Rate Limiter 實作章節 (v1.2)
- LOGBOOK: 記錄 Gemini 切換 + Rate Limiter 上線

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-03-26 18:16:45 +08:00
OG T
30145c7d7e docs: ADR-025 CI/CD AI 整合架構 + Skill 07 更新
- ADR-025: 文檔化 Phase 13.1 CI/CD AI 整合架構決策
  - GitHub Webhook 事件驅動流程
  - 風險分級執行決策 (AUTO/TELEGRAM/APPROVAL/BLOCKED)
  - SignOz Log 整合
- Skill 07 v1.3: 新增 Grafana MCP + SignOz query_logs

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-03-26 15:41:26 +08:00
OG T
14c81f728f docs: 新增 ADR-025 告警鏈路 E2E 驗證 + 更新 Skills
新增:
- ADR-025: 告警鏈路 E2E 驗證架構 (2026-03-26 事故教訓)

更新:
- ADR-011: 新增 DNS 規則最佳實踐 (附錄 B)
- Skill 04: 新增 NetworkPolicy DNS 規則 + CoreDNS 設定
- Skill 05: 新增告警鏈路 Smoke Test 要求
- CLAUDE.md: 新增告警鏈路驗證到任務前必讀

事故根因:
1. URL 路徑錯誤 (webhook vs webhooks)
2. NetworkPolicy DNS 規則標籤不匹配
3. CoreDNS 上游 DNS 依賴 systemd-resolved

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-03-26 15:34:12 +08:00
OG T
579da38b8b feat(api): Phase 13 智能路由 + CI/CD 整合 (#74-88)
Phase 13.1 CI/CD Integration:
- #76 workflow_run handler for CI failure diagnosis
- #77 SignOz log query (query_logs, error_logs_summary MCP)
- #78 CIAutoRepairService with risk-based execution decisions

Phase 13.3 Smart Routing:
- #85 Intent Classifier v2.0 (rule engine + LLM fallback)
- #86 Complexity Scorer (9-dimension scoring)
- #87 AI Router v3.0 (routing decision matrix)
- #88 Token Counter (OTEL + Langfuse integration)

New files:
- services/ci_auto_repair.py (risk stratification)
- services/model_registry.py (centralized model config)
- services/token_counter.py (677 lines)
- Skill 08: Model Router Expert
- Skill 09: Strangler Pattern Expert
- ADR-023: Smart Routing Architecture
- ADR-024: API Layer Architecture

Tests:
- phase11-conversational.spec.ts (E2E tests)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-03-26 15:32:52 +08:00
OG T
30f045bf28 feat: ADR-019 System Prompt 集中管理 + Nightly LLM Workflow
新增:
- docs/adr/ADR-019-system-prompt-management.md - System Prompt 規範
- apps/api/src/core/prompts.py - 集中管理 System Prompts
- .github/workflows/nightly-llm.yaml - 每夜 LLM 迴歸測試

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-03-26 12:27:47 +08:00
OG T
edecf7a053 docs: ADR-020 E2E 驗證框架規範
Phase 18.3 配套決策文檔:
- E2E 驗證腳本架構 (5 步驟標準)
- Safe Label 防護機制
- Daily Health Check 排程規範
- 目標資源驗證要求

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-03-26 12:27:36 +08:00
OG T
96c3ddd8c4 feat(api): Phase 18.1 K8s 資源名稱驗證 (ADR-016)
三層防禦架構確保 kubectl 指令有效:
1. Webhook 入口正規化 (webhooks.py)
2. OpenClaw 產生指令前驗證 (openclaw.py)
3. 靜態映射表 + 模糊匹配 (k8s_naming.py, resource_resolver.py)

新增:
- src/utils/k8s_naming.py: RFC 1123 正規化 + 靜態映射
- src/services/resource_resolver.py: MCP K8s Tool 動態驗證
- docs/adr/ADR-016-k8s-resource-naming.md: 契約文檔
- scripts/e2e_tool_call_verification.py: E2E 驗證腳本 v2.0

修改:
- webhooks.py: Phase 18.1.7 入口正規化
- openclaw.py: Phase 18.1.6 產生指令前驗證
- Skill 03 v1.4: 新增 K8s 資源驗證章節

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-03-26 11:22:47 +08:00
OG T
fe7fd7a3e0 feat(tests): ADR-018 LLM 測試策略三層架構
問題: LLM 測試因模型波動導致 CI 失敗

解決方案: 三層測試策略
- Tier 1 (CI): Schema 驗證 + Golden Responses
- Tier 2 (Nightly): 屬性測試 + Live LLM
- Tier 3 (Weekly): 語意相似度測試

新增檔案:
- ADR-018-llm-testing-strategy.md
- tests/llm_testing/ 框架
  - schema_validators.py: Pydantic Schema 驗證
  - property_validators.py: kubectl/風險等級驗證
  - golden_responses.py: 預錄回應管理
- tests/test_llm_tier1_schema.py: 35 個 Tier 1 測試

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-03-26 11:17:00 +08:00
OG T
8a163609bf docs(adr): 更新 ADR-006/009/015 狀態
ADR-015: 標記為「已實作」 (Phase 16 R1 完成)
ADR-009: 標記為「已實作」 (Phase 9.1-9.5 全部完成)
ADR-006: 新增智能路由整合章節 (Phase 13.3)

首席架構師 ADR 審計 P0/P1 完成

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-03-26 10:45:29 +08:00
OG T
0003098c55 docs(adr): ADR-017 LLMOps Observability 三層觀測架構
建立 Phase 15 LLMOps 觀測架構決策文件,記錄:
- 三層觀測架構 (Langfuse + SignOz + Sentry)
- Langfuse 整合與 Deep Linking 實作
- Redis Streams Trace Context 傳遞機制
- 取樣率策略與成本估算

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-03-26 10:13:12 +08:00
OG T
24e35fee1b docs(adr): ADR-016 智能路由 (Smart Routing)
新增 Intent + Complexity → Model Selection 架構決策文件,
作為 ADR-006 (AI Fallback) 的補充,實現動態模型選擇。

- IntentClassifier: 關鍵字優先 + LLM 備援
- ComplexityScorer: 規則引擎加權評分
- AIRouter: 整合路由決策

Phase 13.3 #85-87

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-03-26 10:13:05 +08:00
OG T
42659a271a docs(adr): ADR-014 Dependency Governance 依賴治理
建立前端依賴治理規範文件,.dependency-cruiser.cjs 已參照此 ADR。

內容包含:
- Layer Model 四層架構定義
- Feature Isolation 規則說明
- CI 整合配置 (pnpm dep-check)
- Severity 分級策略

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-03-26 10:12:43 +08:00
OG T
604e38cf07 docs: Phase 14 紅區治理 + Skills 01/03 更新
- CLAUDE.md: 紅區治理章節
- Skills 01/03: 版本更新
- ADR/Architecture: 標準化

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-03-26 09:55:47 +08:00
OG T
643946e60c refactor(api): ADR-015 MCP 模組化架構重構
## 重構內容

符合 leWOOOgo 積木化原則:
- 新增 interfaces.py: MCPToolProvider ABC 定義
- 新增 registry.py: Provider 註冊中心 (DI 模式)
- 新增 providers/: K8s, SignOz, Database 具體實作
- 重構 mcp_bridge.py: 透過 ProviderRegistry 委派執行

## 修復 Code Review 問題

- 🔴 移除 _execute_stdio logging 敏感 parameters
- 🔴 修復 conversational-view.tsx i18n 硬編碼

## 新增檔案

- apps/api/src/plugins/mcp/interfaces.py
- apps/api/src/plugins/mcp/registry.py
- apps/api/src/plugins/mcp/providers/__init__.py
- apps/api/src/plugins/mcp/providers/k8s_provider.py
- apps/api/src/plugins/mcp/providers/signoz_provider.py
- apps/api/src/plugins/mcp/providers/database_provider.py
- docs/adr/ADR-015-mcp-modular-architecture.md
- .dependency-cruiser.cjs (Phase 14.2 準備)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-03-25 14:31:32 +08:00