OG T
|
db1aed81d9
|
fix(db): C1 時區統一遷移 — utc_now → taipei_now (全 5 table)
E2E Health Check / e2e-health (push) Successful in 18s
CD Pipeline / build-and-deploy (push) Has been cancelled
🔴 首席架構師審查 C1: 全系統禁止 UTC,必須台北時區 +8
- utc_now() → taipei_now() (調用 src.utils.timezone.now_taipei)
- 影響: ApprovalRecord, TimelineEvent, AuditLog, IncidentRecord, KnowledgeEntryRecord
- 13 處 default/onupdate 全部替換
- 移除 datetime.UTC import
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
2026-04-02 09:13:36 +08:00 |
|
OG T
|
628387de8c
|
fix: risklevel migration 自動化 + Telegram Whitelist 注入
E2E Health Check / e2e-health (push) Successful in 17s
CD Pipeline / build-and-deploy (push) Has been cancelled
1. init_db() 啟動時自動確保 risklevel enum 包含 'high' 值
(Phase 23 新增,避免舊 DB 缺值導致 InvalidTextRepresentation)
2. CD Pipeline 新增 OPENCLAW_TG_USER_WHITELIST 自動注入
(之前為 CHANGE_ME,已更新為實際 user ID 5619078117)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
2026-04-02 09:13:13 +08:00 |
|
OG T
|
d2bad44173
|
fix(api): KB 架構審查修復 I3-I5
E2E Health Check / e2e-health (push) Successful in 17s
CD Pipeline / build-and-deploy (push) Has been cancelled
- I3: Service 層加 IKnowledgeRepository Protocol 型別標注
- I4: search 方法加入 tags JSONB 搜尋 (cast→String→ilike)
- I5: get_categories 獨立方法,不再繞道 list_entries(limit=0)
首席架構師審查 87/100 → 全部 Important issues 已修復
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
2026-04-02 09:05:54 +08:00 |
|
OG T
|
48a0bc66f7
|
fix(api): KB 首席架構師審查修復 (I1 tags filter + I2 type annotation)
E2E Health Check / e2e-health (push) Successful in 16s
CD Pipeline / build-and-deploy (push) Has been cancelled
- I1: Repository list_entries 實作 tags JSONB @> 篩選 (之前聲明未實作)
- I2: ORM tags 型別從 list[dict[str, Any]] 修正為 list[str]
首席架構師審查: 87/100
C1 時區(UTC→Taipei) 為既有系統性問題,另開 task 統一遷移
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
2026-04-02 09:04:41 +08:00 |
|
OG T
|
e17248fd10
|
fix: 首席架構師審查修復 — i18n/CD/時區/死碼清理
E2E Health Check / e2e-health (push) Successful in 16s
CD Pipeline / build-and-deploy (push) Has been cancelled
P0 前端 i18n 合規 (6 檔案):
- settings/page.tsx: 全面改用 useTranslations('settings')
- auto-repair/page.tsx: 30+ 處硬編碼改用 t('autoRepair.*')
- sidebar.tsx: sectionLabel 改用 tSection(),aria-label 國際化
- openclaw-panel.tsx: STATUS_MESSAGES 改用 tPanel(),Production 改用 tBrand
- alerts/page.tsx: StatPill label 改用 t('incident.severity.*')
P1 CD Pipeline:
- cd.yaml: runs-on 改 self-hosted (ADR-039)
- Telegram Secret 注入失敗改為 exit 1 (ADR-035)
- kubectl patch op:replace → op:add (首次部署相容)
P2 後端:
- langfuse_client.py: 移除 v4.x 死碼分支 (SDK 鎖定 <3.0.0)
- ai.py: 標記 TODO(R4) Router 瘦身
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
2026-04-02 09:02:41 +08:00 |
|
OG T
|
d32d84efce
|
feat(telegram): 接通 Phase 22 Nemotron 雙軌顯示 (ADR-044)
CD Pipeline / build-and-deploy (push) Has been cancelled
E2E Health Check / e2e-health (push) Has been cancelled
根本原因: format_with_nemotron() 已實作但從未被呼叫
- send_approval_card() 新增 nemotron_enabled/tools/validation/latency 參數
- TelegramMessage 建構時傳入 nemotron 欄位
- nemotron_enabled=true 時自動使用 format_with_nemotron() 格式
- _push_decision_to_telegram() 從 proposal_data 提取並傳遞 nemotron 資料
效果: Telegram 同時顯示 OpenClaw 仲裁 + Nemotron 執行方案雙區塊
2026-04-02 ogt: Phase 22 最後一哩路
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
2026-04-02 08:59:03 +08:00 |
|
OG T
|
d8be78b135
|
feat(api): Knowledge Base Phase 1 後端四層架構
CD Pipeline / build-and-deploy (push) Successful in 7m0s
E2E Health Check / e2e-health (push) Successful in 17s
Type Sync Check / check-type-sync (push) Failing after 30s
- models/knowledge.py: Pydantic Schema (EntryType/Source/Status/CRUD)
- db/models.py: KnowledgeEntryRecord ORM (PostgreSQL)
- repositories/interfaces.py: IKnowledgeRepository Protocol
- repositories/knowledge_repository.py: PostgreSQL CRUD 實作
- services/knowledge_service.py: 業務邏輯 (get_db_context 內部管理 session)
- api/v1/knowledge.py: REST Router (get_knowledge_service,無直接 DB 存取)
- main.py: 掛載 Knowledge Base Router
- k8s/jobs/migrate-knowledge-entries.yaml: DB Migration Job
API 端點: GET/POST / | GET/PATCH/DELETE /{id} | POST /{id}/approve
GET /search | GET /categories
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
2026-04-02 00:55:56 +08:00 |
|
OG T
|
077e1cd637
|
fix(telegram): 修復所有 webhook 路徑缺失 ai_provider → Telegram 顯示「AI 仲裁判定」
E2E Health Check / e2e-health (push) Has been cancelled
CD Pipeline / build-and-deploy (push) Has been cancelled
根本原因: send_approval_card() 有 ai_provider 參數,但三個 webhook 呼叫端都沒傳:
- signoz_webhook.py: 有 ai_provider 參數但未轉傳給 send_approval_card
- sentry_webhook.py: 有 analysis.analyzed_by 但未傳
- webhooks.py: _push_to_telegram_background 缺少 ai_provider 參數
修復後 Telegram 會顯示「🤖 OpenClaw Nemo 仲裁」而非「🤖 AI 仲裁判定」
2026-04-02 ogt: 三個 webhook 路徑統一修復
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
2026-04-02 00:51:13 +08:00 |
|
OG T
|
46a346d948
|
fix(llmops): 鎖定 langfuse SDK v2.x (v3.x/v4.x 均已移除 client.trace() API)
CD Pipeline / build-and-deploy (push) Successful in 6m33s
E2E Health Check / e2e-health (push) Successful in 15s
排查確認:v3.x 也無 client.trace(),langfuse_client.py 依賴此 API。
鎖定 <3.0.0 確保安裝 v2.60.10 (v2 最新),trace/generation/score 均可用。
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-04-02 00:23:29 +08:00 |
|
OG T
|
de04de1d4f
|
fix(telegram): 新增 openclaw_nemo/nvidia_nim 顯示名稱映射
CD Pipeline / build-and-deploy (push) Has been cancelled
E2E Health Check / e2e-health (push) Has been cancelled
- format() 和 format_with_nemotron() 兩處 provider_names 均加入:
openclaw_nemo → "OpenClaw Nemo"
openclaw_nvidia_nim → "OpenClaw Nemo"
openclaw_qwen → "OpenClaw Nemo"
- 修正顯示 "OPENCLAW_NEMO" (大寫) 的問題
- 2026-04-01 ogt: 配合 AI 仲裁架構調整
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-04-02 00:20:37 +08:00 |
|
OG T
|
ae6af27bda
|
fix(llmops): 鎖定 langfuse SDK v2.x — pyproject.toml (實際建置來源)
E2E Health Check / e2e-health (push) Has been cancelled
CD Pipeline / build-and-deploy (push) Has been cancelled
requirements.txt 不是 Docker 建置來源,Dockerfile 使用 uv pip install . 從
pyproject.toml 安裝依賴。v4.x 改用 OTLP 協定,與 langfuse_client.py 的
client.trace() API 不相容,鎖定 <4.0.0。
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-04-01 22:58:49 +08:00 |
|
OG T
|
27b4d2a76a
|
fix(telegram): strip <placeholder> 佔位符防止 HTML parse 錯誤
CD Pipeline / build-and-deploy (push) Has been cancelled
E2E Health Check / e2e-health (push) Has been cancelled
OpenClaw 生成的 kubectl_command 含 <受影響服務名稱>
在 Telegram HTML parse mode 下造成 'Can't parse entities'
用 regex strip 所有 <...> 佔位符
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-04-01 22:50:07 +08:00 |
|
OG T
|
2aeec34735
|
fix(llmops): 鎖定 langfuse SDK v2.x (避免 v4.x OTLP 不相容)
CD Pipeline / build-and-deploy (push) Has been cancelled
E2E Health Check / e2e-health (push) Has been cancelled
CD Pipeline (Dev) / build-and-deploy-dev (push) Successful in 2m40s
問題: langfuse>=2.0.0 安裝了 v4.0.5,該版本移除 client.trace() 改用 OTLP
根因: Langfuse server v2.95.11 的 OTLP endpoint (/api/public/otel) 返回 404
但舊版 /api/public/ingestion endpoint 正常 (HTTP 207)
修復: 鎖定 langfuse>=2.0.0,<4.0.0,保留 client.trace() API 相容性
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-04-01 22:49:00 +08:00 |
|
OG T
|
88051388d4
|
fix(ai): 修復 _call_openclaw_analyze datetime 序列化失敗 → fallback Gemini
CD Pipeline / build-and-deploy (push) Has been cancelled
E2E Health Check / e2e-health (push) Has been cancelled
signals dict 內含 datetime 物件,httpx json= 無法序列化
加入 _to_serializable 遞迴轉換,datetime → str
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-04-01 22:37:04 +08:00 |
|
OG T
|
ff85b0581a
|
fix(ai): 修復 analyze_and_propose 方法呼叫錯誤
CD Pipeline (Dev) / build-and-deploy-dev (push) Failing after 9s
CD Pipeline / build-and-deploy (push) Has been cancelled
E2E Health Check / e2e-health (push) Has been cancelled
- OpenClawService 沒有 analyze(), 正確方法是 analyze_alert(alert_context)
- 包裝 host_statuses 為 alert_ctx 傳入
- 解包返回值 (8-tuple) 用 *_ 忽略尾端
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-04-01 22:33:51 +08:00 |
|
OG T
|
a1f7d1f495
|
fix(db): 固化 risklevel ADD VALUE 'high' 為正式 migration
CD Pipeline / build-and-deploy (push) Successful in 6m58s
E2E Health Check / e2e-health (push) Successful in 18s
Phase 23 緊急修復已在 prod/dev 手動執行,此檔作為正式記錄
使用 DO 塊防止重複執行錯誤
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-04-01 21:36:15 +08:00 |
|
OG T
|
5dd28b2fc6
|
test(telegram): add ADR-050 info action tests (detail/reanalyze/history/keyboard)
CD Pipeline / build-and-deploy (push) Successful in 6m47s
E2E Health Check / e2e-health (push) Successful in 18s
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-04-01 21:11:45 +08:00 |
|
OG T
|
5809d3e336
|
feat(ai): 委派 Incident RCA 給 OpenClaw (Nemo) — 架構鐵律修正
E2E Health Check / e2e-health (push) Has been cancelled
CD Pipeline / build-and-deploy (push) Has been cancelled
架構鐵律: OpenClaw = AI 大腦,AWOOOI API 透過 HTTP 委派仲裁
修改:
- openclaw.py: 加入 _call_openclaw_analyze(),在 LLM fallback 前先呼叫 OpenClaw
- 04-configmap.yaml: OPENCLAW_URL 修正為 :8088 (新容器 port)
- AI_FALLBACK_ORDER 改為 ["ollama","claude"] (移除 Gemini 付費 API)
OpenClaw /api/v1/analyze/incident → qwen2.5:7b 本機 Ollama (Nemo)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-04-01 21:11:30 +08:00 |
|
OG T
|
60d2fbaf8c
|
feat(telegram): implement reanalyze button handler, replace placeholder (ADR-050)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-04-01 21:08:44 +08:00 |
|
OG T
|
6dc1505584
|
feat(incident): add trigger_reanalysis() with Redis 10min dedup (ADR-050)
|
2026-04-01 21:06:39 +08:00 |
|
OG T
|
a9d8fd9c3c
|
feat(telegram): ADR-050 P2 - detail/history info actions 實作
CD Pipeline (Dev) / build-and-deploy-dev (push) Successful in 2m28s
- _send_incident_detail: 取得事件詳情 + AI 信心條形圖,傳送新訊息保留原始簽核卡片
- _send_incident_history: 頻率統計 (1h/24h/7d/30d + 自動修復次數)
- reanalyze: 保留為開發中 placeholder
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-04-01 18:48:04 +08:00 |
|
OG T
|
0bf0a1cea2
|
feat(telegram): ADR-050 P1 - 6鍵 Inline Keyboard + info actions 骨架
CD Pipeline (Dev) / build-and-deploy-dev (push) Successful in 2m39s
CD Pipeline / build-and-deploy (push) Successful in 7m1s
E2E Health Check / e2e-health (push) Successful in 17s
第一行: [✅ 批准] [❌ 拒絕] [🔕 靜默] (nonce 防重放)
第二行: [📋 詳情] [🔄 重診] [📊 歷史] (read-only, action:incident_id 格式)
- security_interceptor: parse_callback_data 支援 2-part info action 格式
- telegram_gateway: _build_inline_keyboard 新增 incident_id 參數
- telegram.py: info_action 短路,不觸發 DB 操作
P2 待實作: detail/reanalyze/history 回傳實際資料 (目前回傳「功能開發中」)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-04-01 18:34:26 +08:00 |
|
OG T
|
43a370fc11
|
fix(model): IncidentOutcome 舊 Redis 字串格式相容性
CD Pipeline (Dev) / build-and-deploy-dev (push) Successful in 2m38s
CD Pipeline / build-and-deploy (push) Has been cancelled
E2E Health Check / e2e-health (push) Has been cancelled
Type Sync Check / check-type-sync (push) Failing after 22s
舊事件 outcome 存為字串 "resolved",Pydantic v2 無法解析
→ INTERNAL_ERROR on /auto-repair/evaluate/{incident_id}
field_validator mode='before' 將字串轉為 None (安全丟棄)
確保舊資料不引發 incident_parse_error
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-04-01 18:03:21 +08:00 |
|
OG T
|
9913f5dc6d
|
feat(infra): 開發環境分離 + BuildKit cache 修復 + circuit breaker 優化
CD Pipeline / build-and-deploy (push) Successful in 6m52s
E2E Health Check / e2e-health (push) Successful in 17s
CD Pipeline (Dev) / build-and-deploy-dev (push) Failing after 9s
1. k8s/awoooi-dev/: 新建 dev namespace (01-05 配置)
- Namespace + ResourceQuota (cpu 2/4, mem 4Gi/8Gi)
- ConfigMap: ENVIRONMENT=dev, LOG_LEVEL=DEBUG, SHADOW_MODE=false
- Deployment: 1 replica, NodePort 32344, image dev-latest
- RBAC: awoooi-executor-dev ServiceAccount
2. .gitea/workflows/cd-dev.yaml: dev branch CD pipeline
- 觸發: dev branch push
- Build: --no-cache (防 cache poisoning)
- Tag: dev-{sha} / dev-latest
- Deploy: awoooi-dev namespace, health check 32344
- Telegram: [DEV] 前綴通知
3. apps/api/Dockerfile: ARG CACHE_BUST=none (防 BuildKit cache 毒化)
- deps 層 (pip install) 仍可 cache
- src/ 和 models.json 層每次重建
4. .gitea/workflows/cd.yaml: 正式環境 API build 加入 CACHE_BUST=git_sha
- 確保 models.json 等配置變更正確進入 image
5. apps/api/src/services/nvidia_provider.py: timeout 不計入 circuit breaker
- TimeoutException → 只 log,不 record_failure()
- 只有硬性錯誤 (auth/rate limit/exception) 才斷路
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-04-01 16:22:21 +08:00 |
|
OG T
|
c9c60c3a61
|
feat(mcp-integrations): Phase S 架構修復 + MCP 整合基礎建設
E2E Health Check / e2e-health (push) Has been cancelled
CD Pipeline / build-and-deploy (push) Has been cancelled
Type Sync Check / check-type-sync (push) Failing after 22s
Phase S 技術債修復 (首席架構師審查 82→完整):
- S-01: generate_alert_fingerprint 移至 AlertAnalyzer.generate_fingerprint() staticmethod
- S-04: 移除 Pydantic v2 deprecated json_encoders (直接用原生 datetime 序列化)
Sentry MCP 整合 (Phase 23):
- ADR-048: Sentry→OpenClaw AI Triage 架構決策
- sentry_webhook_service.py: parse/analyze/create_incident/build_message Service 層
- config.py: SENTRY_WEBHOOK_SECRET (Fail-Closed HMAC-SHA256)
Playwright MCP 整合 (短期):
- smoke.spec.ts: 5 頁面 E2E smoke test (home/dashboard/incidents/approvals/terminal)
- cd.yaml: E2E Smoke Test 步驟 + Telegram 🎭 Smoke 狀態通知
長期規劃 ADR:
- ADR-049: Figma Code Connect 設計系統同步
- ADR-050: Telegram 互動式 Incident 2.0 (6鍵 Inline Keyboard)
- ADR-051: Context7 依賴升級顧問 (Next.js 14→15, FastAPI 0.115→0.128)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-04-01 16:20:57 +08:00 |
|
OG T
|
394f85954e
|
fix(api): 修復 Y/n 404 + 停用 Multi-Sig
E2E Health Check / e2e-health (push) Has been cancelled
CD Pipeline / build-and-deploy (push) Has been cancelled
1. proposal_service._load_incident() 改用 incident_service.get_from_working_memory()
- brain engine 使用 awoooi:incidents: prefix,資料實際在 incident: prefix
- 兩個 prefix 不符導致永遠 404 (Y/n 按鈕全部失敗)
- 2026-04-02 ogt
2. trust_engine CRITICAL required_signatures 2→1
- 統帥決策: 所有審核只需 1 層簽核
- 2026-04-02 ogt
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-04-01 16:16:28 +08:00 |
|
OG T
|
419dc2f8e0
|
fix(nvidia): timeout 60s→30s,NVIDIA 第一保免費,失敗轉 Gemini
CD Pipeline / build-and-deploy (push) Successful in 5m46s
E2E Health Check / e2e-health (push) Successful in 16s
- nvidia_provider.py: NVIDIA_TIMEOUT 60→30s
- models.json: timeout_seconds 60→30s
- configmap: NEMOTRON_TIMEOUT_SECONDS 45→30s, fallback 恢復 nvidia 第一
目標: Nemo 有足夠時間回應(free),失敗快速轉 Gemini(備援),整體機制可運作
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-04-01 16:05:19 +08:00 |
|
OG T
|
4c622813af
|
fix(auto-repair): 實際可用的自動修復門檻 (Phase 22 P1)
E2E Health Check / e2e-health (push) Has been cancelled
CD Pipeline / build-and-deploy (push) Has been cancelled
問題: 四道鎖全卡死導致自動修復永遠不觸發
1. configmap: Gemini 排第一 (100ms vs NVIDIA 60s timeout)
2. auto_approve: confidence 0.90→0.65, trust 5→1, playbook 3→1
3. auto_approve: 開放 medium 風險, require_playbook=False
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-04-01 16:02:16 +08:00 |
|
OG T
|
eccf61fbc9
|
fix(ai): 修復假信心度 + 解除 Shadow Mode (Phase 22 P1)
CD Pipeline / build-and-deploy (push) Has been cancelled
E2E Health Check / e2e-health (push) Has been cancelled
1. openclaw.py: LLM 截斷時 confidence 0.82→0.0 (禁止偽造信心度)
2. prompts.py: NEMOTRON schema 範例值改用佔位符,防模型照抄 0.75
3. configmap: SHADOW_MODE_ENABLED=false,開放 low 風險自動執行
條件門檻: confidence≥90% + trust_score≥5 + playbook_success≥95%
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-04-01 15:59:42 +08:00 |
|
OG T
|
d352673099
|
fix(ai): models.json gemini-1.5-flash → gemini-2.0-flash (404 修復)
CD Pipeline / build-and-deploy (push) Has been cancelled
E2E Health Check / e2e-health (push) Has been cancelled
gemini-1.5-flash 已停用,改用 gemini-2.0-flash。
models.json 上次未跟著 model_registry.py 同步更新。
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-04-01 15:56:05 +08:00 |
|
OG T
|
0fd53422c6
|
fix(openclaw): NEMOTRON_SYSTEM_PROMPT confidence/reasoning 移至最前
CD Pipeline / build-and-deploy (push) Failing after 5m36s
E2E Health Check / e2e-health (push) Successful in 17s
Nemo-4B 4B 參數模型輸出長度有限,confidence/reasoning 排在 schema 末尾
時常被截斷,導致 openclaw.py:1045 fallback 補 0.82 假數據。
修復:將 confidence 和 reasoning 移至 schema 最前兩個欄位,確保模型
輸出截斷時仍包含最關鍵欄位。同時明確禁止模型抄範例值。
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-04-01 13:19:18 +08:00 |
|
OG T
|
22de22c989
|
refactor(phase-s): Phase S 技術債清理 - 五項架構改善
S-01: generate_alert_fingerprint() 移至 alert_analyzer_service (Router→Service)
S-02: 移除廢棄 USE_NEW_ENGINE config (Phase R 已完成歷史使命)
S-03: github_webhook.py linter 清理 (Field unused + delivery_id noqa)
S-04: Pydantic v2 遷移 - approval/incident models (class Config → ConfigDict)
S-05: Skill 09 v1.1 更新 (USE_NEW_ENGINE 廢棄說明)
測試: 393 passed, 零失敗
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-04-01 13:12:02 +08:00 |
|
OG T
|
cd6da9c8d6
|
fix(tests): 更新 NVIDIA rate limiter 測試至當前配置值
ai_rate_limiter.py 在 2026-03-31 更新了 NVIDIA 免費版限制值,
但測試未同步更新導致失敗:
- rpm: 5 → 10 (放寬並發控制)
- daily_requests: 100 → 99999 (免費版無限制)
- daily_tokens: 50_000 → 9999999 (免費版無限制)
- total_cost_usd: 0.0 → 999999.0 (修復 $0>=0 永遠 True bug)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-04-01 11:15:22 +08:00 |
|
OG T
|
59902f270d
|
fix(tests): 首席架構師審查修復 - 測試套件 + DI 強化 (96/100 OUTSTANDING)
P1 測試修復:
- test_smart_router.py: 更新至當前 API (IntentResult + DIAGNOSE/CONFIG 規範化)
- test_auto_repair_service.py: 注入 _no_cooldown fixture 隔離 Redis 依賴
- test_global_repair_cooldown.py: 加 @pytest.mark.integration 標記
P2 架構改進:
- AutoRepairService: 新增 cooldown_checker DI 參數 (Callable | None)
- global_repair_cooldown: get_redis() 移入 try-except 防止未捕獲 RuntimeError
P3 配置:
- pyproject.toml: 登記 integration pytest marker
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-04-01 11:11:50 +08:00 |
|
OG T
|
e6f6734f39
|
fix(telegram): Redis Leader Election 解決多 Pod 409 Conflict
CD Pipeline / build-and-deploy (push) Has been cancelled
E2E Health Check / e2e-health (push) Has been cancelled
問題: 2 個 API Pod 同時 getUpdates → 互相 409 → 兩個都失敗
根本原因: explicit env TELEGRAM_ENABLE_POLLING=false 被 kubectl patch 設入
deployment,覆蓋 ConfigMap 的 true (feedback_k8s_env_precedence.md 違規)
修復步驟:
1. kubectl patch 移除 deployment 的 explicit env override
2. 實作 Redis Leader Election 防止多 Pod 競爭
- 使用 SET NX EX=45 取得 Leader Lock
- _leader_renewer(): 每 20s 續約,確保 Leader 持有 Lock
- _leader_watcher(): 非 Leader Pod 每 30s 嘗試接管
- 409 時主動釋放 Lock,Watcher 競爭接管
結果: 一個 Pod 正常 polling,另一個 Pod 進入 Watcher 待命模式
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-04-01 11:04:10 +08:00 |
|
OG T
|
411880842f
|
refactor(router): R4 #129 AlertAnalyzer 遷移至 services 層
ADR-024 Router 層瘦身 R4: 將業務邏輯從 Router 移出至正確層次。
變更:
- 新增 src/models/webhook.py: AlertPayload + AlertResponse 移至 models 層
- 新增 src/services/alert_analyzer_service.py: AlertAnalyzer (141行) 移至 services 層
- RISK_MAPPING / ACTION_MAPPING / BLAST_RADIUS_MAPPING 對應表
- analyze() 方法含 K8s 資源名稱正規化 (ADR-016)
- webhooks.py: 移除重複定義,改為 import,-243行
Router 層 webhooks.py 已符合 ADR-024 禁止清單規範:
AlertAnalyzer 不再存在於 Router 層。
R4 狀態: #127✅ #128✅ #129✅ #130✅ (全部完成)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-04-01 09:27:23 +08:00 |
|
OG T
|
44840f5e73
|
fix(service): #123 proposal_service.py 修正 key prefix + 移除重複邏輯
ADR-046 修復: proposal_service 使用錯誤 Redis key prefix "incident:"
(brain 使用 "awoooi:incidents:"),導致 R-R2 後 load/persist 失效。
變更:
- _load_incident(): 委派給 IncidentEngineAdapter.get_incident()
(正確 key prefix,含 brain→local 型別轉換)
- _persist_incident(): Redis 部分委派給 brain DualIncidentMemory
透過 local_to_brain() 轉換後儲存 (key prefix 一致)
- 移除 _record_to_incident() 重複邏輯 (已由 IncidentEngineAdapter 處理)
- 移除 INCIDENT_KEY_PREFIX 常數
- 移除 get_redis() 直接依賴
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-04-01 09:11:57 +08:00 |
|
OG T
|
a94bb57d8b
|
feat(types): ADR-046 IncidentConverter + IncidentEngineAdapter
實作 ADR-046 Option B: IncidentConverter 轉換層,解決
BrainIncident (lewooogo-brain) 與 LocalIncident (apps/api) 型別邊界問題。
變更:
- 新增 src/utils/incident_converter.py
- brain_to_local(): BrainIncident → LocalIncident
- local_to_brain(): LocalIncident → BrainIncident
- ESCALATED → MITIGATING 映射 (brain 無 ESCALATED)
- incident_engine.py: 新增 IncidentEngineAdapter 包裝層
- process_signal() / get_incident() 輸出轉換為 LocalIncident
- get_incident_engine() 返回 IncidentEngineAdapter
- incident_memory.py: 加入 brain_to_local import,更新 _record_to_incident 說明
- ADR-046: 標記三個轉換點全部完成
解鎖: #123 proposal_service.py 清理 (下一步)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-03-31 22:47:54 +08:00 |
|
OG T
|
2ba61acf72
|
fix(api): Phase R-R2.2 首席架構師 72/100 P2 修復
P2-01 signal_worker.py: persisted_to_pg 改用 getattr 防 BrainIncident AttributeError
P2-02 IIncidentEngine Protocol: update_incident_status → update_status 對齊 brain 實作
P2-03 config.py USE_NEW_ENGINE: 標記失效 + 回滾路徑更正 (git revert 而非 kubectl)
ADR-046: Option B (IncidentConverter) 決策完成,待實作清單更新
ADR-024: 審查結論 + 正式回滾指令更新
Skill 02: v2.5 版本記錄
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-03-31 22:33:08 +08:00 |
|
OG T
|
d17b67c823
|
fix(api): Phase R-R2.1 修復架構審查 P0+P1 問題
P0-01: IncidentDbAdapter._record_to_incident 返回型別標注為 Any
(實際返回 BrainIncident,非本地 Incident,避免型別誤報)
P0-02: get_incident_engine() 加入 try/except ImportError 保護
(仿照 get_incident_memory() 錯誤處理模式,確保可觀測性)
P1-01: 移除 IncidentMemoryAdapter 死碼 (-170 行 Lua scripts + _ensure_lua_scripts)
(lewooogo-brain 不調用此方法,已確認)
P1-03: IncidentMemoryAdapter.save_incident() 委派給 self._memory
(修復 key prefix 不一致: "incident:" vs "awoooi:incidents:")
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-03-31 22:15:06 +08:00 |
|
OG T
|
c7b3f8f2b3
|
refactor(api): Phase R-R2 移除內嵌重複邏輯 (#121 #122)
- incident_memory.py: 移除 ~480 行 DualIncidentMemory + IIncidentMemory 內嵌版本
保留 IncidentDbAdapter (SQLAlchemy bridge) + get_incident_memory() singleton
- incident_engine.py: 移除 ~405 行 IncidentEngine 舊版內嵌類別
保留 IncidentMemoryAdapter + BlastRadiusAdapter (lewooogo-brain 橋接)
- 全面切換至 lewooogo-brain 套件 (USE_NEW_ENGINE=True 已驗證穩定)
- 測試驗證: 104 passed, 13 skipped (所有 Redis-independent 測試通過)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
|
2026-03-31 22:03:00 +08:00 |
|
OG T
|
cc6b18e3bc
|
fix(phase22): 修復 Telegram 對話三個 Bug (ADR-044)
E2E Health Check / e2e-health (push) Successful in 18s
P0: security_interceptor.py 新增 intercept_telegram() 方法
- 修復 _handle_chat_message 的 AttributeError (致命 Bug)
- 白名單驗證,不需要 Nonce (對話訊息 vs 按鈕回調)
P1: nvidia_provider.py chat() 新增 use_json_mode 參數
- 對話場景預設 False (自然語言回應)
- RCA/分析場景傳入 True (結構化 JSON 輸出)
- openclaw.py RCA 呼叫加上 use_json_mode=True
P2: K8s ConfigMap 啟用 TELEGRAM_ENABLE_POLLING=true
- K8s AWOOOI API 接管 @tsenyangbot Long Polling
- OpenClaw (188) 停止 Telegram,改為純 REST 服務
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-03-31 21:53:09 +08:00 |
|
OG T
|
1f9e94e78d
|
refactor(ai-router): 新增 IAIRouter Protocol (P1 修復)
首席架構師審查 P1 修復:
- 新增 IAIRouter Protocol 支援 DI 測試替換
- 參考 IModelRegistry, IComplexityScorer 實作模式
- 包含 route(), route_sync(), route_tool_calling() 方法簽名
審查評分: 78/100 → 85/100
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
|
2026-03-31 21:23:07 +08:00 |
|
OG T
|
d3c5a93e0f
|
fix(api): bulk-approve BlastRadius 屬性存取錯誤
E2E Health Check / e2e-health (push) Successful in 16s
Type Sync Check / check-type-sync (push) Failing after 2m29s
bug: approval.blast_radius.get("data_impact") → AttributeError
fix: 改為 approval.blast_radius.data_impact (Pydantic model 屬性)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-03-31 19:24:04 +08:00 |
|
OG T
|
e1e3bba296
|
refactor(api): Phase 22 技術債修復 - 業務邏輯分層
E2E Health Check / e2e-health (push) Has been cancelled
P2.3: LearningService.get_learning_summary() 業務邏輯移至 Service 層
- Repository 只提供原始統計數據
- Service 計算 best_action 和 learning_status
P2.6: Playbook similarity 計算邏輯抽取
- 新增 src/utils/similarity.py
- Repository 從 utils 導入,不再定義演算法
2026-03-31 Claude Code (首席架構師技術債修復)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
|
2026-03-31 18:55:06 +08:00 |
|
OG T
|
dd526684ab
|
feat(ai): Phase 22 OpenClaw + Nemotron 協作架構 (ADR-044)
E2E Health Check / e2e-health (push) Successful in 17s
統帥批准實作「仲裁-執行分工」架構:
- OpenClaw = 仲裁者 (Why + Risk Level)
- Nemotron = 執行者 (How + kubectl Command)
新增功能:
- config.py: ENABLE_NEMOTRON_COLLABORATION Feature Flag
- openclaw.py: generate_incident_proposal_with_tools()
- openclaw.py: _call_nemotron_tools() Nemotron 呼叫
- telegram_gateway.py: TelegramMessage Nemotron 欄位
- telegram_gateway.py: format_with_nemotron() 雙區塊格式
- decision_manager.py: 整合協作方法
- proposal_service.py: 整合協作方法
觸發條件:
- LOW 風險 → 僅 OpenClaw
- MEDIUM/HIGH/CRITICAL → OpenClaw + Nemotron 雙軌
首席架構師審查: 83/100 條件通過
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
|
2026-03-31 18:52:53 +08:00 |
|
OG T
|
e7e3fc8e00
|
refactor(api): Phase 22 P2 Protocol 簽名修正 + 缺失方法補齊
E2E Health Check / e2e-health (push) Successful in 16s
- IApprovalRepository.create() 簽名由 ApprovalRequestCreate 改為 dict (與實作一致)
- 補齊 find_by_fingerprint() 和 increment_hit_count() Protocol 方法
2026-03-31 Claude Code (首席架構師 P2 修復)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
|
2026-03-31 16:28:37 +08:00 |
|
OG T
|
31c9117ae7
|
refactor(api): Phase 22 P1 模組化修復 - Router→Service 封裝
E2E Health Check / e2e-health (push) Successful in 24s
修復內容:
1. e2e_network_test.py: 移除 unittest.mock
- 將 16 個 patch.object 改為 pytest monkeypatch
- 符合 feedback_no_mock_testing.md
2. audit_logs.py: Router→Service 層封裝
- 新增 AuditLogService (audit_log_service.py)
- Router 改用 get_audit_log_service()
- 移除直接 Repository 存取
3. incidents.py:463: DEBUG 端點重構
- 移除 get_incident_repository() 直接呼叫
- 完全透過 IncidentService 操作
- 簡化回傳結構
遵循規範:
- Skill 09: Router 層禁止直接外部 API 呼叫
- feedback_lewooogo_modular_enforcement.md: Service 層封裝
- feedback_no_mock_testing.md: 禁止 MagicMock/AsyncMock
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
|
2026-03-31 16:25:00 +08:00 |
|
OG T
|
b94a7800ad
|
fix(approval): 修復 Y/n 簽核按鈕無動作問題 (Phase 22 P1)
E2E Health Check / e2e-health (push) Successful in 17s
根本原因: 前端未傳送 CSRF Token,API 拒絕所有簽核請求
修復內容:
1. live-approval-panel.tsx: 整合 useCSRF hook
- 簽核時帶上 csrfToken 參數
- 拒絕時帶上 csrfToken 參數
- 新增 CSRF 載入/錯誤狀態顯示
2. test_intent_classifier.py: 移除 Mock 違規 (P1)
- 改用 @requires_ollama marker
- 真實 Ollama 整合測試
3. test_terminal_service.py: 移除 Mock 違規 (P1)
- 改用 @requires_database/@requires_k8s markers
- 保留純函數單元測試
遵循規範:
- feedback_no_mock_testing.md: 禁止 MagicMock/AsyncMock
- Phase 20 CSRF Protection: Double Submit Cookie
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
|
2026-03-31 16:16:16 +08:00 |
|
OG T
|
8313a3787b
|
refactor(api): Phase 22 P0 leWOOOgo 模組化修復
E2E Health Check / e2e-health (push) Has been cancelled
Router 層禁止直接 httpx.AsyncClient,抽取到 Service 層:
新增 Services:
- OpenClawHttpService: Error 分析/Code Review/CI 診斷
- GitHubApiService: PR Diff 取得
- HealthCheckService: HTTP/PostgreSQL/Redis 健康檢查
修改 Routers:
- sentry_webhook.py: 使用 OpenClawHttpService
- github_webhook.py: 使用 GitHubApiService + OpenClawHttpService
- health.py: 使用 HealthCheckService
遵循規範:
- Skill 09: Router 層禁止直接外部 API 呼叫
- feedback_lewooogo_modular_enforcement.md
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
|
2026-03-31 16:06:35 +08:00 |
|