OG T
af49a54728
fix(playbook): alert_names 完全匹配時 bypass 相似度門檻
...
CD Pipeline / build-and-deploy (push) Successful in 12m58s
症狀: SentryDown/OllamaDown 告警觸發 incident,但 playbook 搜索
回傳 NO_MATCH,即使 alert_names 完全一致。
根本原因: Jaccard 加權計算中,affected_services 存的是 Prometheus
instance IP (192.168.0.110:9000),而 Playbook 存的是服務名 (sentry),
導致 services 維度得 0,最終 0.35 < min_similarity=0.4。
修正: alert_names 有交集時直接通過,不受其他維度拉低分數影響。
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com >
2026-04-09 12:05:07 +08:00
OG T
79a9a514dd
fix(rules): ADR-064 L1 Redis 分散式鎖防止多 Pod 重複生成規則
...
CD Pipeline / build-and-deploy (push) Has started running
問題: _generating set 是進程級,多 Pod 各自獨立,同一 alertname 可能被
多個 Pod 同時送給 Ollama/Gemini 生成規則
修復: SET NX EX lock_key — 只有第一個 Pod 能取鎖,其他 Pod 直接跳過
降級: Redis 不可用時 fallback 進程級 set(保持原有行為)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com >
2026-04-09 12:03:51 +08:00
OG T
b66263ad36
fix(decision_manager): resolved Incident 不重送 Telegram
...
CD Pipeline / build-and-deploy (push) Has started running
dedup TTL 10分鐘過期後,已 resolve 的 Incident 仍被重新推送
加入狀態檢查,resolved/closed 直接跳過
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com >
2026-04-09 12:00:39 +08:00
OG T
b43e1f1818
feat(rules): L2-2 alerts-unified — 補充 14 條 Prometheus 告警規則 + target_down 自動修復
...
CD Pipeline / build-and-deploy (push) Has been cancelled
新增規則:
- postgresql_down / postgresql_connection_pool / postgresql_slow_queries
- redis_down / ollama_down / minio_down / minio_disk_high / harbor_down
- k3s_node_down / awoooi_api_down / alert_chain_broken / nvidia_circuit_breaker
修正:
- target_down: kubectl_command 從診斷改為自動重啟 exporter (docker restart / systemctl)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com >
2026-04-09 11:49:28 +08:00
OG T
9361fd1fa7
fix(decision_manager): action 不應 strip_placeholders 避免截斷 deployment name
...
CD Pipeline / build-and-deploy (push) Has been cancelled
_strip_placeholders 移除 <...> 導致 kubectl rollout restart deployment/<name>
變成 kubectl rollout restart deployment/,Telegram 顯示建議指令不完整
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com >
2026-04-09 11:45:33 +08:00
OG T
d467fc11be
fix(nemotron): 修復 deployment_name placeholder 問題
...
根因: Nemotron tool calling 收到 target_resource=DockerContainerUnhealthy
(非真實 K8s deployment name),不確定時填 <deployment_name>
修復:
1. prompt 明確標注 deployment_name 必須填入 target_resource
2. 收到 tool call 結果後,偵測 placeholder 並用 target_resource 覆蓋
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com >
2026-04-09 11:44:25 +08:00
OG T
580053394b
fix(web): C4 監控工具 emoji → Lucide icon (feedback_no_emoji_use_icons.md)
...
CD Pipeline / build-and-deploy (push) Has been cancelled
TOOL_EMOJI Record<string> 改為 TOOL_ICON Record<React.ReactNode>
使用 BarChart3/Flame/Telescope/FlaskConical/Activity/GitBranch
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com >
2026-04-09 11:28:53 +08:00
OG T
4a94588766
fix(web): I3 approve/reject API + I4 SIGNOZ_URL env + I5 ErrorsPanel nothing-gray
...
CD Pipeline / build-and-deploy (push) Has been cancelled
- I3: Approve/Reject 按鈕串接 /api/v1/approvals/{id}/sign|reject
- I4: ApmPanel SIGNOZ_URL 改用 NEXT_PUBLIC_SIGNOZ_URL 環境變數
- I5: ErrorsPanel 外框改用 nothing-gray 調色盤 inline style
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com >
2026-04-09 11:20:44 +08:00
OG T
28d2ff704e
fix(web): C1 殘留 i18n — 5 處硬編碼中文改 useTranslations
...
CD Pipeline / build-and-deploy (push) Has been cancelled
- 告警 badge: alertBadge / alertBadgeZero
- 等待確認: awaitingConfirm
- 主機/拓撲 toggle: hostView / topoView
- HOST_CATALOG description 確認未渲染,不需 i18n
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com >
2026-04-09 11:18:05 +08:00
OG T
c5e475121a
fix(telegram): 修復建議指令被截斷 + decision_manager enum string 補正
...
CD Pipeline / build-and-deploy (push) Has been cancelled
根因 1: telegram_gateway.py suggested_action[:35] 剛好截到 deployment/ 後
→ 改為 [:80],完整顯示 kubectl command
根因 2: 舊 Incident proposal_data 存 enum string (RESTART_DEPLOYMENT)
→ decision_manager.py 加入偵測,用規則引擎重新查 kubectl command
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com >
2026-04-09 11:14:30 +08:00
OG T
fb66ecd2a0
refactor(web): Panel 抽取全面完成 — 三個整合頁面解決雙重 AppLayout
...
CD Pipeline / build-and-deploy (push) Has been cancelled
/observability: AppsPanel + ServicesPanel (共 5/5 Tab 完成)
/automation: AutoRepairPanel + NeuralCommandPanel + DriftPanel (3/3)
/operations: DeploymentsPanel + TicketsPanel + CostPanel + ActionLogsPanel + BillingPanel (5/5)
原始頁面全部精簡為 AppLayout + Panel,零雙重 Layout。
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com >
2026-04-09 11:06:57 +08:00
OG T
7934ade3a6
refactor(web): 全部 13 Panel 抽取完成 + 整合頁面雙重 AppLayout 修正
...
CD Pipeline / build-and-deploy (push) Has been cancelled
Panel 抽取 (13 個):
- MonitoringPanel, ApmPanel, ErrorsPanel, AppsPanel, ServicesPanel
- AutoRepairPanel, NeuralCommandPanel, DriftPanel
- DeploymentsPanel, TicketsPanel, CostPanel, ActionLogsPanel, BillingPanel
整合頁面更新 (全部使用 Panel,無雙重 AppLayout):
- /observability: 5 Panel
- /automation: 3 Panel
- /operations: 5 Panel
首席架構師 I2 問題已解決
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com >
2026-04-09 11:05:37 +08:00
OG T
9e10305acc
fix(web): C2 拓撲元件 i18n — 10+ 處硬編碼中文改 useTranslations
CD Pipeline / build-and-deploy (push) Has been cancelled
2026-04-09 11:04:35 +08:00
OG T
7153395267
fix(web): 首席架構師 P0 修正 — i18n 硬編碼 + 效能輪詢
...
CD Pipeline / build-and-deploy (push) Has been cancelled
C1: 首頁 4 Tab 30+ 處硬編碼中文改為 useTranslations
- 新增 dashboard.tabs.* / alertEvents / approve / reject 等 30+ i18n key
- zh-TW + en 雙語同步
C3: automation/operations Loading 改用 LobsterLoading (i18n)
I1: 100ms setInterval 改為 popstate + 1s 低頻備援 (效能 10x 改善)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com >
2026-04-09 11:01:07 +08:00
OG T
5ea6c3fb91
feat: alert_operation_log 查詢 API + 前端頁面 (Sprint 5.2)
...
CD Pipeline / build-and-deploy (push) Has been cancelled
後端:
- 新增 list_recent() 分頁方法 (alert_operation_log_repository)
- 新增 /api/v1/alert-operation-logs GET + /stats 端點
- main.py 註冊 alert_operation_logs_v1.router
前端:
- /alert-operation-logs 頁面,18 種 event_type 顏色標記
- 分頁、event_type 篩選、incident_id 篩選
- 24h 統計卡片 (總數/護欄攔截/自動修復/已解決)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com >
2026-04-09 10:57:40 +08:00
OG T
428e66c111
fix(arch-review): 首席架構師審查 S1×3 S2×3 S3×3 全修復 + ADR-064
...
CD Pipeline / build-and-deploy (push) Has been cancelled
S1 Critical:
- S1-1: asyncio 觸發移至 _call_with_fallback async 上下文,移除 sync 中的 get_event_loop()
- S1-2: _append_rule_to_yaml 加 textwrap.dedent() 正規化 LLM 輸出縮排
- S1-3: _matches() 對 alertname=["*"] 直接回傳 False,防意外命中
S2 Major:
- S2-1: auto_generate_rule() 改為 DI 參數注入 (ollama_url/model/gemini_api_key),移除 import settings
- S2-4: _generate_mock_response docstring 澄清為規則引擎生產路徑,非假數據
- S2-5: suggested_action .strip() 防空白字串繞過 or
S3 Minor:
- S3-2: priority 上界 min(next, 890)
- S3-3: alertname sanitize re.sub([{}]) 防 format KeyError
- S3-4: model_registry.py 最後修改時間戳更新
文件:
- ADR-064: Alert Rule Engine YAML 驅動 + AI 自動學習
- Skills 02: 告警規則引擎 DI 規範 + asyncio 禁止事項
- Skills 03: _generate_mock_response 語意澄清 + 規則引擎降級流程
- LOGBOOK: 本次 Session 完整記錄
2026-04-09 ogt: 首席架構師審查修正
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com >
2026-04-09 10:52:40 +08:00
OG T
11fc2860cf
refactor(web): ErrorsPanel 抽取 — /observability 3 個 Tab 已無雙重 Layout
CD Pipeline / build-and-deploy (push) Has been cancelled
2026-04-09 10:51:59 +08:00
OG T
22fa6ea413
refactor(web): ApmPanel 抽取 — /observability 的 monitoring+apm 兩個 Tab 無雙重 Layout
CD Pipeline / build-and-deploy (push) Has been cancelled
2026-04-09 10:49:39 +08:00
OG T
4b3fdd82f9
fix(api): incidents list 不再同步等待 AI 決策 (效能修復)
...
CD Pipeline / build-and-deploy (push) Has been cancelled
問題: GET /api/v1/incidents 對每個 incident await AI 分析 (120-180s)
多個活躍 incident 時 timeout 乘積爆炸 → 前端完全無法載入
修復:
- list endpoint 只查 Redis 已快取的決策 token (立即返回)
- 無快取時回 decision=null,背景 fire-and-forget 觸發 AI
- 前端對有興趣的 incident 再 GET 單筆端點取得決策結果
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com >
2026-04-09 10:49:30 +08:00
OG T
f05a391d02
feat(web): panels/index.ts 匯出 + Panel 抽取進度標記
CD Pipeline / build-and-deploy (push) Has been cancelled
2026-04-09 10:42:30 +08:00
OG T
770667eed4
refactor(web): MonitoringPanel 抽取 — 解決 /observability 雙重 AppLayout
CD Pipeline / build-and-deploy (push) Has been cancelled
2026-04-09 10:40:07 +08:00
OG T
89da2d24be
fix(model-registry): fallback config 更新為 deepseek-r1:14b + gemma3:4b
...
CD Pipeline / build-and-deploy (push) Successful in 13m20s
- model_registry._get_default_config: ollama summary llama3.2:3b → gemma3:4b
- model_registry._get_default_config: ollama default/rca → deepseek-r1:14b
- 修正 test_smart_router::test_simple_context 失敗 (斷言 gemma3:4b)
- alert_rule_engine: 移除 asyncio/time unused import
- 2026-04-09 ogt
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com >
2026-04-09 09:52:47 +08:00
OG T
c26c4030e4
feat(web): /topology 升級為 React Flow 完整版 (串接真實 dashboard API)
CD Pipeline / build-and-deploy (push) Has been cancelled
2026-04-09 09:49:31 +08:00
OG T
71437db0e9
feat(rule-engine): 自動規則生成 — generic_fallback 觸發 AI 學習
...
CD Pipeline / build-and-deploy (push) Successful in 11m25s
流程:
1. 告警命中 generic_fallback 規則
2. 背景觸發 auto_generate_rule()
3. Ollama (deepseek-r1:14b) 生成 YAML 規則片段
4. Ollama 失敗 → Gemini 備援
5. 驗證格式 → append alert_rules.yaml → 清除 lru_cache
6. 下次同類告警直接命中專屬規則,不再走兜底
去重: 同一 alertname 進程內只生成一次
手寫規則 priority 1-499,AI 生成 500-899,兜底 999
2026-04-09 ogt: AI 自學規則引擎
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com >
2026-04-09 09:20:33 +08:00
OG T
db02eb41d0
fix(docker): COPY alert_rules.yaml 進容器
...
CD Pipeline / build-and-deploy (push) Has been cancelled
規則引擎從 ./alert_rules.yaml 載入,Dockerfile 漏了 COPY
2026-04-09 ogt: fix
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com >
2026-04-09 09:12:42 +08:00
OG T
030f4f7c32
feat(web): 首頁基礎架構加入拓撲圖 Toggle (主機/拓撲切換,串接真實 API)
CD Pipeline / build-and-deploy (push) Has been cancelled
2026-04-09 09:12:31 +08:00
OG T
d1ede7f989
feat(openclaw): 告警規則引擎 — alert_rules.yaml 取代硬編碼 if/elif
...
CD Pipeline / build-and-deploy (push) Has been cancelled
- 新增 alert_rules.yaml: 6 條規則 (docker/target_down/oom/cpu/5xx/crash) + 通用兜底
- 新增 alert_rule_engine.py: YAML 載入、匹配邏輯、變數填充
- openclaw.py _generate_mock_response: 重構為呼叫規則引擎 (v8.0)
- 新增規則只需修改 YAML,重啟 Pod 即可,不需改代碼
- 2026-04-09 ogt: 架構重構
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com >
2026-04-09 09:05:23 +08:00
OG T
1e1f24c561
fix(test): ComplexityScorer 模型名稱更新 llama3.2:3b → gemma3:4b
CD Pipeline / build-and-deploy (push) Has been cancelled
2026-04-09 09:01:59 +08:00
OG T
3abc7c2f85
fix(openclaw): DockerContainerUnhealthy + TargetDown 專屬規則匹配
...
CD Pipeline / build-and-deploy (push) Has been cancelled
- DockerContainerUnhealthy: ssh docker inspect + docker restart,含 healthcheck 指令驗證
- TargetDown / IP:port instance: ssh 確認 exporter 存活
- 修正 target 混用 alertname 作為 deployment 名稱的問題
- alertname/labels 從 alert_context 提取供規則判斷
- 2026-04-09 ogt: 新增兩條專屬規則
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com >
2026-04-09 09:00:31 +08:00
OG T
4b6f14d9a1
fix(webhook): alertmanager 路徑 suggested_action 改用 kubectl_command
...
CD Pipeline / build-and-deploy (push) Failing after 1m43s
- 1399 行: suggested_action.value (RESTART_DEPLOYMENT) → kubectl_command
- 與 /alerts 路徑 887 行保持一致
- 修正 Telegram 顯示「kubectl rollout restart deployment/」後面空白的問題
- 2026-04-09 ogt: bug fix
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com >
2026-04-09 08:57:56 +08:00
OG T
65e1edb0ad
feat(web): OpenClaw 風格龍蝦 SVG + 三色狀態燈號 + 測試修正
...
CD Pipeline / build-and-deploy (push) Failing after 1m39s
前端:
- OpenClawLobster 全新 SVG (參考 dashboardicons.com/icons/openclaw)
圓潤身體 + 大眼睛 + 鉗子 + 觸角 + 微笑 + 小腳
- 三色版本: red(異常/預設) / green(健康) / yellow(警告)
- LobsterLoading 改用新 SVG
測試修正:
- test_nemotron_failure_still_returns_proposal: func_body 截取 5000→10000
原因: 函數超過 5000 字元,導致 rfind 找不到最後的 return
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com >
2026-04-09 08:55:21 +08:00
OG T
f32b077336
fix(models): 更新 Ollama 設定 — M1 Pro + deepseek-r1:14b
...
CD Pipeline / build-and-deploy (push) Failing after 1m36s
E2E Health Check / e2e-health (push) Successful in 44s
- endpoint: 188 → 111 (M1 Pro, 40+ tok/s)
- rca/default model: qwen2.5:7b-instruct → deepseek-r1:14b (SRE最強推理)
- summary model: llama3.2:3b → gemma3:4b (快速摘要)
- timeout: 90s → 120s (deepseek-r1:14b 實測最慢 54s)
- version: 1.1.0 → 1.2.0
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com >
2026-04-08 22:59:53 +08:00
OG T
d80153bdce
fix(openclaw): NIM 完全失敗後 fallback 到 Gemini 產生執行方案
...
CD Pipeline / build-and-deploy (push) Failing after 1m34s
NIM tool calling 多次 timeout 後,不再顯示空白執行方案,
改由 Gemini 代理產生 kubectl 操作指令(JSON 解析)。
只有 NIM 完全失敗才觸發,符合統帥「必須等到有回應」原則。
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com >
2026-04-08 22:55:25 +08:00
OG T
c669069427
feat: 小龍蝦載入動畫 + HostAggregator 效能優化
...
CD Pipeline / build-and-deploy (push) Has been cancelled
前端:
- LobsterLoading 共用元件 (Q版龍蝦上下浮動 + 文字提示)
- 替換首頁所有「載入中...」為小龍蝦動畫
- PageTabs 骨架屏也換成龍蝦
後端:
- TCP probe timeout: 3.0s → 1.5s
- HTTP probe timeout: 5.0s → 2.0s
- 30 秒記憶體快取 (避免 unreachable 主機拖慢前端)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com >
2026-04-08 22:44:24 +08:00
OG T
6f475000f6
fix(db): alert_operation_log.event_type String→PgEnum (create_type=False)
...
CD Pipeline / build-and-deploy (push) Has been cancelled
修正 DatatypeMismatchError: DB 欄位為 native enum alert_event_type,
SQLAlchemy model 誤用 String(50),導致 alert_operation_log 寫入失敗。
使用 PgEnum(create_type=False) 讓 SQLAlchemy 映射已存在的 DB enum,
不重建型別。18 個 event_type 值與 M-003 migration 一致。
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com >
2026-04-08 22:42:36 +08:00
OG T
86ac6ed028
perf(api): HostAggregator 效能優化 — probe timeout 縮短 + 30 秒記憶體快取
2026-04-08 22:42:01 +08:00
OG T
2a6977343a
fix(telegram): 補傳 incident_id 至所有 _push_to_telegram_background 呼叫點
...
CD Pipeline / build-and-deploy (push) Has been cancelled
規則匹配有六顆按鈕但 Ollama/OpenClaw 路徑只有三顆,根因是
alertmanager 和 fallback 路徑呼叫 _push_to_telegram_background 時
未傳 incident_id,導致詳情/重診/歷史按鈕不顯示。
- _push_to_telegram_background: 新增 incident_id 參數
- alertmanager 主路徑: 補傳 incident_id
- alertmanager fallback 路徑: 存回傳值並補傳
- /alerts 路徑: 尚無 incident,明確傳空字串
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com >
2026-04-08 22:40:22 +08:00
OG T
ef17720dfe
fix(web): 首頁 Tab 切換同步修正 — activeTabId 追蹤 URL query 變化
CD Pipeline / build-and-deploy (push) Has been cancelled
2026-04-08 22:36:39 +08:00
OG T
286df4b3e3
fix(web): Sidebar section label 修正 — main 不顯示標題,legacy 用分隔線
CD Pipeline / build-and-deploy (push) Has been cancelled
2026-04-08 22:33:17 +08:00
OG T
9188e499cc
feat(web): Sprint 5 Phase 3+4 — 整合頁面完成 + 舊路由保留並存
...
CD Pipeline / build-and-deploy (push) Has been cancelled
Phase 3: 5 個整合頁面 (lazy import 現有內容)
Phase 4: 舊路由暫保留獨立可用,新舊並存
- /monitoring 仍可訪問 (原始頁面)
- /observability?tab=monitoring (整合入口)
- 避免 redirect 循環問題
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com >
2026-04-08 22:10:46 +08:00
OG T
1413804378
feat(web): Sprint 5 Phase 3 — 5 個整合頁面 + Sidebar 路由更新
...
新增頁面:
- /observability: 服務監控 + APM + 錯誤追蹤 + 應用 + 服務目錄 (5 Tab)
- /automation: 自動修復 + 神經指揮 + Drift (3 Tab)
- /operations: 部署 + 工單 + 成本 + 行動日誌 + 計費 (5 Tab)
- /security-compliance: 安全 + 合規 (2 Tab)
- /knowledge: 知識庫
所有 Tab 用 React.lazy + Suspense 載入現有頁面內容
零假數據: 每個 Tab 都是現有真實頁面
Sidebar 路由更新指向新整合頁面
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com >
2026-04-08 22:09:53 +08:00
OG T
8b5db2f58e
feat(infra): 切換 Ollama 到 M1 Pro 192.168.0.111 + NetworkPolicy 更新
...
CD Pipeline / build-and-deploy (push) Has been cancelled
- OLLAMA_URL: 188 → 111 (M1 Pro, 40+ tok/s vs 0.45 tok/s)
- OPENCLAW_DEFAULT_MODEL: qwen2.5:7b-instruct → deepseek-r1:14b (SRE最強推理)
- OPENCLAW_TIMEOUT: 90s → 120s (deepseek-r1:14b 實測最慢 54s)
- NetworkPolicy v1.3: 新增 192.168.0.111:11434 egress,移除 188 的 Ollama port
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com >
2026-04-08 22:05:14 +08:00
OG T
c9f1bcd122
fix(api): service_registry 安全降級 — Docker 無 YAML 時不 crash,fallback AUTO
CD Pipeline / build-and-deploy (push) Successful in 11m37s
2026-04-08 21:47:38 +08:00
OG T
db4b28c49d
fix(ci): 強制觸發 CD — service_registry.py Docker 路徑修正已包含於 1f9eea5
...
CD Pipeline / build-and-deploy (push) Failing after 8m45s
Pod CrashLoopBackOff: IndexError parents[5]
修復: _find_registry_path() 安全搜尋 (parents[4]/parents[3]/絕對路徑)
1f9eea5 已修復但未觸發 CI,此 commit 強制重新 build
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com >
2026-04-08 21:37:49 +08:00
OG T
1f9eea5b74
fix(api): service_registry.py Path 索引修正 — 相容 Docker 容器環境
CD Pipeline / build-and-deploy (push) Has been cancelled
2026-04-08 21:34:40 +08:00
OG T
f7c1c46f96
chore: 觸發 CD 部署 Sprint 5 前端
CD Pipeline / build-and-deploy (push) Failing after 10m29s
2026-04-08 21:23:13 +08:00
OG T
14cb015826
fix(openclaw): Nemotron 重試邏輯 + exhausted log key (未提交的修改)
...
CD Pipeline / build-and-deploy (push) Has been cancelled
- generate_incident_proposal_with_tools: 單次 try/except → 2次重試迴圈
- 失敗 log key: nemotron_collaboration_failed → nemotron_collaboration_exhausted
- 失敗時 nemotron_enabled=True (讓統帥看到失敗狀態)
- _call_nemotron_tools: timeout 超時改為拋出異常(讓外層重試)
- 這是之前 Session 的本地修改,修正測試與實際實作不一致問題
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com >
2026-04-08 21:16:34 +08:00
OG T
d276b39bd5
feat(web): Sprint 5 Phase 2 — React Flow 拓撲圖元件 (串接真實 dashboard API)
...
新增 7 個檔案:
- ServiceTopology.tsx: 主元件 (ReactFlow + Controls + MiniMap + 空狀態)
- GroupNode.tsx: 群組節點 (memo + 收合摘要 + CPU/RAM 指標)
- ServiceNode.tsx: 服務節點 (memo + 狀態燈 + 端口 + 延遲)
- TopologyEdge.tsx: 自定義邊線 (漸層 + 虛線)
- useTopologyData.ts: 從 dashboard store 讀取真實資料 → nodes/edges
- index.ts: 匯出
資料來源: useDashboardStore → hosts[] (HostAggregator 真實 TCP/HTTP 探測)
依賴關係: 靜態定義 (對應 ConfigMap 環境變數)
零假數據: 所有節點資料來自真實 API
TypeScript: 零新增錯誤
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com >
2026-04-08 21:14:29 +08:00
OG T
eaa6102e69
feat(web): Sprint 5 Phase 1.3 — Sidebar 精簡 25→6+2+經典
...
導航重組 (統帥批准 2026-04-08):
- 指令中心 / → 整合: 儀表板+授權+告警+報表 (4 Tab)
- 可觀測性 → 整合: 監控+APM+錯誤+應用+服務 (5 Tab)
- 自動化 → 整合: 自動修復+神經指揮+Drift (3 Tab)
- 營運 → 整合: 部署+工單+成本+行動日誌+計費 (5 Tab)
- 安全合規 → 整合: 安全+合規 (2 Tab)
- 知識 → 知識庫
- Legacy: 經典 AI 中心 (/classic)
- 底部: 終端 + 設定
i18n: zh-TW + en 新增 7 個導航 key
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com >
2026-04-08 21:10:11 +08:00
OG T
b380b6a34c
fix(ci): 修正 nemotron 測試函數體截斷 5000→10000 字元
...
CD Pipeline / build-and-deploy (push) Has been cancelled
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com >
2026-04-08 21:09:19 +08:00