OG T
|
e93a50a4b4
|
feat(pages): 全部 ComingSoon 頁面升級為真實 UI — 串接真實 API / 空狀態頁面
CD Pipeline / build-and-deploy (push) Successful in 6m47s
- services/topology: 串接 /api/v1/dashboard,顯示服務清單表格與主機拓撲卡片 grid
- notifications: 串接 /api/v1/notifications/channels,404 時顯示空列表
- reports: 串接 /api/v1/stats/incident-summary + /api/v1/stats/resolution-stats,顯示統計卡片
- apm: 乾淨空狀態頁(SignOz 待整合)
- apps/tickets/users/deployments: 空列表表格結構
- billing/compliance/cost/security: 空狀態卡片結構
- help: 靜態系統版本資訊頁
- zh-TW.json + en.json: 新增所有頁面 i18n key(零 hardcode 字串)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-04-02 23:49:24 +08:00 |
|
OG T
|
6266a4fc01
|
fix(test): 更新 AIProviderEnum 測試 — NVIDIA → NEMOTRON (Phase 24 B3)
CD Pipeline / build-and-deploy (push) Successful in 7m6s
- test_nvidia_provider_in_router: 改為驗證 NEMOTRON enum
- test_tool_calling_route: 改為期望 NEMOTRON provider
- test_existing_routing_not_affected: 排除 NEMOTRON (非一般路由)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-04-02 23:39:46 +08:00 |
|
OG T
|
e9a1ac6276
|
fix(ui): 對齊 figma-v2 設計稿 — IncidentCard + OpenClawPanel 視覺修正
CD Pipeline / build-and-deploy (push) Failing after 35s
IncidentCard:
- 背景 #fff、圓角 12px、頂邊條 4px(對齊設計稿)
- P1 嚴重度色修正為 #F59E0B(amber,非 orange)
- Severity badge 改為 4px 圓角 uppercase 樣式
- Impact 指標列移除灰底方塊,改為細邊框分隔線
- AI 提案按鈕改為全寬居中橙色風格
OpenClawPanel:
- 移除多餘 rounded-xl/backdrop/border(由父層卡片容器提供)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-04-02 23:36:59 +08:00 |
|
OG T
|
97d86861ed
|
fix(ai_router): C1 修復 — AIProviderEnum 對齊 Registry 實際 Provider 名稱
CD Pipeline / build-and-deploy (push) Failing after 37s
問題: AIProviderEnum.NVIDIA = "nvidia" 在 Registry 無對應 Provider
OpenClawNemoProvider.name = "openclaw_nemo"
NemotronProvider.name = "nemotron"
→ 高複雜度/Tool Calling 路由永遠 skip,靜默 fallback 到 Gemini/Ollama
修復:
- 新增 OPENCLAW_NEMO = "openclaw_nemo" (一般推理, via .188 → NVIDIA NIM)
- 新增 NEMOTRON = "nemotron" (Tool Calling, direct NVIDIA NIM)
- 移除 NVIDIA = "nvidia" (Registry 無對應)
- 規則 4 (複雜度>=4/HIGH風險): NVIDIA → OPENCLAW_NEMO
- route_tool_calling: NVIDIA → NEMOTRON
- Rate Limiter check: "nvidia" → "openclaw_nemo"
- _full_fallback_chain: OPENCLAW_NEMO 首位
- _tool_calling_fallback_chain: NEMOTRON 首位
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-04-02 23:31:31 +08:00 |
|
OG T
|
a3f02888a1
|
feat(ui): 加入 chibi 龍蝦游泳列 + 主頁卡片式佈局對齊設計稿
CD Pipeline / build-and-deploy (push) Has been cancelled
- Metrics Strip 頂部加入龍蝦游泳動畫列
- 主體 Feed 和 Right Panel 改為圓角卡片式(背景白/陰影)
- Section header 加入橘點裝飾,對齊 figma-v2 設計稿
- 所有資料串接真實 API,無假資料
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-04-02 23:31:01 +08:00 |
|
OG T
|
ef5b1ab85a
|
fix(knowledge-base): 串接 NEXT_PUBLIC_API_URL 取代相對路徑
CD Pipeline / build-and-deploy (push) Successful in 7m6s
- /api/v1/knowledge 改用 process.env.NEXT_PUBLIC_API_URL 前綴
- 確保 Docker build 後能正確連到後端 API,不再打到 Next.js app server
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-04-02 23:19:14 +08:00 |
|
OG T
|
2d87eca5f6
|
fix(ci): 移除 e2e-health push 觸發 — 根治「每 commit 兩個 run」問題
CD Pipeline / build-and-deploy (push) Has been cancelled
根本原因:
cd.yaml + e2e-health.yaml 都監聽 push main
→ 每次 push 產生兩個 run,互相 cancel,code commit 被跳過
解法:
e2e-health.yaml 移除 push trigger,只保留排程(每日00:00)和手動觸發
CD 本身已有 smoke test,E2E 不需要每次 push 重複跑
Co-Authored-By: Claude Code <noreply@anthropic.com>
|
2026-04-02 23:17:31 +08:00 |
|
OG T
|
cde61b06ae
|
fix(ci): CD 改搶佔模式 — cancel-in-progress: true
CD Pipeline / build-and-deploy (push) Has been cancelled
E2E Health Check / e2e-health (push) Successful in 17s
問題: 多個 commit 快速推版時排隊堆積;docker build 卡住阻塞整條 queue
根因: cancel-in-progress:false 讓每個 commit 都排隊等,新的無法取消舊的
修復: cancel-in-progress:true — 新 push 立即取消舊 build,只部署最新 commit
安全: concurrency group 保證同時只有一個 job 跑,kubectl rollout status 防半部署
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-04-02 23:16:24 +08:00 |
|
OG T
|
1e1d7e34cd
|
fix(ci): 加入 timeout-minutes:45 防止 CD job 無限卡住
CD Pipeline / build-and-deploy (push) Waiting to run
E2E Health Check / e2e-health (push) Successful in 18s
問題: task 288 卡住 71 分鐘 (docker build/push Harbor 網路問題)
影響: 後續 task 排隊無法執行
修復: job 超過 45 分鐘自動 fail,下次 push 重新觸發
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-04-02 23:15:05 +08:00 |
|
OG T
|
58002e6bf4
|
feat(phase24-b3): NemotronProvider 抽取 + incident-card 重構
E2E Health Check / e2e-health (push) Has been cancelled
CD Pipeline / build-and-deploy (push) Has been cancelled
Phase 24 B3:
- 新增 ai_providers/nemotron.py: NemotronProvider 封裝 K8s Tool Calling
搬移自 openclaw.py _call_nemotron_tools (L1623-1785)
capabilities=tool_calling, privacy_level=cloud
- ai_router.py: 加入 NemotronProvider 到 Registry
- ai_providers/__init__.py: 匯出 NemotronProvider
Phase R-UI2 (架構師 Warning):
- incident-card.tsx: 抽取 useApprovalAction hook
handleApprove/handleReject 60行重複邏輯 → 共用 hook
行為完全不變,維護性提升
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-04-02 23:12:42 +08:00 |
|
OG T
|
5a8aae89c4
|
fix(phase24): 首席架構師 Review C1/C2/C3/I4 修復
CD Pipeline / build-and-deploy (push) Successful in 7m12s
E2E Health Check / e2e-health (push) Successful in 18s
C1 (P0): AIRouterExecutor.execute() 補 Langfuse Trace (D5)
- 建立 langfuse_trace("ai_router_execute") 包住整個執行鏈
- 成功時記錄 generation (model/input/output/tokens/cost)
- prod 所有 AI 呼叫現在有 LLMOps 追蹤
C2 (P0): 絞殺者改為呼叫 AIRouter.route() 智慧路由
- 先取得 RoutingDecision (意圖分類 + 複雜度評分)
- provider_order 從 selected_provider + fallback_chain 動態生成
- D1 意圖路由矩陣、D7 隱私保護 (DIAGNOSE 強制 local) 生效
C3 (P1): 型別標注 typo 修復
- AIProviderEnumEnum → AIProviderEnum
- AIProviderEnumProtocol → AIProviderProtocol
I4 (P1): interfaces.py AIProvider Protocol 補 close() 定義
S1: ai_router.py 模組版本標頭更新至 v4.0
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-04-02 21:47:06 +08:00 |
|
OG T
|
9d00b0389e
|
fix(ci): CD path filter — 只有 apps/k8s/workflows 變更才觸發部署
E2E Health Check / e2e-health (push) Has been cancelled
CD Pipeline / build-and-deploy (push) Has been cancelled
問題: docs/memory/ADR commit 也觸發 CD,擠掉 code commit 的 run
導致線上版本 (28bd06d) 落後 main (2d5f1a7) 6個 commit
解法: push paths filter,排除不影響部署的路徑
workflow_dispatch 手動觸發永遠可用
Co-Authored-By: Claude Code <noreply@anthropic.com>
|
2026-04-02 21:43:27 +08:00 |
|
OG T
|
2d5f1a71ad
|
chore(observability): ClickHouse TTL 設定完成 — Phase O 全驗收
CD Pipeline / build-and-deploy (push) Has been cancelled
E2E Health Check / e2e-health (push) Has been cancelled
signoz_logs: 30天 (已內建 _retention_days DEFAULT 30)
signoz_metrics 8個表: 233280000s(2700天) → 7776000s(90天)
- samples_v4, samples_v4_agg_5m, samples_v4_agg_30m
- exp_hist, time_series_v4, time_series_v4_6hrs
- time_series_v4_1day, time_series_v4_1week
Phase O 驗收清單全部打勾 ✅
Co-Authored-By: Claude Code <noreply@anthropic.com>
|
2026-04-02 21:38:39 +08:00 |
|
OG T
|
ba4ee46514
|
fix(ui): 架構師 Review 修復 — i18n/keyframe/型別/版面
E2E Health Check / e2e-health (push) Has been cancelled
CD Pipeline / build-and-deploy (push) Has been cancelled
Critical:
- flow-pipeline.tsx: 移除 4 個重複 lobster-bob keyframe,統一在父元件注入
修正 isResolved 路由邏輯,保留嚴重度視覺識別 (P0 resolved 仍用 StyleA)
- incident-card.tsx: 修復 4 個硬編碼中文字串 (affectedServices/signalCount/statusLabel/aiProposal)
新增對應 i18n key 到 zh-TW.json + en.json
Warning:
- page.tsx: MetricItem type 提升至 module scope,pendingApprovals null 安全檢查
Metrics Strip 移除固定 height:68px 改為 auto + padding:8px
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-04-02 21:36:51 +08:00 |
|
OG T
|
08f73dfce8
|
docs: Phase O-5 Wave 5.4 告警鏈路 E2E 驗證 Runbook
E2E Health Check / e2e-health (push) Has been cancelled
CD Pipeline / build-and-deploy (push) Has been cancelled
- 架構圖、手動測試步驟、smoke test 清單
- generate_monitoring.py 用法說明
- 已知問題豁免清單、回滾指令
- 首次驗收記錄 2026-04-02 8/8
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-04-02 21:34:43 +08:00 |
|
OG T
|
234f7febd0
|
feat(ci): Phase O-5 Wave C.2 加入 monitoring coverage check step
CD Pipeline / build-and-deploy (push) Has been cancelled
E2E Health Check / e2e-health (push) Has been cancelled
- cd.yaml 新增 Monitoring Coverage Check step (generate_monitoring.py --check)
- continue-on-error: true — 不阻塞部署
- Telegram 通知加入 📊 Monitoring 覆蓋率狀態
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-04-02 21:33:59 +08:00 |
|
OG T
|
827923b9b9
|
feat(monitoring): Phase O-5 Wave C.1 generate_monitoring.py 自動發現
E2E Health Check / e2e-health (push) Has been cancelled
CD Pipeline / build-and-deploy (push) Has been cancelled
- 查詢 Prometheus targets API 取得全量 scrape 狀態
- 10 個預期服務覆蓋率計算 (門檻 70%)
- 已知 DOWN targets 豁免清單 (不影響健康判斷)
- --json 機器可讀輸出 / --check CI 模式 (exit 1 if coverage < threshold)
- 首次執行: 100% 覆蓋率,無真實問題
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-04-02 21:33:28 +08:00 |
|
OG T
|
28bd06d7b3
|
feat(homepage): Metrics Strip 7指標視覺強化 + 真實資料串接
CD Pipeline / build-and-deploy (push) Has been cancelled
E2E Health Check / e2e-health (push) Has been cancelled
- 新增 podHealth/allRunning i18n key (zh-TW + en)
- Metrics Strip: 6個指標全部串接真實 API
- 活躍事件: incidents count + P0 badge
- 服務健康: dashboard services healthy/total + RPS sparkline
- 待簽核: dashboard pendingApprovals + 橘色 badge
- 自動處置率: incidents resolved rate + error rate sparkline
- MTTR 均值: incidents resolved avg duration
- POD 健康: dashboard services up/total + 顏色狀態
- Right panel 固定 530px 寬度 (55/45 比例)
- 禁止假數據: 無 API 資料時顯示 "--"
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-04-02 21:27:59 +08:00 |
|
OG T
|
48c65756da
|
chore(config): USE_AI_ROUTER=true 寫入 ConfigMap (Phase 24 B2)
E2E Health Check / e2e-health (push) Has been cancelled
CD Pipeline / build-and-deploy (push) Has been cancelled
防止下次 CD deploy 覆蓋 kubectl set env 的設定。
B2 觀察期 48h, 截止 2026-04-04 18:40 台北時間。
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-04-02 21:26:53 +08:00 |
|
OG T
|
3f339110dd
|
fix(observability): 同步 .188 實際部署調整至 repo
CD Pipeline / build-and-deploy (push) Has been cancelled
E2E Health Check / e2e-health (push) Has been cancelled
與原始計畫的差異:
1. MinIO Bearer Token 認證
- 原計畫: MINIO_PROMETHEUS_AUTH_TYPE=public (此版本不支援)
- 實際: mc admin prometheus generate 產生 Bearer Token
- 更新: prometheus-config-phase-o.yaml 加入 bearer_token
2. remote_write 廢棄 → OTEL Collector Prometheus scrape
- 原計畫: Prometheus remote_write → SigNoz OTEL /api/v1/write
- 實際: SigNoz OTEL Collector 不支援 Prometheus remote_write 格式 (404)
- 改用: OTEL Collector prometheus receiver 直接 scrape node-exporter + kube-state-metrics
- 新增: ops/signoz/otel-collector-config-phase-o.yaml (版本控管副本)
3. ADR-053 驗收清單更新為實際結果
Co-Authored-By: Claude Code <noreply@anthropic.com>
|
2026-04-02 21:23:47 +08:00 |
|
OG T
|
93e3aa6811
|
feat(ui): 四種嚴重度管線動畫 + WoooClaw 命名更新
E2E Health Check / e2e-health (push) Has been cancelled
CD Pipeline / build-and-deploy (push) Has been cancelled
- flow-pipeline.tsx: 新增 severity prop,四種管線樣式
- P0 → Style A: 脈衝光波 + 流動光效 (#cc2200)
- P1 → Style B: 進度條,龍蝦站在進度端點 (#F59E0B)
- P2 → Style C: 卡片步驟,龍蝦浮在 active 卡片上方 (#4A90D9)
- P3 → Style D: 時間軸,虛線流動動畫 (#22C55E)
- incident-card.tsx: FlowPipeline 傳入 severity={sev}
- openclaw-panel.tsx: NemoClaw→WoooClaw, OpenClaw Pipeline→WoooClaw Pipeline
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-04-02 21:18:22 +08:00 |
|
OG T
|
04978995c1
|
fix(metrics): 實際呼叫 record_alert_chain_success (Wave A.5)
CD Pipeline / build-and-deploy (push) Successful in 6m47s
E2E Health Check / e2e-health (push) Successful in 17s
alert_chain_last_success_timestamp 指標已定義但從未被 set。
在 alertmanager_webhook 兩個主要成功路徑呼叫 record_alert_chain_success():
- CI/CD 告警成功處理後
- LLM 分析完成後
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-04-02 20:10:58 +08:00 |
|
OG T
|
f5b8738185
|
fix(wave-a): Wave A 告警鏈路驗收修復
CD Pipeline / build-and-deploy (push) Has been cancelled
E2E Health Check / e2e-health (push) Has been cancelled
- sentry_webhook: 加入 GET /health endpoint (smoke test 探測用)
- smoke_test: alertmanager 路徑改為 /webhooks/health (已存在)
- smoke_test: Prometheus URL 改為正確的 110:9090
- smoke_test: Alert chain metric 標記 critical=False (初始化期正常)
Wave A.6 smoke test 現在 6/8 → 7/8 checks pass (sentry health deploy 後 8/8)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-04-02 20:08:26 +08:00 |
|
OG T
|
5a7919f55c
|
fix(test): AIProvider → AIProviderEnum (Phase 24 C1 rename fix)
CD Pipeline / build-and-deploy (push) Successful in 7m11s
E2E Health Check / e2e-health (push) Successful in 16s
C1 修復 (3ad7b60) 重命名 AIProvider Enum 為 AIProviderEnum
test_nvidia_provider.py 未同步更新,導致 CD 測試失敗。
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-04-02 19:38:04 +08:00 |
|
OG T
|
9afb518ea6
|
fix(ui): 修復事件卡片溢出框 + 基礎架構資料欄位錯誤對應
CD Pipeline / build-and-deploy (push) Failing after 49s
E2E Health Check / e2e-health (push) Successful in 21s
- incident-card: AI提案按鈕 width 100% + margin 造成右側懸浮框,改為 calc(100%-20px)
- page.tsx: useHosts() 返回 Host[] 但直接傳入 HostGrid 期望的 HostInfo[],
補上 mapper (name→hostname, metrics.cpu_percent→cpuPct, service.status→healthy)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-04-02 19:01:07 +08:00 |
|
OG T
|
9c01ed85a9
|
chore: trigger CD rebuild for Phase 24 (3e4612f not yet built)
CD Pipeline / build-and-deploy (push) Failing after 35s
E2E Health Check / e2e-health (push) Successful in 18s
|
2026-04-02 18:32:39 +08:00 |
|
OG T
|
3e4612f259
|
docs(observability): ADR-053 SigNoz 統一架構 + Phase O 驗收
CD Pipeline / build-and-deploy (push) Failing after 36s
E2E Health Check / e2e-health (push) Successful in 16s
- 新增 ADR-053: 可觀測性統一架構決策記錄
- 更新 service-registry.yaml: 補齊 MinIO/Kali 監控入口
- 更新 LOGBOOK: Phase O 完成狀態
Phase O 驗收清單:
✅ kubectl Mac 本機免密碼
✅ OTEL Collector 2 Pod Running
✅ Event Exporter 1 Pod Running
✅ Descheduler CronJob Completed
✅ MinIO + Kali 告警規則
✅ Alert Chain Smoke Test
✅ CD Pipeline 整合
⏳ ClickHouse TTL / remote_write / SigNoz rules (待 .188 手動)
Co-Authored-By: Claude Code <noreply@anthropic.com>
|
2026-04-02 18:26:57 +08:00 |
|
OG T
|
d2b337430a
|
feat(cd): Phase O-4 Wave A 收尾 — Sentry Token 注入 + Alert Chain Smoke Test
CD Pipeline / build-and-deploy (push) Failing after 35s
E2E Health Check / e2e-health (push) Successful in 17s
Wave A.1: SENTRY_AUTH_TOKEN CD 自動注入 K8s Secret
- 每次部署自動 kubectl patch (遵循 ADR-035 鐵律)
- Token 缺失時 warn 不 fail (降級保護)
Wave A.6 + B.2: Alert Chain Smoke Test
- scripts/alert_chain_smoke_test.py (新建)
- 檢查: API Health / Alert Chain Metric / 3 Webhook /
SigNoz / OTEL Collector / Event Exporter
- 整合進 cd.yaml (Alert Chain Smoke Test 步驟)
- continue-on-error: true (不阻塞部署,結果顯示在 TG)
- TG 部署通知新增 Alert Chain 狀態欄
Wave A.2/A.3/A.4: SignOz/Sentry 程式碼已在 2026-03-29 實作完成
- signoz_webhook.py / sentry_webhook.py 均已部署
- 待手動部署 SignOz 告警規則到 .188
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
2026-04-02 18:22:13 +08:00 |
|
OG T
|
99be215e83
|
fix(monitoring): R1 Review 修正 — Blackbox DNS/PSA label/告警閾值
Critical: Blackbox Exporter replacement 從 K8s DNS 改為主機 IP (192.168.0.188:9115)
Important: Descheduler namespace 顯式宣告 PSA restricted labels
Suggestion: failedJobsHistoryLimit 3→1, 新增 MinioDiskUsageCritical 5% 告警
R1 Review by: 首席架構師 (Phase O-1)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
2026-04-02 14:02:50 +08:00 |
|
OG T
|
41bf0681cf
|
feat(observability): Phase O-2/O-3 OTEL Log管線 + Event Exporter + Remote Write
O-2.1: OTEL Collector DaemonSet (filelog receiver)
- 收集所有 K3s 節點 Pod stdout/stderr → SigNoz ClickHouse
- CRI log parser (Go time layout for +08:00 timezone)
- filter processor 排除 kube-system debug noise
- observability namespace PSA privileged (log 目錄需 root)
- 資源限制: 50m-200m CPU / 64-128Mi Memory
O-2.2: kubernetes-event-exporter
- K8s Event → 結構化 JSON Log → SigNoz
- Warning/Error 全量保留, Normal 過濾高頻事件
- 解決: Event 預設僅保留 ~1hr 的致命盲區
O-3: Prometheus remote_write 配置模板
- 白名單: ~50 關鍵 metric series (node/container/kube/api/db)
- 目標: 90 天長期儲存於 SigNoz ClickHouse
已部署驗證: 3 Pod Running, 0 error, filelog 正常監控所有 namespace
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
2026-04-02 14:01:42 +08:00 |
|
OG T
|
1dd0ff8cf4
|
fix(cd): runs-on 改回 ubuntu-latest (Gitea runner label 不支援 self-hosted)
CD Pipeline / build-and-deploy (push) Failing after 43s
E2E Health Check / e2e-health (push) Successful in 19s
根因: Gitea act_runner 只有 ubuntu-latest/24.04/22.04 labels
改為 self-hosted 後 runner 無法匹配 → CD 靜默失敗
所有 Phase 24 代碼都沒部署到 K8s
Gitea ≠ GitHub: GitHub 有內建 self-hosted label
Gitea 需要明確匹配 runner 註冊的 label
2026-04-02 ogt: CD 失敗根因修復
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
2026-04-02 13:59:58 +08:00 |
|
OG T
|
1ec342db0c
|
fix(web): 首席架構師審查修復 (82/100 → Pass)
E2E Health Check / e2e-health (push) Successful in 18s
CD Pipeline / build-and-deploy (push) Has been cancelled
- 字體遷移遺漏: host-grid (2處), sidebar (1處) → var(--font-body)
- time-series-chart tick → var(--font-mono) (圖表軸標籤保留等寬意圖)
- i18n key 重複: 移除 incident.anomaly, 保留 incident.card.anomaly
- 全站 inline fontFamily: 'monospace' 歸零
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
2026-04-02 13:56:43 +08:00 |
|
OG T
|
f0f9cc87a1
|
fix(web): monitoring 頁 QA 修復 — NAN% + HostGrid + i18n
E2E Health Check / e2e-health (push) Successful in 17s
CD Pipeline / build-and-deploy (push) Has been cancelled
- HostGrid 接 useHosts() SSE 數據(不再傳空陣列)
- HealthSummary NAN% 修復:total_count=0 時顯示 0% 而非 NaN%
- 8 處硬編碼中文改 i18n (正常/警告/異常/黃金指標/主機狀態/服務清單/表頭)
- 新增 monitoring namespace i18n keys (11 keys × 2 langs)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
2026-04-02 13:55:29 +08:00 |
|
OG T
|
6ce82ff883
|
fix(k3s): Phase O-1 基礎設施修復 — Descheduler + MinIO/Kali 監控
O-1.1: Descheduler securityContext 修復 (PodSecurity restricted 合規)
- 新增 pod securityContext (runAsNonRoot, runAsUser:65534, seccompProfile)
- 新增 container securityContext (allowPrivilegeEscalation:false, drop ALL)
- 補齊 RBAC: namespaces + replicasets list 權限
- 已部署驗證: CronJob 成功執行 (Status: Completed)
O-1.3: MinIO Prometheus scrape 配置 + 告警規則
O-1.4: Kali Blackbox TCP probe + 告警規則
- MinioDown, MinioDiskUsageHigh, MinioOfflineDisk
- KaliScannerDown
待手動部署: Prometheus config → .188, kubectl kubeconfig → 120/121
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
2026-04-02 13:55:26 +08:00 |
|
OG T
|
95343de782
|
chore: trigger CD (Phase 24 Review 修復已 push)
E2E Health Check / e2e-health (push) Successful in 17s
CD Pipeline / build-and-deploy (push) Has been cancelled
|
2026-04-02 13:52:23 +08:00 |
|
OG T
|
51961b9f03
|
docs: Phase O 可觀測性終極補完計畫設計規格
SigNoz 統一派架構,解決 6 大盲區 (Event/Log/Metrics/Descheduler/kubectl/MinIO-Kali)
+ Monitoring Master Plan Wave A-D 收尾
+ 5 個首席架構師 Review 節點
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
2026-04-02 13:45:23 +08:00 |
|
OG T
|
3ad7b60f68
|
fix(ai): Phase 24 R1+R2 首席架構師 Review 修復 (C1-C3 + I1-I5)
E2E Health Check / e2e-health (push) Successful in 18s
CD Pipeline / build-and-deploy (push) Has been cancelled
Critical 修復:
- C1: AIProvider Enum 改名為 AIProviderEnum (避免與 Protocol 同名衝突)
- C2: 共用 Circuit Breaker → per-provider _SimpleCircuitBreaker
(避免 Gemini 掛掉時 Ollama 也被擋)
- C3: cache_key 移到 try 外面 (避免 UnboundLocalError)
Important 修復:
- I1: Claude hardcode model → 用 get_model_registry()
- I2: Claude 追蹤 tokens/cost (input_tokens + output_tokens)
- I3: Ollama 追蹤 tokens (eval_count + prompt_eval_count)
- I4: Gemini temperature → 用 model_registry
- I5: AIProviderRegistry.close_all() shutdown hook
2026-04-02 ogt: Phase 24 首席架構師審查通過後修復
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
2026-04-02 13:40:58 +08:00 |
|
OG T
|
1f174e1268
|
fix(web): 首頁全面 QA 修復 — hosts 數據 + incident 標題 + i18n + 字體
E2E Health Check / e2e-health (push) Successful in 17s
CD Pipeline / build-and-deploy (push) Has been cancelled
- HostGrid 接 useHosts() SSE 數據(不再傳空陣列)
- IncidentCard 標題從 description?? '--' 改為 decision.action ?? services + 異常
- 6 處硬編碼中文改 i18n (活躍事件/載入中/系統穩定/OpenClaw認知引擎/基礎架構)
- fontFamily: Inter/monospace → var(--font-body) 全部替換
- 新增 dashboard.openclawEngine / infrastructure i18n keys
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
2026-04-02 13:33:48 +08:00 |
|
OG T
|
1628f659e3
|
fix(web): tDashboard is not defined — 補上 useTranslations('dashboard')
E2E Health Check / e2e-health (push) Successful in 16s
CD Pipeline / build-and-deploy (push) Has been cancelled
ReferenceError 導致 web pod crash loop。
page.tsx 用了 tDashboard() 但沒宣告。
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
2026-04-02 13:17:32 +08:00 |
|
OG T
|
73e8f8ab77
|
feat(ai): Phase 24-A+B1 — AI Provider Registry + 絞殺者包裝 (ADR-052)
E2E Health Check / e2e-health (push) Successful in 16s
CD Pipeline / build-and-deploy (push) Has been cancelled
Brain Layer 雙軌 Registry 架構:
- 新建 src/services/ai_providers/ 目錄 (interfaces + 4 providers)
- OllamaProvider (local, rca/chat/code_review)
- GeminiProvider (cloud, rca/chat)
- ClaudeProvider (cloud, rca/chat/code_review)
- OpenClawNemoProvider (cloud, rca — 委派 188→NIM)
- 擴展 ai_router.py 加入:
- AIProviderRegistry (動態註冊/啟停)
- AIRouterExecutor (Cache + 閘門 CB/RL/Sem + 執行)
- openclaw.py 絞殺者包裝: USE_AI_ROUTER=true 走新路徑
- config.py + ConfigMap 加入 USE_AI_ROUTER=false (安全預設)
- ADR-052 正式文件 (14 項決策 D1-D14)
- HARD_RULES v1.7 加入 AI Router 規範
安全: USE_AI_ROUTER=false 預設不啟用,需手動開啟觀察
回滾: kubectl set env deployment/awoooi-api USE_AI_ROUTER=false
2026-04-02 ogt: Phase 24 首批實作
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
2026-04-02 13:16:09 +08:00 |
|
OG T
|
1123eb4107
|
feat(web): Metrics Strip 自動處置率 + MTTR 真實計算
E2E Health Check / e2e-health (push) Successful in 17s
CD Pipeline / build-and-deploy (push) Has been cancelled
- autoRemediationRate: resolved+closed / total incidents
- mttrAvg: 平均 (updated_at - created_at) 分鐘/小時
- 替換原本的 '--' 靜態值
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
2026-04-02 13:03:20 +08:00 |
|
OG T
|
05cd9cbab4
|
fix(web): 驗收報告 6 個問題修復
E2E Health Check / e2e-health (push) Successful in 17s
CD Pipeline / build-and-deploy (push) Has been cancelled
1. [Medium] Metrics Strip [object Object] — 移除 pendingApprovals 陣列直接渲染
+ label 硬編碼改 i18n (activeIncidents/serviceHealth/todayIncidents 等)
2. [Low] KB GET /{id} 不過濾 archived — get_by_id 加 status != ARCHIVED
3. [Low] favicon.ico 404 — 新增 NemoClaw SVG favicon + layout metadata
4. [Medium] auto-repair console errors — fetchEval 加 try-catch 靜默處理
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
2026-04-02 10:30:43 +08:00 |
|
OG T
|
db2a2852b8
|
docs: 前端重構驗收報告 87/100
E2E Health Check / e2e-health (push) Successful in 16s
CD Pipeline / build-and-deploy (push) Has been cancelled
Playwright 瀏覽器截圖 + KB API 端點測試 + Console 分析
- 24/24 路由零 404
- 7 完整頁面 + 15 ComingSoon
- KB API 7 端點全部正常
- 1 Low bug (archived entry still accessible via GET)
- Metrics Strip [object Object] 待修
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
2026-04-02 10:20:27 +08:00 |
|
OG T
|
25889d4b8e
|
docs: 歸檔 ADR-050 reanalyze 實作計畫 (已完成)
CD Pipeline (Dev) / build-and-deploy-dev (push) Failing after 9s
E2E Health Check / e2e-health (push) Successful in 18s
CD Pipeline / build-and-deploy (push) Has been cancelled
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
2026-04-02 09:38:03 +08:00 |
|
OG T
|
4d46e6b9a7
|
style(web): 全站 font-mono → font-body (DM Mono 設計系統套用)
E2E Health Check / e2e-health (push) Successful in 17s
CD Pipeline / build-and-deploy (push) Has been cancelled
45 個 component + 6 個 page 統一從舊 font-mono 遷移到
font-body (DM Mono),確保設計系統一致性。
font-body = DM Mono (等寬),視覺效果相同但走新設計 token。
保留: font-heading (Syne)、font-dot-matrix (VT323/DSEG7)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
2026-04-02 09:37:03 +08:00 |
|
OG T
|
db1aed81d9
|
fix(db): C1 時區統一遷移 — utc_now → taipei_now (全 5 table)
E2E Health Check / e2e-health (push) Successful in 18s
CD Pipeline / build-and-deploy (push) Has been cancelled
🔴 首席架構師審查 C1: 全系統禁止 UTC,必須台北時區 +8
- utc_now() → taipei_now() (調用 src.utils.timezone.now_taipei)
- 影響: ApprovalRecord, TimelineEvent, AuditLog, IncidentRecord, KnowledgeEntryRecord
- 13 處 default/onupdate 全部替換
- 移除 datetime.UTC import
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
2026-04-02 09:13:36 +08:00 |
|
OG T
|
628387de8c
|
fix: risklevel migration 自動化 + Telegram Whitelist 注入
E2E Health Check / e2e-health (push) Successful in 17s
CD Pipeline / build-and-deploy (push) Has been cancelled
1. init_db() 啟動時自動確保 risklevel enum 包含 'high' 值
(Phase 23 新增,避免舊 DB 缺值導致 InvalidTextRepresentation)
2. CD Pipeline 新增 OPENCLAW_TG_USER_WHITELIST 自動注入
(之前為 CHANGE_ME,已更新為實際 user ID 5619078117)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
2026-04-02 09:13:13 +08:00 |
|
OG T
|
3ecfe7b3f5
|
chore: 清理 NemoNodeAnimation 殘留 + 修復 Migration YAML
E2E Health Check / e2e-health (push) Successful in 19s
CD Pipeline / build-and-deploy (push) Has been cancelled
- 移除 nemo-node-animation.tsx (無人引用,已被 NemoClaw 取代)
- Migration YAML: 修復 $$ 在 YAML heredoc 被 shell 解析問題
改用單引號字串 DO '' 語法
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
2026-04-02 09:09:25 +08:00 |
|
OG T
|
d2bad44173
|
fix(api): KB 架構審查修復 I3-I5
E2E Health Check / e2e-health (push) Successful in 17s
CD Pipeline / build-and-deploy (push) Has been cancelled
- I3: Service 層加 IKnowledgeRepository Protocol 型別標注
- I4: search 方法加入 tags JSONB 搜尋 (cast→String→ilike)
- I5: get_categories 獨立方法,不再繞道 list_entries(limit=0)
首席架構師審查 87/100 → 全部 Important issues 已修復
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
2026-04-02 09:05:54 +08:00 |
|
OG T
|
48a0bc66f7
|
fix(api): KB 首席架構師審查修復 (I1 tags filter + I2 type annotation)
E2E Health Check / e2e-health (push) Successful in 16s
CD Pipeline / build-and-deploy (push) Has been cancelled
- I1: Repository list_entries 實作 tags JSONB @> 篩選 (之前聲明未實作)
- I2: ORM tags 型別從 list[dict[str, Any]] 修正為 list[str]
首席架構師審查: 87/100
C1 時區(UTC→Taipei) 為既有系統性問題,另開 task 統一遷移
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
2026-04-02 09:04:41 +08:00 |
|