OG T
|
e59f8181b3
|
fix(telegram): 歷史按鈕改從 AnomalyCounter(Redis) 讀頻率,修復永遠顯示「無頻率統計資料」
CD Pipeline / build-and-deploy (push) Failing after 1m45s
根本原因: frequency_stats 從未持久化到 DB,get_by_id() 回傳永遠是 None
修復: 用 AnomalyCounter.derive_key_from_incident() 推導 anomaly_key,
直接從 Redis 查即時頻率與處置分佈統計
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-04-10 00:56:23 +08:00 |
|
OG T
|
e2c6ca598e
|
fix(approval_db): update_telegram_message 用 raw SQL + CAST BIGINT 避免 int32 overflow
CD Pipeline / build-and-deploy (push) Has been cancelled
telegram_chat_id 為 BIGINT (5619078117 > 2^31-1),SQLAlchemy ORM 會推斷為 $N::INTEGER
改用 raw SQL + CAST(:telegram_chat_id AS BIGINT) 繞過型別推斷
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-04-10 00:53:50 +08:00 |
|
OG T
|
dbb8104557
|
fix(drift): kubectl not found + RBAC services/configmaps/ingresses
CD Pipeline / build-and-deploy (push) Has been cancelled
drift_detector 用 kubectl 比對 Git YAML vs K8s 實際狀態,但:
1. API image 沒有 kubectl binary → No such file or directory: 'kubectl'
2. awoooi-executor ClusterRole 缺少 services/configmaps/ingresses list 權限
修復:
- Dockerfile: apt install curl + download kubectl v1.29.0 amd64
- 07-rbac.yaml: 加入 services/configmaps (core) + ingresses (networking.k8s.io) get/list/watch
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-04-10 00:49:56 +08:00 |
|
OG T
|
0571ad15d5
|
fix(signoz_webhook): AIDataImpact.value 大寫 → .lower() 轉 DataImpact
CD Pipeline / build-and-deploy (push) Has been cancelled
AIDataImpact enum value 為 'NONE'/'READ_ONLY' 等大寫,
DataImpact enum value 為 'none'/'read_only' 等小寫,
轉換時補 .lower() 避免 ValueError。
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-04-10 00:38:29 +08:00 |
|
OG T
|
f8c6dfc642
|
feat(web): Header ⌘K 搜尋提示按鈕 + sensor service file 補齊
CD Pipeline / build-and-deploy (push) Has started running
Header:
- 新增 ⌘K 入口按鈕(搜尋圖示 + "搜尋..." + ⌘K badge)
- 點擊觸發 window keydown(meta+k) 開啟 CommandPalette
- hover 變藍(UX 提示)
Sensor:
- 補齊 apps/sensor/awoooi-sensor.service(PYTHONUNBUFFERED=1 + --loop)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-04-10 00:29:15 +08:00 |
|
OG T
|
c132fd423a
|
fix(drift): COPY k8s/ 進 API image — drift_detector Git state 比對
CD Pipeline / build-and-deploy (push) Has been cancelled
drift_detector 的 GitStateReader 需要讀 k8s/*.yaml 來比對 K8s 實際狀態,
但 API Pod 沒有此目錄導致 k8s_dir_not_found,掃描結果永遠為空。
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-04-10 00:23:54 +08:00 |
|
OG T
|
5d591c4639
|
fix(drift_repository): CAST(:param AS jsonb) 取代 :param::jsonb
CD Pipeline / build-and-deploy (push) Has been cancelled
asyncpg 不支援 named param 混用 :: cast 語法,導致 PostgresSyntaxError。
改用 CAST() 函數語法,與 SQLAlchemy text() named params 相容。
影響: drift_reports 現在可正常寫入 DB
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-04-10 00:22:43 +08:00 |
|
OG T
|
25412807f5
|
docs(logbook): B1 Sensor Agent 兩台主機部署完成
|
2026-04-10 00:16:45 +08:00 |
|
OG T
|
7e498621e0
|
fix(signoz_webhook): AIBlastRadius → BlastRadius 型別轉換
CD Pipeline / build-and-deploy (push) Has been cancelled
blast_radius 欄位傳入 AIBlastRadius 物件導致 Pydantic validation error,
approval 無法存進 DB(Telegram 仍送出但無法批准)。
修法:明確轉換 AIBlastRadius → BlastRadius,data_impact enum 用 .value 橋接。
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-04-10 00:15:40 +08:00 |
|
OG T
|
3fa377cce9
|
fix(web): en.json 多餘的右括號導致 webpack JSON parse 失敗
CD Pipeline / build-and-deploy (push) Has been cancelled
position 41700 附近有雙重 }} 結尾,移除多餘的一個。
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-04-10 00:08:04 +08:00 |
|
OG T
|
c803e94370
|
docs(logbook): Sprint 5R Phase 3 閉環記錄
|
2026-04-10 00:04:46 +08:00 |
|
OG T
|
524423577a
|
feat(web): 基礎架構主機卡點擊 → 詳情抽屜展開
CD Pipeline / build-and-deploy (push) Failing after 3m57s
E2E Health Check / e2e-health (push) Successful in 35s
點擊主機卡展開行內抽屜:
- CPU/RAM 大字顯示(含顏色警示:>80% 紅/>60% 橙)
- 完整服務清單(狀態點 + port + latency_ms)
- 相關事件(按 affected_services 過濾)
- ✕ 關閉 / 再點同卡收合
- 選中狀態:藍色邊框高亮
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-04-09 23:49:00 +08:00 |
|
OG T
|
2897007014
|
fix(web): 修復 webpack build 錯誤 — 重複 flexShrink + firing_count undefined
CD Pipeline / build-and-deploy (push) Has been cancelled
header.tsx: 移除重複的 flexShrink: 0 屬性 (TS1117)
classic/page.tsx: firing_count ?? 0 處理 undefined (TS2322)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-04-09 23:45:52 +08:00 |
|
OG T
|
df0afa654f
|
feat(soul): SOUL.md + capabilities.json v5.0 → v5.5
- AI fallback: ollama_tool→openclaw_nemo→gemini→nvidia (ADR-052)
- Phase 25 能力:Config Drift Detection / Auto-Harvesting / Sensor Agent
- ADR-059 K8s ClusterIP override 文件化
- Telegram dedup TTL=600s + model name 顯示
- Discord 移除(已停用)
- capabilities.json: llama3.1:8b / DB 10 / stream key awoooi:signals
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-04-09 23:40:40 +08:00 |
|
OG T
|
a303b5ef91
|
feat(chat): NemoClaw 改接 Ollama 111 deepseek-r1:14b
CD Pipeline / build-and-deploy (push) Failing after 4m6s
2026-04-09 ogt: 棄用 Claude Haiku,改用本地 deepseek-r1:14b
- 端點: http://192.168.0.111:11434
- 過濾 <think>...</think> 推理區塊,只回傳結論
- timeout 120s(14b 推理較慢)
- 完全免費,不計入 Claude API 費用
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-04-09 23:38:57 +08:00 |
|
OG T
|
62cb274735
|
feat(host_aggregator+k8s): 新增 121 K3s Worker 主機監控
CD Pipeline / build-and-deploy (push) Has been cancelled
HOST_CONFIGS 加入 192.168.0.121(K3s Worker):
- K3s API tcp:6443
- awoooi-api NodePort tcp:32334
- awoooi-web NodePort tcp:32335
NetworkPolicy 補開 121 egress: 6443/32334/32335
NodePort 服務實際在 121(mon1),非 120(mon)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-04-09 23:36:36 +08:00 |
|
OG T
|
2bc2a2f174
|
test(integration): drift API + DB 持久化整合測試
CD Pipeline / build-and-deploy (push) Has been cancelled
覆蓋 GET /drift/reports、POST /drift/internal/scan
驗證掃描後 DB 有新資料(B5 整合測試框架擴充)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-04-09 23:36:17 +08:00 |
|
OG T
|
fc9d0f9c1f
|
fix(host_aggregator): total=1 時 total//2=0 導致服務全 up 仍顯示 unhealthy
CD Pipeline / build-and-deploy (push) Has been cancelled
112(Kali) 和 120(K3s) 各只有 1 個服務,down_count=0 >= total//2=0
永遠成立 → 永遠 unhealthy。改為 total > 1 才套用 >=half 門檻。
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-04-09 23:35:37 +08:00 |
|
OG T
|
d324dd7aed
|
fix(telegram): 移除所有告警訊息欄位截斷限制,放寬至 Telegram 4096 字元硬限
CD Pipeline / build-and-deploy (push) Has been cancelled
問題:root_cause[:50/100]、suggested_action[:80]、suggestion[:50]、
note[:150]、fix_action[:100]、impact[:150]、hypothesis[:200]
以及 message[:900]/[:1000] 導致告警內容顯示不完整。
修復:移除欄位截斷,整體上限改為 4096(Telegram API 硬限制)。
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-04-09 23:32:51 +08:00 |
|
OG T
|
31d45f0c99
|
feat(sensor): Phase 5.5 B1 Sensor Agent v2.0 — 三層真實採集
CD Pipeline / build-and-deploy (push) Has been cancelled
- NodeMetricsCollector: node-exporter CPU/Mem/Disk/Load 閾值告警
- JournalCollector: systemd journal ERROR/OOM/KernelPanic 偵測
- ServiceProbeCollector: TCP port 存活探測 (188: PG/Redis/Ollama/Nginx/SigNoz, 110: Harbor/Gitea)
- 10分鐘 fingerprint dedup (Redis sensor:dedup:{fp})
- 正確 Stream key: awoooi:signals DB10 (ADR-038)
- HOST_CONFIGS 自動 IP 偵測 (110/188)
- 已部署 cron @188/@110,E2E 驗證:sensor→stream→INC-20260409-2F1DD6
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-04-09 23:31:35 +08:00 |
|
OG T
|
eb46079b4a
|
fix(telegram): root_cause 顯示長度 50→100 字元,符合 SOUL.md 鐵律
CD Pipeline / build-and-deploy (push) Has been cancelled
SOUL.md 明定根因摘要上限 100 字元,但程式碼兩處 IncidentApprovalCard
均截在 [:50],導致告警卡片訊息被截斷。
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-04-09 23:30:58 +08:00 |
|
OG T
|
89db96fc21
|
feat(web): ⌘K Command Palette — 全局指令面板 + 高斯模糊
CD Pipeline / build-and-deploy (push) Has been cancelled
- ⌘K (Mac) / Ctrl+K (其他) 開啟/關閉
- 高斯模糊背景 (backdrop-blur 8px + rgba overlay)
- 搜尋過濾:導航 9 頁 + 快速動作(開 Terminal)
- 鍵盤完整支援:↑↓ 選擇 / Enter 執行 / Esc 關閉
- 滑鼠 hover 同步 activeIdx
- 100% i18n (commandPalette namespace)
- Z-Index: DIALOG(70),掛載於 providers.tsx 全局層
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-04-09 23:28:36 +08:00 |
|
OG T
|
5b42bd34e6
|
docs(logbook): Sprint 5R Phase 2 完整閉環記錄
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-04-09 23:24:50 +08:00 |
|
OG T
|
764dcf24e9
|
fix(i18n): byAnomalyAutoRate 插值修正 + mttrUnit 單位改分鐘
CD Pipeline / build-and-deploy (push) Successful in 12m22s
byAnomalyAutoRate: "自動修復率" → "自動修復率 {pct}%" (缺少 {pct} 插值導致顯示原始 key)
mttrUnit: "秒" → "分鐘" (前端已做 /60 換算)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-04-09 23:11:02 +08:00 |
|
OG T
|
af7b6beba8
|
fix(web): Tab4 by_anomaly 欄位修正 — 適配真實 API 結構
CD Pipeline / build-and-deploy (push) Successful in 12m8s
by_anomaly 回傳結構為 {alert_name, anomaly_key, disposition:{total,auto_repair,auto_rate,...}}
修正:
- 排序依 disposition.total(非 count)
- 名稱顯示用 alert_name || anomaly_key
- auto_rate 取自 disposition.auto_rate * 100
- 計數取自 disposition.total
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-04-09 20:57:58 +08:00 |
|
OG T
|
ab5ba7062c
|
feat(web): Tab3 Chain-of-Thought 面板 + Tab4 by_anomaly Top5 + MTTR
CD Pipeline / build-and-deploy (push) Successful in 13m1s
Tab 3 ActivityStreamTab:
- 點擊 SSE 事件展開 COT 側面板(含 provider/confidence/latency/tools/reasoning)
- 有 proposal_data 的事件顯示 COT badge
- 點擊同一事件收合面板
Tab 4 DispositionTab:
- by_anomaly Top5 水平進度條(按 auto-repair 率著色:≥80% 綠/≥50% 橙/其他紅)
- MTTR 大字顯示(分鐘)+ 無資料時 fallback
i18n: cotTitle/cotReasoning/cotConfidence/cotProvider/cotLatency/cotTools/
cotClickHint/byAnomalyTitle/byAnomalyAutoRate/mttrTitle/mttrUnit/mttrNoData
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-04-09 20:42:02 +08:00 |
|
OG T
|
3bdac2e68e
|
docs(logbook): Sprint 5R 架構審查+QA全驗收閉環記錄
- 首席架構師審查 9 項修復完成
- CORS/sign/host_aggregator 修復
- QA 9/9 頁面通過,無假資料
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-04-09 20:33:55 +08:00 |
|
OG T
|
c92cdeea0f
|
feat(drift): B4 drift_reports DB 持久化 + CronJob 修復
CD Pipeline / build-and-deploy (push) Successful in 12m17s
- drift_repository.py: DriftReportRepository (save/get/list/update)
- drift.py router: 移除 in-memory dict,改用 DB repository
- drift-cronjob.yaml: 修正 SA/NetworkPolicy/NodePort 問題
- allow-intra-namespace NetworkPolicy (已套用至 prod)
- migrate-phase8/9: symptoms_hash + drift_reports migration Job YAML
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-04-09 20:28:55 +08:00 |
|
OG T
|
b1e207ffae
|
fix(host_aggregator): E2E驗證後修正 HOST_CONFIGS — Ollama位置+NodePort+Nginx
CD Pipeline / build-and-deploy (push) Has been cancelled
從 K3s Pod 內 Python socket 實測確認後修正:
- 110: 加 Prometheus(9090) Grafana(3002),移除 GH Runner(3000 refused)
- 112: 移除 SSH:22 (K3s Pod NetworkPolicy 未開)
- 120: 移除 awoooi NodePort(只在121不在120)
- 188: 移除 Ollama(在111非188) 和 Nginx:443(Pod內打不通)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-04-09 20:27:46 +08:00 |
|
OG T
|
c200d7a52d
|
fix(web+k8s): CSRF mismatch + NetworkPolicy 缺少監控端口
CD Pipeline / build-and-deploy (push) Successful in 12m19s
1. pending-approvals-card: 改為點擊時即時 fetch 新 CSRF token
避免多 useCSRF 實例互相覆蓋 cookie 導致 header/cookie 不一致
2. NetworkPolicy: 補開 110:3002(Grafana) 9090(Prometheus) 3001(Gitea)
修正 monitoring probe "All connection attempts failed"
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-04-09 20:11:00 +08:00 |
|
OG T
|
21567a7a6d
|
fix(host_aggregator): 修正四台主機 probe 端點錯誤導致全部顯示 unhealthy
CD Pipeline / build-and-deploy (push) Successful in 12m1s
- 110: Harbor http→tcp(5000), Docker 2375→Gitea tcp(3001)
- 120: K3s 6443 https(401誤判)→tcp, 移除 Traefik 80(closed)
- 188: OpenClaw 8089→8088 (實際端口)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-04-09 19:52:34 +08:00 |
|
OG T
|
8c2983b70a
|
fix(api+web): CORS 補 K3s NodePort origins + sign 補 signer_id/name
CD Pipeline / build-and-deploy (push) Has been cancelled
CORS (config.py):
- 補 http://192.168.0.125:32335 (K3s VIP NodePort)
- 補 http://192.168.0.120:32335 + 121:32335 (K3s nodes)
- 修前: 內網瀏覽器開 :32335 打 API 全 CORS blocked
(incidents Failed to fetch / monitoring 無法連線根因)
sign body (pending-approvals-card.tsx):
- signer: 'web-ui' → signer_id: CURRENT_USER.id + signer_name: CURRENT_USER.name
- 修前: POST /approvals/{id}/sign 回 403 (缺必填欄位 422 誤報為 403)
— 實際是 422 Field required signer_id + signer_name
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-04-09 19:50:48 +08:00 |
|
OG T
|
34f0228d92
|
fix(executor): K8s ClusterIP 10.43.0.1 不可達 — 加 K8S_API_SERVER_URL 覆蓋 + migration job
CD Pipeline / build-and-deploy (push) Successful in 12m0s
問題: in-cluster config 讀到 10.43.0.1:443,但 K3s Pod 內 iptables/kube-proxy
沒把流量導到實際 API server,導致 Connection refused,批准後 kubectl 永遠失敗
修復:
- executor.py: load_incluster_config() 後讀 K8S_API_SERVER_URL env 覆蓋 host
- 04-configmap.yaml: 設 K8S_API_SERVER_URL=https://192.168.0.120:6443
- migrate-sprint5r-telegram-message-id.yaml: approval_records 新增兩欄 migration job
E2E 驗證: kubectl rollout restart deployment/awoooi-worker success=True ✅
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-04-09 19:10:27 +08:00 |
|
OG T
|
ebccb88278
|
fix(approval_db): 修復 incident_id 篩選查空 DB 欄位而非 JSON 導致執行斷路
CD Pipeline / build-and-deploy (push) Has been cancelled
get_all_approvals(incident_id=...) 原本在應用層過濾
a.metadata.get("incident_id"),但 ApprovalRecord.incident_id
是直接欄位,不在 extra_metadata JSON,導致永遠返回空列表,
Telegram 批准後出現 telegram_approval_not_found_by_incident,
審批從未實際執行。改為 .where(ApprovalRecord.incident_id == incident_id)
DB 層直接篩選,同時效能更佳。
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-04-09 19:05:48 +08:00 |
|
OG T
|
9a8f410f23
|
fix(web): PendingApprovalsCard 批准/拒絕補 CSRF Token — 修復 403
CD Pipeline / build-and-deploy (push) Has been cancelled
根因: fetch 沒帶 X-CSRF-Token header + credentials:include
API 回 403 "CSRF token cookie missing"
修復: 加 useCSRF hook,sign/reject 請求帶 ...getHeaders() + credentials:include
與 incident-card.tsx / openclaw-state-machine.tsx 同一模式
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-04-09 19:00:02 +08:00 |
|
OG T
|
a2a98452ad
|
fix(web): 移除 AIModelStatus 假綠燈 — Gemini/NVIDIA 不應 assumed up
CD Pipeline / build-and-deploy (push) Has been cancelled
根因: /api/v1/health 的 components 只有 api/database/redis/ollama/openclaw
d.components.gemini 永遠 undefined → healthy: true 是硬編碼假數據
修復: 改為只有 components 有對應 key 才更新狀態
無 health 資料時保持 false(unknown),不顯示假綠燈
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-04-09 18:51:14 +08:00 |
|
OG T
|
a4d6b3f3e6
|
fix(review): 首席架構師+QA 修復 C1/P1/P2/I2/I3 — Sprint 5R Review 修正
CD Pipeline / build-and-deploy (push) Has been cancelled
C1/P1-1: DB migration — approval_records 新增 telegram_message_id/telegram_chat_id
- apps/api/migrations/sprint5r_telegram_message_id.sql (新增)
- apps/api/src/db/base.py: init_db() 加 ALTER TABLE ADD COLUMN IF NOT EXISTS
- k8s/jobs/migrate-sprint5r-telegram-message-id.yaml (追蹤)
P1-2: risk_map 補 "high" 鍵防止 LLM 回傳 high 時降為 MEDIUM
- apps/api/src/services/proposal_service.py:183
I2/M3: kubectl_command 回填補齊 delete_deployment/drain_node/cordon_node/delete_service
+ 抽取 _backfill_kubectl_command() helper 消除重複邏輯
- apps/api/src/services/openclaw.py
I3: _notify_approval_result 靜默 except 改為 logger.warning
- apps/api/src/services/telegram_gateway.py
P2-2: PendingApprovalsCard 審批動作加 loading/disabled 防止重複點擊
- apps/web/src/components/shared/pending-approvals-card.tsx
P2-3: SecurityPanel/CompliancePanel error 死碼修復 — catch() 補 setError()
- apps/web/src/components/panels/SecurityPanel.tsx (含 'Unresolved' i18n)
- apps/web/src/components/panels/CompliancePanel.tsx
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-04-09 18:38:10 +08:00 |
|
OG T
|
896bef94ee
|
fix(web): pending-approvals-card 加防重複點擊 + loading 狀態
linter 自動強化: actioningId state 防止同一張卡重複操作
- disabled + opacity 0.6 + cursor not-allowed
- loading 時按鈕顯示 '...'
- finally() 確保 actioningId 清除
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-04-09 18:38:08 +08:00 |
|
OG T
|
890e2a9568
|
fix(review): 架構審查修復 — P0 import crash + i18n 零 hardcode + 靜默錯誤
P0:
- proposal_service.py: 補 get_redis + INCIDENT_KEY_PREFIX import
(修前: resolve_incident_after_approval 必 NameError crash)
P1 i18n:
- page.tsx: 拓撲群組移除 emoji,改用 tTopo() i18n key
- page.tsx: 主機標籤 (DevOps金庫等) 改 tTopo() i18n
- ai-model-status.tsx: 加 useTranslations,AI 模型狀態 → t('aiModelStatus')
- disposition-mini.tsx: 查看完整報表 → t('viewAllReport')
- recent-activity.tsx: 查看活動串流 → t('viewAllAlerts')
P2 品質:
- pending-approvals-card.tsx: approve/reject 加 r.ok 檢查+錯誤顯示,查看全部授權加路由+i18n
- page-tabs.tsx: TabSkeleton 載入中... → t('loading')
- page.tsx: ↑5% → tDashboard('trendUp', {pct}) 動態值
- page.tsx: Prometheus '23' hardcode → '-- targets'
i18n 新增 key (zh-TW + en 同步):
- dashboard: viewAllAlerts/viewAllAuth/viewAllReport/aiModelStatus/loading/trendUp
- topology: groupExternal/allReachable/investigating/hostDevops/hostAiData/hostK3sMaster/hostK3sWorker
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-04-09 18:34:50 +08:00 |
|
OG T
|
309fe04698
|
docs(adr066): 批准執行閉環修復記錄 — LOGBOOK + ADR-066 + Skill 02 更新
- LOGBOOK.md: 新增 2026-04-09 批准執行閉環修復狀態區塊
- ADR-066: 記錄根本問題鏈條、決策與受影響檔案
- Skills/02: v2.7 新增 Nemotron tool→kubectl_command 回填鐵律 + 教訓
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-04-09 18:23:55 +08:00 |
|
OG T
|
c01026be9b
|
docs(skills+adr): 自動修復全鏈路知識更新 — ADR-058 Appendix A + Skills v2.5
ADR-058: 188白名單補完 + Appendix A (12 Bug修復記錄 + E2E驗證 + Playbook覆蓋矩陣)
Skill-04 DevOps v2.5: SSH自動修復架構章節 (白名單/SOP/陷阱)
Skill-05 SRE: 自動修復E2E驗收規範 + 診斷表
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-04-09 18:21:24 +08:00 |
|
OG T
|
2779233b25
|
docs: Sprint 5R 實施完成紀錄更新
- LOGBOOK: 13/14 步驟全部完成,CD 部署中
- ADR-065: 狀態更新為「實施完成」
- Skills 01 v1.8: Sprint 5R 完成記錄
- Memory: project_current_status + sprint5r_plan 已更新
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-04-09 18:19:57 +08:00 |
|
OG T
|
1483218bab
|
feat(approval): 批准/拒絕後立即回應 Telegram + 持久化 message_id 到 DB
CD Pipeline / build-and-deploy (push) Successful in 13m9s
問題:按下 TG 批准/拒絕按鈕後完全沒有任何回應,使用者不知道是否成功。
Telegram message_id 只存 Redis 24h TTL,過期後無法追蹤。
修正:
- approval_records 加 telegram_message_id / telegram_chat_id 欄位(已 ALTER TABLE)
- approval_db.update_telegram_message() — 持久化 message_id 到 DB
- decision_manager: 發送告警卡片後同時寫 Redis + DB
- telegram_gateway._notify_approval_result() — 批准/拒絕後:
1. editMessageReplyMarkup 移除批准/拒絕按鈕,保留資訊按鈕
2. sendMessage reply_to 在原訊息下回覆狀態行
3. fallback: send_notification 發新訊息
- _handle_group_command: chat_id 改為 _chat_id 消除 IDE lint
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-04-09 18:19:31 +08:00 |
|
OG T
|
2c7d5d049c
|
fix(openclaw): Nemotron tool call 回填 kubectl_command,讓批准後執行器能解析
CD Pipeline / build-and-deploy (push) Has been cancelled
根本問題:Nemotron 產生的 restart_deployment(deployment_name=sentry) tool call
只存在 nemotron_tools[],沒有回填到 proposal["kubectl_command"]。
proposal_service 拿到的 kubectl_command 是空的,approval_records.action 存空值,
parse_operation_from_action 永遠返回 None,execute_approved_action 永遠 skip。
修正:Nemotron (和 Gemini fallback) 成功後,將 tool call 轉換為 kubectl 指令
並回填 proposal["kubectl_command"],讓 proposal_service 能取到可執行指令。
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-04-09 18:15:01 +08:00 |
|
OG T
|
a39647d793
|
docs(logbook): 自動修復全鏈路完整閉環記錄 — 雙主機 E2E 驗證通過
CD Pipeline / build-and-deploy (push) Has been cancelled
docker-110: SentryDown → REPAIR_OK:sentry (6208ms)
docker-188: MoWoooWorkDown → REPAIR_OK:momo-app (3791ms)
20 Playbooks (8 auto-generated), repair-bot 雙主機白名單更新
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-04-09 18:14:17 +08:00 |
|
OG T
|
ae9780837d
|
fix(proposal): action 優先用 kubectl_command,修復批准後永遠 skip 執行的根本 bug
根本問題:approval_records.action 存的是 LLM action_title(中文標題,如「重啟 sentry 服務」),
parse_operation_from_action() 無法解析,導致 execute_approved_action() 每次都 skip。
修正:action 優先取 llm_proposal["kubectl_command"](可執行的 kubectl 指令),
僅在沒有 kubectl_command 時才 fallback 到 action_title。
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-04-09 18:13:22 +08:00 |
|
OG T
|
49a15e1ac9
|
feat(web): G1 骨架屏取代載入中 + S8 完整提交 — Sprint 5R
CD Pipeline / build-and-deploy (push) Has been cancelled
- G1: PulseSkeleton + CardSkeleton 元件
- 首頁所有 LobsterLoading 替換為 PulseSkeleton/CardSkeleton
- Tab 2/4 載入狀態用 CardSkeleton
- 活躍事件載入用 PulseSkeleton
Sprint 5R Phase 1B+1C 全部完成:
S1(KPI卡片) S2(FlowPipeline OpenClaw) S3(AI提案) S4(環形圖)
S5(時間線) S6(Terminal) S7(待審批) S8(拓撲群組+主機)
S9(AI模型) S10(監控3×2) S11(Tab修復) S12(頁面修復) G1(骨架屏)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-04-09 18:09:26 +08:00 |
|
OG T
|
09c6eb3358
|
feat(web): S2 FlowPipeline 龍蝦→OpenClaw icon — Sprint 5R
CD Pipeline / build-and-deploy (push) Has been cancelled
- LobsterSVG 替換為 OpenClawIcon (dashboardicons.com/openclaw PNG)
- 4 種嚴重度渲染全部更新 (P0/P1/P2/P3)
- icon 直接取代圓圈作為活躍步驟標記(非浮動)
- S3 確認: AI 提案橫幅已存在且樣式正確
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-04-09 18:07:53 +08:00 |
|
OG T
|
03b07d5bc5
|
feat(web): S8 基礎架構拓撲群組 2×2 + 主機 4 台 — Sprint 5R
CD Pipeline / build-and-deploy (push) Has been cancelled
- 拓撲模式(預設): 4 群組 2×2 網格 (基礎設施/AI數據/K3s/外部)
每群組含名稱+服務數+健康摘要+服務列表(色點)
有 warning 的群組加橘色光暈
- 主機模式: 4 台 2×2 (110/188/120/121) 含 CPU/RAM 進度條
優先使用 API 真實數據,fallback 靜態值
- 預設切換為拓撲模式 (設計稿要求)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-04-09 18:06:01 +08:00 |
|
OG T
|
07a097c259
|
fix(infra): Sprint 3 自動修復全鏈路修復 — docker-188 SSH egress + service registry 擴充
CD Pipeline / build-and-deploy (push) Has been cancelled
NetworkPolicy: 新增 192.168.0.188:22 egress — repair-bot-188.sh 執行路徑
service-registry.yaml: 新增 signoz/bitan-app (AUTO, 188主機)
修復覆蓋: Bug #11 補完 (188 SSH) + 188 服務分級覆蓋
E2E 驗證: MoWoooWorkDown → SSH → REPAIR_OK:momo-app (3791ms)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-04-09 18:04:19 +08:00 |
|