Commit Graph

219 Commits

Author SHA1 Message Date
Your Name
e45b055e0e feat(governance): AI 治理事件處理鏈四軌交付(C/D/B/A)
Some checks failed
Code Review / ai-code-review (push) Successful in 48s
run-migration / migrate (push) Failing after 45s
CD Pipeline / tests (push) Successful in 3m46s
Type Sync Check / check-type-sync (push) Successful in 2m8s
CD Pipeline / build-and-deploy (push) Failing after 31m14s
CD Pipeline / post-deploy-checks (push) Has been skipped
【十二人專家團隊全景掃描 + 並行四軌實施】

統帥質疑「有讓 12-agent 一起協作嗎」後,依照團隊規則完成全鏈路交付:
onboarder + critic + db-expert + debugger + frontend-designer 並行掃描,
找到 6 大 Gap,再由 fullstack-engineer × 4、refactor-specialist 協作落地。

【Track C — trust_drift 雙寫整併】

兩條獨立寫 event_type=trust_drift 路徑互不呼叫,下游 consumer 拿到雙份資料
無法判定 source-of-truth。整併保留 governance_agent.check_trust_drift(功能
更全:auto-deprecate + Telegram + PG),TrustDriftDetector 降為純統計 lib,
W-6 watchdog 改呼叫 governance_agent。新增 TestSinglePgWritePerDriftScenario
驗證同一 drift 場景只觸發一次 PG 寫入。

  變更:
    - apps/api/src/services/trust_drift_detector.py(lib only,不再寫 PG)
    - apps/api/tests/test_trust_drift_watchdog.py(W-6 改 mock governance_agent)

【Track D — governance_remediation_dispatch 派遣表】

ai_governance_events 是不可變 Event Sourcing,不能塞執行狀態。新建派遣表
作為投影層:1 event → 0..N dispatches,狀態可變、可重試、可審計。

  - PgEnum 5 種 event_type + 7 階段狀態機(pending → dispatched → executing →
    succeeded/failed/cancelled/skipped)
  - 失敗重試 INSERT 新 row(不改舊 row 的 status,保留審計痕跡)
  - Partial unique index ux_grd_one_active_per_event 強制「同事件唯一活躍」
  - 4 個複合 index 支援 worker poll、去重查詢、觀測面板
  - FK 對應 ai_governance_events / playbooks / incidents / approval_records
    全部 SET NULL(avoid cascade lock,但 governance_event 用 RESTRICT)

  變更:
    - apps/api/src/db/models.py(GovernanceRemediationDispatch ORM class)
    - apps/api/migrations/governance_remediation_dispatch_2026-05-03.sql
    - apps/api/src/repositories/governance_remediation_dispatch_repo.py
      (6 個 async 函式 + 3 個自訂例外:DispatchAlreadyActive /
       InvalidStatusTransition / DispatchNotFound)
    - apps/api/src/models/governance_dispatch.py(DecisionContextV1 等 4 schema)
    - apps/api/tests/test_governance_remediation_dispatch.py(29 tests)

【Track B — /governance 頁面】

後端 PR1 三個 endpoint + 前端 PR2-5 完整三 Tab。

PR1 後端:
  - GET /api/v1/ai/governance/events(events_tab,含 event_type/severity/
    狀態/時間範圍篩選 + 分頁)
  - GET /api/v1/ai/governance/queue(queue_tab,含 graceful fallback:
    dispatch 表不存在時回 table_pending=True 不拋 500)
  - GET /api/v1/ai/governance/summary(slo_tab 30d 違反時序圖)
  - severity 映射規則寫死(critic 建議未來移 settings)

PR2-5 前端:
  - /governance 路由 + AppLayout + Compliance Badge 橫幅 + PageTabs
  - SLO Tab:3 KPI 卡片(Syne 28px + StatusOrb + 7d sparkline)+
    30d 違反 stacked BarChart
  - Events Tab:篩選列 + 表格 + inline 展開行(JSON / 修復建議 / 派遣記錄)
  - Queue Tab:HITL 待辦卡片 + 信任度進度條 + 批准/拒絕按鈕(本 PR console.log)
  - Sidebar 加入「AI 治理」入口(ShieldCheck icon)
  - i18n 雙語完整(governance namespace + nav.governance)
  - 7 個新元件:slo-kpi-card / slo-violation-chart / events-table /
    events-filter-bar / event-detail-drawer / queue-item-card / queue-history-tabs

  變更:
    - apps/api/src/api/v1/ai_governance.py(router)
    - apps/api/src/services/governance_query_service.py
    - apps/api/src/models/governance.py(Pydantic V2 schemas)
    - apps/api/tests/test_ai_governance_endpoints.py(21 tests)
    - apps/web/src/app/[locale]/governance/(page + 3 tabs)
    - apps/web/src/components/governance/(7 元件)
    - apps/web/messages/{zh-TW,en}.json(governance namespace)
    - apps/web/src/components/layout/sidebar.tsx(+1 行)
    - apps/api/src/main.py(router include)

【Track A — GovernanceDispatcher 決策融合】

把治理事件接到 remediation 執行器,走北極星方向決策融合(LLM × Playbook trust
× MCP),符合「禁寫死規則」鐵律。

  - 設計鐵律:DecisionFusionAdapter 是新增 wrapper,**不修改任何 Tier 3 檔**
    (decision_manager / learning_service / trust_engine),只 consume 既有 API
  - 三維融合公式:confidence = 0.4×llm + 0.3×playbook_trust + 0.3×mcp_consistency
    (權重加 TODO 標明未來由 AI 自學調整)
  - 三分支決策路徑:
    confidence ≥ 0.85 → auto_dispatch(status=dispatched)
    0.65 ≤ confidence < 0.85 → pending_approval(HITL)
    confidence < 0.65 → skip + log
  - decision_context JSONB 完整記錄三維輸入快照(給未來 fine-tune 用)
  - poll 30s 掃 unresolved 事件,仿 governance loop 模式
  - 重複事件擋去重(呼叫 get_active_for_event)

  變更:
    - apps/api/src/services/governance_dispatcher.py
    - apps/api/src/services/decision_fusion_adapter.py
    - apps/api/tests/test_governance_dispatcher.py(14 tests)
    - apps/api/src/main.py(lifespan task 接 run_governance_dispatcher_loop)

【驗證】

1836 個 unit test 全過(29 skipped 為既有 PG integration env 問題)

【調度教訓 — 已記入 memory】

- vuln-verifier 應在 fullstack-engineer **之前**跑(避免並行讀到已修代碼誤判)
- critic 雙輪審查不可省(第二輪抓到 NaN sentinel + Prom rule 連鎖)
- 北極星「禁寫死規則」搭配 decision-fusion 確實實施

【未動 Tier 3 — 已驗證】

git diff 確認本 commit 完全沒改 decision_manager.py / learning_service.py /
trust_engine.py,只新增 wrapper service consume 既有 API。

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-03 12:42:40 +08:00
Your Name
f1362fcc8d fix(governance): 修治理告警 4 個 silent failure + Prom sentinel 連鎖
Some checks failed
Code Review / ai-code-review (push) Successful in 49s
CD Pipeline / tests (push) Successful in 2m9s
CD Pipeline / build-and-deploy (push) Failing after 31m11s
CD Pipeline / post-deploy-checks (push) Has been skipped
【全景檢測:12-agent 並行掃描定位 4 大 bug 與 1 個 P0 連鎖回歸】

Bug 1(P0 silent failure)— governance_agent.check_trust_drift
  原 `await db.commit()` 縮排錯在 async with 區塊外(8 空格 vs 12),
  session 已 auto-commit 關閉,二次 commit 拋 InvalidRequestError 被吞,
  governance_trust_drift_auto_deprecated log 從不出現。修:commit/log 移回 with 內。
  附 AST regression guard test 擋退化。

Bug 2 — flywheel_stats_service / W-3 fresh deploy 假告警
  Redis 空時 total_exec=0 → rate=0.0 → watchdog `< 0.30` 立即觸發
  「飛輪成功率 0%」假告警。修:total_exec < FLYWHEEL_MIN_SAMPLE(10) 回 None,
  watchdog 判 None 跳過 W-3。Prometheus sentinel 用 NaN(非 -1.0)
  避免觸發 ops/monitoring/alerts.yml:775 等 3 份 prom rule 的 `< 0.1`
  條件造成 2h 後假告警連鎖。前端 type 同步 number | null。

Bug 3 — failover_alerter dedup key
  原 key 只看 event_type 不看 payload,trust_drift 4→25 IDs 變動全被
  1h dedup 吞掉。修:dedup key 加 sha256(impact subdict)[:8],event_type
  sanitize 防特殊字元污染 Redis key。

Bug 4 — ai_slo_watchdog_job W-4 evolver 全封存初始化誤報
  原邏輯 approved==0 即告警,未排除「playbooks 表初始化中」場景。
  修:_count_approved_playbooks 回 (approved, total),total==0 → skip。

【執行結果】
- 39 個相關 unit test 全過(test_failover_alerter / test_governance_agent /
  test_trust_drift_watchdog / test_check_trust_drift_commit_outside_context_poc)
- 6 個關鍵路徑實測:NaN sentinel / float 渲染 / hash 區分性 / dedup 同 impact
  相同 hash / datetime 容錯 / 4 檔 py_compile 全過

【調度教訓 — 留作未來改進】
- 12-agent 並行調度時,vuln-verifier 與 fullstack-engineer 競態
  導致 vuln-verifier 讀到已修代碼誤判 NOT REPRODUCIBLE。
  未來:vuln-verifier 應在 fullstack 之前執行,或用 git show HEAD~1 對比修復前。
- fullstack-engineer 引入 P0 regression(f-string 內嵌 ternary 非法 format spec),
  critic 抓到 + Prom sentinel 連鎖 — 證明 critic 審查必要不可省。

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-03 00:18:57 +08:00
Your Name
639bb64788 feat(flywheel): surface ai automation and code review
Some checks failed
Code Review / ai-code-review (push) Successful in 31s
CD Pipeline / build-and-deploy (push) Failing after 5m23s
2026-04-30 00:09:25 +08:00
Your Name
4a57c2d04f feat(flywheel): expose incident processing timeline
All checks were successful
CD Pipeline / build-and-deploy (push) Successful in 10m56s
2026-04-29 23:38:30 +08:00
Your Name
1096da12ae feat(p2.5): aiops 時序前端面板 — Incident 6 階段視覺化
Wave 6 P2.5 frontend-designer 工業級視覺化(拒絕 AI slop):

新增(1824 行):
- apps/web/src/app/[locale]/aiops/timeline/page.tsx
- apps/web/src/components/aiops/timeline/
  · AiopsTimelinePanel.tsx (413) — 主面板組件
  · TimelineStage.tsx (279) — 6 階段時序卡片
  · TimelineStageDetails.tsx (359) — 階段細節展開
  · EvidenceViewer.tsx (144) — Evidence Snapshot 檢視
  · TimelineFilter.tsx (109) — incident_id / severity / 時段 過濾器
  · types.ts (118) — TS 型別定義
  · mock-data.ts (357) — 開發 mock fallback
  · index.ts (7) — barrel export
- i18n: messages/en.json + messages/zh-TW.json — Timeline 翻譯

設計原則:
- 拒絕 AI slop(無泛用 emoji/漸層,採工業 dashboard 風格)
- 後端 endpoint 接通 /api/v1/aiops/timeline(critic B4 修復)
- mock 模式 fallback 防 endpoint 暫時不可達

對應後端: a3b4595e(aiops_timeline.py + aiops_timeline_service.py)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: frontend-designer agent (Wave 6) <noreply@anthropic.com>
2026-04-27 08:11:40 +08:00
Your Name
cc547736ab feat(wave6-8): P2.1 fusion + P2.2 governance + P2.4 consensus + Wave 7/8 BLOCKER 修復
承接 Wave 6/7/8 多 engineer 在 agent 限額前完成的代碼,補 commit 解 production
HEAD 隱性 import error(decision_fusion 已被 decision_manager 引用但檔案 untracked)。

新增(後端核心):
- decision_fusion.py (562 行) — P2.1 方法 III(OpenClaw + Hermes + Elephant 三 LLM 融合)
- aiops_timeline.py + aiops_timeline_service.py — critic B4 修復
  /api/v1/aiops/timeline endpoint,DB 存取抽到 service 層遵守 leWOOOgo 積木化
- migrations/p2_decision_fusion_columns.sql + rollback — approval_records fusion 欄位

修改(後端整合):
- decision_manager.py — fusion 三斷鏈修補(critic B1+B2+B3):
  · B1: 寫 _evidence_snapshot_ref 到 token.proposal_data
  · B2: fusion 前計算 complexity_score 並寫 token
  · B3: fusion composite 寫 token.proposal_data["decision_fusion"]
- auto_approve.py — fusion + consensus 認識(critic B3+B5):
  · composite > 0.7 → auto_execute_eligible bypass min_confidence
  · source=consensus_engine + score>=0.6 → 規則可信路徑
- consensus_engine.py — db-fix _save_consensus 重用 agent_sessions
- governance_agent.py — db-fix _alert PG 寫入 ai_governance_events
- approval_db.py — fusion 3 欄位 + 2 partial index + CheckConstraint
- db/models.py — schema 對齊 migration
- core/config.py — vuln #1 修復:OLLAMA_URL/_FALLBACK_URL field_validator
  拒絕公網 IP + 外部域名,僅允許私網/loopback/K8s SVC 白名單
- core/feature_flags.py — P2 fusion + consensus flags
- main.py — governance_agent lifespan 啟動
- failover_alerter.py — Wave8-X2: in-memory dedup fallback(Redis 拒絕後不 fail-open)
- ollama_*.py — metrics 整合 + recovery 改善
- auto_repair_service.py — verifier 接線

新增(測試 2438 行):
- test_decision_fusion.py / test_governance_agent.py / test_consensus_integration.py
- test_p2_db_fixes.py / test_wave8_fusion_fixes.py
- test_config_url_validation.py(vuln #1 12 tests)
- test_failover_alerter.py +Wave8-X2 in-memory dedup 補測

驗收: 116 tests pass (decision_fusion + wave8_fusion + config_url + consensus +
                      governance + p2_db_fixes + failover_alerter)

Conflict resolution:
- 3 檔(config.py + auto_approve.py + decision_manager.py)git stash pop 衝突
  保留 stashed (engineer 最終版),補回 ValueError 「公網 IP」字樣對齊 test

Note: 此 commit 解 production HEAD 隱性 import error
仍未修: vuln #4 prompt injection / debugger B14 quota fail-closed
       / B25-B26 drain_pending_tasks / B8 governance fail alert

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Multiple Engineers (Wave 6/7/8) <noreply@anthropic.com>
2026-04-27 08:11:40 +08:00
Your Name
d0591c54b0 fix(security): 體健修復 — 7項 Critical/Major 安全問題全修
Some checks failed
CD Pipeline / build-and-deploy (push) Failing after 35s
## Critical 修復 (C1-C5)
- C1: git rm --cached 03-secrets.yaml(CHANGE_ME 模板不再追蹤)
- C2: git rm --cached awoooi.db + .gitignore 加 *.db(SQLite HARD_RULES 違規)
- C3: sentry-tunnel SENTRY_HOST 改為 process.env fallback
- C4: config.py DATABASE_URL 移除 changeme default,改為必填
- C5: run_migration.py 改為 os.environ["DATABASE_URL"]

## Major 修復 (M1-M4)
- M1: auto_repair /execute 加 CSRF 保護 + AutoRepairPanel.tsx 同步
- M2: drift /rollback /adopt 加 CSRF 保護(/internal/scan 保持無 CSRF)
- M3: terminal /intent 加 CSRF 保護 + terminal.store.ts 同步
- M4: live-dashboard HOST_IPS + host-grid VIP 改為 env var

## 其他
- 新增 apps/web/.env.example(6 個 env var 說明)
- K8s deployment-web 補入 3 個新 env var
- 整合測試:新增 aider_event_repository + ai_router_feedback 真實 DB 測試
- test_terminal.py CSRF dependency override 修復

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-22 01:27:39 +08:00
OG T
149065e3de perf(e2e): CI smoke test 改 retain-on-failure 降低錄影 overhead
All checks were successful
CD Pipeline / build-and-deploy (push) Successful in 51m8s
E2E Health Check / e2e-health (push) Successful in 3m18s
video/screenshot 從 'on' 改為 retain-on-failure/only-on-failure
CI 遠端 smoke test 預計從 13min+ 降至 ~1min

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-16 23:44:20 +08:00
OG T
65a5220e16 feat(flywheel-c2-c3): C2 hasType4接真實API + C3 WebSocket指數退避重連
Some checks failed
CD Pipeline / build-and-deploy (push) Failing after 3m41s
C2: flywheel_stats_service 加 type4_count query → API 回傳
    flywheel-diagram.tsx hasType4 改由 type4Count prop 驅動(非 false)
    flywheel-kpi-card.tsx 傳入 type4Count={flowData?.type4_count}

C3: WebSocket onclose 加指數退避重連 (1s→2s→4s→最大30s)
    cancelled 旗標確保 unmount 後不重連
    wsRetryTimer 加入 cleanup

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-12 18:45:40 +08:00
OG T
db282cd0e9 perf(cd): Web build 加速 — buildx registry cache + turbo cache mount
Some checks failed
CD Pipeline / build-and-deploy (push) Has been cancelled
切換 docker buildx + type=registry cache (mode=max):
- 比 inline cache 更可靠,deps/runner 層存入 Harbor web-cache:buildcache
- 移除 BUILDKIT_INLINE_CACHE=1(不再需要)

Dockerfile 補 /root/.cache/turbo mount:
- Turborepo task hash 跨 build 生效,未變動 packages 直接跳過
- 配合既有 .next/cache mount,預期節省 1-2 min

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-12 18:33:27 +08:00
OG T
9b1812cdef feat(c4): ADR-073-C C4 — 飛輪人工介入路徑視覺化
All checks were successful
CD Pipeline / build-and-deploy (push) Successful in 14m5s
新增 FlywheelDiagram SVG 元件:
- 六節點流程圖(監控→去重→診斷→推理→執行→學習)
- TYPE-3 觸發時:紅色虛線 推理→人工處理中心
- TYPE-4 觸發時:橙色虛線 推理→根因確認
- 活躍節點高亮 + incident 計數徽章
- 整合進 FlywheelKPICard(消費 /api/v1/stats/flywheel)

2026-04-12 ogt (ADR-073-C C4)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-12 15:41:33 +08:00
OG T
0c2892ac19 feat(c3): ADR-073-C C3 — WebSocket 飛輪即時推送
後端:
- stats.py 新增 @router.websocket('/flywheel/ws')
- 每 10 秒推送 flywheel_summary JSON

前端 FlywheelKPICard:
- WebSocket 優先,WS 斷線自動降級到 30s HTTP 輪詢
- onopen 時停止 HTTP polling,onclose 時恢復

2026-04-12 ogt (ADR-073-C C3)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-12 15:40:20 +08:00
OG T
4b51f9b60d feat(c2): ADR-073-C C2 — 前端飛輪 KPI 元件接真實 API
Some checks failed
CD Pipeline / build-and-deploy (push) Has been cancelled
- 新增 FlywheelKPICard 元件
- 消費 GET /api/v1/stats/summary,30 秒輪詢
- 顯示 Playbooks、修復成功率、今日轉化數、KM 向量化率
- 卡住 Incident 警示條
- 插入首頁右欄 PendingApprovalsCard 之後

2026-04-12 ogt (ADR-073-C C2)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-12 15:39:10 +08:00
OG T
0b93f0e5c6 feat(topology): B2 elkjs 自動排版 + 展開收合互動 + 過濾控制
Some checks failed
CD Pipeline / build-and-deploy (push) Has been cancelled
- 新增 useElkLayout.ts: elkjs compound graph 自動計算節點位置
  - 收合時群組為葉節點, 展開時子服務納入 compound layout
  - 邊線參與跨群組排版
  - 異步計算, 失敗時 fallback 原位置
- GroupNode.tsx: 新增 onToggle/isExpanded props, ChevronDown/Right 圖示
- ServiceTopology.tsx: 整合 elkjs, 展開收合 state, 3 個控制按鈕
  - 全展開 / 全收合 / 只看異常
  - 排版中指示文字

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-10 11:29:16 +08:00
OG T
f8c6dfc642 feat(web): Header ⌘K 搜尋提示按鈕 + sensor service file 補齊
Some checks are pending
CD Pipeline / build-and-deploy (push) Has started running
Header:
- 新增 ⌘K 入口按鈕(搜尋圖示 + "搜尋..." + ⌘K badge)
- 點擊觸發 window keydown(meta+k) 開啟 CommandPalette
- hover 變藍(UX 提示)

Sensor:
- 補齊 apps/sensor/awoooi-sensor.service(PYTHONUNBUFFERED=1 + --loop)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-10 00:29:15 +08:00
OG T
3fa377cce9 fix(web): en.json 多餘的右括號導致 webpack JSON parse 失敗
Some checks failed
CD Pipeline / build-and-deploy (push) Has been cancelled
position 41700 附近有雙重 }} 結尾,移除多餘的一個。

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-10 00:08:04 +08:00
OG T
524423577a feat(web): 基礎架構主機卡點擊 → 詳情抽屜展開
Some checks failed
CD Pipeline / build-and-deploy (push) Failing after 3m57s
E2E Health Check / e2e-health (push) Successful in 35s
點擊主機卡展開行內抽屜:
- CPU/RAM 大字顯示(含顏色警示:>80% 紅/>60% 橙)
- 完整服務清單(狀態點 + port + latency_ms)
- 相關事件(按 affected_services 過濾)
- ✕ 關閉 / 再點同卡收合
- 選中狀態:藍色邊框高亮

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-09 23:49:00 +08:00
OG T
2897007014 fix(web): 修復 webpack build 錯誤 — 重複 flexShrink + firing_count undefined
Some checks failed
CD Pipeline / build-and-deploy (push) Has been cancelled
header.tsx: 移除重複的 flexShrink: 0 屬性 (TS1117)
classic/page.tsx: firing_count ?? 0 處理 undefined (TS2322)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-09 23:45:52 +08:00
OG T
89db96fc21 feat(web): ⌘K Command Palette — 全局指令面板 + 高斯模糊
Some checks failed
CD Pipeline / build-and-deploy (push) Has been cancelled
- ⌘K (Mac) / Ctrl+K (其他) 開啟/關閉
- 高斯模糊背景 (backdrop-blur 8px + rgba overlay)
- 搜尋過濾:導航 9 頁 + 快速動作(開 Terminal)
- 鍵盤完整支援:↑↓ 選擇 / Enter 執行 / Esc 關閉
- 滑鼠 hover 同步 activeIdx
- 100% i18n (commandPalette namespace)
- Z-Index: DIALOG(70),掛載於 providers.tsx 全局層

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-09 23:28:36 +08:00
OG T
764dcf24e9 fix(i18n): byAnomalyAutoRate 插值修正 + mttrUnit 單位改分鐘
All checks were successful
CD Pipeline / build-and-deploy (push) Successful in 12m22s
byAnomalyAutoRate: "自動修復率" → "自動修復率 {pct}%" (缺少 {pct} 插值導致顯示原始 key)
mttrUnit: "秒" → "分鐘" (前端已做 /60 換算)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-09 23:11:02 +08:00
OG T
af7b6beba8 fix(web): Tab4 by_anomaly 欄位修正 — 適配真實 API 結構
All checks were successful
CD Pipeline / build-and-deploy (push) Successful in 12m8s
by_anomaly 回傳結構為 {alert_name, anomaly_key, disposition:{total,auto_repair,auto_rate,...}}
修正:
- 排序依 disposition.total(非 count)
- 名稱顯示用 alert_name || anomaly_key
- auto_rate 取自 disposition.auto_rate * 100
- 計數取自 disposition.total

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-09 20:57:58 +08:00
OG T
ab5ba7062c feat(web): Tab3 Chain-of-Thought 面板 + Tab4 by_anomaly Top5 + MTTR
All checks were successful
CD Pipeline / build-and-deploy (push) Successful in 13m1s
Tab 3 ActivityStreamTab:
- 點擊 SSE 事件展開 COT 側面板(含 provider/confidence/latency/tools/reasoning)
- 有 proposal_data 的事件顯示 COT badge
- 點擊同一事件收合面板

Tab 4 DispositionTab:
- by_anomaly Top5 水平進度條(按 auto-repair 率著色:≥80% 綠/≥50% 橙/其他紅)
- MTTR 大字顯示(分鐘)+ 無資料時 fallback

i18n: cotTitle/cotReasoning/cotConfidence/cotProvider/cotLatency/cotTools/
      cotClickHint/byAnomalyTitle/byAnomalyAutoRate/mttrTitle/mttrUnit/mttrNoData

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-09 20:42:02 +08:00
OG T
c200d7a52d fix(web+k8s): CSRF mismatch + NetworkPolicy 缺少監控端口
All checks were successful
CD Pipeline / build-and-deploy (push) Successful in 12m19s
1. pending-approvals-card: 改為點擊時即時 fetch 新 CSRF token
   避免多 useCSRF 實例互相覆蓋 cookie 導致 header/cookie 不一致
2. NetworkPolicy: 補開 110:3002(Grafana) 9090(Prometheus) 3001(Gitea)
   修正 monitoring probe "All connection attempts failed"

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-09 20:11:00 +08:00
OG T
8c2983b70a fix(api+web): CORS 補 K3s NodePort origins + sign 補 signer_id/name
Some checks failed
CD Pipeline / build-and-deploy (push) Has been cancelled
CORS (config.py):
- 補 http://192.168.0.125:32335 (K3s VIP NodePort)
- 補 http://192.168.0.120:32335 + 121:32335 (K3s nodes)
- 修前: 內網瀏覽器開 :32335 打 API 全 CORS blocked
  (incidents Failed to fetch / monitoring 無法連線根因)

sign body (pending-approvals-card.tsx):
- signer: 'web-ui' → signer_id: CURRENT_USER.id + signer_name: CURRENT_USER.name
- 修前: POST /approvals/{id}/sign 回 403 (缺必填欄位 422 誤報為 403)
  — 實際是 422 Field required signer_id + signer_name

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-09 19:50:48 +08:00
OG T
9a8f410f23 fix(web): PendingApprovalsCard 批准/拒絕補 CSRF Token — 修復 403
Some checks failed
CD Pipeline / build-and-deploy (push) Has been cancelled
根因: fetch 沒帶 X-CSRF-Token header + credentials:include
     API 回 403 "CSRF token cookie missing"

修復: 加 useCSRF hook,sign/reject 請求帶 ...getHeaders() + credentials:include
     與 incident-card.tsx / openclaw-state-machine.tsx 同一模式

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-09 19:00:02 +08:00
OG T
a2a98452ad fix(web): 移除 AIModelStatus 假綠燈 — Gemini/NVIDIA 不應 assumed up
Some checks failed
CD Pipeline / build-and-deploy (push) Has been cancelled
根因: /api/v1/health 的 components 只有 api/database/redis/ollama/openclaw
     d.components.gemini 永遠 undefined → healthy: true 是硬編碼假數據

修復: 改為只有 components 有對應 key 才更新狀態
     無 health 資料時保持 false(unknown),不顯示假綠燈

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-09 18:51:14 +08:00
OG T
896bef94ee fix(web): pending-approvals-card 加防重複點擊 + loading 狀態
linter 自動強化: actioningId state 防止同一張卡重複操作
- disabled + opacity 0.6 + cursor not-allowed
- loading 時按鈕顯示 '...'
- finally() 確保 actioningId 清除

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-09 18:38:08 +08:00
OG T
890e2a9568 fix(review): 架構審查修復 — P0 import crash + i18n 零 hardcode + 靜默錯誤
P0:
- proposal_service.py: 補 get_redis + INCIDENT_KEY_PREFIX import
  (修前: resolve_incident_after_approval 必 NameError crash)

P1 i18n:
- page.tsx: 拓撲群組移除 emoji,改用 tTopo() i18n key
- page.tsx: 主機標籤 (DevOps金庫等) 改 tTopo() i18n
- ai-model-status.tsx: 加 useTranslations,AI 模型狀態 → t('aiModelStatus')
- disposition-mini.tsx: 查看完整報表 → t('viewAllReport')
- recent-activity.tsx: 查看活動串流 → t('viewAllAlerts')

P2 品質:
- pending-approvals-card.tsx: approve/reject 加 r.ok 檢查+錯誤顯示,查看全部授權加路由+i18n
- page-tabs.tsx: TabSkeleton 載入中... → t('loading')
- page.tsx: ↑5% → tDashboard('trendUp', {pct}) 動態值
- page.tsx: Prometheus '23' hardcode → '-- targets'

i18n 新增 key (zh-TW + en 同步):
- dashboard: viewAllAlerts/viewAllAuth/viewAllReport/aiModelStatus/loading/trendUp
- topology: groupExternal/allReachable/investigating/hostDevops/hostAiData/hostK3sMaster/hostK3sWorker

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-09 18:34:50 +08:00
OG T
49a15e1ac9 feat(web): G1 骨架屏取代載入中 + S8 完整提交 — Sprint 5R
Some checks failed
CD Pipeline / build-and-deploy (push) Has been cancelled
- G1: PulseSkeleton + CardSkeleton 元件
- 首頁所有 LobsterLoading 替換為 PulseSkeleton/CardSkeleton
- Tab 2/4 載入狀態用 CardSkeleton
- 活躍事件載入用 PulseSkeleton

Sprint 5R Phase 1B+1C 全部完成:
S1(KPI卡片) S2(FlowPipeline OpenClaw) S3(AI提案) S4(環形圖)
S5(時間線) S6(Terminal) S7(待審批) S8(拓撲群組+主機)
S9(AI模型) S10(監控3×2) S11(Tab修復) S12(頁面修復) G1(骨架屏)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-09 18:09:26 +08:00
OG T
09c6eb3358 feat(web): S2 FlowPipeline 龍蝦→OpenClaw icon — Sprint 5R
Some checks failed
CD Pipeline / build-and-deploy (push) Has been cancelled
- LobsterSVG 替換為 OpenClawIcon (dashboardicons.com/openclaw PNG)
- 4 種嚴重度渲染全部更新 (P0/P1/P2/P3)
- icon 直接取代圓圈作為活躍步驟標記(非浮動)
- S3 確認: AI 提案橫幅已存在且樣式正確

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-09 18:07:53 +08:00
OG T
03b07d5bc5 feat(web): S8 基礎架構拓撲群組 2×2 + 主機 4 台 — Sprint 5R
Some checks failed
CD Pipeline / build-and-deploy (push) Has been cancelled
- 拓撲模式(預設): 4 群組 2×2 網格 (基礎設施/AI數據/K3s/外部)
  每群組含名稱+服務數+健康摘要+服務列表(色點)
  有 warning 的群組加橘色光暈
- 主機模式: 4 台 2×2 (110/188/120/121) 含 CPU/RAM 進度條
  優先使用 API 真實數據,fallback 靜態值
- 預設切換為拓撲模式 (設計稿要求)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-09 18:06:01 +08:00
OG T
895784e646 feat(web): S7+S9+S10 待審批+AI模型+監控工具3×2 — Sprint 5R
All checks were successful
CD Pipeline / build-and-deploy (push) Successful in 12m15s
- S7: PendingApprovalsCard 含風險標籤 + 批准/拒絕按鈕
- S9: AIModelStatus 2×2 (OpenClaw/Ollama/Gemini/NVIDIA)
- S10: MonitoringTools 改 3×2 網格 (名稱+元資訊+左側色條)
- 右欄順序: OpenClaw → 待審批 → 基礎架構 → 監控工具 → AI模型

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-09 16:10:28 +08:00
OG T
a0f3a7d532 feat(web): S6 OpenClaw AI Terminal + 狀態數據 — Sprint 5R
All checks were successful
CD Pipeline / build-and-deploy (push) Successful in 13m15s
- 分隔線下方新增:模型名稱 + 運行狀態
- 即時統計:今日分析數 / 成功率 / MTTR
- AI 推理終端:#141413 背景 + #a0e8a0 螢光綠 + JetBrains Mono
- 最後一行黃色閃爍游標 ▎
- 資料來源:/api/v1/alert-operation-logs + /api/v1/stats/disposition

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-09 15:56:03 +08:00
OG T
b85a0e232e feat(web): S4+S5 處置統計環形圖 + 最近活動時間線 — Sprint 5R
Some checks failed
CD Pipeline / build-and-deploy (push) Has been cancelled
- S4: DispositionMini 元件 (SVG 環形圖 + 四類列表)
- S5: RecentActivity 元件 (時間線 + 色點 + JetBrains Mono)
- 左欄改為 flex:6 可滾動多卡片列
- 右欄改為 flex:4 (60:40 比例)
- 左欄結構: 活躍事件 → 處置統計 → 最近活動

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-09 15:51:54 +08:00
OG T
7a2e07f74f feat(web): S1 KPI Strip 改 5 張卡片 — Sprint 5R Phase 1B
- 7 指標分隔線 → 5 張 kpi-card 卡片橫排
- 系統健康(進度條) / 活動事件(P1:P2) / 自動修復率(進度條+↑5%) / 待審批 / 本週操作
- 移除龍蝦游泳列(統帥指示移除)
- 新增 weeklyOps 從 /api/v1/audit-logs/stats 取得

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-09 15:48:04 +08:00
OG T
289dac6bd1 fix(web): S11+S12 載入失敗修復 — Sprint 5R Phase 1A
Some checks failed
CD Pipeline / build-and-deploy (push) Has been cancelled
- S11: Tab 2 approvals API path 修正 (?status=pending → /pending)
- S11: Tab 2 fetch 加 r.ok 檢查避免解析錯誤 JSON
- S12: 安全合規改用 SecurityPanel + CompliancePanel (解決 double AppLayout)
- S12: 知識庫改為 redirect 到 /knowledge-base (避免 lazy import 問題)
- S12: 拓撲圖加入 useDashboardStore.connect() 啟動 SSE

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-09 15:43:06 +08:00
OG T
73ef9c6b12 fix(web): QA 掃描 — alert-operation-logs i18n + classic emoji→icon + knowledge 載入中
All checks were successful
CD Pipeline / build-and-deploy (push) Successful in 12m28s
- alert-operation-logs: 30+ 處硬編碼中文改 useTranslations (18 event types + UI)
- classic: 告警 badge + 等待確認 + TOOL_EMOJI → Lucide icon
- knowledge: 載入中 → common.loading
- 新增 alertOpLogs i18n section (zh-TW + en)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-09 13:58:04 +08:00
OG T
580053394b fix(web): C4 監控工具 emoji → Lucide icon (feedback_no_emoji_use_icons.md)
Some checks failed
CD Pipeline / build-and-deploy (push) Has been cancelled
TOOL_EMOJI Record<string> 改為 TOOL_ICON Record<React.ReactNode>
使用 BarChart3/Flame/Telescope/FlaskConical/Activity/GitBranch

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-09 11:28:53 +08:00
OG T
4a94588766 fix(web): I3 approve/reject API + I4 SIGNOZ_URL env + I5 ErrorsPanel nothing-gray
Some checks failed
CD Pipeline / build-and-deploy (push) Has been cancelled
- I3: Approve/Reject 按鈕串接 /api/v1/approvals/{id}/sign|reject
- I4: ApmPanel SIGNOZ_URL 改用 NEXT_PUBLIC_SIGNOZ_URL 環境變數
- I5: ErrorsPanel 外框改用 nothing-gray 調色盤 inline style

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-09 11:20:44 +08:00
OG T
28d2ff704e fix(web): C1 殘留 i18n — 5 處硬編碼中文改 useTranslations
Some checks failed
CD Pipeline / build-and-deploy (push) Has been cancelled
- 告警 badge: alertBadge / alertBadgeZero
- 等待確認: awaitingConfirm
- 主機/拓撲 toggle: hostView / topoView
- HOST_CATALOG description 確認未渲染,不需 i18n

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-09 11:18:05 +08:00
OG T
fb66ecd2a0 refactor(web): Panel 抽取全面完成 — 三個整合頁面解決雙重 AppLayout
Some checks failed
CD Pipeline / build-and-deploy (push) Has been cancelled
/observability: AppsPanel + ServicesPanel (共 5/5 Tab 完成)
/automation: AutoRepairPanel + NeuralCommandPanel + DriftPanel (3/3)
/operations: DeploymentsPanel + TicketsPanel + CostPanel + ActionLogsPanel + BillingPanel (5/5)

原始頁面全部精簡為 AppLayout + Panel,零雙重 Layout。

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-09 11:06:57 +08:00
OG T
7934ade3a6 refactor(web): 全部 13 Panel 抽取完成 + 整合頁面雙重 AppLayout 修正
Some checks failed
CD Pipeline / build-and-deploy (push) Has been cancelled
Panel 抽取 (13 個):
- MonitoringPanel, ApmPanel, ErrorsPanel, AppsPanel, ServicesPanel
- AutoRepairPanel, NeuralCommandPanel, DriftPanel
- DeploymentsPanel, TicketsPanel, CostPanel, ActionLogsPanel, BillingPanel

整合頁面更新 (全部使用 Panel,無雙重 AppLayout):
- /observability: 5 Panel
- /automation: 3 Panel
- /operations: 5 Panel

首席架構師 I2 問題已解決

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-09 11:05:37 +08:00
OG T
9e10305acc fix(web): C2 拓撲元件 i18n — 10+ 處硬編碼中文改 useTranslations
Some checks failed
CD Pipeline / build-and-deploy (push) Has been cancelled
2026-04-09 11:04:35 +08:00
OG T
7153395267 fix(web): 首席架構師 P0 修正 — i18n 硬編碼 + 效能輪詢
Some checks failed
CD Pipeline / build-and-deploy (push) Has been cancelled
C1: 首頁 4 Tab 30+ 處硬編碼中文改為 useTranslations
  - 新增 dashboard.tabs.* / alertEvents / approve / reject 等 30+ i18n key
  - zh-TW + en 雙語同步
C3: automation/operations Loading 改用 LobsterLoading (i18n)
I1: 100ms setInterval 改為 popstate + 1s 低頻備援 (效能 10x 改善)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-09 11:01:07 +08:00
OG T
5ea6c3fb91 feat: alert_operation_log 查詢 API + 前端頁面 (Sprint 5.2)
Some checks failed
CD Pipeline / build-and-deploy (push) Has been cancelled
後端:
- 新增 list_recent() 分頁方法 (alert_operation_log_repository)
- 新增 /api/v1/alert-operation-logs GET + /stats 端點
- main.py 註冊 alert_operation_logs_v1.router

前端:
- /alert-operation-logs 頁面,18 種 event_type 顏色標記
- 分頁、event_type 篩選、incident_id 篩選
- 24h 統計卡片 (總數/護欄攔截/自動修復/已解決)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-09 10:57:40 +08:00
OG T
11fc2860cf refactor(web): ErrorsPanel 抽取 — /observability 3 個 Tab 已無雙重 Layout
Some checks failed
CD Pipeline / build-and-deploy (push) Has been cancelled
2026-04-09 10:51:59 +08:00
OG T
22fa6ea413 refactor(web): ApmPanel 抽取 — /observability 的 monitoring+apm 兩個 Tab 無雙重 Layout
Some checks failed
CD Pipeline / build-and-deploy (push) Has been cancelled
2026-04-09 10:49:39 +08:00
OG T
f05a391d02 feat(web): panels/index.ts 匯出 + Panel 抽取進度標記
Some checks failed
CD Pipeline / build-and-deploy (push) Has been cancelled
2026-04-09 10:42:30 +08:00
OG T
770667eed4 refactor(web): MonitoringPanel 抽取 — 解決 /observability 雙重 AppLayout
Some checks failed
CD Pipeline / build-and-deploy (push) Has been cancelled
2026-04-09 10:40:07 +08:00
OG T
c26c4030e4 feat(web): /topology 升級為 React Flow 完整版 (串接真實 dashboard API)
Some checks failed
CD Pipeline / build-and-deploy (push) Has been cancelled
2026-04-09 09:49:31 +08:00