OG T
|
0dec007673
|
docs(logbook): 記錄 Sprint 3 P0 critical security fixes 完成
CD Pipeline / build-and-deploy (push) Successful in 11m37s
|
2026-04-07 11:10:48 +08:00 |
|
OG T
|
93bcfb4ce8
|
docs: 更新 LOGBOOK — Sprint 3 SSH_COMMAND 指揮權鏈完成
|
2026-04-06 14:48:11 +08:00 |
|
OG T
|
1a8021bfaa
|
docs(plans): Sprint 3 SSH_COMMAND 指揮權鏈實作計畫 (7 tasks)
|
2026-04-06 14:08:28 +08:00 |
|
OG T
|
b9ee58f752
|
fix(cd): 移除 parse_mode=HTML 避免 commit message 特殊字元造成 400 (non-fatal)
CD Pipeline / build-and-deploy (push) Successful in 13m15s
E2E Health Check / e2e-health (push) Successful in 36s
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-04-05 22:32:02 +08:00 |
|
OG T
|
e51a68d309
|
docs(logbook): 記錄 Telegram/CD 顯示修復 + ADR-059 全部完成
|
2026-04-05 14:49:10 +08:00 |
|
OG T
|
35d37111f0
|
docs(logbook): ADR-059 全計劃執行完畢 (Task 1-9)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-04-05 14:47:05 +08:00 |
|
OG T
|
d9af8e1c7a
|
docs(logbook): ADR-059 Gitea Webhook 遷移完成記錄
CD Pipeline / build-and-deploy (push) Has been cancelled
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-04-05 14:45:02 +08:00 |
|
OG T
|
23364423fa
|
feat(webhook): ADR-059 GitHub → Gitea Webhook 遷移完成
- gitea_webhook.py: Header 全部改 X-Gitea-*,移除 workflow_run handler
- gitea_webhook_service.py: _fetch_pr_diff 改直接 httpx,不依賴 github_api_service
- 清除兩個檔案的所有殘留 github_ log key,review_id prefix 改 gitea-
- test_gitea_webhook.py: 10/10 通過,docstring 修正
- 03-secrets.yaml: 新增 GITEA_WEBHOOK_SECRET 佔位
- cd.yaml: 新增 GITEA_WEBHOOK_SECRET 注入步驟
- ADR-059: 建立架構決策文件
待統帥操作: Gitea Actions secret + Gitea UI Webhook 設定
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-04-05 14:44:32 +08:00 |
|
OG T
|
be60ec1507
|
docs(plan): ADR-059 Gitea Webhook 遷移實作計畫 (9 Tasks)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-04-05 14:22:29 +08:00 |
|
OG T
|
5cd67d372f
|
docs(spec): ADR-059 Gitea Webhook 遷移設計規格
從 GitHub Webhook (Phase 13.1) 遷移至 Gitea Webhook
最少改動策略:Header 常數替換,業務邏輯層不動
廢棄 workflow_run CI 診斷(CD pipeline 已有 TG 通知覆蓋)
整合首席架構師護欄:防禦性 payload 解析 + Content-Type 設定
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-04-05 14:17:13 +08:00 |
|
OG T
|
6937238174
|
docs(logbook): 記錄 Telegram 按鈕修復 + SRE 群組格式升級
|
2026-04-05 14:17:11 +08:00 |
|
OG T
|
a49faf7baa
|
docs: ADR-058 Host Auto-Repair SSH 白名單 + LOGBOOK 更新
首席架構師 Review 結果: 72→88/100
已修正: C1 C2 C3 M3 m1 m2
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-04-05 13:09:58 +08:00 |
|
OG T
|
25e2e45353
|
docs(logbook): Telegram 格式重設計 + 按鈕修復首席架構師 R1 通過記錄
CD Pipeline / build-and-deploy (push) Has been cancelled
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-04-05 13:08:13 +08:00 |
|
OG T
|
2a2a1fac8b
|
docs(logbook): Sprint 3 Host Auto-Repair 全閉環完成記錄
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-04-05 12:31:19 +08:00 |
|
OG T
|
66b12bf9eb
|
fix(infra): 根治 Harbor Exited(128) Race Condition + harbor-watchdog 常駐自愈
問題根因:
awoooi-startup-110.sh 在 Harbor 啟動時,第一次 compose up -d 會同時
啟動所有容器。harbor-core/db/portal 嘗試連 syslog:1514(harbor-log 未就緒),
失敗後 exit(128),restart:always 重試直到 backoff 放棄。
即使後來 harbor-log healthy,其他容器已不再重試。
修復 1 — startup-110.sh Harbor 時序(4 Phase 策略):
Phase 1: 清除所有 Exited Harbor 容器(打破 backoff 死鎖)
Phase 2: 只啟動 harbor-log
Phase 3: 等 harbor-log healthy(最多 90s)
Phase 4: 啟動全組件
修復 2 — harbor-watchdog.service(常駐自愈):
Type=simple 常駐進程,每 60s 輪詢 http://127.0.0.1:5000/v2/
不健康 → 等 5s 再確認 → 執行 Phase 1-4 完整修復
修復重開機時序問題無法覆蓋的「運行中崩潰」場景
Bug Fix:curl -f 會把 HTTP 401 視為失敗(exit 22),
Harbor /v2/ 正常回傳 401(需認證),改用 curl -s 不加 -f
REBOOT-RECOVERY-SOP.md → v5.0
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-04-05 12:13:21 +08:00 |
|
OG T
|
f51bf5a6a8
|
feat(backup): 全服務備份覆蓋 + 告警機制 — 9/9 服務完整
新增備份(已部署到 110,首次執行全部通過):
- backup-langfuse.sh: Langfuse AI 追蹤/評測 DB (7238 traces)
- backup-monitoring.sh: Prometheus + Grafana + Alertmanager volumes + configs
- backup-signoz.sh: SignOz ClickHouse + SQLite (分散式追蹤/日誌)
- backup-open-webui.sh: Open-WebUI LLM 對話紀錄 (SSH 188 volume)
- backup-clawbot.sh: ClawBot Redis 狀態/快取 (SSH 188 volume)
- backup-all.sh v3.0: 整合至 9/9 服務
告警機制:
- common.sh: notify_clawbot 改用 /webhook/custom 正確格式
- failed → severity:critical → Telegram 🔴 立即告警
- 告警測試通過:{"status":"ok","alert_id":"878c4c59..."}
GFS 保留:30日/12週/24月 (AWOOOI 額外 28h 高頻)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-04-05 11:12:42 +08:00 |
|
OG T
|
91564c6ea3
|
docs(sop): REBOOT-RECOVERY-SOP.md v4.0
更新:
- 加入 Sentry /opt/sentry 啟動說明 (110 Step 7/9)
- 新增 Sentry 重開機損壞修復章節 (PostgreSQL WAL/Redis RDB/ClickHouse parts)
- 告警沉默診斷樹補充「規則未部署」診斷 + deploy-alerts.sh 修復指令
- E2E 驗證腳本加入 Sentry + Prometheus 規則數驗證 (≥25)
- 架構圖補充 Sentry :9000
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-04-05 03:11:27 +08:00 |
|
OG T
|
0db9b41808
|
docs(plan): Observability + Auto-healing 完整實施計畫 (15 Tasks, 3 Sprints)
Sprint 1 (P0): Prometheus 統一告警規則 + Sentry 啟動 + CD 同步
Sprint 2 (P1): SigNoz 日誌告警 + Sentry SDK 標籤
Sprint 3 (P2): SSH HostRepairAgent 基礎設施
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-04-05 02:24:23 +08:00 |
|
OG T
|
de33abe0e3
|
docs(spec): 全系統自愈閉環設計規格 v1.0
整合三大問題的完整解決方案:
1. Prometheus 規則未部署 (13條→40+條,含SentryDown/AlertChain)
2. 日誌收集但無log-based alerting
3. 自動修復只限K8s層,無Host Docker/systemd修復能力
包含:
- 統一標籤規範 (layer/component/team/host)
- Sprint 1: 規則部署+Sentry啟動+CD同步
- Sprint 2: SigNoz log alert + Sentry整合
- Sprint 3: SSH HostRepairAgent + Playbooks
- SOP v4.0整合更新點
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-04-05 02:14:01 +08:00 |
|
OG T
|
2243a21b96
|
fix(ai-router): v4.3 NIM 保護 — timeout 不計 CB 失敗,每次先跑 NIM 才切 Gemini
CD Pipeline / build-and-deploy (push) Failing after 20s
需求: NIM 必須等到有回應才切換,不能因為慢就被 CB 封鎖走 Gemini
變更:
- Timeout exception 不累積 CB failure(只有真實連線錯誤才計)
- NIM CB: failure_threshold=10, recovery_timeout=30s(比預設寬鬆)
- 設計文件 v4.3: 更新方向二,移除錯誤假設
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-04-05 01:51:12 +08:00 |
|
OG T
|
8f64affbdb
|
docs(runbooks): REBOOT-RECOVERY-SOP v3.0 完整重開機自動化方案
## 內容
完整盤點所有主機、服務、工具、監控的:
- 啟動順序與依賴關係圖
- 正常重啟 vs 異常重啟處理流程
- 各主機詳細啟動序列 (188/110/120/121)
- 常見故障排查手冊 (告警沉默/CD失效/數據消失/NodePort)
- E2E 驗證腳本
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-04-05 01:48:29 +08:00 |
|
OG T
|
be3aa6069b
|
feat(backup): AWOOOI 高頻備份 — 每 6 小時備份 awoooi_prod
awoooi_prod 為核心生產 DB,每日一次最大損失 24 小時不可接受:
- backup-awoooi-frequent.sh:每 6 小時備份 awoooi_prod(08/14/20:00)
- 02:00 由 backup-all.sh 完整備份(含 dev/k3s)
- 合計 4次/天,最大數據損失 ≤ 6 小時
- GFS 保留:28h 高頻 + 30日 + 12週 + 24月
首次執行:✅ 680K,4s,snapshot db050dbc
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-04-05 01:14:50 +08:00 |
|
OG T
|
3136fc5ea0
|
feat(backup): 全面自動化備份 + AWOOOI DB + GFS 延長保留
首席架構師備份審計 — 全部自動化完成:
- backup-awoooi.sh:新增 AWOOOI PostgreSQL 備份腳本
- awoooi_prod (KB/事故/AutoRepair/Drift) + k3s_datastore
- 從 110 SSH 到 188 執行 pg_dump,整合進 restic
- 首次執行:680K,9s,snapshot 8750748f ✅
- backup-all.sh v2.0:整合第 4 個服務 AWOOOI DB
- GFS 保留策略延長:
- 每日 7→30 份(覆蓋最近 30 天)
- 每週 4→12 份(覆蓋最近 3 個月)
- 每月 6→24 份(覆蓋最近 2 年)
- BACKUP-STATUS.md:更新為全自動化狀態總覽
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-04-05 01:11:31 +08:00 |
|
OG T
|
84cfdb6195
|
docs(backup): 備份審計完整盤點 + 新增 AWOOOI DB 與 Gitea DB 備份腳本
首席架構師備份審計結論:
- awoooi_prod PostgreSQL:❌ 無備份 (P0 缺口)
- Gitea SQLite DB:❌ 無備份 (今日已損壞,人工修復耗時 2h+)
新增:
- scripts/backup/backup-awoooi-db.sh (188 部署,02:00 daily)
- scripts/backup/backup-gitea-db.sh (110 部署,01:00 daily)
- docs/runbooks/BACKUP-STATUS.md (全景表 + 部署步驟 + SOP)
- LOGBOOK.md 備份審計段落
待手動部署:統帥需 scp 腳本至 188/110 並設定 crontab
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-04-05 01:01:58 +08:00 |
|
OG T
|
45458e8f33
|
docs(adr): ADR-057 狀態更新為已批准並實作
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-04-05 00:44:31 +08:00 |
|
OG T
|
f4f454fd98
|
feat(api): 重開機後自動 warm-up Redis Working Memory from PostgreSQL
- main.py lifespan: 啟動時從 DB restore INVESTIGATING/MITIGATING incidents
- scripts/reboot-recovery: 188 + 110 自動化腳本 + systemd services
- scripts/reboot-recovery: aiops-network 自動建立 (ClawBot 依賴)
- docs/runbooks/REBOOT-RECOVERY-SOP.md: 完整改寫,含自動化腳本說明
Why: 重開機後 Redis 清空導致前端 incidents 顯示 0 筆(DB 完整保存)
統帥批准: 「所有數據必須被長久記錄下來」
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-04-05 00:39:20 +08:00 |
|
OG T
|
ddb75b69c5
|
docs(logbook): Phase 25 Review R2 通過 + ADR-054~057 記錄
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-04-05 00:25:31 +08:00 |
|
OG T
|
15c7f6fcd3
|
docs(adr): 起草 ADR-054/055/056/057 — Phase 25 三方向架構決策
ADR-054: DIAGNOSE Privacy-First Routing (已批准)
- _local_fallback_chain 設計決策
- NEMOTRON privacy_level=local 首席架構師裁示
- 全部 local 失敗 → REJECT + Telegram
ADR-055: Knowledge Auto-Harvesting (已批准)
- AUTO_RUNBOOK DRAFT + ANTI_PATTERN PUBLISHED 設計理由
- compute_hash() 碰撞風險說明
- Fire-and-forget GC 防護強制規範
ADR-056: Config Drift Detection 四層架構 (已批准)
- Detector→Analyzer→Interpreter→Remediator 職責邊界
- AI 只做意圖分析不做修復決策
- adopt() 暫停 + _recent_reports Phase 1 限制
ADR-057: adopt() Gitea PR API 實作路徑 (草案,待批准)
- 解決 API Pod git add -A 安全風險
- PR review 流程保障
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-04-05 00:24:50 +08:00 |
|
OG T
|
c4923b6908
|
docs(logbook): Phase 22.4 + Phase 25 全部驗證通過記錄
- Phase 22.4 tests 18/18 PASSED (b6e12f7)
- embed-all 7/7 prod 成功
- semantic-search E2E score=0.6867 驗證通過
- drift /scan E2E 正常回應
- drift-scanner CronJob 每小時執行
- dev/prod DB migration (symptoms_hash + enum) 完成
- 53 integration tests PASSED
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-04-04 18:00:33 +08:00 |
|
OG T
|
0c180dec86
|
docs(spec): 方向二實作修正記錄 — Nemotron privacy_level=cloud (P0)
|
2026-04-04 17:42:53 +08:00 |
|
OG T
|
0b41df45d6
|
docs(plans): 三方向實作計畫 P0/P1/P2
- P0: DIAGNOSE Privacy-First Routing(local chain 隔離 + REJECT 保護)
- P1: Knowledge Auto-Harvesting(Anti-Pattern 閉環 + Runbook 生成)
- P2: Config Drift Detection(GitOps 守門員 + Nemotron 意圖分析)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-04-04 12:31:36 +08:00 |
|
OG T
|
035cb9cd0d
|
docs(spec): Nemotron 主動防禦三方向設計文件
- 方向一:Knowledge Auto-Harvesting(Anti-Pattern 閉環 + Runbook 自動生成)
- 方向二:DIAGNOSE Privacy-First Routing(Local-Only Fallback Chain)
- 方向三:Config Drift Detection(GitOps 守門員 + Nemotron 意圖分析)
首席架構師 ogt 100% 技術背書
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-04-04 12:18:11 +08:00 |
|
OG T
|
369413f87d
|
docs: 更新 LOGBOOK KB Phase 2 全修完成 + 5 tests PASSED
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-04-04 11:55:40 +08:00 |
|
OG T
|
69a9218723
|
docs: 更新 LOGBOOK KB Phase 2 + 首席架構師 Review 紀錄
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-04-04 11:49:31 +08:00 |
|
OG T
|
cddc4cb1fc
|
fix(knowledge): 首席架構師 Review 修復 C1+C2+I1+I2 (71→~88/100)
CD Pipeline / build-and-deploy (push) Successful in 7m16s
C1: IKnowledgeRepository Protocol 補齊 save_embedding + semantic_search +
list_unembedded_entries,恢復 Interface 先行保護層
C2: embed_all_entries Service 層 raw SQL 移至 Repository.list_unembedded_entries()
Service 改透過 Protocol 呼叫,符合 leWOOOgo 積木化原則
I1: asyncio.create_task 加入 _pending_tasks set 持有引用,防 GC 回收與
Shutdown 時 Task 遺失;task done 後自動 discard
I2: OllamaEmbeddingService 從每次 new 改為 KnowledgeService.__init__ 注入,
單一實例重用
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-04-04 11:22:38 +08:00 |
|
OG T
|
15aabd6ac5
|
fix(chat+nim): 修復首席架構師 Review I1-I4 + S3 四項重要問題
CD Pipeline / build-and-deploy (push) Successful in 7m9s
I1: chat_manager._call_openclaw timeout=30.0 → 讀 settings.OPENCLAW_TIMEOUT
I2: nvidia_provider.py stale comment "45" → "55" 對齊 ConfigMap
I3: asyncio.shield 移除 — shield 超時後 task 繼續跑但無人等待 (silent leak)
I4: ChatManager.__init__ 移除 repo 實例 (leWOOOgo 禁 Service 持有 repository)
S3: _check_nemotron_health probe 10s → 25s + /v1/models 輕量端點
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-04-03 16:36:16 +08:00 |
|
OG T
|
ce945fe89e
|
rule(cost): 🔴🔴🔴 費用變更強制審批 — HARD_RULES v1.8 + CLAUDE.md
統帥指示 2026-04-03:
所有涉及費用產生的變更必須停下來等統帥明確批准後才可執行
新增:
- HARD_RULES.md v1.8: Cost Change Approval 章節
- 定義涉費變更範圍
- 強制流程: 識別→停→說明→等批准→執行
- 今日違規教訓記錄
- CLAUDE.md 任務前必讀新增費用變更條目
Memory 已同步:
- feedback_cost_change_approval.md (新建)
- feedback_constitution_v2.md 第五章
- MEMORY.md 索引最高鐵律區
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-04-03 15:36:47 +08:00 |
|
OG T
|
dc232ebb49
|
docs: LOGBOOK 更新 — KB Phase 1 + monitoring + I1/I3 完成
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-04-03 13:22:54 +08:00 |
|
OG T
|
0b83707697
|
feat(web): APM/Apps/Deployments/Tickets 頁面升級 — 串接真實 API 數據
CD Pipeline / build-and-deploy (push) Has been cancelled
- apm/page.tsx: Golden Signals 真實數據 (SignOz ClickHouse)
- apps/page.tsx: 主機服務狀態 (/api/v1/dashboard 真實數據)
- deployments/page.tsx: K8s 部署狀態串接
- tickets/page.tsx: Incidents 列表串接
- i18n: apm/apps/deployments/tickets namespace 雙語補齊
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-04-03 00:08:11 +08:00 |
|
OG T
|
2d5f1a71ad
|
chore(observability): ClickHouse TTL 設定完成 — Phase O 全驗收
CD Pipeline / build-and-deploy (push) Has been cancelled
E2E Health Check / e2e-health (push) Has been cancelled
signoz_logs: 30天 (已內建 _retention_days DEFAULT 30)
signoz_metrics 8個表: 233280000s(2700天) → 7776000s(90天)
- samples_v4, samples_v4_agg_5m, samples_v4_agg_30m
- exp_hist, time_series_v4, time_series_v4_6hrs
- time_series_v4_1day, time_series_v4_1week
Phase O 驗收清單全部打勾 ✅
Co-Authored-By: Claude Code <noreply@anthropic.com>
|
2026-04-02 21:38:39 +08:00 |
|
OG T
|
08f73dfce8
|
docs: Phase O-5 Wave 5.4 告警鏈路 E2E 驗證 Runbook
E2E Health Check / e2e-health (push) Has been cancelled
CD Pipeline / build-and-deploy (push) Has been cancelled
- 架構圖、手動測試步驟、smoke test 清單
- generate_monitoring.py 用法說明
- 已知問題豁免清單、回滾指令
- 首次驗收記錄 2026-04-02 8/8
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-04-02 21:34:43 +08:00 |
|
OG T
|
48c65756da
|
chore(config): USE_AI_ROUTER=true 寫入 ConfigMap (Phase 24 B2)
E2E Health Check / e2e-health (push) Has been cancelled
CD Pipeline / build-and-deploy (push) Has been cancelled
防止下次 CD deploy 覆蓋 kubectl set env 的設定。
B2 觀察期 48h, 截止 2026-04-04 18:40 台北時間。
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-04-02 21:26:53 +08:00 |
|
OG T
|
3f339110dd
|
fix(observability): 同步 .188 實際部署調整至 repo
CD Pipeline / build-and-deploy (push) Has been cancelled
E2E Health Check / e2e-health (push) Has been cancelled
與原始計畫的差異:
1. MinIO Bearer Token 認證
- 原計畫: MINIO_PROMETHEUS_AUTH_TYPE=public (此版本不支援)
- 實際: mc admin prometheus generate 產生 Bearer Token
- 更新: prometheus-config-phase-o.yaml 加入 bearer_token
2. remote_write 廢棄 → OTEL Collector Prometheus scrape
- 原計畫: Prometheus remote_write → SigNoz OTEL /api/v1/write
- 實際: SigNoz OTEL Collector 不支援 Prometheus remote_write 格式 (404)
- 改用: OTEL Collector prometheus receiver 直接 scrape node-exporter + kube-state-metrics
- 新增: ops/signoz/otel-collector-config-phase-o.yaml (版本控管副本)
3. ADR-053 驗收清單更新為實際結果
Co-Authored-By: Claude Code <noreply@anthropic.com>
|
2026-04-02 21:23:47 +08:00 |
|
OG T
|
3e4612f259
|
docs(observability): ADR-053 SigNoz 統一架構 + Phase O 驗收
CD Pipeline / build-and-deploy (push) Failing after 36s
E2E Health Check / e2e-health (push) Successful in 16s
- 新增 ADR-053: 可觀測性統一架構決策記錄
- 更新 service-registry.yaml: 補齊 MinIO/Kali 監控入口
- 更新 LOGBOOK: Phase O 完成狀態
Phase O 驗收清單:
✅ kubectl Mac 本機免密碼
✅ OTEL Collector 2 Pod Running
✅ Event Exporter 1 Pod Running
✅ Descheduler CronJob Completed
✅ MinIO + Kali 告警規則
✅ Alert Chain Smoke Test
✅ CD Pipeline 整合
⏳ ClickHouse TTL / remote_write / SigNoz rules (待 .188 手動)
Co-Authored-By: Claude Code <noreply@anthropic.com>
|
2026-04-02 18:26:57 +08:00 |
|
OG T
|
51961b9f03
|
docs: Phase O 可觀測性終極補完計畫設計規格
SigNoz 統一派架構,解決 6 大盲區 (Event/Log/Metrics/Descheduler/kubectl/MinIO-Kali)
+ Monitoring Master Plan Wave A-D 收尾
+ 5 個首席架構師 Review 節點
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
2026-04-02 13:45:23 +08:00 |
|
OG T
|
73e8f8ab77
|
feat(ai): Phase 24-A+B1 — AI Provider Registry + 絞殺者包裝 (ADR-052)
E2E Health Check / e2e-health (push) Successful in 16s
CD Pipeline / build-and-deploy (push) Has been cancelled
Brain Layer 雙軌 Registry 架構:
- 新建 src/services/ai_providers/ 目錄 (interfaces + 4 providers)
- OllamaProvider (local, rca/chat/code_review)
- GeminiProvider (cloud, rca/chat)
- ClaudeProvider (cloud, rca/chat/code_review)
- OpenClawNemoProvider (cloud, rca — 委派 188→NIM)
- 擴展 ai_router.py 加入:
- AIProviderRegistry (動態註冊/啟停)
- AIRouterExecutor (Cache + 閘門 CB/RL/Sem + 執行)
- openclaw.py 絞殺者包裝: USE_AI_ROUTER=true 走新路徑
- config.py + ConfigMap 加入 USE_AI_ROUTER=false (安全預設)
- ADR-052 正式文件 (14 項決策 D1-D14)
- HARD_RULES v1.7 加入 AI Router 規範
安全: USE_AI_ROUTER=false 預設不啟用,需手動開啟觀察
回滾: kubectl set env deployment/awoooi-api USE_AI_ROUTER=false
2026-04-02 ogt: Phase 24 首批實作
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
2026-04-02 13:16:09 +08:00 |
|
OG T
|
db2a2852b8
|
docs: 前端重構驗收報告 87/100
E2E Health Check / e2e-health (push) Successful in 16s
CD Pipeline / build-and-deploy (push) Has been cancelled
Playwright 瀏覽器截圖 + KB API 端點測試 + Console 分析
- 24/24 路由零 404
- 7 完整頁面 + 15 ComingSoon
- KB API 7 端點全部正常
- 1 Low bug (archived entry still accessible via GET)
- Metrics Strip [object Object] 待修
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
2026-04-02 10:20:27 +08:00 |
|
OG T
|
25889d4b8e
|
docs: 歸檔 ADR-050 reanalyze 實作計畫 (已完成)
CD Pipeline (Dev) / build-and-deploy-dev (push) Failing after 9s
E2E Health Check / e2e-health (push) Successful in 18s
CD Pipeline / build-and-deploy (push) Has been cancelled
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
2026-04-02 09:38:03 +08:00 |
|
OG T
|
5959855a71
|
feat(web): 字體系統升級 + NemoClaw SVG 還原 + Knowledge Base 設計文件
E2E Health Check / e2e-health (push) Has been cancelled
CD Pipeline / build-and-deploy (push) Has been cancelled
- 字體:Syne (標題) + DM Mono (內文) + VT323 (品牌點陣),替換 Inter
- Tailwind: fontFamily 更新 + 5 層文字色彩 token (primary→disabled)
- Sidebar: NemoClaw 白瓷龍蝦爪 SVG + AWOOOI 用 VT323 放大
- OpenClaw Panel: 還原 NemoClaw 3D 白瓷龍蝦爪 (替換 NemoNodeAnimation)
- Knowledge Base 設計文件 (B分離/A K8s Job/Phase1跳過向量搜尋)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
2026-04-02 00:48:42 +08:00 |
|
OG T
|
8845377a6d
|
docs: 更新 AI中心重設計規格 (廢棄元件 + 授權邏輯記錄)
E2E Health Check / e2e-health (push) Has been cancelled
CD Pipeline / build-and-deploy (push) Has been cancelled
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-04-01 22:28:32 +08:00 |
|