Your Name
|
04ff22563e
|
fix(aiops-p1): Playbook 學習閉環 5斷點全修 + DB Migration(ADR-092 B4)
run-migration / migrate (push) Failing after 14s
CD Pipeline / build-and-deploy (push) Failing after 2m7s
【P0.4 補丁】pre_decision_investigator Prometheus query 欄位缺失
- _build_tool_params() 補 "query" 欄位(prometheus_query tool 必要參數)
- 新增 _build_prometheus_query() — 依告警類型生成 PromQL(CPU/Memory/Crash/Disk/HTTP/Pod/fallback)
- 修復後 D3_METRICS 感官維度實際取得資料(原本 100% 回 missing_query_parameter)
【P1 Playbook 學習閉環 B1-B5 全修】
- B2 db/models.py: ApprovalRecord 新增 matched_playbook_id 欄位 + ix_approval_matched_playbook index
- B2 db/models.py: TimelineEvent 新增 incident_id 欄位(MCP 稽核用)+ index
- B3 approval_db.py: record→ApprovalRequest 補回 incident_id + matched_playbook_id
- B4 approval_repository.py: 同 B3(兩個轉換函式必須同步)
- B5 approval_db.py: approval_request_to_record_data 補 matched_playbook_id → DB 才能存值
【P1.5 KM 寫入】approval_execution.py: fire-and-forget → await wait_for(30s)
- 根因:asyncio.create_task 在 Pod recycle 時被殺,KM 寫入靜默遺失
- 修復:await asyncio.wait_for(..., timeout=30.0) + TimeoutError log
【Migration 文件】adr092_p1_learning_chain_fix.sql
- ALTER TABLE approval_records ADD COLUMN matched_playbook_id VARCHAR(36)
- ALTER TABLE timeline_events ADD COLUMN incident_id VARCHAR(64)
- 執行:psql $DATABASE_URL -f apps/api/migrations/adr092_p1_learning_chain_fix.sql
【附帶 Agent 改動】
- decision_manager: Phase 2 YAML NO_ACTION 優先門(主機層/外部服務跳過 Agent Debate)
- alert_rules.yaml: Sentry/ClickHouse + HostDiskUsageHigh/Critical 新規則
- solver_agent: action_title 語意合成兜底(取代靜默丟棄)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-04-24 15:41:35 +08:00 |
|
Your Name
|
994817a23a
|
docs: ADR-092 附錄 A+B + LOGBOOK + MASTER §8 記錄四修與 C1-C4 全流程串接
- ADR-092: 附錄 A(B1-B4 四修 root cause + commit)+ 附錄 B(C1-C4 斷點修復表 + 架構鐵律)
- LOGBOOK: 新增 2026-04-20 晚 C1-C4 章節(斷點清單 + commits + 驗收步驟)
- MASTER §8: 追加 C1-C4 changelog(§3/§1.1 對齊 + 修復後行為說明)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-04-20 20:24:41 +08:00 |
|
Your Name
|
39ac292c90
|
docs(master): §8 追加 ADR-092 四修記錄 + project_current_status 更新
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
2026-04-20 20:01:50 +08:00 |
|
Your Name
|
803b389f6b
|
security(secrets): 替換 test fixture 真 TG bot token 為假值
run-migration / migrate (push) Failing after 20s
CD Pipeline / build-and-deploy (push) Successful in 9m10s
## 事件
aider-watch v1 session 把真 production TG bot token(NEMOTRON_BOT_TOKEN)
當成 test fixture 寫入下列 tracked 檔(均已 push Gitea):
- apps/api/tests/test_secret_redactor.py
- docs/superpowers/plans/2026-04-19-aider-watch.md (3 處)
- docs/superpowers/plans/2026-04-20-aider-watch-v2.md
違反 feedback_secrets_leak_incidents_2026-04-18.md L2 零信任(source control 無 secrets)。
## 處置
- 統帥決議:不撤銷 token(接受風險)
- 替換為假值 111222333:A*35(明顯 placeholder,仍符合 redactor 判別格式)
- 減少未來 search engine / fork 的暴露面(但 git history 仍存)
## 驗證
secret_redactor.py 8 個 test 全過,telegram regex 仍能辨識新假值格式。
## P1 backlog
- git history 清理(git filter-repo)需統帥批准 force push
- pre-commit hook 防未來再洩(grep TG token 格式 / detect-secrets)
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
2026-04-20 04:23:09 +08:00 |
|
Your Name
|
8d40bbff2b
|
docs(aider-watch v2): 補 4 個全景盲點
統帥 2026-04-20 提醒「每次更新都不忘全景」— 在執行前做二次檢查
發現 4 個 plan 未處理的盲點,現補齊:
盲點 1:Mac 外網可達性
- spec §8 + §8b 新增 Tailscale/nginx/VPN 三選一
- plan Task B5 install.sh 前置提醒選配置
盲點 2:incident 洗版(同 session 多 error)
- spec §8 新增 coalesce 策略(60s 窗口 per session_id)
- plan Task A5 service 實作 create_incident_for_event 加 coalesce 邏輯
- 加 2 個測試 case 驗證同 session reuse + 不同 session 分離
盲點 3:AI Router feedback 首次 rollout 風險
- spec §8 新增 USE_AIDER_FEEDBACK flag 預設 false,灰度 7 天再開
- plan Task A8 route() hook 外包 if settings.USE_AIDER_FEEDBACK block
- plan Task A9 config 加 USE_AIDER_FEEDBACK: bool = False
盲點 4:AWOOOI_PG_PW secret 取得
- spec §8c 新增 kubectl get secret → env → shred 流程
- plan Task A0 Step 1 明確寫出 K8s Secret 讀取 + 立即銷毀檔案
符合 feedback_ai_autonomous_direction.md 的全景思考紀律。
執行策略:全 subagent-driven(統帥批准)。
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
2026-04-20 04:04:13 +08:00 |
|
Your Name
|
345e6832da
|
docs(aider-watch): v2 implementation plan — 18 tasks across server/client/E2E
對應 v2 spec 2026-04-20-aider-watch-v2-design.md:
Phase A (server, 10 tasks, TDD):
A0 HMAC secret + env setup
A1 adr091 migration
A2 secret_redactor util
A3 Pydantic AiderEventIn/AiderBatchIn
A4 AiderEventRepository
A5 aider_event_service (classify/incident/pattern)
A6 API webhook HMAC-verified
A7 Redis stream consumer job + daily pattern extract
A8 ai_router feedback_from_aider_events hook
A9 config settings + main.py lifespan register
Phase B (Mac client, 5 tasks):
B1 scaffolding (parsers/config/redactor 從 v1 搬)
B2 api_client HMAC + retry
B3 JSONL buffer + flush
B4 aiderw wrapper + cli
B5 install.sh + launchd plist
Phase C (E2E, 3 tasks):
C1 happy path Mac → awoooi
C2 degradation + buffer flush
C3 AI Router feedback verification (fixture-driven)
Self-review:spec 覆蓋率 100%,無 placeholder,型別一致。
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
2026-04-20 04:04:13 +08:00 |
|
Your Name
|
8ce8efad29
|
docs(aider-watch): v2 設計稿 — 完全整合 awoooi AI 自主化飛輪
統帥 2026-04-20 指示「C 路線 + 甲 bot」— v1 獨立個人工具路線與
awoooi MASTER blueprint 全景割裂,違反 feedback_ai_autonomous_direction
北極星(純記錄非自主化)。v2 重新對齊:
- DB:進主 PG,新 migration adr091 的 aider_events 表
- Telegram:走既有 telegram_gateway @tsenyangbot + Redis dedup
- Incident:aider error 自動建 incident 走既有告警鏈
- AI 學習回路:symptom_pattern 抽取 + AI Router feedback hook
- Mac client:薄殼 HTTP POST + 本機 JSONL fallback buffer
v1 產物去向:events.py/redactor.py 搬進 awoooi;其他廢棄。
@NemoTronAwoooI_Bot 轉 sandbox 用,不刪。
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
2026-04-20 04:04:13 +08:00 |
|
Your Name
|
55486ce2fd
|
docs: aider-watch 實作計畫(15 tasks,TDD + 頻繁 commit)
對應 spec 2026-04-19-aider-watch-design.md 的完整 §1-§7 拆解:
scaffold → events schema → redactor → config → tg format/send → PG DDL
→ storage → parsers → wrapper → CLI → reporter → launchd → install → E2E。
每個 task 含 TDD 步驟(測試先行 → 驗失敗 → 實作 → 驗通過 → commit)。
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
2026-04-19 22:42:41 +08:00 |
|
Your Name
|
8603bce23b
|
docs: aider-watch 設計稿(統帥批准的 §1-§7 定稿)
aider CLI 全程監控系統:Python wrapper 攔 aider stdout + chat history
→ Telegram DM 即時推播(session start/end/file edit/error/commit/silent
timeout)+ PG 192.168.0.188/aider_watch 累積儲存 + 每日 23:50/每週日
22:00 launchd 日週報。
Graceful degradation:PG 不可達 fallback 本機 JSONL buffer + 5min
flush job;Telegram 429 指數退避不阻塞 aider;secret pattern 自動遮罩。
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
2026-04-19 22:39:40 +08:00 |
|
OG T
|
0670fe4d76
|
docs(master): §8 追加 Phase 7 Round 3 Telegram 子系統修復記錄
Round 3 Changelog 條目:
- 9 bugs 盤點 + 5 commits 清單
- git tag v7.3.0
- 交接指引給下個 Session
2026-04-19 凌晨 — ogt + Claude Opus 4.7
|
2026-04-19 01:32:52 +08:00 |
|
OG T
|
5ae82d1d1f
|
feat(db): ADR-090 L4 AIOps 地基 — 資產盤點 × 7 項自動化覆蓋矩陣永久化 DB
CD Pipeline / build-and-deploy (push) Has been cancelled
2026-04-18 下午(台北時區)—— ogt + Claude Opus 4.7 (1M)
MoWoooWorkDown 假警報 RCA 暴露三重結構性失守:
- 110/188 主機 load 18/16 × 13 天 / cadvisor 288% / K3s 120/121 無監控
- Prometheus 僅 35 targets / 58 rules(覆蓋不到三成)
- HostHighCpuLoad 量錯維度(CPU idle vs load_avg)
統帥戰略指令:
- 全景資產 × 七大自動化 × 永久化 DB
- AI 四分工(OpenClaw × NemoTron × Hermes × Claude LLM)
- 所有自動化操作歷程必進 DB,不靠 MD(MD 會漂移)
本 commit 交付:
1. SQL migration (apps/api/migrations/adr090_asset_inventory_foundation.sql)
- 11 張表 + 33 indexes + 20 CHECK + 3 UNIQUE + 16 FK
- pgcrypto extension dependency
- 完整 idempotent(CREATE IF NOT EXISTS + single transaction)
- 已 apply 進 awoooi_prod(188 PG),驗證通過
2. ADR-090 (docs/adr/ADR-090-monitoring-blindspot-governance.md)
- 決策紀錄 + 7 引擎對映 + 4 替代方案否決
3. 主戰略文件 (docs/superpowers/specs/2026-04-18-blindspot-governance-capacity-l4.md)
- §0-§14: 背景 / 根因 / Schema DDL / 4 層防禦 / 7 Phase 實施 /
HARD_RULES / AI 分工矩陣 / 驗收指標 / 技術債 / 回滾 / 接手協議
4. MASTER §8 Living Changelog 追加 Phase 7 啟動條目
11 張表:
asset_inventory / asset_discovery_run / asset_coverage_snapshot /
asset_relationship / alert_rule_catalog / asset_change_event /
asset_compliance_snapshot / host_capacity_snapshot /
capacity_violation_event / automation_operation_log /
ai_collaboration_trace
首筆 bootstrap 記錄已 seed 進 asset_discovery_run
(run_id=6760c5bf-57e5-4a40-b82d-31b794464652)
相關 Memory (未 commit,存於 ~/.claude/...):
- project_blindspot_governance.md (跨 session 指針)
- feedback_monitor_self_monitoring.md (監控工具必須被監控)
- feedback_secrets_leak_incidents_2026-04-18.md (憑證外洩三防線)
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
2026-04-18 13:18:46 +08:00 |
|
OG T
|
e465ee1936
|
docs(Phase 3): Evolver 演練完成 ✅ — exit condition #6 通過
- MASTER spec §3/§7/§8:三處 Evolver 演練勾選完成
- LOGBOOK:演練結果記錄 + 下一步更新為 7 天生產監控
演練結果:POST /api/v1/learning/evolver/run → HTTP 200 errors:[] 2026-04-15
ADR-083 Phase 3 — 2026-04-15 ogt + Claude Sonnet 4.6(亞太)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-04-15 21:24:33 +08:00 |
|
OG T
|
4718c7667c
|
feat(Phase 3): Evolver loop 排程 + 手動觸發端點 — 合併演練閘道完工
CD Pipeline / build-and-deploy (push) Has been cancelled
- playbook_evolver.py: 新增 run_evolver_loop()(24h 無限迴圈)
- main.py: 掛載 run_evolver_loop asyncio.create_task
- api/v1/learning.py: POST /api/v1/learning/evolver/run(Phase 3 exit #6 演練端點)
- MASTER §8: 補錄 66c4eda AgentSession + 本次 Evolver 完整退出條件清單
ADR-083 Phase 3 — 2026-04-15 ogt + Claude Sonnet 4.6(亞太)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-04-15 21:07:56 +08:00 |
|
OG T
|
fb1bbd0e20
|
feat(Phase 3): 學習閉環補完 — Root cause 3 + 診斷 feedback + 知識遺忘 + Fine-tune 管線
CD Pipeline / build-and-deploy (push) Has been cancelled
- approval_execution.py: _run_post_execution_verify() 補接 record_verification_result()
Root cause 3 終結:環境驗證結果(success/degraded/failed/timeout)不再孤立
- learning_service.py: 新增 record_verification_result() — 驗證結果 → Redis + Playbook EWMA
- learning_service.py: 新增 record_diagnosis_outcome() — 誤診負向訊號回寫(L3×D4)
- jobs/knowledge_decay_job.py: 新建 30d 知識遺忘 Job(未引用 draft/review → archived)
- services/finetune_exporter.py: 新建每週 JSONL 匯出(EvidenceSnapshot × AgentSession)
- main.py: 掛載 knowledge_decay_loop(24h)+ finetune_export_loop(7d)
- MASTER §8: Phase 3 核心改造項全部落地記錄
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-04-15 20:57:43 +08:00 |
|
OG T
|
05b774386b
|
feat(Phase 6): AI SLO REST API — GET /api/v1/ai/slo 收官
CD Pipeline / build-and-deploy (push) Has been cancelled
ADR-087 Phase 6 自我治理閉環最後一塊拼圖:
1. api/v1/ai_slo.py — GET /api/v1/ai/slo
- Service 層快取優先(TTL 5min,AiSloCalculator.get_cached_report)
- force_refresh=true 強制重算(AiSloCalculator.run)
- Router 層零 Redis 直接存取(leWOOOgo 積木化鐵律)
2. main.py — 路由掛載 ai_slo_v1.router(prefix=/api/v1)
3. MASTER §8 Living Changelog 追加:
- P0 告警靜默 3 根因 RCA 完整紀錄
- P2 飛輪斷鏈修復摘要
- Phase 6 全元件完成清單
Phase 6 退出條件 5/6 已達(生產驗證待 image 上線)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-04-15 19:57:26 +08:00 |
|
OG T
|
7da64eaad2
|
feat(Phase 3): 學習閉環重建 — 三根因修復 + 2x EWMA + Evolver Agent
CD Pipeline / build-and-deploy (push) Failing after 19m7s
Type Sync Check / check-type-sync (push) Failing after 1m18s
ADR-083 Phase 3 學習閉環重建:
**三根因修復**
- approval_execution.py: fire-and-forget create_task → await asyncio.wait_for(timeout=30) × 2
(成功路徑 L265 + 失敗路徑 L353,超時記錄 learning_trigger_timeout metric,主流程不 crash)
- models/approval.py: ApprovalRequestBase 新增 matched_playbook_id 欄位
- decision_manager.py: _auto_execute 建立 ApprovalRequest 時填充 matched_playbook_id
- learning_service.py: 雙路徑查找 _matched_pb_id(matched_playbook_id + metadata fallback)
**2x EWMA 負向強化**
- models/playbook.py: 新增 trust_score: float = 0.3(EWMA 動態信任度欄位)
- repositories/playbook_repository.py: update_stats 加 EWMA
成功: trust = 0.9 × old + 0.1 × 1.0
失敗: trust = 0.8 × old + 0.2 × 0.0(衰減速度 2x)
trust < 0.1 → log warning,等 Evolver 封存
**Evolver Agent(新建)**
- services/playbook_evolver.py: 三功能全靜態規則
1. 低信任封存: trust < 0.1 → DEPRECATED
2. 休眠封存: 30d 未使用 AND trust < 0.5 → DEPRECATED
3. 相似合併: 症狀 Jaccard > 0.9 → 保留高 trust,封存低 trust
AIOPS_P3_EVOLVER_ENABLED=False 預設關閉
**文件**
- ADR-083 學習閉環重建
- MASTER §8 Phase 3 完工記錄
AIOPS_P3_ENABLED=False(預設),骨架就位等統帥批准開啟
Co-Authored-By: Claude Sonnet 4.6(亞太)<noreply@anthropic.com>
|
2026-04-15 14:01:37 +08:00 |
|
OG T
|
db9e304a14
|
feat(adr-080): Phase 0 防護欄建立 — AI 自主化飛輪啟動
- docs/superpowers/specs/2026-04-15-MASTER-ai-autonomous-flywheel-v2.md
(1456 行,§0-§8 全填完:42-cell 戰術矩陣、7 Phase 計畫、7 ADR 摘要、
15 KPI、21 Feature Flags、10 風險場景)
- docs/adr/ADR-080-ai-autonomy-flywheel-overview.md
(7 Phase 結構 + 4 北極星 + 7 架構師 Review Gates + Phase 退出條件)
- apps/api/src/core/feature_flags.py
(AIOpsFeatureFlags: P1~P6 總開關全 False + 15 細粒度子開關
is_phase_enabled() / is_sub_flag_enabled() + bool cast 安全)
- apps/api/src/jobs/__init__.py + baseline_snapshot.py
(Phase 0 基線快照 Job:MCP calls / Playbook confidence / general 比例
/ learning loop rate / auto_repair — 寫入 aiops:baseline:latest)
- apps/api/tests/test_feature_flags.py (21 tests — 全綠)
- docs/HARD_RULES.md → v1.9
(新增 Phase 退出條件鐵律:禁止未過 exit conditions 宣告 Phase 完成)
- CLAUDE.md 防失憶閘門 1:強制讀 MASTER §0 Session Resume Protocol
Gate 0 Pass — 21/21 tests green
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-04-15 12:44:53 +08:00 |
|
OG T
|
50edeaa9ea
|
docs(Phase 5): 分類按鈕完整化 — 完整解決方案與實施步驟
統帥要求「提出完整的解決方案和詳細的實施步驟」→ 本 plan 回覆。
內容涵蓋:
- 28 按鈕完整 action → MCP tool 對應表(3 類:查/寫/secops)
- 6 個 Sprint 工作分解(5.0 規格 → 5.1 dispatch → 5.2 查類 → 5.3 寫類 → 5.4 secops → 5.5 E2E)
- 架構設計決策(callback_dispatcher registry pattern)
- 依賴與風險矩陣
- 5 個 E2E 驗收案例
- Rollout 策略(查類先上線,觀察 24h 再上寫類)
估時: 3-5 天(總計 5.5 工作日)
Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
|
2026-04-14 20:22:03 +08:00 |
|
OG T
|
d32d494320
|
docs: 四階段細化實施步驟 + 架構轉型截圖定案 + 防偏差守則
規格書 v2.0 新增:
- §十一 四階段細化實施步驟(階段1~4各含驗收清單)
- 階段1: CD解鎖+debounce+alertname+冷啟動Playbook+KM向量化(9步)
- 階段2: DB Migration+classify_alert_early+outcome寫入(5步)
- 階段3: 分診站+SSH路由+TYPE-1/E/F+action解析+risk_level(Tier3,7步)
- 階段4: KMConversionService+手動修復記錄(4步)
- §十二 防偏差守則(不跳步驟/Tier3授權/不改範圍/異常立刻報告)
ADR-073 更新:架構轉型截圖定案(舊架構中斷→新架構分診飛輪)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-04-12 13:30:37 +08:00 |
|
OG T
|
d3ddaafcfd
|
docs(spec): v2.2 新增 §15 Subsystem 1 核心飛輪修復路線圖(2026-04-12)
- 四階段路線圖定案(截圖對應):CD解鎖→數據完整性→路由用戶體驗→知識引擎
- 各階段解鎖條件與 Tier 標記
- 整合 ADR-073/ADR-074 參考
- 飛輪停擺統計數據(觸發原因)
- 後續子系統前提條件
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-04-12 13:23:45 +08:00 |
|
OG T
|
77771c16b1
|
docs(spec): ADR-073/074 AIOps 飛輪全面修復整合規格書 v1.0
整合四個層次的完整解決方案:
- 層次一 ADR-073-A:緊急解封(CD修復/alertname/debounce/Playbook冷啟動/KM向量化)
- 層次二 ADR-073-B:路由修正(檢傷分類站/SSH路徑/action解析/KMConversionService)
- 層次三 ADR-074:監控補全(飛輪健康度Exporter/網路/DNS/Gitea CI/備份還原測試)
- 層次四 ADR-073-C:前端飛輪即時化(真實API/WebSocket/KPI面板)
整合來源:ADR-073盤點 + v2.2規格書§14.11 ADR-071工作序 + 監控缺口盤點 + 飛輪截圖定案
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-04-12 13:21:02 +08:00 |
|
OG T
|
09982fdfaa
|
docs(session6): Telegram 全面審計 + ADR-072 Bug 清單 + 規格整合
- LOGBOOK: Session 6 Redis DB10 審計結果(8個系統性問題,P0-P2分級)
- ADR-072: AIOps 閉環 Bug 修復清單(drift_interpreter/deployment_name/KM vectorization等)
- 規格文件 v2.2: 確認 Sprint A/B/C + MCP 1-4 + ADR-071 全部完成,標記下一步為 ADR-072
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-04-11 20:04:50 +08:00 |
|
OG T
|
fa7b763689
|
docs(infra): ADR-069 基礎設施重建計畫規格 v1.3 — Sprint A/B/C 完整設計
新增 Sprint A(清廢棄修錯誤)+ Sprint B(Ansible+ArgoCD GitOps)+ Sprint C(Velero+rsync DR)
完整技術調查:Sentry snuba DNS根因、Harbor port錯誤、bitan Docker化需求、volumes盤點
加入第十二節(與現有專案整合)+ 第十三節(文件更新時間表)
LOGBOOK 更新、project_master_workplan 加入 ADR-069 Sprint A/B/C
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-04-11 00:01:07 +08:00 |
|
OG T
|
6f7a4be2c7
|
docs: Sprint 5.1 資料安全護欄 — ADR-062/063 + 方案規範驗證
- ADR-062: Data Safety Guardrails (服務分級/Pre-flight/MultiSig)
- ADR-063: Service Registry IaC 設計規範
- Sprint 5.1 方案文件: 規範驗證通過,P1-P5 問題修正
- P1: Playbook 存 Redis(非 SQL),M-001 改為 Pydantic model 修改
- P2: velero_client.py 命名維持(與 signoz_client 慣例一致)
- P3: docker-health-monitor 狀態釐清
- P4/P5: DI setter + Deployment Verification 補充
- LOGBOOK: 當前焦點更新為 Sprint 5.1
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-04-08 16:07:12 +08:00 |
|
OG T
|
83e9d3eef8
|
docs(specs): Sprint 5 四份技術文檔 — Tab 規格/路由對照/元件抽取/API 變更
1. Tab 結構規格書: 每個新頁面的 Tab 配置、區塊佈局、元件複用方式
2. 路由對照表: 26 個舊 URL → 新位置的精確映射 + redirect 實作方式
3. 元件抽取計畫: 17 個頁面抽取為 Panel 元件的步驟和目錄結構
4. API 變更規格: DashboardResponse +3 欄位 + SSE +1 事件 (不新增 API)
Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
|
2026-04-08 16:03:58 +08:00 |
|
OG T
|
bb6a57dd87
|
docs(plan): Sprint 5 前端資訊架構重組 — 完整解決方案
涵蓋:
- 第一章: 現有 26 頁面 + 62 元件完整資產清單
- 第二章: 重組對照表 (25→6+2 導航,零功能遺失)
- 第三章: 6 個新頁面的 Tab 結構與元件整合
- 第四章: 舊路由向後兼容 (20+ redirect)
- 第五章: 共用 Tab 容器元件規格
- 第六章: 新導航 Sidebar 結構
- 第七章: 互動模式規範 (Tab/Drawer/Modal/Toggle)
- 第八章: 細化實施步驟 (6 Phase, 30 Step)
- 第九章: 檔案影響清單 (15 新增 + 5 修改)
- 第十章: 8 份技術文檔清單
- 第十一章: 風險矩陣
- 第十二章: 時程預估 (~10天, 3批交付)
Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
|
2026-04-08 16:01:38 +08:00 |
|
OG T
|
8788c720e4
|
docs(plan): Sprint 5 完整解決方案 — 與現有架構整合的細化實施計畫
|
2026-04-08 12:22:05 +08:00 |
|
OG T
|
f2b3a7129f
|
docs(plan): Sprint 5 指令中心重設計 — 完整解決方案與細化實施步驟
|
2026-04-08 12:01:14 +08:00 |
|
OG T
|
246587a401
|
fix(web): Sprint F 前端打假行動 — 29處假數據全面清除 (首席架構師 98/100)
P0: Neural Command 三個子組件移除所有 MOCK 常數,接上真實 API props
- NeuralLiveCenter: 假歷史/假KPI/假雷達 → 從 stats/history/incidents 即時計算
- NeuralStats: MOCK_HISTORY/SCHEME_STATS/PLAYBOOK_RANKINGS → useMemo 聚合
- NeuralApprovalPanel: MOCK_PENDING → 真實 /api/v1/approvals 簽核操作
P1: 10+處假用戶身份 (demo-user/user-001/War Room User) → CURRENT_USER 常數統一
P2: 刪除 6 個 Demo 匯出 (GlobalPulseChartDemo/MOCK_APPROVAL/DEMO_DECISION_CHAIN)
P3: /demo 頁面加 NEXT_PUBLIC_ENABLE_DEMO 環境變數保護
i18n: 新增 22 個翻譯鍵 (zh-TW + en)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
2026-04-07 12:53:52 +08:00 |
|
OG T
|
e82d3802c5
|
docs: Sprint 4 告警處置統計系統 — 完整計畫文件 + LOGBOOK 更新
Sprint 4 計畫包含 6 Phase / 19 工作項:
- Phase A: 資料層 (IncidentFrequencyStats + Redis 計數器)
- Phase B: 寫入層 (4 觸發點: auto_repair/cold_start/human/manual)
- Phase C: API 端點 (/stats/disposition)
- Phase D: Telegram 告警卡片統計
- Phase E: 前端 (/reports 儀表板 + 首頁 + auto-repair + neural-command)
- Phase F: 週報 + 文件
首席架構師審查: 100% Fully Approved
衝突檢查: 所有依賴正確,DAG 無環
Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
|
2026-04-07 11:37:21 +08:00 |
|
OG T
|
1a8021bfaa
|
docs(plans): Sprint 3 SSH_COMMAND 指揮權鏈實作計畫 (7 tasks)
|
2026-04-06 14:08:28 +08:00 |
|
OG T
|
be60ec1507
|
docs(plan): ADR-059 Gitea Webhook 遷移實作計畫 (9 Tasks)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-04-05 14:22:29 +08:00 |
|
OG T
|
5cd67d372f
|
docs(spec): ADR-059 Gitea Webhook 遷移設計規格
從 GitHub Webhook (Phase 13.1) 遷移至 Gitea Webhook
最少改動策略:Header 常數替換,業務邏輯層不動
廢棄 workflow_run CI 診斷(CD pipeline 已有 TG 通知覆蓋)
整合首席架構師護欄:防禦性 payload 解析 + Content-Type 設定
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-04-05 14:17:13 +08:00 |
|
OG T
|
0db9b41808
|
docs(plan): Observability + Auto-healing 完整實施計畫 (15 Tasks, 3 Sprints)
Sprint 1 (P0): Prometheus 統一告警規則 + Sentry 啟動 + CD 同步
Sprint 2 (P1): SigNoz 日誌告警 + Sentry SDK 標籤
Sprint 3 (P2): SSH HostRepairAgent 基礎設施
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-04-05 02:24:23 +08:00 |
|
OG T
|
de33abe0e3
|
docs(spec): 全系統自愈閉環設計規格 v1.0
整合三大問題的完整解決方案:
1. Prometheus 規則未部署 (13條→40+條,含SentryDown/AlertChain)
2. 日誌收集但無log-based alerting
3. 自動修復只限K8s層,無Host Docker/systemd修復能力
包含:
- 統一標籤規範 (layer/component/team/host)
- Sprint 1: 規則部署+Sentry啟動+CD同步
- Sprint 2: SigNoz log alert + Sentry整合
- Sprint 3: SSH HostRepairAgent + Playbooks
- SOP v4.0整合更新點
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-04-05 02:14:01 +08:00 |
|
OG T
|
2243a21b96
|
fix(ai-router): v4.3 NIM 保護 — timeout 不計 CB 失敗,每次先跑 NIM 才切 Gemini
CD Pipeline / build-and-deploy (push) Failing after 20s
需求: NIM 必須等到有回應才切換,不能因為慢就被 CB 封鎖走 Gemini
變更:
- Timeout exception 不累積 CB failure(只有真實連線錯誤才計)
- NIM CB: failure_threshold=10, recovery_timeout=30s(比預設寬鬆)
- 設計文件 v4.3: 更新方向二,移除錯誤假設
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-04-05 01:51:12 +08:00 |
|
OG T
|
0c180dec86
|
docs(spec): 方向二實作修正記錄 — Nemotron privacy_level=cloud (P0)
|
2026-04-04 17:42:53 +08:00 |
|
OG T
|
0b41df45d6
|
docs(plans): 三方向實作計畫 P0/P1/P2
- P0: DIAGNOSE Privacy-First Routing(local chain 隔離 + REJECT 保護)
- P1: Knowledge Auto-Harvesting(Anti-Pattern 閉環 + Runbook 生成)
- P2: Config Drift Detection(GitOps 守門員 + Nemotron 意圖分析)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-04-04 12:31:36 +08:00 |
|
OG T
|
035cb9cd0d
|
docs(spec): Nemotron 主動防禦三方向設計文件
- 方向一:Knowledge Auto-Harvesting(Anti-Pattern 閉環 + Runbook 自動生成)
- 方向二:DIAGNOSE Privacy-First Routing(Local-Only Fallback Chain)
- 方向三:Config Drift Detection(GitOps 守門員 + Nemotron 意圖分析)
首席架構師 ogt 100% 技術背書
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-04-04 12:18:11 +08:00 |
|
OG T
|
51961b9f03
|
docs: Phase O 可觀測性終極補完計畫設計規格
SigNoz 統一派架構,解決 6 大盲區 (Event/Log/Metrics/Descheduler/kubectl/MinIO-Kali)
+ Monitoring Master Plan Wave A-D 收尾
+ 5 個首席架構師 Review 節點
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
2026-04-02 13:45:23 +08:00 |
|
OG T
|
db2a2852b8
|
docs: 前端重構驗收報告 87/100
E2E Health Check / e2e-health (push) Successful in 16s
CD Pipeline / build-and-deploy (push) Has been cancelled
Playwright 瀏覽器截圖 + KB API 端點測試 + Console 分析
- 24/24 路由零 404
- 7 完整頁面 + 15 ComingSoon
- KB API 7 端點全部正常
- 1 Low bug (archived entry still accessible via GET)
- Metrics Strip [object Object] 待修
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
2026-04-02 10:20:27 +08:00 |
|
OG T
|
25889d4b8e
|
docs: 歸檔 ADR-050 reanalyze 實作計畫 (已完成)
CD Pipeline (Dev) / build-and-deploy-dev (push) Failing after 9s
E2E Health Check / e2e-health (push) Successful in 18s
CD Pipeline / build-and-deploy (push) Has been cancelled
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
2026-04-02 09:38:03 +08:00 |
|
OG T
|
5959855a71
|
feat(web): 字體系統升級 + NemoClaw SVG 還原 + Knowledge Base 設計文件
E2E Health Check / e2e-health (push) Has been cancelled
CD Pipeline / build-and-deploy (push) Has been cancelled
- 字體:Syne (標題) + DM Mono (內文) + VT323 (品牌點陣),替換 Inter
- Tailwind: fontFamily 更新 + 5 層文字色彩 token (primary→disabled)
- Sidebar: NemoClaw 白瓷龍蝦爪 SVG + AWOOOI 用 VT323 放大
- OpenClaw Panel: 還原 NemoClaw 3D 白瓷龍蝦爪 (替換 NemoNodeAnimation)
- Knowledge Base 設計文件 (B分離/A K8s Job/Phase1跳過向量搜尋)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
2026-04-02 00:48:42 +08:00 |
|
OG T
|
8845377a6d
|
docs: 更新 AI中心重設計規格 (廢棄元件 + 授權邏輯記錄)
E2E Health Check / e2e-health (push) Has been cancelled
CD Pipeline / build-and-deploy (push) Has been cancelled
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-04-01 22:28:32 +08:00 |
|
OG T
|
0b04abf990
|
docs(plan): add AI Center v6 redesign implementation plan (13 tasks)
|
2026-04-01 19:39:41 +08:00 |
|
OG T
|
4b84e95723
|
docs: AI中心 UI 重設計規格文件 v6
- Anthropic Warmth (#f5f4ed) + OpenClaw Blue (#4A90D9) 色彩系統
- 3欄佈局:Sidebar(200px) | Feed(50%) | RightPanel(50%)
- 完整側邊欄:4區19項(整合 wooo-aiops 所有菜單)
- 事件卡片流程圖 + Q版龍蝦 (橘紅本色 #E85530)
- NemoClaw 白底節點動畫(截圖風格)
- 全面圓角規範
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-04-01 19:19:03 +08:00 |
|