Commit Graph

22 Commits

Author SHA1 Message Date
Your Name
8f15c57019 feat(claude): 套用 ty-ai-standards Global-Local 架構
- 新增 .claude/agents/:12 個標準化 subagents(critic / debugger / planner 等)
- 新增 .claude/hooks/secrets.local.json:AWOOOI 專屬 Token 偵測 patterns
- 新增 .claude/hooks/branch-protection.local.json:保護 production 分支
- 更新 .claude/settings.json:加入 hooks 區段(全域 hooks 疊加執行)
- 更新 CLAUDE.md:加入全域參照行 + 安全架構說明

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-22 00:18:14 +08:00
OG T
bb7441ec8a docs: 更新 CLAUDE.md + HARD_RULES.md v2.0 + LOGBOOK (2026-04-16)
- HARD_RULES.md v2.0: 新增 Self-Loop Workflow、Circuit Breaker Exception、State & Flow Validation
- CLAUDE.md: 補充 §4 必讀Memory 表格
- LOGBOOK: 記錄 AIOps E2E 修復進度

2026-04-16 Claude Sonnet 4.6 Asia/Taipei

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-16 01:20:16 +08:00
OG T
db9e304a14 feat(adr-080): Phase 0 防護欄建立 — AI 自主化飛輪啟動
- docs/superpowers/specs/2026-04-15-MASTER-ai-autonomous-flywheel-v2.md
  (1456 行,§0-§8 全填完:42-cell 戰術矩陣、7 Phase 計畫、7 ADR 摘要、
   15 KPI、21 Feature Flags、10 風險場景)

- docs/adr/ADR-080-ai-autonomy-flywheel-overview.md
  (7 Phase 結構 + 4 北極星 + 7 架構師 Review Gates + Phase 退出條件)

- apps/api/src/core/feature_flags.py
  (AIOpsFeatureFlags: P1~P6 總開關全 False + 15 細粒度子開關
   is_phase_enabled() / is_sub_flag_enabled() + bool cast 安全)

- apps/api/src/jobs/__init__.py + baseline_snapshot.py
  (Phase 0 基線快照 Job:MCP calls / Playbook confidence / general 比例
   / learning loop rate / auto_repair — 寫入 aiops:baseline:latest)

- apps/api/tests/test_feature_flags.py  (21 tests — 全綠)

- docs/HARD_RULES.md → v1.9
  (新增 Phase 退出條件鐵律:禁止未過 exit conditions 宣告 Phase 完成)

- CLAUDE.md 防失憶閘門 1:強制讀 MASTER §0 Session Resume Protocol

Gate 0 Pass — 21/21 tests green

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-15 12:44:53 +08:00
OG T
88ac1c7f50 feat(phase27): 歷史按鈕雙層頻率統計 + DB frequency_snapshot 持久化
Some checks failed
CD Pipeline / build-and-deploy (push) Failing after 1m44s
- telegram_gateway: _send_incident_history 改為 Phase 27 雙層策略
  Layer 1: DB frequency_snapshot (建立時刻永久快照)
  Layer 2: Redis AnomalyCounter disposition 累積統計 (35d TTL)
  修復舊版呼叫 record_anomaly() 導致誤計數的 bug
- 新增 migration: phase27_incident_frequency_snapshot.sql (已在 prod 執行)
- CLAUDE.md: 精簡至 123 行,減少 Token 消耗

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-10 01:06:51 +08:00
OG T
ce945fe89e rule(cost): 🔴🔴🔴 費用變更強制審批 — HARD_RULES v1.8 + CLAUDE.md
統帥指示 2026-04-03:
所有涉及費用產生的變更必須停下來等統帥明確批准後才可執行

新增:
- HARD_RULES.md v1.8: Cost Change Approval 章節
  - 定義涉費變更範圍
  - 強制流程: 識別→停→說明→等批准→執行
  - 今日違規教訓記錄
- CLAUDE.md 任務前必讀新增費用變更條目

Memory 已同步:
- feedback_cost_change_approval.md (新建)
- feedback_constitution_v2.md 第五章
- MEMORY.md 索引最高鐵律區

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-03 15:36:47 +08:00
OG T
73e8f8ab77 feat(ai): Phase 24-A+B1 — AI Provider Registry + 絞殺者包裝 (ADR-052)
Some checks failed
E2E Health Check / e2e-health (push) Successful in 16s
CD Pipeline / build-and-deploy (push) Has been cancelled
Brain Layer 雙軌 Registry 架構:
- 新建 src/services/ai_providers/ 目錄 (interfaces + 4 providers)
  - OllamaProvider (local, rca/chat/code_review)
  - GeminiProvider (cloud, rca/chat)
  - ClaudeProvider (cloud, rca/chat/code_review)
  - OpenClawNemoProvider (cloud, rca — 委派 188→NIM)
- 擴展 ai_router.py 加入:
  - AIProviderRegistry (動態註冊/啟停)
  - AIRouterExecutor (Cache + 閘門 CB/RL/Sem + 執行)
- openclaw.py 絞殺者包裝: USE_AI_ROUTER=true 走新路徑
- config.py + ConfigMap 加入 USE_AI_ROUTER=false (安全預設)
- ADR-052 正式文件 (14 項決策 D1-D14)
- HARD_RULES v1.7 加入 AI Router 規範

安全: USE_AI_ROUTER=false 預設不啟用,需手動開啟觀察
回滾: kubectl set env deployment/awoooi-api USE_AI_ROUTER=false

2026-04-02 ogt: Phase 24 首批實作

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-02 13:16:09 +08:00
OG T
9145faf24b docs: 前端內網 IP 禁令 - RCA + Hard Rule v1.6
Some checks failed
CD Pipeline / build-and-deploy (push) Has been cancelled
E2E Health Check / e2e-health (push) Has been cancelled
事故: 2026-03-30 瀏覽器區域網路權限對話框
根因: CD 用 http://192.168.0.125:32334 建置 NEXT_PUBLIC_API_URL

已更新:
- CLAUDE.md: 新增 🔴🔴🔴 前端內網 IP 禁令章節
- HARD_RULES.md: v1.6 新增 Frontend Internal IP 規則
- LOGBOOK.md: RCA 事故回顧

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-03-30 01:28:38 +08:00
OG T
25e69e6870 feat(cicd): ADR-039 完成 - GitHub Actions 停用,Gitea 主倉
Some checks failed
E2E Health Check / e2e-health (push) Has been cancelled
CD Pipeline / build-and-deploy (push) Has been cancelled
- 停用所有 GitHub Actions workflows (.disabled)
- 更新 CLAUDE.md 添加 Gitea CI/CD 章節
- 更新 LOGBOOK.md 記錄遷移狀態
- Gitea 版本: 1.25.5
- Runner 版本: v0.3.1 (host 網絡模式)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-03-30 01:07:32 +08:00
OG T
3bfb9c51f5 chore: Skills + CLAUDE.md + Playwright 配置更新
- SRE-QA Skills 擴充
- CLAUDE.md 指引更新
- playwright.config.ts 優化

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-03-29 16:04:43 +08:00
OG T
6a38c0c968 fix(cd): ADR-035 Telegram Secrets 自動注入三層防護
🔴 事故根因: K8s Secrets 未注入,Telegram 告警長時間失效
- kustomization.yaml 說「由 CI/CD 處理」但 CD 從未執行

🛡️ 三層防護機制:
- Layer 1: Pre-flight 檢查 GitHub Secrets 存在
- Layer 2: Deploy 時 kubectl patch secret 自動注入
- Layer 3: Post-Deploy E2E 測試告警驗證

📄 文件更新:
- ADR-035: docs/adr/ADR-035-telegram-alert-chain-enforcement.md
- DevOps Skill v1.9: 新增 Secrets 注入鐵律
- CLAUDE.md: 新增告警鏈路章節
- LOGBOOK: 事故記錄

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-03-28 21:47:49 +08:00
OG T
bd5648f19d refactor(docs): CLAUDE.md 精簡化 - 引用外部 MD
- 紅區治理:引用 RED_ZONES.md
- 部署層級:引用 feedback_deployment_layer_decision.md
- 積木化:引用 feedback_lewooogo_modular_enforcement.md
- 新增:基礎設施參考 (SERVICE-ENDPOINTS.md + K3S-RUNBOOK.md)
- 減少 35% 內容 (227 → 148 行)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-03-28 21:19:18 +08:00
OG T
14c81f728f docs: 新增 ADR-025 告警鏈路 E2E 驗證 + 更新 Skills
新增:
- ADR-025: 告警鏈路 E2E 驗證架構 (2026-03-26 事故教訓)

更新:
- ADR-011: 新增 DNS 規則最佳實踐 (附錄 B)
- Skill 04: 新增 NetworkPolicy DNS 規則 + CoreDNS 設定
- Skill 05: 新增告警鏈路 Smoke Test 要求
- CLAUDE.md: 新增告警鏈路驗證到任務前必讀

事故根因:
1. URL 路徑錯誤 (webhook vs webhooks)
2. NetworkPolicy DNS 規則標籤不匹配
3. CoreDNS 上游 DNS 依賴 systemd-resolved

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-03-26 15:34:12 +08:00
OG T
604e38cf07 docs: Phase 14 紅區治理 + Skills 01/03 更新
- CLAUDE.md: 紅區治理章節
- Skills 01/03: 版本更新
- ADR/Architecture: 標準化

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-03-26 09:55:47 +08:00
OG T
9bff46a1b0 feat: integrate Sentry + fix CI/CD issues
Sentry Integration (補強 SignOz):
- Add @sentry/nextjs for frontend error tracking + session replay
- Add sentry-sdk[fastapi] for backend error tracking
- Create sentry.client/server/edge.config.ts
- Integrate with next.config.js + instrumentation.ts
- Add Sentry exception capture in FastAPI error handler
- Create deployment scripts for Self-Hosted @ 192.168.0.110

CI/CD Fixes:
- Fix F821 Undefined name 'Field' in incidents.py
- Add NEXT_PUBLIC_API_URL env var to CI build step
- Add build-arg to Docker build verification

E2E Test Improvements:
- Fix strict mode violations in dashboard-acceptance tests
- Add timeout increase for Phase 4 demo tests
- Make tests more resilient to UI variations

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-03-24 15:19:52 +08:00
OG T
bff031fa8f fix(cd): 修正 kustomize 安裝路徑 (避免 sudo)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-03-24 12:31:26 +08:00
OG T
ec7e45d538 fix(api): 修復 Incident-Approval 狀態同步 BUG
🔴 P0 核心功能修復:

問題: 審核後頁面重整,Y/n 按鈕重複出現
根因: resolve_incident_after_approval 在 Redis 缺失時靜默跳過

修復:
1. proposal_service.py - 處理 Redis 缺失情況
2. approvals.py - 添加詳細日誌追蹤
3. 設定 resolved_at 時間戳

防禦性增強:
- 日誌記錄 metadata 內容
- 記錄 resolve 成功/失敗狀態
- 警告無 incident_id 的情況

長期規範:
- 新增 feedback_incident_approval_sync.md 記憶
- 更新 HARD_RULES.md API 路徑規範

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-03-24 10:39:22 +08:00
OG T
6e644d4fd0 docs: 禁止 Mock 測試規則整合至 HARD_RULES + CLAUDE.md
統帥鐵律 (2026-03-24):
- HARD_RULES.md 新增 No Mock Testing 章節
- CLAUDE.md 新增測試主題引用
- Skill 05 新增禁止 Mock 詳細規範
- LOGBOOK.md 更新當前狀態

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-03-24 10:28:46 +08:00
OG T
00d94ca71c docs: CLAUDE.md 引用 HARD_RULES.md (禁止爆滿)
結構:
- CLAUDE.md: 精簡索引,只放引用連結
- docs/HARD_RULES.md: 詳細規則

這是早就溝通好的做法,不應該忘記。

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-03-23 23:32:35 +08:00
OG T
dc30c70e57 docs(CLAUDE.md): 新增絕對禁止規則 (Hard Rules)
問題:
- Memory 有記錄但沒有實際遵守
- CI workflow 被改成 ubuntu-latest 違反 Memory 鐵律
- 長期記憶形同虛設

修復:
- 直接在 CLAUDE.md 寫死禁止項目
- 新增修改前檢查清單
- 這些規則會在每次 Session 自動載入

禁止項目:
- runs-on: ubuntu-latest → self-hosted
- Telegram logOut() → 禁止
- 前端硬編碼 → next-intl
- SQLite → PostgreSQL
- CORS * → 白名單
- 假數據 → 真實 API

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-03-23 23:31:04 +08:00
OG T
2b1264df05 docs: 完整治理架構 ADR-010/011/012 + CLAUDE.md 鐵律更新
2026-03-23 重大事故修復與治理:

1. ADR-010: Secrets 集中管理 (Bitwarden + Sealed Secrets)
2. ADR-011: NetworkPolicy 變更治理 (偵測 + 告警 + 人工決策)
3. ADR-012: 危險操作治理 (Tier 分級 + CI/CD 攔截 + 審計)
4. UX-001: 告警疲勞解決方案 (時間衰減 + 智慧分組)

CLAUDE.md 更新:
- 新增最高優先級鐵律 (禁止 ClawBot、OpenClaw 核心、禁止危險 API)
- 新增任務開始前必讀 Memory 對照表

事故教訓:
- Telegram Token 連續三次被 logOut 失效
- AWOOOI API 程式碼呼叫 logOut 導致災難
- 已停用 AWOOOI API Telegram,OpenClaw 為唯一 Gateway

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-03-23 19:44:56 +08:00
OG T
7478dc0254 feat(phase6-9): Complete modular architecture and Agent Teams
Phase 6.4 - Modular Architecture:
- Add lewooogo-brain adapters for LLM providers
- Add lewooogo-data dual memory (Redis + PostgreSQL)
- Implement consensus engine for multi-agent decisions
- Add incident memory service for historical context

Phase 9 - Agent Teams (Claude Agent SDK):
- Add base agent class with Claude Sonnet 4 integration
- Implement action planner, blast radius, and security agents
- Add agent API endpoints and proposal workflow
- Integrate ADR-009 OpenClaw Agent Teams architecture

DevOps & CI/CD:
- Add GitHub Actions CI/CD workflows (ci.yaml, cd.yaml)
- Add pre-commit hooks and secrets baseline
- Add docker-compose for local development
- Update Kubernetes network policies

Frontend Improvements:
- Add auto-healing error boundary component
- Update i18n messages for agent features
- Enhance dual-state incident card with execution feedback

Documentation:
- Add 7 ADRs covering MCP, design system, architecture decisions
- Update ARCHITECTURE_MEMORY.md with modular design
- Add GLOBAL_RULES.md and SOUL.md for project identity

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-03-23 18:40:36 +08:00
OG T
41127f1e8b docs: Add CLAUDE.md for Claude Code auto-load configuration
- Skills index and routing table
- Core rules (simplified)
- Props mapping lesson from Y/n button incident

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-03-23 14:05:58 +08:00