Commit Graph

13 Commits

Author SHA1 Message Date
Your Name
cd17a67774 fix(alerts): surface legacy hitl backlog
Some checks failed
CD Pipeline / tests (push) Successful in 1m21s
Code Review / ai-code-review (push) Successful in 13s
Type Sync Check / check-type-sync (push) Failing after 40s
CD Pipeline / build-and-deploy (push) Successful in 5m22s
CD Pipeline / post-deploy-checks (push) Successful in 2m19s
2026-05-31 13:16:22 +08:00
OG T
7da64eaad2 feat(Phase 3): 學習閉環重建 — 三根因修復 + 2x EWMA + Evolver Agent
Some checks failed
CD Pipeline / build-and-deploy (push) Failing after 19m7s
Type Sync Check / check-type-sync (push) Failing after 1m18s
ADR-083 Phase 3 學習閉環重建:

**三根因修復**
- approval_execution.py: fire-and-forget create_task → await asyncio.wait_for(timeout=30) × 2
  (成功路徑 L265 + 失敗路徑 L353,超時記錄 learning_trigger_timeout metric,主流程不 crash)
- models/approval.py: ApprovalRequestBase 新增 matched_playbook_id 欄位
- decision_manager.py: _auto_execute 建立 ApprovalRequest 時填充 matched_playbook_id
- learning_service.py: 雙路徑查找 _matched_pb_id(matched_playbook_id + metadata fallback)

**2x EWMA 負向強化**
- models/playbook.py: 新增 trust_score: float = 0.3(EWMA 動態信任度欄位)
- repositories/playbook_repository.py: update_stats 加 EWMA
  成功: trust = 0.9 × old + 0.1 × 1.0
  失敗: trust = 0.8 × old + 0.2 × 0.0(衰減速度 2x)
  trust < 0.1 → log warning,等 Evolver 封存

**Evolver Agent(新建)**
- services/playbook_evolver.py: 三功能全靜態規則
  1. 低信任封存: trust < 0.1 → DEPRECATED
  2. 休眠封存: 30d 未使用 AND trust < 0.5 → DEPRECATED
  3. 相似合併: 症狀 Jaccard > 0.9 → 保留高 trust,封存低 trust
  AIOPS_P3_EVOLVER_ENABLED=False 預設關閉

**文件**
- ADR-083 學習閉環重建
- MASTER §8 Phase 3 完工記錄

AIOPS_P3_ENABLED=False(預設),骨架就位等統帥批准開啟

Co-Authored-By: Claude Sonnet 4.6(亞太)<noreply@anthropic.com>
2026-04-15 14:01:37 +08:00
OG T
914c7e7a90 fix: 9b9ff5b 引發的 NoneAttr bug — incident_id 上移到 Base
Some checks failed
CD Pipeline / build-and-deploy (push) Has started running
Type Sync Check / check-type-sync (push) Failing after 1m17s
bug: 'ApprovalRequestCreate' object has no attribute 'incident_id'
Live-fire #6 整個 webhook 500 fail。

根因: 9b9ff5b 在 approval_db 寫 request.incident_id,
但 ApprovalRequestCreate 繼承 Base 沒這 field(只在 ApprovalRequest 才有)。

修復: 把 incident_id 上移到 ApprovalRequestBase
- ApprovalRequestCreate 自動繼承 → webhook 可建帶 incident_id 的 request
- ApprovalRequest 不重複定義
- 786/786 回歸測試全過

Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
2026-04-14 20:01:47 +08:00
OG T
59c3dfb910 fix(models): approval.py 改用 timezone.utc 相容 Python 3.10
Some checks failed
CD Pipeline / build-and-deploy (push) Successful in 12m12s
Type Sync Check / check-type-sync (push) Failing after 52s
CI runner 用 Python 3.10,datetime.UTC 是 3.11 才加入。
改用 datetime.timezone.utc 全版本相容,修復 CI type-sync 全量失敗。

# 2026-04-06 ogt: root cause — CI Python 3.10 無法 import UTC

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-06 12:19:23 +08:00
OG T
658337ec18 fix(phase26): 打通 Incident→DB→KM 完整鏈路 + namespace 修正
Some checks failed
CD Pipeline / build-and-deploy (push) Failing after 1m29s
Type Sync Check / check-type-sync (push) Failing after 52s
問題根因:
1. create_incident_for_approval 只存 Redis,不存 PostgreSQL
   → TTL 7天後消失,Playbook 萃取永遠找不到 Incident
2. ApprovalRecord 無 incident_id 欄位
   → _trigger_playbook_extraction 靠 regex 掃中文文字找 INC-,永遠失敗
3. operation_parser namespace fallback 是 "default"
   → 所有 deployment 在 awoooi-prod,203 次執行全失敗

修復:
- Incident 同時寫入 Redis + PostgreSQL (save_to_episodic_memory)
- ApprovalRecord 加入 incident_id 欄位 (model + ORM + migration)
- alertmanager_webhook 建立 Approval 後回寫 incident_id
- _trigger_playbook_extraction 直接用 approval.incident_id
- operation_parser DEFAULT_NAMESPACE = "awoooi-prod"

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-06 11:46:05 +08:00
OG T
c9c60c3a61 feat(mcp-integrations): Phase S 架構修復 + MCP 整合基礎建設
Some checks failed
E2E Health Check / e2e-health (push) Has been cancelled
CD Pipeline / build-and-deploy (push) Has been cancelled
Type Sync Check / check-type-sync (push) Failing after 22s
Phase S 技術債修復 (首席架構師審查 82→完整):
- S-01: generate_alert_fingerprint 移至 AlertAnalyzer.generate_fingerprint() staticmethod
- S-04: 移除 Pydantic v2 deprecated json_encoders (直接用原生 datetime 序列化)

Sentry MCP 整合 (Phase 23):
- ADR-048: Sentry→OpenClaw AI Triage 架構決策
- sentry_webhook_service.py: parse/analyze/create_incident/build_message Service 層
- config.py: SENTRY_WEBHOOK_SECRET (Fail-Closed HMAC-SHA256)

Playwright MCP 整合 (短期):
- smoke.spec.ts: 5 頁面 E2E smoke test (home/dashboard/incidents/approvals/terminal)
- cd.yaml: E2E Smoke Test 步驟 + Telegram 🎭 Smoke 狀態通知

長期規劃 ADR:
- ADR-049: Figma Code Connect 設計系統同步
- ADR-050: Telegram 互動式 Incident 2.0 (6鍵 Inline Keyboard)
- ADR-051: Context7 依賴升級顧問 (Next.js 14→15, FastAPI 0.115→0.128)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-01 16:20:57 +08:00
OG T
22de22c989 refactor(phase-s): Phase S 技術債清理 - 五項架構改善
S-01: generate_alert_fingerprint() 移至 alert_analyzer_service (Router→Service)
S-02: 移除廢棄 USE_NEW_ENGINE config (Phase R 已完成歷史使命)
S-03: github_webhook.py linter 清理 (Field unused + delivery_id noqa)
S-04: Pydantic v2 遷移 - approval/incident models (class Config → ConfigDict)
S-05: Skill 09 v1.1 更新 (USE_NEW_ENGINE 廢棄說明)

測試: 393 passed, 零失敗

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-01 13:12:02 +08:00
OG T
e0584bc181 refactor(api): Phase 16 R2 封存死代碼 + RiskLevel 統一
封存 (866 行):
- routes/approvals.py → _archived/routes/ (477 行,未註冊死代碼)
- services/approval.py → _archived/services/ (389 行,僅被死代碼使用)

合併 RiskLevel:
- models/approval.py 新增 HIGH (從 trust_engine.py 合併)
- trust_engine.py 改 import from models/approval.py
- 保留舊定義為註解供回滾

更新 services/__init__.py:
- 移除已封存模組的 import (註解保留回滾路徑)

驗證:
- RiskLevel 統一: models 與 trust_engine 使用同一 class
- 24 個 action_parsing 測試通過

回滾指令見 _archived/README.md

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-03-25 23:14:24 +08:00
OG T
b13b063282 feat(web): Phase 11 對話式 AI UI/UX (#47-59)
Phase 11.1 對話式容器:
- ConversationalView 雙欄佈局 (左側列表 + 右側詳情)
- ApprovalThreadItem 風險等級 + 相對時間顯示
- SSE 即時更新整合

Phase 11.2 批次處理:
- BatchModeSelector 組件 (全部接受/逐一審核/CRITICAL Only)
- POST /api/v1/approvals/bulk-approve API 端點
- CRITICAL + DESTRUCTIVE 安全過濾 (禁止批次核准)

Phase 11.4 鍵盤快捷鍵:
- useKeyboardShortcuts hook (Y/N/方向鍵/Esc)
- Y 鍵長按 2 秒核准 + 頂部進度指示器
- 快捷鍵說明 Modal (Y/N 高亮顯示)

i18n: 100% next-intl 覆蓋

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-03-25 10:31:35 +08:00
OG T
6f049877fc fix(lint): ruff auto-fix + lewooogo-core src 加入 git
- Python: ruff --fix 修復 280 個 lint 錯誤
- lewooogo-core: src/ 目錄未追蹤,導致 CI eslint 失敗

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-03-23 23:51:37 +08:00
OG T
65fa1168b8 feat(api): ApprovalRequestResponse 新增 metadata 欄位
讓前端/API 可見 incident_id,用於除錯和關聯追蹤

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-03-23 21:51:05 +08:00
OG T
7478dc0254 feat(phase6-9): Complete modular architecture and Agent Teams
Phase 6.4 - Modular Architecture:
- Add lewooogo-brain adapters for LLM providers
- Add lewooogo-data dual memory (Redis + PostgreSQL)
- Implement consensus engine for multi-agent decisions
- Add incident memory service for historical context

Phase 9 - Agent Teams (Claude Agent SDK):
- Add base agent class with Claude Sonnet 4 integration
- Implement action planner, blast radius, and security agents
- Add agent API endpoints and proposal workflow
- Integrate ADR-009 OpenClaw Agent Teams architecture

DevOps & CI/CD:
- Add GitHub Actions CI/CD workflows (ci.yaml, cd.yaml)
- Add pre-commit hooks and secrets baseline
- Add docker-compose for local development
- Update Kubernetes network policies

Frontend Improvements:
- Add auto-healing error boundary component
- Update i18n messages for agent features
- Enhance dual-state incident card with execution feedback

Documentation:
- Add 7 ADRs covering MCP, design system, architecture decisions
- Update ARCHITECTURE_MEMORY.md with modular design
- Add GLOBAL_RULES.md and SOUL.md for project identity

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-03-23 18:40:36 +08:00
OG T
196d269b92 feat: add all application source code
- apps/api: FastAPI backend with Dockerfile
- apps/web: Next.js frontend with Dockerfile
- apps/sensor: Signal collection agent
- packages: shared packages

Co-Authored-By: Claude <noreply@anthropic.com>
2026-03-22 18:57:44 +08:00