OG T
|
0d239838b4
|
fix(cr): Code Review P2 — 測試覆蓋 + CronJob 腳本重構
CD Pipeline / build-and-deploy (push) Has been cancelled
P2-1: CronJob inline Python 抽成 scripts/cron_km_vectorize.py
Dockerfile 加入 COPY scripts/,CronJob YAML 改用腳本路徑
P2-2: 新增 test_classify_alert_early.py — 23 tests 覆蓋 7 條分類規則
含邊界情況:VeleroBackupFailed(backup優先於k8s)、優先順序驗證
595 unit tests passed
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-04-12 15:14:44 +08:00 |
|
OG T
|
b261a51685
|
feat(rag): Dockerfile 加入 docs/ + .agents/skills/ — RAG 索引來源
CD Pipeline / build-and-deploy (push) Failing after 2m11s
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-04-10 09:16:51 +08:00 |
|
OG T
|
dbb8104557
|
fix(drift): kubectl not found + RBAC services/configmaps/ingresses
CD Pipeline / build-and-deploy (push) Has been cancelled
drift_detector 用 kubectl 比對 Git YAML vs K8s 實際狀態,但:
1. API image 沒有 kubectl binary → No such file or directory: 'kubectl'
2. awoooi-executor ClusterRole 缺少 services/configmaps/ingresses list 權限
修復:
- Dockerfile: apt install curl + download kubectl v1.29.0 amd64
- 07-rbac.yaml: 加入 services/configmaps (core) + ingresses (networking.k8s.io) get/list/watch
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-04-10 00:49:56 +08:00 |
|
OG T
|
c132fd423a
|
fix(drift): COPY k8s/ 進 API image — drift_detector Git state 比對
CD Pipeline / build-and-deploy (push) Has been cancelled
drift_detector 的 GitStateReader 需要讀 k8s/*.yaml 來比對 K8s 實際狀態,
但 API Pod 沒有此目錄導致 k8s_dir_not_found,掃描結果永遠為空。
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-04-10 00:23:54 +08:00 |
|
OG T
|
1fb0c0ca90
|
fix(auto-repair): Bug #5+#6 — SSH binary + affected_services 匹配修正
CD Pipeline / build-and-deploy (push) Has been cancelled
Bug #5 (webhooks.py): target_resource 現在優先用 component label
- SentryDown alert 有 labels.component="sentry"
- 舊邏輯: labels.instance="192.168.0.110:9000" → Playbook affected_services 不匹配
- 新邏輯: component → pod → instance → alertname
Bug #6 (Dockerfile): python:3.11-slim 無 openssh-client
- SSH_COMMAND Playbook 執行路徑調用 asyncio.create_subprocess_exec("ssh", ...)
- image 沒有 ssh binary → 所有 SSH 修復必然失敗
- 修正: 在 production stage 安裝 openssh-client
服務清單: 補 sentry 主服務到 service-registry.yaml (AUTO 級別)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-04-09 14:11:50 +08:00 |
|
OG T
|
db02eb41d0
|
fix(docker): COPY alert_rules.yaml 進容器
CD Pipeline / build-and-deploy (push) Has been cancelled
規則引擎從 ./alert_rules.yaml 載入,Dockerfile 漏了 COPY
2026-04-09 ogt: fix
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-04-09 09:12:42 +08:00 |
|
OG T
|
4762ad924d
|
ci(cd): 首席架構師 Review Phase 25 全批修正 (C1-C4 / S1-S4 / I1-I4)
修正項目:
C1: DOCKER_BUILDKIT=1 + ARG BUILDKIT_INLINE_CACHE + syntax directive (兩個 Dockerfile)
C2: Alert Chain Smoke Test 修正 pass/fail 輸出邏輯 (不再無條件 pass)
C3: API Dockerfile builder stage 先 pip install 後 COPY src/ (deps cache 正確失效)
C4: Deploy step 自行管理 SSH key + ssh-keyscan 取代 StrictHostKeyChecking=no
S1/S2: 統一 SSH 連線方式,移除 StrictHostKeyChecking=no
S3: API Dockerfile HEALTHCHECK 改用 curl 取代 httpx (確保 image 有該工具)
S4: type-sync-check.yaml python → python3
I1: 建立 .dockerignore 防止無關檔案污染 build context
I2: 加入 Setup Python Tools 共用步驟
I3: deploy-alerts job 移至獨立 deploy-alerts.yaml workflow (paths trigger)
I4: E2E Smoke Test 加入 pnpm install + PLAYWRIGHT_BASE_URL 公網域名
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-04-05 12:42:37 +08:00 |
|
OG T
|
9913f5dc6d
|
feat(infra): 開發環境分離 + BuildKit cache 修復 + circuit breaker 優化
CD Pipeline / build-and-deploy (push) Successful in 6m52s
E2E Health Check / e2e-health (push) Successful in 17s
CD Pipeline (Dev) / build-and-deploy-dev (push) Failing after 9s
1. k8s/awoooi-dev/: 新建 dev namespace (01-05 配置)
- Namespace + ResourceQuota (cpu 2/4, mem 4Gi/8Gi)
- ConfigMap: ENVIRONMENT=dev, LOG_LEVEL=DEBUG, SHADOW_MODE=false
- Deployment: 1 replica, NodePort 32344, image dev-latest
- RBAC: awoooi-executor-dev ServiceAccount
2. .gitea/workflows/cd-dev.yaml: dev branch CD pipeline
- 觸發: dev branch push
- Build: --no-cache (防 cache poisoning)
- Tag: dev-{sha} / dev-latest
- Deploy: awoooi-dev namespace, health check 32344
- Telegram: [DEV] 前綴通知
3. apps/api/Dockerfile: ARG CACHE_BUST=none (防 BuildKit cache 毒化)
- deps 層 (pip install) 仍可 cache
- src/ 和 models.json 層每次重建
4. .gitea/workflows/cd.yaml: 正式環境 API build 加入 CACHE_BUST=git_sha
- 確保 models.json 等配置變更正確進入 image
5. apps/api/src/services/nvidia_provider.py: timeout 不計入 circuit breaker
- TimeoutException → 只 log,不 record_failure()
- 只有硬性錯誤 (auth/rate limit/exception) 才斷路
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-04-01 16:22:21 +08:00 |
|
OG T
|
fb0ddf305c
|
fix(api): fix dockerfile to include models.json, remove huge prompt example to fit 4K limit
E2E Health Check / e2e-health (push) Successful in 17s
|
2026-03-31 14:03:34 +08:00 |
|
OG T
|
7478dc0254
|
feat(phase6-9): Complete modular architecture and Agent Teams
Phase 6.4 - Modular Architecture:
- Add lewooogo-brain adapters for LLM providers
- Add lewooogo-data dual memory (Redis + PostgreSQL)
- Implement consensus engine for multi-agent decisions
- Add incident memory service for historical context
Phase 9 - Agent Teams (Claude Agent SDK):
- Add base agent class with Claude Sonnet 4 integration
- Implement action planner, blast radius, and security agents
- Add agent API endpoints and proposal workflow
- Integrate ADR-009 OpenClaw Agent Teams architecture
DevOps & CI/CD:
- Add GitHub Actions CI/CD workflows (ci.yaml, cd.yaml)
- Add pre-commit hooks and secrets baseline
- Add docker-compose for local development
- Update Kubernetes network policies
Frontend Improvements:
- Add auto-healing error boundary component
- Update i18n messages for agent features
- Enhance dual-state incident card with execution feedback
Documentation:
- Add 7 ADRs covering MCP, design system, architecture decisions
- Update ARCHITECTURE_MEMORY.md with modular design
- Add GLOBAL_RULES.md and SOUL.md for project identity
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
|
2026-03-23 18:40:36 +08:00 |
|
OG T
|
196d269b92
|
feat: add all application source code
- apps/api: FastAPI backend with Dockerfile
- apps/web: Next.js frontend with Dockerfile
- apps/sensor: Signal collection agent
- packages: shared packages
Co-Authored-By: Claude <noreply@anthropic.com>
|
2026-03-22 18:57:44 +08:00 |
|