Some checks failed
CD Pipeline / deploy (push) Failing after 59s
- 建立 Gitea Actions CD pipeline (.gitea/workflows/cd.yaml) - 部署模式: rsync Python 檔案至 188 → docker restart (volume mount) - Dockerfile/requirements 變動時自動重建 Docker image - 部署通知: Telegram (開始/成功/失敗) - 健康檢查: https://mo.wooo.work/health (最多 5 次重試) - 同步最新 CLAUDE.md / ADR-008 / memory (2026-04-19) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
201 lines
4.9 KiB
Markdown
201 lines
4.9 KiB
Markdown
# WOOO AIOps Core
|
|
|
|
智慧雲端運維平台核心模組
|
|
|
|
## 架構概覽
|
|
|
|
```
|
|
aiops-core/
|
|
├── deploy_engine/ # 部署引擎
|
|
│ ├── deploy_service.py # 主部署服務
|
|
│ ├── template_renderer.py # Jinja2 模板渲染
|
|
│ └── k8s_client.py # Kubernetes 客戶端
|
|
│
|
|
├── monitor_engine/ # 監控引擎
|
|
│ ├── monitor_service.py # 監控服務
|
|
│ ├── prometheus_client.py # Prometheus API
|
|
│ └── alert_manager.py # Alertmanager API
|
|
│
|
|
├── repair_engine/ # 自動修復引擎
|
|
│ ├── repair_service.py # 修復決策引擎
|
|
│ ├── repair_executor.py # 修復執行器
|
|
│ └── repair_strategies.py # 修復策略
|
|
│
|
|
├── templates/ # K8s 部署模板
|
|
│ ├── base/ # 基礎模板
|
|
│ │ ├── namespace.yaml.j2
|
|
│ │ ├── service.yaml.j2
|
|
│ │ └── ingress.yaml.j2
|
|
│ └── frameworks/ # 框架專用模板
|
|
│ ├── fastapi/
|
|
│ ├── flask/
|
|
│ ├── express/
|
|
│ └── nextjs/
|
|
│
|
|
├── api/ # FastAPI 後端
|
|
│ ├── main.py # 應用入口
|
|
│ └── routers/ # API 路由
|
|
│ ├── auth.py # 認證
|
|
│ ├── apps.py # 應用管理
|
|
│ ├── deployments.py # 部署管理
|
|
│ ├── monitoring.py # 監控
|
|
│ ├── repairs.py # 自動修復
|
|
│ └── users.py # 用戶管理
|
|
│
|
|
└── web/ # React 前端
|
|
├── src/
|
|
│ ├── pages/ # 頁面
|
|
│ ├── components/ # 組件
|
|
│ └── lib/ # 工具庫
|
|
└── package.json
|
|
```
|
|
|
|
## 核心功能
|
|
|
|
### 1. Deploy Engine - 一鍵部署
|
|
|
|
```python
|
|
from aiops_core.deploy_engine import DeployService
|
|
|
|
deploy_service = DeployService()
|
|
|
|
# 部署新應用
|
|
result = deploy_service.deploy(
|
|
app=AppConfig(
|
|
name="my-api",
|
|
framework="fastapi",
|
|
git_repo="https://github.com/user/repo.git",
|
|
branch="main"
|
|
)
|
|
)
|
|
```
|
|
|
|
### 2. Monitor Engine - 智能監控
|
|
|
|
```python
|
|
from aiops_core.monitor_engine import MonitorService
|
|
|
|
monitor_service = MonitorService()
|
|
|
|
# 設置監控
|
|
monitor_service.setup_monitoring(
|
|
config=MonitorConfig(
|
|
app_name="my-api",
|
|
namespace="default",
|
|
telegram_chat_id="123456789"
|
|
)
|
|
)
|
|
|
|
# 取得健康狀態
|
|
health = monitor_service.get_app_health("my-api", "default")
|
|
```
|
|
|
|
### 3. Repair Engine - 自動修復
|
|
|
|
```python
|
|
from aiops_core.repair_engine import RepairService
|
|
|
|
repair_service = RepairService()
|
|
|
|
# 處理告警,自動決定並執行修復
|
|
repair_service.process_alert({
|
|
"labels": {
|
|
"alertname": "HighMemoryUsage",
|
|
"app": "my-api",
|
|
"namespace": "default"
|
|
}
|
|
})
|
|
```
|
|
|
|
## 支援的框架
|
|
|
|
| 框架 | 狀態 | 預設端口 |
|
|
|------|------|---------|
|
|
| FastAPI | ✅ | 8000 |
|
|
| Flask | ✅ | 5000 |
|
|
| Express.js | ✅ | 3000 |
|
|
| Next.js | ✅ | 3000 |
|
|
| Django | 🚧 | 8000 |
|
|
| NestJS | 🚧 | 3000 |
|
|
|
|
## 自動修復策略
|
|
|
|
| 告警類型 | 修復動作 |
|
|
|---------|---------|
|
|
| AppDown | 重啟 Pod |
|
|
| HighMemoryUsage | 重啟 Pod |
|
|
| PodOOMKilled | 增加記憶體限制 +50% |
|
|
| HighCPUUsage | 擴容 +50% |
|
|
| HighHTTP5xxRate | 回滾到上一版本 |
|
|
| PostgresHighConnections | VACUUM ANALYZE |
|
|
| DiskSpaceLow | 清理快取 |
|
|
|
|
## API 端點
|
|
|
|
### 認證
|
|
- `POST /api/auth/login` - 登入
|
|
- `POST /api/auth/register` - 註冊
|
|
- `GET /api/auth/me` - 取得當前用戶
|
|
|
|
### 應用管理
|
|
- `GET /api/apps` - 列出應用
|
|
- `POST /api/apps` - 創建應用
|
|
- `GET /api/apps/{id}` - 取得應用詳情
|
|
- `PUT /api/apps/{id}` - 更新應用
|
|
- `DELETE /api/apps/{id}` - 刪除應用
|
|
- `POST /api/apps/{id}/start` - 啟動應用
|
|
- `POST /api/apps/{id}/stop` - 停止應用
|
|
- `POST /api/apps/{id}/restart` - 重啟應用
|
|
|
|
### 部署
|
|
- `GET /api/deployments` - 列出部署記錄
|
|
- `POST /api/deployments` - 創建部署
|
|
- `POST /api/deployments/{id}/cancel` - 取消部署
|
|
- `POST /api/deployments/{id}/rollback` - 回滾部署
|
|
|
|
### 監控
|
|
- `GET /api/monitoring/dashboard` - 儀表板概覽
|
|
- `GET /api/monitoring/apps/{id}/metrics` - 應用指標
|
|
- `GET /api/monitoring/apps/{id}/health` - 健康狀態
|
|
- `GET /api/monitoring/alerts` - 告警列表
|
|
|
|
### 自動修復
|
|
- `GET /api/repairs` - 修復記錄
|
|
- `GET /api/repairs/stats` - 修復統計
|
|
- `POST /api/repairs/apps/{id}/trigger` - 手動觸發修復
|
|
|
|
## 快速開始
|
|
|
|
### 啟動 API 服務
|
|
|
|
```bash
|
|
cd aiops-core/api
|
|
pip install -r requirements.txt
|
|
uvicorn main:app --reload --port 8000
|
|
```
|
|
|
|
### 啟動 Web 前端
|
|
|
|
```bash
|
|
cd aiops-core/web
|
|
npm install
|
|
npm run dev
|
|
```
|
|
|
|
## 環境變數
|
|
|
|
```bash
|
|
# API
|
|
JWT_SECRET=your-secret-key
|
|
PROMETHEUS_URL=http://prometheus:9090
|
|
ALERTMANAGER_URL=http://alertmanager:9093
|
|
TELEGRAM_BOT_TOKEN=your-bot-token
|
|
|
|
# Web
|
|
NEXT_PUBLIC_API_URL=http://localhost:8000/api
|
|
```
|
|
|
|
## 授權
|
|
|
|
© 2026 WOOO TECH. All rights reserved.
|