2026-03-29 ogt: Telegram 訊息模板完整實作 新增訊息類型: - SentryErrorMessage: Sentry 錯誤通知 (含 Stack Trace) - ResourceWarnMessage: 資源耗盡警告 (含 CPU/Memory/Disk) - RepairReportMessage: 自動修復每日報告 - DailySummaryMessage: 每日系統狀態摘要 - DeploySuccessMessage: CD 部署成功通知 - RateLimitMessage: API 限額警告 新增發送方法: - send_sentry_error() - send_resource_warning() - send_repair_report() - send_daily_summary() - send_deploy_success() - send_rate_limit_warning() 新增按鈕: - Sentry: [🔍 查看詳情] [🔕 靜默 1h] - Resource: [⚡ 自動擴展] [🔕 靜默 1h] 測試: 14 測試案例全部通過 Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
484 lines
12 KiB
Markdown
484 lines
12 KiB
Markdown
# Telegram 訊息模板規範
|
||
|
||
> **版本**: v1.0
|
||
> **建立日期**: 2026-03-29 (台北時區)
|
||
> **負責人**: CIO
|
||
> **用途**: 定義所有 Telegram 告警訊息的標準格式
|
||
|
||
---
|
||
|
||
## 訊息類別總覽
|
||
|
||
| # | 類別 | 代碼 | 按鈕 | 狀態 |
|
||
|---|------|------|------|------|
|
||
| 1 | Incident Alert | `INCIDENT` | 4-5 顆 | ✅ 已實作 |
|
||
| 2 | CI Failure Diagnosis | `CI_FAILURE` | 0 顆 | ✅ 已實作 |
|
||
| 3 | GitHub PR Review | `PR_REVIEW` | 0 顆 | ✅ 已實作 |
|
||
| 4 | Execution Result | `EXEC_RESULT` | 0 顆 | ✅ 已實作 |
|
||
| 5 | Heartbeat | `HEARTBEAT` | 0 顆 | ✅ 已實作 |
|
||
| 6 | Silence Alert | `SILENCE` | 0 顆 | ✅ 已實作 |
|
||
| 7 | Sentry Error | `SENTRY_ERROR` | 2 顆 | ✅ 已實作 |
|
||
| 8 | Resource Exhaustion | `RESOURCE_WARN` | 2 顆 | ✅ 已實作 |
|
||
| 9 | Auto-Repair Report | `REPAIR_REPORT` | 0 顆 | ✅ 已實作 |
|
||
| 10 | Daily Summary | `DAILY_SUMMARY` | 0 顆 | ✅ 已實作 |
|
||
| 11 | Deployment Success | `DEPLOY_SUCCESS` | 0 顆 | ✅ 已實作 |
|
||
| 12 | Rate Limit Warning | `RATE_LIMIT` | 0 顆 | ✅ 已實作 |
|
||
|
||
---
|
||
|
||
## 已實作訊息模板
|
||
|
||
### 1️⃣ Incident Alert (INCIDENT)
|
||
|
||
**檔案**: `telegram_gateway.py:99` - `TelegramMessage`
|
||
|
||
**按鈕配置**:
|
||
```
|
||
第一行: [✅ 簽核] [❌ 拒絕]
|
||
第二行: [⏰ 稍後] [🔕 靜默 1h]
|
||
第三行: [⚡ 執行自動調優] (可選,有 auto_tuning_command 時顯示)
|
||
```
|
||
|
||
**模板**:
|
||
```
|
||
═══════════════════════════
|
||
🚨 CRITICAL | harbor-core
|
||
═══════════════════════════
|
||
📋 INC-20260321-0001
|
||
🎯 資源: harbor-core-7d4b8c9f5
|
||
━━━━━━━━━━━━━━━━━━━
|
||
🤖 Gemini 仲裁
|
||
👥 責任: ⚙️ BE (後端)
|
||
📊 信心: 🟢 88%
|
||
💰 Tokens: 1,234 / $0.0012
|
||
💡 原因: JVM Heap 配置不當
|
||
━━━━━━━━━━━━━━━━━━━
|
||
📊 頻率統計 ⚠️
|
||
├ 1h: 3 次
|
||
├ 24h: 8 次
|
||
└ 修復: 2 次
|
||
🔺 升級: REPEAT
|
||
━━━━━━━━━━━━━━━━━━━
|
||
📊 SignOz 指標
|
||
├ RPS: 150.2 📈
|
||
├ Error: 🟢 0.5%
|
||
└ P99: 245ms ➡️
|
||
━━━━━━━━━━━━━━━━━━━
|
||
🔧 建議: 刪除 Pod
|
||
⏱️ 停機: ~30s
|
||
🔍 查看 SignOz Trace (±5min)
|
||
|
||
[✅ 簽核] [❌ 拒絕]
|
||
[⏰ 稍後] [🔕 靜默 1h]
|
||
[⚡ 執行自動調優]
|
||
```
|
||
|
||
**按鈕功能**:
|
||
|
||
| 按鈕 | Callback | Handler | 功能 |
|
||
|------|----------|---------|------|
|
||
| ✅ 簽核 | `approve` | `sign_approval()` | 執行建議操作 |
|
||
| ❌ 拒絕 | `reject` | `reject_approval()` | 拒絕並記錄 |
|
||
| ⏰ 稍後 | `snooze` | `_handle_snooze()` | 30 分鐘後再提醒 |
|
||
| 🔕 靜默 1h | `silence` | `_handle_silence()` | 同資源告警靜默 1 小時 |
|
||
| ⚡ 執行自動調優 | `tune` | 執行 kubectl 指令 | 自動調整資源配置 |
|
||
|
||
---
|
||
|
||
### 2️⃣ CI Failure Diagnosis (CI_FAILURE)
|
||
|
||
**檔案**: `github_webhook.py:1186`
|
||
|
||
**按鈕**: 無
|
||
|
||
**模板**:
|
||
```
|
||
🔴 **CI 失敗診斷** | awoooi
|
||
|
||
📋 **Workflow**: CD
|
||
👤 **觸發者**: ogt
|
||
🔗 [查看 Workflow](https://github.com/...)
|
||
|
||
📝 **摘要**: Build failed due to missing dependency
|
||
🔍 **根因**: prometheus-client package not installed
|
||
⚠️ **錯誤類型**: DEPENDENCY_ERROR
|
||
🎯 **風險等級**: MEDIUM
|
||
🔧 **修復決策**: 🤖 自動修復
|
||
|
||
💡 **AI 建議**:
|
||
1. 安裝 prometheus-client 依賴
|
||
2. 更新 pyproject.toml
|
||
|
||
🔨 **修復選項**:
|
||
1. `AUTO_FIX` (85% 信心)
|
||
`pip install prometheus-client`
|
||
|
||
🆔 `CI-20260329-0001`
|
||
```
|
||
|
||
---
|
||
|
||
### 3️⃣ GitHub PR Review (PR_REVIEW)
|
||
|
||
**檔案**: `github_webhook.py:1400+`
|
||
|
||
**按鈕**: 無
|
||
|
||
**模板**:
|
||
```
|
||
📝 **PR 審核** | awoooi
|
||
|
||
🔗 PR #123: feat: 新增 NVIDIA 整合
|
||
👤 作者: @ogt
|
||
📋 審核者: @reviewer
|
||
|
||
✅ **已通過**
|
||
|
||
💬 評論: LGTM! 程式碼品質良好
|
||
```
|
||
|
||
---
|
||
|
||
### 4️⃣ Execution Result (EXEC_RESULT)
|
||
|
||
**檔案**: `approval_execution.py:250` - `NotificationMessage`
|
||
|
||
**按鈕**: 無
|
||
|
||
**模板 (成功)**:
|
||
```
|
||
✅ **執行完成** | restart-pod
|
||
|
||
📋 Approval: APR-20260329-0001
|
||
🔧 操作: RESTART_DEPLOYMENT
|
||
📁 Namespace: awoooi-prod
|
||
⏱️ 耗時: 2.3s
|
||
|
||
👥 簽核者:
|
||
- @ogt: 已授權
|
||
|
||
🎯 影響:
|
||
├ Pod 數: 2
|
||
├ 停機: ~30s
|
||
└ 服務: awoooi-api
|
||
```
|
||
|
||
**模板 (失敗)**:
|
||
```
|
||
❌ **執行失敗** | restart-pod
|
||
|
||
📋 Approval: APR-20260329-0001
|
||
🔧 操作: RESTART_DEPLOYMENT
|
||
📁 Namespace: awoooi-prod
|
||
|
||
⚠️ 錯誤: Deployment not found
|
||
|
||
👥 簽核者:
|
||
- @ogt: 已授權
|
||
```
|
||
|
||
---
|
||
|
||
### 5️⃣ Heartbeat (HEARTBEAT)
|
||
|
||
**檔案**: `telegram_gateway.py:1408`
|
||
|
||
**按鈕**: 無
|
||
|
||
**模板**:
|
||
```
|
||
📡 告警鏈路: ✅ 正常
|
||
|
||
⏰ 2026-03-29 12:00 (台北)
|
||
🔗 AWOOOI API → Telegram
|
||
```
|
||
|
||
---
|
||
|
||
### 6️⃣ Silence Alert (SILENCE)
|
||
|
||
**檔案**: `telegram_gateway.py:1462`
|
||
|
||
**按鈕**: 無
|
||
|
||
**模板**:
|
||
```
|
||
⚠️ **沉默告警**
|
||
|
||
Telegram 已 **2 小時**沒有收到任何訊息!
|
||
請檢查告警鏈路是否正常運作。
|
||
|
||
🔗 檢查項目:
|
||
├ Alertmanager → AWOOOI API
|
||
├ OpenClaw → Telegram Bot
|
||
└ K8s NetworkPolicy
|
||
```
|
||
|
||
---
|
||
|
||
## ✅ 已實作訊息模板 (2026-03-29 新增)
|
||
|
||
### 7️⃣ Sentry Error (SENTRY_ERROR)
|
||
|
||
**狀態**: ✅ **已實作** (2026-03-29)
|
||
**檔案**: `telegram_gateway.py` - `SentryErrorMessage`
|
||
|
||
**按鈕配置**:
|
||
```
|
||
第一行: [🔍 查看詳情] [🔕 靜默 1h]
|
||
```
|
||
|
||
**模板**:
|
||
```
|
||
═══════════════════════════
|
||
🐛 SENTRY ERROR | awoooi-api
|
||
═══════════════════════════
|
||
📋 SENTRY-abc123
|
||
🎯 錯誤: TypeError: Cannot read property 'x'
|
||
|
||
━━━━━━━━━━━━━━━━━━━
|
||
📊 統計
|
||
├ 發生次數: 15
|
||
├ 影響用戶: 3
|
||
└ 首次發生: 10 分鐘前
|
||
━━━━━━━━━━━━━━━━━━━
|
||
📍 位置: src/api/v1/incidents.py:123
|
||
🔗 Stack Trace (前 3 行):
|
||
→ incidents.py:123 in get_incident
|
||
→ service.py:45 in fetch_data
|
||
→ db.py:89 in query
|
||
|
||
🔍 [查看 Sentry](http://192.168.0.110:9000/...)
|
||
|
||
[🔍 查看詳情] [🔕 靜默 1h]
|
||
```
|
||
|
||
---
|
||
|
||
### 8️⃣ Resource Exhaustion (RESOURCE_WARN)
|
||
|
||
**狀態**: ✅ **已實作** (2026-03-29)
|
||
**檔案**: `telegram_gateway.py` - `ResourceWarnMessage`
|
||
|
||
**按鈕配置**:
|
||
```
|
||
第一行: [⚡ 自動擴展] [🔕 靜默 1h]
|
||
```
|
||
|
||
**模板**:
|
||
```
|
||
═══════════════════════════
|
||
⚠️ 資源告警 | awoooi-api
|
||
═══════════════════════════
|
||
📋 RES-20260329-0001
|
||
🎯 Pod: awoooi-api-7d4b8c9f5-abc12
|
||
|
||
━━━━━━━━━━━━━━━━━━━
|
||
📊 資源使用率
|
||
├ CPU: 🔴 92% (limit: 500m)
|
||
├ Memory: 🟡 78% (limit: 512Mi)
|
||
└ Disk: 🟢 45%
|
||
━━━━━━━━━━━━━━━━━━━
|
||
📈 趨勢: 過去 30 分鐘上升 25%
|
||
💡 建議: 增加 replicas 或調整 limits
|
||
|
||
[⚡ 自動擴展] [🔕 靜默 1h]
|
||
```
|
||
|
||
---
|
||
|
||
### 9️⃣ Auto-Repair Report (REPAIR_REPORT)
|
||
|
||
**狀態**: ✅ **已實作** (2026-03-29)
|
||
**檔案**: `telegram_gateway.py` - `RepairReportMessage`
|
||
|
||
**按鈕**: 無
|
||
|
||
**模板**:
|
||
```
|
||
═══════════════════════════
|
||
🔧 自動修復報告 | 每日彙總
|
||
═══════════════════════════
|
||
📅 2026-03-29
|
||
|
||
━━━━━━━━━━━━━━━━━━━
|
||
📊 統計
|
||
├ 總修復次數: 12
|
||
├ 成功: ✅ 10 (83%)
|
||
├ 失敗: ❌ 2 (17%)
|
||
└ 節省人工: ~45 分鐘
|
||
━━━━━━━━━━━━━━━━━━━
|
||
🔝 Top 3 問題:
|
||
1. Pod CrashLoopBackOff (5 次)
|
||
2. OOM Killed (4 次)
|
||
3. Image Pull Failed (3 次)
|
||
━━━━━━━━━━━━━━━━━━━
|
||
💰 AI 成本
|
||
├ Gemini: $0.0234 (1,823 tokens)
|
||
├ Nvidia: $0.00 (免費)
|
||
└ 總計: $0.0234
|
||
```
|
||
|
||
---
|
||
|
||
### 🔟 Daily Summary (DAILY_SUMMARY)
|
||
|
||
**狀態**: ✅ **已實作** (2026-03-29)
|
||
**檔案**: `telegram_gateway.py` - `DailySummaryMessage`
|
||
|
||
**按鈕**: 無
|
||
|
||
**模板**:
|
||
```
|
||
═══════════════════════════
|
||
📊 每日摘要 | AWOOOI
|
||
═══════════════════════════
|
||
📅 2026-03-29
|
||
|
||
━━━━━━━━━━━━━━━━━━━
|
||
🚨 告警統計
|
||
├ 總數: 45
|
||
├ Critical: 2
|
||
├ Medium: 18
|
||
└ Low: 25
|
||
━━━━━━━━━━━━━━━━━━━
|
||
✅ 處理統計
|
||
├ 自動修復: 30 (67%)
|
||
├ 人工簽核: 10 (22%)
|
||
├ 忽略/靜默: 5 (11%)
|
||
└ 平均回應: 2.3 分鐘
|
||
━━━━━━━━━━━━━━━━━━━
|
||
📈 可用性
|
||
├ API: 99.95%
|
||
├ Web: 99.98%
|
||
└ Worker: 99.90%
|
||
━━━━━━━━━━━━━━━━━━━
|
||
💰 成本
|
||
├ AI: $0.15
|
||
├ 雲端: $0.00
|
||
└ 預算剩餘: $9.85
|
||
```
|
||
|
||
---
|
||
|
||
### 1️⃣1️⃣ Deployment Success (DEPLOY_SUCCESS)
|
||
|
||
**狀態**: ✅ **已實作** (2026-03-29)
|
||
**檔案**: `telegram_gateway.py` - `DeploySuccessMessage`
|
||
|
||
**按鈕**: 無
|
||
|
||
**模板**:
|
||
```
|
||
✅ **部署成功** | awoooi
|
||
|
||
📋 Commit: abc1234
|
||
👤 觸發者: @ogt
|
||
📁 環境: Production
|
||
|
||
━━━━━━━━━━━━━━━━━━━
|
||
📊 部署詳情
|
||
├ API: v1.2.3 ✅
|
||
├ Web: v1.2.3 ✅
|
||
├ Worker: v1.2.3 ✅
|
||
└ 耗時: 3m 45s
|
||
━━━━━━━━━━━━━━━━━━━
|
||
🧪 E2E 測試: 26/26 PASSED
|
||
📊 健康檢查: ✅ 全部通過
|
||
|
||
🔗 [查看 Workflow](https://github.com/...)
|
||
```
|
||
|
||
---
|
||
|
||
### 1️⃣2️⃣ Rate Limit Warning (RATE_LIMIT)
|
||
|
||
**狀態**: ✅ **已實作** (2026-03-29)
|
||
**檔案**: `telegram_gateway.py` - `RateLimitMessage`
|
||
|
||
**按鈕**: 無
|
||
|
||
**模板**:
|
||
```
|
||
⚠️ **API 限額警告**
|
||
|
||
━━━━━━━━━━━━━━━━━━━
|
||
📊 Gemini API
|
||
├ 今日用量: 450/500 (90%)
|
||
├ Token: 85,000/100,000
|
||
└ 成本: $0.08
|
||
|
||
💡 建議:
|
||
- 考慮切換到 Ollama 優先
|
||
- 或增加每日限額
|
||
|
||
🔄 將於明日 00:00 重置
|
||
```
|
||
|
||
---
|
||
|
||
## 按鈕 Callback 對照表
|
||
|
||
| Callback | Handler | 功能 | 適用訊息 |
|
||
|----------|---------|------|----------|
|
||
| `approve` | `sign_approval()` | 簽核執行 | INCIDENT |
|
||
| `reject` | `reject_approval()` | 拒絕執行 | INCIDENT |
|
||
| `snooze` | `_handle_snooze()` | 延遲 30 分鐘 | INCIDENT |
|
||
| `silence` | `_handle_silence()` | 靜默 1 小時 | INCIDENT, SENTRY_ERROR, RESOURCE_WARN |
|
||
| `tune` | kubectl 執行 | 自動調優 | INCIDENT |
|
||
| `scale` | HPA 觸發 | 自動擴展 | RESOURCE_WARN |
|
||
| `view` | 開啟連結 | 查看詳情 | SENTRY_ERROR |
|
||
|
||
---
|
||
|
||
## 訊息格式規範
|
||
|
||
### Emoji 使用規則
|
||
|
||
| 類別 | Emoji | 用途 |
|
||
|------|-------|------|
|
||
| 嚴重度 | 🚨 🔴 ⚠️ ℹ️ | Critical/High/Medium/Low |
|
||
| 狀態 | ✅ ❌ 🟢 🟡 | 成功/失敗/正常/警告 |
|
||
| 操作 | 🔧 ⚡ 🔍 📊 | 修復/執行/查看/統計 |
|
||
| 資源 | 📋 🎯 📁 👤 | ID/資源/目錄/用戶 |
|
||
| 成本 | 💰 💵 | Token/費用 |
|
||
|
||
### 分隔線樣式
|
||
|
||
```
|
||
═══════════════════════════ # 主標題分隔 (27 個 ═)
|
||
━━━━━━━━━━━━━━━━━━━ # 區塊分隔 (19 個 ━)
|
||
├ ─ └ # 樹狀結構
|
||
```
|
||
|
||
### 字元限制
|
||
|
||
- 總長度: **900 字元** (Telegram 限制 4096,預留空間)
|
||
- 標題: **25 字元**
|
||
- 摘要: **50 字元**
|
||
- 操作建議: **35 字元**
|
||
|
||
---
|
||
|
||
## 實作狀態
|
||
|
||
| 訊息類別 | 狀態 | 實作日期 |
|
||
|----------|------|----------|
|
||
| SENTRY_ERROR | ✅ 已實作 | 2026-03-29 |
|
||
| RESOURCE_WARN | ✅ 已實作 | 2026-03-29 |
|
||
| RATE_LIMIT | ✅ 已實作 | 2026-03-29 |
|
||
| REPAIR_REPORT | ✅ 已實作 | 2026-03-29 |
|
||
| DAILY_SUMMARY | ✅ 已實作 | 2026-03-29 |
|
||
| DEPLOY_SUCCESS | ✅ 已實作 | 2026-03-29 |
|
||
|
||
**全部 12 種訊息模板已實作完成!**
|
||
|
||
---
|
||
|
||
## 變更記錄
|
||
|
||
| 日期 | 版本 | 內容 |
|
||
|------|------|------|
|
||
| 2026-03-29 | v1.1 | ✅ 6 種新訊息模板實作完成 (Sentry/Resource/Repair/Daily/Deploy/RateLimit) |
|
||
| 2026-03-29 | v1.0 | 初始建立,定義 12 種訊息模板 |
|