fix(awooop): mirror ops notifications through api
All checks were successful
Code Review / ai-code-review (push) Successful in 10s
All checks were successful
Code Review / ai-code-review (push) Successful in 10s
This commit is contained in:
@@ -823,7 +823,8 @@ jobs:
|
||||
|
||||
# 2026-04-09 Claude Sonnet 4.6: Sprint 5.2 — 同步 ops 腳本到 188 (ollama user)
|
||||
# DEPLOY_SSH_KEY_188 = gitea-cd-deploy-188 (ed25519,只有 188 authorized_keys)
|
||||
# 腳本: docker-health-monitor.sh + pg-backup.sh (感知層 + 備份)
|
||||
# 腳本: docker-health-monitor.sh + pg-backup.sh + notify-awoooi-ops.sh
|
||||
# 感知層與備份通知都先走 AWOOI API/AwoooP,Telegram 直發只保留 API 離線 fallback。
|
||||
- name: Sync Ops Scripts to 188
|
||||
continue-on-error: true
|
||||
env:
|
||||
@@ -870,9 +871,16 @@ jobs:
|
||||
&& echo "✅ pg-backup.sh 已同步" \
|
||||
|| echo "⚠️ pg-backup.sh 同步失敗"
|
||||
|
||||
# 同步 ops 通知 helper
|
||||
timeout -k 5s 60s scp "${SCP_188_OPTS[@]}" \
|
||||
scripts/ops/notify-awoooi-ops.sh \
|
||||
ollama@192.168.0.188:~/awoooi-ops/notify-awoooi-ops.sh \
|
||||
&& echo "✅ notify-awoooi-ops.sh 已同步" \
|
||||
|| echo "⚠️ notify-awoooi-ops.sh 同步失敗"
|
||||
|
||||
# 確保執行權限
|
||||
timeout -k 5s 30s ssh "${SSH_188_OPTS[@]}" ollama@192.168.0.188 \
|
||||
"chmod +x ~/awoooi-ops/docker-health-monitor.sh ~/awoooi-ops/pg-backup.sh && echo '✅ 權限設定完成'" \
|
||||
"chmod +x ~/awoooi-ops/docker-health-monitor.sh ~/awoooi-ops/pg-backup.sh ~/awoooi-ops/notify-awoooi-ops.sh && echo '✅ 權限設定完成'" \
|
||||
|| echo "⚠️ 權限設定失敗"
|
||||
|
||||
- name: Notify Pipeline Failure
|
||||
|
||||
@@ -1,3 +1,73 @@
|
||||
## 2026-05-12 | Ops 通知旁路收斂到 AWOOI API / AwoooP
|
||||
|
||||
**背景**:CI/CD 通知已改成先走 AWOOI Alertmanager 入口,並由 TelegramGateway 鏡像到 AwoooP Run Timeline;但 188 ops 腳本仍有直接 Telegram 發送路徑。這會讓備份、DR Drill、host backup 等營運事件繞過 AwoooP 的治理與稽核,只在 Telegram 群組出現。
|
||||
|
||||
**本次修補**:
|
||||
- 新增 `scripts/ops/notify-awoooi-ops.sh`:
|
||||
- 將 ops job 狀態包成 Alertmanager payload。
|
||||
- 預設投遞到 `${AWOOOI_API_URL}/api/v1/webhooks/alertmanager`。
|
||||
- 支援 `AWOOI_OPS_*` / `AWOOOI_OPS_*` 環境變數。
|
||||
- 支援 `AWOOI_OPS_DRY_RUN=1` 輸出 JSON,便於部署前驗證。
|
||||
- `pg-backup.sh`:
|
||||
- DB 備份成功 / 失敗先走 `notify-awoooi-ops.sh`。
|
||||
- Alertname 使用 `Backup.PG`,severity 固定 `info`,避免備份狀態通知誤入 LLM 路徑燒 token。
|
||||
- Telegram 直發只保留為 API 不可達 fallback。
|
||||
- `dr-drill.sh`:
|
||||
- DR dry-run / 失敗 / 月度演練結果先走 AWOOI API。
|
||||
- Alertname 使用 `DRDrillStatus`,並帶入執行耗時。
|
||||
- `backup-from-110.sh`:
|
||||
- host backup 失敗先走 AWOOI API,fallback 才直發 Telegram。
|
||||
- Alertname 使用 `HostBackupFailed`,severity 固定 `info`,避免腳本即時通知和 Prometheus 長時間備份告警互相重複觸發 LLM。
|
||||
- `.gitea/workflows/cd.yaml`:
|
||||
- `Sync Ops Scripts to 188` 新增同步 `notify-awoooi-ops.sh`。
|
||||
- chmod 同步納入 helper,確保 188 上的 `pg-backup.sh` 能使用同目錄 helper。
|
||||
- Telegram fallback 改用 `--data-urlencode text=...`,避免多行 HTML 訊息在 JSON 字串內破格式。
|
||||
|
||||
**驗證**:
|
||||
- `bash -n scripts/ops/notify-awoooi-ops.sh scripts/ops/pg-backup.sh scripts/ops/dr-drill.sh scripts/ops/backup-from-110.sh` → passed。
|
||||
- `AWOOI_OPS_DRY_RUN=1 ... scripts/ops/notify-awoooi-ops.sh` → JSON 可解析,且多行 detail 保留。
|
||||
- `ruby -e 'require "yaml"; YAML.load_file(".gitea/workflows/cd.yaml")'` → `yaml ok`。
|
||||
- `git diff --check` → clean。
|
||||
|
||||
判讀:這輪先收斂 188 ops 通知的主要旁路。正式訊息會先進 AWOOI API / TelegramGateway / AwoooP;Telegram 直發只剩 API 離線時的救命 fallback。下一步可繼續把未納入 CD 同步的 `backup-from-110.sh` 實機部署到 188,並逐步清理其他 workflows 的 direct Telegram fallback。
|
||||
|
||||
## 2026-05-12 | CI/CD 出站訊息正式進入 AwoooP Run Timeline
|
||||
|
||||
**背景**:CI/CD 通知已改走 AWOOI API,但 production 一開始沒有出現在 AwoooP Run Monitor。追 log 後確認是 legacy outbound mirror 建立 `awooop_run_state` 時仰賴 DB default,而 production table 的 `attempt_count` 等 NOT NULL 欄位未套到 default,導致 `telegram_outbound_mirror_failed`。
|
||||
|
||||
**本次修補**:
|
||||
- `channel_hub.py` 的 `ensure_completed_shadow_run()` 明確寫入:
|
||||
- `attempt_count = 0`
|
||||
- `max_attempts = 3`
|
||||
- `cost_usd = 0.0000`
|
||||
- `step_count = 0`
|
||||
- `platform_operator_service.py` 將含 `[AWOOOI CI/CD]` 的 outbound timeline 標題改為 `TELEGRAM:CI/CD 狀態通知`,不再顯示泛用 `TELEGRAM:處置結果`。
|
||||
- `.gitea/workflows/cd.yaml` 修正 Docker build lock 檢查自我匹配問題,避免 `grep 'docker build'` 匹配到自己的 shell script,造成 orphan lock 無法自清。
|
||||
|
||||
**驗證**:
|
||||
- Gitea CD `#1885` success:
|
||||
- `tests` success。
|
||||
- `build-and-deploy` success。
|
||||
- `post-deploy-checks` success。
|
||||
- K8s live image:
|
||||
- `awoooi-api` → `192.168.0.110:5000/awoooi/api:03ba9678d54cd24038cbe3162b6c03c31956548c`。
|
||||
- `awoooi-web` → `192.168.0.110:5000/awoooi/web:03ba9678d54cd24038cbe3162b6c03c31956548c`。
|
||||
- `awoooi-worker` → `192.168.0.110:5000/awoooi/api:03ba9678d54cd24038cbe3162b6c03c31956548c`。
|
||||
- Production smoke:
|
||||
- `/api/v1/health` → 200。
|
||||
- `/zh-TW/awooop/runs` → 200。
|
||||
- `/api/v1/platform/runs/list?per_page=3` → `total=11`。
|
||||
- Run detail `5f422d51-f967-532b-9eaf-46c1616ef455`:
|
||||
- timeline 含 `TELEGRAM:CI/CD 狀態通知`。
|
||||
- content preview 含 `[AWOOOI CI/CD] | post-deploy`。
|
||||
- Production API log 短窗口看到:
|
||||
- `alertmanager_cicd_detected`
|
||||
- `completed_shadow_run_created`
|
||||
- `outbound_message_recorded`
|
||||
- 未再看到 `telegram_outbound_mirror_failed`、`NotNullViolation`、`IntegrityError`。
|
||||
|
||||
判讀:CI/CD 出站訊息已不只是 Telegram 訊息,而是能在 AwoooP Run Monitor / Timeline 查到的治理事件。這是把 AWOOOP 併回 AI 自動化飛輪控制面的第一個可驗證閉環。
|
||||
|
||||
## 2026-05-07 | AwoooP legacy Channel Event 補 completed shadow run 錨點
|
||||
|
||||
**背景**:Production `/api/v1/platform/runs/list` 回 `total=0`,但系統仍持續有 Telegram 出站訊息與 grouped child alert。盤點後確認:legacy Telegram 出站只寫 `awooop_outbound_message`,使用 soft `run_id`,但沒有對應 `awooop_run_state`;grouped child alert 也只落 `awooop_conversation_event`。結果是 AwoooP Console 有 event / outbound 資料,但 Run Monitor 主列表沒有聚合錨點,看起來像空殼。
|
||||
|
||||
@@ -31,6 +31,7 @@ TEXTFILE_DIR="${TEXTFILE_DIR:-/home/ollama/node_exporter_textfiles}"
|
||||
TEXTFILE_PROM="${TEXTFILE_DIR}/backup.prom"
|
||||
DATE=$(date +%Y%m%d-%H%M%S)
|
||||
ERRORS=0
|
||||
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
|
||||
|
||||
log() {
|
||||
echo "[$DATE] $*" | tee -a "$LOG"
|
||||
@@ -38,6 +39,45 @@ log() {
|
||||
|
||||
log "=== Starting backup from 110 ==="
|
||||
|
||||
notify_awoooi_ops() {
|
||||
local status="$1"
|
||||
local msg="$2"
|
||||
local helper="${SCRIPT_DIR}/notify-awoooi-ops.sh"
|
||||
[[ -x "$helper" ]] || return 1
|
||||
|
||||
AWOOI_OPS_ALERTNAME="HostBackupFailed" \
|
||||
AWOOI_OPS_JOB_NAME="188 Host 層備份" \
|
||||
AWOOI_OPS_STATUS="$status" \
|
||||
AWOOI_OPS_SEVERITY="info" \
|
||||
AWOOI_OPS_SOURCE="backup-from-110" \
|
||||
AWOOI_OPS_COMPONENT="host-backup" \
|
||||
AWOOI_OPS_SUMMARY="188 Host 層備份 ${status}" \
|
||||
AWOOI_OPS_DETAIL="$msg" \
|
||||
"$helper" >/dev/null
|
||||
}
|
||||
|
||||
notify_telegram_fallback() {
|
||||
local msg="$1"
|
||||
local tg_token="${TG_BOT_TOKEN:-${TELEGRAM_BOT_TOKEN:-}}"
|
||||
local tg_chat="${TELEGRAM_ALERT_CHAT_ID:-${SRE_GROUP_CHAT_ID:--1003711974679}}"
|
||||
if [ -n "$tg_token" ] && [ -n "$tg_chat" ]; then
|
||||
curl -s -X POST "https://api.telegram.org/bot${tg_token}/sendMessage" \
|
||||
-d "chat_id=${tg_chat}" \
|
||||
--data-urlencode "text=${msg}" \
|
||||
> /dev/null || true
|
||||
fi
|
||||
}
|
||||
|
||||
notify_ops() {
|
||||
local status="$1"
|
||||
local msg="$2"
|
||||
|
||||
# 正式路徑:先交給 AWOOI API,由 TelegramGateway 送出並鏡像到 AwoooP。
|
||||
# 只有 API 不可達或 helper 未部署時,才使用 Telegram 直發救命旁路。
|
||||
notify_awoooi_ops "$status" "$msg" && return 0
|
||||
notify_telegram_fallback "$msg"
|
||||
}
|
||||
|
||||
# ── Harbor registry data ──────────────────────────────────────────────────────
|
||||
# 2026-04-17 ogt: 改用 docker socket 讀取 volumes(/var/lib/docker/volumes/ 是 710 root:root)
|
||||
# wooo 是 docker group 成員,可透過 docker run 掛載 volume,不可直接讀取 FS 路徑
|
||||
@@ -100,15 +140,6 @@ EOF
|
||||
exit 0
|
||||
else
|
||||
log "=== Backup FAILED ($ERRORS errors) ==="
|
||||
|
||||
# Telegram 告警:正式目的地為 SRE 戰情室群組。
|
||||
TG_TOKEN="${TG_BOT_TOKEN:-}"
|
||||
TG_CHAT="${TELEGRAM_ALERT_CHAT_ID:-${SRE_GROUP_CHAT_ID:--1003711974679}}"
|
||||
if [ -n "$TG_TOKEN" ] && [ -n "$TG_CHAT" ]; then
|
||||
curl -s -X POST "https://api.telegram.org/bot${TG_TOKEN}/sendMessage" \
|
||||
-d "chat_id=${TG_CHAT}" \
|
||||
-d "text=🚨 backup-from-110.sh FAILED on 188 — ${ERRORS} error(s) at ${DATE}" \
|
||||
> /dev/null || true
|
||||
fi
|
||||
notify_ops "failed" "🚨 backup-from-110.sh FAILED on 188 — ${ERRORS} error(s) at ${DATE}"
|
||||
exit 1
|
||||
fi
|
||||
|
||||
@@ -22,6 +22,7 @@ DR_NAMESPACE="awoooi-dr-test"
|
||||
RESTORE_TIMEOUT="${RESTORE_TIMEOUT:-600}" # 10 分鐘
|
||||
SECRETS_FILE="${SECRETS_FILE:-/home/wooo/awoooi-ops-secrets/secrets.env}"
|
||||
DRY_RUN="${1:-}"
|
||||
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
|
||||
|
||||
[[ -f "$SECRETS_FILE" ]] && source "$SECRETS_FILE"
|
||||
|
||||
@@ -31,13 +32,38 @@ START_TIME=$(date +%s)
|
||||
|
||||
log() { echo "[$(date '+%Y-%m-%d %H:%M:%S %z')] $*"; }
|
||||
|
||||
notify_awoooi_ops() {
|
||||
local status="$1"
|
||||
local msg="$2"
|
||||
local helper="${SCRIPT_DIR}/notify-awoooi-ops.sh"
|
||||
[[ -x "$helper" ]] || return 1
|
||||
|
||||
AWOOI_OPS_ALERTNAME="DRDrillStatus" \
|
||||
AWOOI_OPS_JOB_NAME="DR Drill 月度演練" \
|
||||
AWOOI_OPS_STATUS="$status" \
|
||||
AWOOI_OPS_SEVERITY="info" \
|
||||
AWOOI_OPS_SOURCE="dr-drill" \
|
||||
AWOOI_OPS_COMPONENT="disaster-recovery" \
|
||||
AWOOI_OPS_SUMMARY="DR Drill ${status}" \
|
||||
AWOOI_OPS_DETAIL="$msg" \
|
||||
AWOOI_OPS_DURATION_SECONDS="$(elapsed)" \
|
||||
"$helper" >/dev/null
|
||||
}
|
||||
|
||||
notify_telegram() {
|
||||
local msg="$1"
|
||||
local status="${2:-success}"
|
||||
|
||||
# 正式路徑:先交給 AWOOI API,由 TelegramGateway 送出並鏡像到 AwoooP。
|
||||
# 只有 API 不可達或 helper 未部署時,才使用 Telegram 直發救命旁路。
|
||||
notify_awoooi_ops "$status" "$msg" && return 0
|
||||
|
||||
local chat_id="${TELEGRAM_ALERT_CHAT_ID:-${SRE_GROUP_CHAT_ID:--1003711974679}}"
|
||||
if [[ -n "${TELEGRAM_BOT_TOKEN:-}" && -n "$chat_id" ]]; then
|
||||
curl -s -X POST "https://api.telegram.org/bot${TELEGRAM_BOT_TOKEN}/sendMessage" \
|
||||
-H "Content-Type: application/json" \
|
||||
-d "{\"chat_id\":\"${chat_id}\",\"text\":\"${msg}\",\"parse_mode\":\"HTML\"}" \
|
||||
-d "chat_id=${chat_id}" \
|
||||
-d "parse_mode=HTML" \
|
||||
--data-urlencode "text=${msg}" \
|
||||
> /dev/null 2>&1 || true
|
||||
fi
|
||||
}
|
||||
@@ -189,18 +215,18 @@ main() {
|
||||
if [[ "$DRY_RUN" == "--dry-run" ]]; then
|
||||
log "🔍 DRY RUN 模式 — 只檢查 backup,不執行還原"
|
||||
local backup
|
||||
backup=$(find_latest_backup) || { notify_telegram "❌ DR Drill 失敗: 找不到有效 backup"; exit 1; }
|
||||
backup=$(find_latest_backup) || { notify_telegram "❌ DR Drill 失敗: 找不到有效 backup" "failed"; exit 1; }
|
||||
log "✅ 最新 backup: ${backup}"
|
||||
notify_telegram "🔍 <b>DR Drill DRY RUN</b>
|
||||
├ 最新 backup: ${backup}
|
||||
└ 狀態: Completed ✅ (未執行還原)"
|
||||
└ 狀態: Completed ✅ (未執行還原)" "success"
|
||||
return 0
|
||||
fi
|
||||
|
||||
local backup
|
||||
backup=$(find_latest_backup) || {
|
||||
notify_telegram "❌ <b>DR Drill 失敗</b>
|
||||
└ 找不到有效 Velero backup"
|
||||
└ 找不到有效 Velero backup" "failed"
|
||||
exit 1
|
||||
}
|
||||
log "📦 使用 backup: ${backup}"
|
||||
@@ -233,12 +259,15 @@ main() {
|
||||
|
||||
log "=== DR Drill 完成: ${overall} (${minutes}m${seconds}s) ==="
|
||||
|
||||
local notify_status="success"
|
||||
[[ "$overall" == *"FAIL"* ]] && notify_status="failed"
|
||||
|
||||
notify_telegram "${overall} <b>DR Drill 月度演練</b>
|
||||
├ 備份: ${backup}
|
||||
├ Restore: ${pod_status}
|
||||
├ API Health: ${health_status}
|
||||
├ 耗時: ${minutes}m${seconds}s
|
||||
└ 時間: $(date '+%Y-%m-%d %H:%M') +0800"
|
||||
└ 時間: $(date '+%Y-%m-%d %H:%M') +0800" "$notify_status"
|
||||
|
||||
[[ "$overall" == *"FAIL"* ]] && exit 1
|
||||
return 0
|
||||
|
||||
100
scripts/ops/notify-awoooi-ops.sh
Executable file
100
scripts/ops/notify-awoooi-ops.sh
Executable file
@@ -0,0 +1,100 @@
|
||||
#!/usr/bin/env bash
|
||||
# 2026-05-12 Codex: Ops 通知先走 AWOOI Alertmanager 入口,讓 TelegramGateway
|
||||
# 統一送出並鏡像到 AwoooP。呼叫端保留直接 Telegram fallback 作為 API 離線備援。
|
||||
set -euo pipefail
|
||||
|
||||
API_BASE="${AWOOOI_API_URL:-https://awoooi.wooo.work}"
|
||||
ALERTMANAGER_URL="${AWOOOI_ALERTMANAGER_URL:-${API_BASE%/}/api/v1/webhooks/alertmanager}"
|
||||
|
||||
JOB_NAME="${AWOOI_OPS_JOB_NAME:-${AWOOOI_OPS_JOB_NAME:-Ops Job}}"
|
||||
STATUS_RAW="${AWOOI_OPS_STATUS:-${AWOOOI_OPS_STATUS:-success}}"
|
||||
SEVERITY="${AWOOI_OPS_SEVERITY:-${AWOOOI_OPS_SEVERITY:-info}}"
|
||||
ALERTNAME="${AWOOI_OPS_ALERTNAME:-${AWOOOI_OPS_ALERTNAME:-OpsJobStatus}}"
|
||||
SOURCE="${AWOOI_OPS_SOURCE:-${AWOOOI_OPS_SOURCE:-ops-script}}"
|
||||
HOSTNAME_VALUE="${AWOOI_OPS_HOST:-${AWOOOI_OPS_HOST:-$(hostname 2>/dev/null || echo unknown)}}"
|
||||
COMPONENT="${AWOOI_OPS_COMPONENT:-${AWOOOI_OPS_COMPONENT:-ops}}"
|
||||
SUMMARY="${AWOOI_OPS_SUMMARY:-${AWOOOI_OPS_SUMMARY:-${JOB_NAME}}}"
|
||||
DETAIL="${AWOOI_OPS_DETAIL:-${AWOOOI_OPS_DETAIL:-}}"
|
||||
DURATION_SECONDS="${AWOOI_OPS_DURATION_SECONDS:-${AWOOOI_OPS_DURATION_SECONDS:-0}}"
|
||||
|
||||
if ! command -v python3 >/dev/null 2>&1; then
|
||||
echo "python3 missing; cannot build Alertmanager JSON payload" >&2
|
||||
exit 2
|
||||
fi
|
||||
|
||||
payload_file="$(mktemp)"
|
||||
trap 'rm -f "$payload_file"' EXIT
|
||||
|
||||
JOB_NAME="$JOB_NAME" \
|
||||
STATUS_RAW="$STATUS_RAW" \
|
||||
SEVERITY="$SEVERITY" \
|
||||
ALERTNAME="$ALERTNAME" \
|
||||
SOURCE="$SOURCE" \
|
||||
HOSTNAME_VALUE="$HOSTNAME_VALUE" \
|
||||
COMPONENT="$COMPONENT" \
|
||||
SUMMARY="$SUMMARY" \
|
||||
DETAIL="$DETAIL" \
|
||||
DURATION_SECONDS="$DURATION_SECONDS" \
|
||||
python3 - <<'PY' > "$payload_file"
|
||||
from __future__ import annotations
|
||||
|
||||
import datetime as dt
|
||||
import json
|
||||
import os
|
||||
import re
|
||||
|
||||
status = (os.environ.get("STATUS_RAW") or "success").strip().lower()
|
||||
if status not in {"success", "failed", "warning", "running", "skipped"}:
|
||||
status = "warning"
|
||||
|
||||
severity = (os.environ.get("SEVERITY") or "info").strip().lower()
|
||||
if severity not in {"info", "warning", "critical"}:
|
||||
severity = "info"
|
||||
|
||||
alertname = (os.environ.get("ALERTNAME") or "OpsJobStatus").strip()
|
||||
safe_alertname = re.sub(r"[^A-Za-z0-9_.:-]+", "_", alertname).strip("_") or "OpsJobStatus"
|
||||
|
||||
payload = {
|
||||
"version": "4",
|
||||
"status": "firing",
|
||||
"receiver": "awoooi-ops",
|
||||
"groupLabels": {"alertname": safe_alertname},
|
||||
"commonLabels": {"alertname": safe_alertname, "severity": severity},
|
||||
"commonAnnotations": {},
|
||||
"alerts": [
|
||||
{
|
||||
"status": "firing",
|
||||
"labels": {
|
||||
"alertname": safe_alertname,
|
||||
"severity": severity,
|
||||
"status": status,
|
||||
"source": os.environ.get("SOURCE", "ops-script"),
|
||||
"job": os.environ.get("JOB_NAME", "Ops Job"),
|
||||
"host": os.environ.get("HOSTNAME_VALUE", "unknown"),
|
||||
"component": os.environ.get("COMPONENT", "ops"),
|
||||
"duration_seconds": os.environ.get("DURATION_SECONDS", "0"),
|
||||
},
|
||||
"annotations": {
|
||||
"summary": os.environ.get("SUMMARY", ""),
|
||||
"description": os.environ.get("DETAIL", ""),
|
||||
},
|
||||
"startsAt": dt.datetime.now(dt.timezone.utc).isoformat().replace("+00:00", "Z"),
|
||||
}
|
||||
],
|
||||
}
|
||||
print(json.dumps(payload, ensure_ascii=False))
|
||||
PY
|
||||
|
||||
if [ "${AWOOI_OPS_DRY_RUN:-${AWOOOI_OPS_DRY_RUN:-0}}" = "1" ]; then
|
||||
cat "$payload_file"
|
||||
exit 0
|
||||
fi
|
||||
|
||||
curl -fsS \
|
||||
--connect-timeout "${AWOOI_OPS_CONNECT_TIMEOUT:-5}" \
|
||||
--max-time "${AWOOI_OPS_MAX_TIME:-12}" \
|
||||
-H "Content-Type: application/json" \
|
||||
--data-binary "@${payload_file}" \
|
||||
"$ALERTMANAGER_URL" >/dev/null
|
||||
|
||||
echo "AwoooP-mirrored ops notification sent via ${ALERTMANAGER_URL}"
|
||||
@@ -12,6 +12,7 @@ BACKUP_DIR="${BACKUP_DIR:-/home/ollama/backups}"
|
||||
SECRETS_FILE="${SECRETS_FILE:-/home/ollama/awoooi-ops-secrets/secrets.env}"
|
||||
RETAIN_DAYS="${RETAIN_DAYS:-7}"
|
||||
AWOOOI_API_URL="${AWOOOI_API_URL:-https://awoooi.wooo.work}"
|
||||
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
|
||||
|
||||
# 載入 secrets(含 Telegram token for fallback)
|
||||
[[ -f "$SECRETS_FILE" ]] && source "$SECRETS_FILE"
|
||||
@@ -21,13 +22,37 @@ LOG_PREFIX="[$(date '+%Y-%m-%d %H:%M:%S %z')]"
|
||||
|
||||
log() { echo "${LOG_PREFIX} $*"; }
|
||||
|
||||
notify_awoooi_ops() {
|
||||
local status="$1"
|
||||
local msg="$2"
|
||||
local helper="${SCRIPT_DIR}/notify-awoooi-ops.sh"
|
||||
[[ -x "$helper" ]] || return 1
|
||||
|
||||
AWOOI_OPS_ALERTNAME="Backup.PG" \
|
||||
AWOOI_OPS_JOB_NAME="AWOOOI DB 備份" \
|
||||
AWOOI_OPS_STATUS="$status" \
|
||||
AWOOI_OPS_SEVERITY="info" \
|
||||
AWOOI_OPS_SOURCE="pg-backup" \
|
||||
AWOOI_OPS_COMPONENT="postgres-backup" \
|
||||
AWOOI_OPS_SUMMARY="AWOOOI DB 備份 ${status}" \
|
||||
AWOOI_OPS_DETAIL="$msg" \
|
||||
"$helper" >/dev/null
|
||||
}
|
||||
|
||||
notify_telegram() {
|
||||
local msg="$1"
|
||||
local status="${2:-success}"
|
||||
|
||||
# 正式路徑:先交給 AWOOI API,由 TelegramGateway 送出並鏡像到 AwoooP。
|
||||
# 只有 API 不可達或 helper 未部署時,才使用 Telegram 直發救命旁路。
|
||||
notify_awoooi_ops "$status" "$msg" && return 0
|
||||
|
||||
local chat_id="${TELEGRAM_ALERT_CHAT_ID:-${SRE_GROUP_CHAT_ID:--1003711974679}}"
|
||||
if [[ -n "${TELEGRAM_BOT_TOKEN:-}" && -n "$chat_id" ]]; then
|
||||
curl -s -X POST "https://api.telegram.org/bot${TELEGRAM_BOT_TOKEN}/sendMessage" \
|
||||
-H "Content-Type: application/json" \
|
||||
-d "{\"chat_id\":\"${chat_id}\",\"text\":\"${msg}\",\"parse_mode\":\"HTML\"}" \
|
||||
-d "chat_id=${chat_id}" \
|
||||
-d "parse_mode=HTML" \
|
||||
--data-urlencode "text=${msg}" \
|
||||
> /dev/null 2>&1 || true
|
||||
fi
|
||||
}
|
||||
@@ -110,10 +135,13 @@ main() {
|
||||
local icon="✅"
|
||||
[[ $fail_count -gt 0 ]] && icon="⚠️"
|
||||
|
||||
local notify_status="success"
|
||||
[[ $fail_count -gt 0 ]] && notify_status="failed"
|
||||
|
||||
notify_telegram "${icon} <b>AWOOOI DB 備份</b>
|
||||
├ 時間: $(date '+%Y-%m-%d %H:%M') +0800
|
||||
├ 成功: ${success_count} | 失敗: ${fail_count}
|
||||
└ ${details}"
|
||||
└ ${details}" "$notify_status"
|
||||
|
||||
[[ $fail_count -gt 0 ]] && exit 1
|
||||
return 0
|
||||
|
||||
Reference in New Issue
Block a user