Files
awoooi/.gitea/workflows/e2e-health.yaml
Your Name ee2cc2bfc3
Some checks failed
CD Pipeline / tests (push) Failing after 1m23s
CD Pipeline / build-and-deploy (push) Has been skipped
CD Pipeline / post-deploy-checks (push) Has been skipped
Code Review / ai-code-review (push) Successful in 15s
fix(alerts): 收斂 Telegram 告警到 SRE 戰情室
2026-06-12 11:06:16 +08:00

103 lines
4.3 KiB
YAML
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# =============================================================================
# AWOOOI E2E Health Check (Gitea Actions - 方案 B)
# =============================================================================
# 替代 GitHub Actions 的本地 CI/CD
# 2026-03-29 Claude Code (ADR-039)
# 2026-03-31 Claude Code (Phase 21.1 - 每日排程 + 失敗通知)
name: E2E Health Check
on:
workflow_dispatch:
schedule:
- cron: '0 16 * * *' # 每日 00:00 台北 (UTC+8)
# push 觸發已移除 (2026-04-02): E2E health check 不需要每次 push 都跑
# CD pipeline 本身已有 smoke testE2E 用排程或手動觸發即可
# OTEL CI/CD 監控 (2026-03-31 #46c)
env:
OTEL_EXPORTER_OTLP_ENDPOINT: http://192.168.0.188:24318
OTEL_SERVICE_NAME: awoooi-e2e
OTEL_RESOURCE_ATTRIBUTES: deployment.environment=production
SRE_GROUP_CHAT_ID: "-1003711974679"
jobs:
e2e-health:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Check API Health
id: health_check
run: |
sudo apt-get update && sudo apt-get install -y curl || (apt-get update && apt-get install -y curl)
# 2026-03-31 ogt: Docker 容器無法訪問內網 IP改用公網域名
API_URL="https://awoooi.wooo.work"
echo "🔗 檢查 API 健康狀態..."
for i in 1 2 3; do
HTTP_CODE=$(curl -s -w "%{http_code}" -o /dev/null --connect-timeout 10 "$API_URL/api/v1/health" 2>&1) || true
echo "📊 嘗試 #$i: HTTP $HTTP_CODE"
if [ "$HTTP_CODE" = "200" ]; then
echo "✅ API 可用"
echo "status=success" >> $GITHUB_OUTPUT
exit 0
fi
sleep 2
done
echo "❌ API 無法連線"
echo "status=failed" >> $GITHUB_OUTPUT
exit 1
- name: Source Provider Freshness Smoke
run: |
SOURCE_CANARY_RUN_REF="gitea-e2e-${GITHUB_RUN_ID:-manual}-${GITHUB_RUN_ATTEMPT:-1}"
echo "SOURCE_CANARY_RUN_REF=${SOURCE_CANARY_RUN_REF}" >> "$GITHUB_ENV"
echo "SOURCE_LINK_CANARY_WORK_ITEM_ID=source-evidence:sentry:upstream_canary:awoooi-source-link-canary-${SOURCE_CANARY_RUN_REF}" >> "$GITHUB_ENV"
OPERATOR_KEY="$(cat <<'AWOOOI_SECRET_AWOOOP_OPERATOR_API_KEY'
${{ secrets.AWOOOP_OPERATOR_API_KEY }}
AWOOOI_SECRET_AWOOOP_OPERATOR_API_KEY
)"
AWOOOP_OPERATOR_API_KEY="${OPERATOR_KEY}" \
AWOOOP_OPERATOR_ID=gitea-e2e-health \
python3 scripts/alert_chain_smoke_test.py \
--api-url https://awoooi.wooo.work \
--metrics-api-url http://192.168.0.125:32334 \
--source-provider-heartbeat \
--source-provider-upstream-canary \
--run-ref "${SOURCE_CANARY_RUN_REF}" \
--source-link-canary-target-incident-id INC-20260505-25E744 \
--json
- name: Source Correlation Applied-Link Smoke
run: |
python3 scripts/awooop_source_correlation_apply_smoke.py \
--api-url https://awoooi.wooo.work \
--target-incident-id INC-20260505-25E744 \
--allow-existing-apply \
--refresh-if-stale-days 6 \
--refresh-work-item-id "${SOURCE_LINK_CANARY_WORK_ITEM_ID}" \
--verify-refresh-candidate \
--reviewer-id gitea_e2e_source_link_canary \
--operator-note "T124 dedicated source-link canary refresh; append-only status-chain proof"
- name: Notify Telegram on Failure
if: failure()
run: |
MSG="E2E Health Check 失敗API 健康檢查未通過"
if AWOOI_CICD_STATUS=failed \
AWOOI_CICD_STAGE=e2e-health \
AWOOI_CICD_JOB_NAME="E2E Health Check" \
AWOOI_CICD_COMMIT_SHA="${{ github.sha }}" \
AWOOI_CICD_SUMMARY="${MSG}" \
scripts/ci/notify-awoooi-cicd.sh; then
echo "E2E failure notification mirrored through AWOOI API"
else
curl -s -X POST "https://api.telegram.org/bot${{ secrets.TELEGRAM_BOT_TOKEN }}/sendMessage" \
-d chat_id="${{ env.SRE_GROUP_CHAT_ID }}" \
-d parse_mode="HTML" \
-d text="🔴 <b>[E2E Health Check]</b> 失敗%0A%0A📅 $(TZ=Asia/Taipei date '+%Y-%m-%d %H:%M')%0A🔗 API 健康檢查未通過%0A%0A請檢查 K3s 叢集狀態"
fi