docs(ops): record live runner guardrail fix

This commit is contained in:
Your Name
2026-05-05 15:34:00 +08:00
parent 819734f655
commit 44d8322c4d

View File

@@ -1613,13 +1613,13 @@ psql $DATABASE_URL -f apps/api/migrations/cleanup_duplicate_deprecated_playbooks
| `209da7b` | deploy API image `docker-limit-alert-20260505-d08d1e4` |
| `96c1ba2` | CD host-runner helper containers 加固定名稱與 CPU/memory cap避免 `funny_davinci` 類無名無上限容器 |
| `1cc9de5` | Systemd runner alert/runbook 指向 110 host script `/home/wooo/scripts/apply-runner-systemd-guardrails.sh` |
| `live` | 110 runner guardrail 已由統帥 sudo 套用5 個 runner 均 Watchdog=0、CPUQuota=2 cores、MemoryMax=2GiB |
### 下一步
1. 110 以 sudo 執行 `sudo /home/wooo/scripts/apply-runner-systemd-guardrails.sh --apply`
2. 驗證 Prometheus 的 `SystemdRunnerWatchdogEnabled` / `SystemdRunnerMissingResourceQuota` 消失
3. 部署 `/home/wooo/scripts/stop-stale-gitea-actions-jobs.sh``DockerGiteaActionsJobStale` 先 dry-run、再人工/AI 審核後 `--apply`
4. 觀察 110 load5/core 是否穩定低於 1.5,若仍高再調 Sentry ingestion/ClickHouse parts。
1. 15 分鐘滑動視窗過去,確認 `SystemdRunnerRestartSpike` 自然消失
2. 觀察 110 load5/core 是否穩定低於 1.5,若仍高再調 Sentry ingestion/ClickHouse parts
3. 持續`DockerGiteaActionsJobStale` 先 dry-run、再人工/AI 審核後 `--apply`
---