docs(ops): record live runner guardrail fix
This commit is contained in:
@@ -1613,13 +1613,13 @@ psql $DATABASE_URL -f apps/api/migrations/cleanup_duplicate_deprecated_playbooks
|
||||
| `209da7b` | deploy API image `docker-limit-alert-20260505-d08d1e4` |
|
||||
| `96c1ba2` | CD host-runner helper containers 加固定名稱與 CPU/memory cap,避免 `funny_davinci` 類無名無上限容器 |
|
||||
| `1cc9de5` | Systemd runner alert/runbook 指向 110 host script `/home/wooo/scripts/apply-runner-systemd-guardrails.sh` |
|
||||
| `live` | 110 runner guardrail 已由統帥 sudo 套用;5 個 runner 均 Watchdog=0、CPUQuota=2 cores、MemoryMax=2GiB |
|
||||
|
||||
### 下一步
|
||||
|
||||
1. 在 110 以 sudo 執行 `sudo /home/wooo/scripts/apply-runner-systemd-guardrails.sh --apply`。
|
||||
2. 驗證 Prometheus 的 `SystemdRunnerWatchdogEnabled` / `SystemdRunnerMissingResourceQuota` 消失。
|
||||
3. 部署 `/home/wooo/scripts/stop-stale-gitea-actions-jobs.sh`,讓 `DockerGiteaActionsJobStale` 先 dry-run、再人工/AI 審核後 `--apply`。
|
||||
4. 觀察 110 load5/core 是否穩定低於 1.5,若仍高再調 Sentry ingestion/ClickHouse parts。
|
||||
1. 等 15 分鐘滑動視窗過去,確認 `SystemdRunnerRestartSpike` 自然消失。
|
||||
2. 觀察 110 load5/core 是否穩定低於 1.5,若仍高再調 Sentry ingestion/ClickHouse parts。
|
||||
3. 持續讓 `DockerGiteaActionsJobStale` 先 dry-run、再人工/AI 審核後 `--apply`。
|
||||
|
||||
---
|
||||
|
||||
|
||||
Reference in New Issue
Block a user