fix(k8s): rebalance topology spread rollouts
This commit is contained in:
@@ -33900,3 +33900,21 @@ production browser smoke:
|
||||
**邊界**:
|
||||
- 本輪是 GitOps manifest candidate,不手動刪 Pod、不直接 `kubectl patch` live、不 Docker restart、不 Nginx reload、不 firewall 變更。
|
||||
- 完成前仍只能說 service / cold-start green;不可宣稱 API workload balanced。
|
||||
|
||||
## 2026-06-13 — API topology spread rollout strategy follow-up
|
||||
|
||||
**新增實證**:
|
||||
- 第一輪 hard spread 已進 live,ArgoCD revision `17e017f5`,`awoooi-api` deployment 已有 `minDomains=2` 與 `whenUnsatisfiable=DoNotSchedule`。
|
||||
- Rollout 完成後 API 仍 `2/2` 在 121。原因是 rolling update 期間舊 Pod 尚在 120 terminating,新 Pod 兩顆排到 121 時 scheduler 仍把舊 Pod 算進 skew;舊 Pod 消失後最終狀態偏斜。
|
||||
|
||||
**追加修正**:
|
||||
- API / Web rolling strategy 改為 `maxSurge: 0`、`maxUnavailable: 1`。
|
||||
- API / Web template 加入 `awoooi.dev/topology-rebalance-generation=2026-06-13T13:05:00+08:00`,強制以新策略重跑一次 rollout。
|
||||
- Worker 保留單副本策略,只維持 hard spread constraint 供未來多副本使用。
|
||||
|
||||
**驗收條件**:
|
||||
- ArgoCD sync 到新 revision。
|
||||
- API / Web rollout success。
|
||||
- API / Web pods 皆 120 / 121 各一顆。
|
||||
- public API / Web smoke 通過。
|
||||
- cold-start scorecard 仍為 `WARN=0 BLOCKED=0`。
|
||||
|
||||
@@ -180,6 +180,8 @@ sudo kubectl top pods -A --sort-by=cpu
|
||||
|
||||
修正策略改為 `maxSkew: 1`、`minDomains: 2`、`topologyKey: kubernetes.io/hostname`、`whenUnsatisfiable: DoNotSchedule`。120 / 121 都 Ready 時,API / Web replicas=2 必須跨節點;Worker 單副本仍可跑,未來擴副本時也必須分散。
|
||||
|
||||
2026-06-13 13:00 追加判讀:只把 spread 改硬仍不夠。API rollout 時舊 Pod 仍在 120 terminating,新 ReplicaSet 兩顆都排到 121,scheduler 當下因舊 Pod 尚在而視為不違反 skew;舊 Pod 消失後 live API 反而變成 121 集中。因此 API / Web 也必須使用 `maxSurge: 0`、`maxUnavailable: 1`,讓 rollout 先釋出一格再排新 Pod,避免新舊 ReplicaSet 交錯造成最終偏斜。
|
||||
|
||||
驗證:
|
||||
|
||||
- `kubectl kustomize k8s/awoooi-prod` 可渲染三個 `topologySpreadConstraints`。
|
||||
|
||||
@@ -57,6 +57,7 @@ Full cold-start may be declared green only for the latest verified evidence set.
|
||||
| 2026-06-13 `km-vectorize` health remediation | IN_PROGRESS_90 | 00:50 live readback: CronJob `lastScheduleTime=2026-06-12T11:00:00Z`, `lastSuccessfulTime=2026-06-04T11:00:37Z`; retained 6/2, 6/3, 6/4 Jobs are `Complete`, latest visible pod log returned `embed-all: 200 {"total":32,"success":32,"failed":0}`. Gitea main `47ee96b0` and ArgoCD sync now corrected live spec to `schedule=0 3 * * *`, `timeZone=Asia/Taipei`, with Job/Pod labels `app/component/environment/phase/system`. 01:04 cold-start is `PASS=83 WARN=0 BLOCKED=0`. Next gate is the official 03:00 CronJob success readback. |
|
||||
| 2026-06-13 post-CD trust / workload verification | SERVICE_GREEN_CD_GUARDRAIL_HELD | Gitea main advanced to deploy marker `e4a349bc chore(cd): deploy 414413a [skip ci]`; ArgoCD revision is `e4a349bc`, sync `Synced`, health still `Degraded` only by `km-vectorize` stale success. Live K3s image readback uses `414413a59268eedd391648f112e228716dd05362`; API pods split `mon1` / `mon`, Web pods split `mon` / `mon1`, Worker is single replica on `mon`. 01:28 `/home/wooo/.ssh/known_hosts` mtime remains `2026-06-13 01:20:02 +0800` with 120 / 188 entries present; deploy-specific `/home/wooo/.ssh/deploy_known_hosts` mtime is `01:24:05`, proving CD fix `80e6ec1a` stopped clobbering global trust. 01:26 cold-start: `PASS=83 WARN=0 BLOCKED=0`. |
|
||||
| 2026-06-13 API placement hardening | IN_PROGRESS | 12:43 live refresh showed cold-start `PASS=83 WARN=0 BLOCKED=0`, but API replicas `2/2` were on 120 even though topology spread existed. Root cause: `whenUnsatisfiable=ScheduleAnyway` is a soft preference. GitOps candidate changes API/Web/Worker to `minDomains=2` + `DoNotSchedule`; completion requires ArgoCD sync, rollout readback, public route smoke, and cold-start rerun. |
|
||||
| 2026-06-13 API rollout strategy hardening | IN_PROGRESS | First hard-spread rollout reached ArgoCD revision `17e017f5`; `DoNotSchedule` was live, but API completed with both new pods on 121 because old 120 pods were still terminating during scheduling. Second GitOps candidate sets API/Web `maxSurge=0`, `maxUnavailable=1`, and adds a topology rebalance annotation to force a clean rollout. |
|
||||
|
||||
---
|
||||
|
||||
|
||||
@@ -24,14 +24,17 @@ spec:
|
||||
strategy:
|
||||
type: RollingUpdate
|
||||
rollingUpdate:
|
||||
maxSurge: 1
|
||||
maxUnavailable: 0
|
||||
# 2026-06-13 Codex: no surge keeps topology spread honest during rollouts.
|
||||
maxSurge: 0
|
||||
maxUnavailable: 1
|
||||
template:
|
||||
metadata:
|
||||
labels:
|
||||
app: awoooi-web
|
||||
system: awoooi
|
||||
environment: prod
|
||||
annotations:
|
||||
awoooi.dev/topology-rebalance-generation: "2026-06-13T13:05:00+08:00"
|
||||
spec:
|
||||
# 2026-06-13 Codex: 120 / 121 皆 Ready 時強制跨節點分散,避免 replicas=2 合法同落單節點。
|
||||
topologySpreadConstraints:
|
||||
|
||||
@@ -24,7 +24,10 @@ spec:
|
||||
strategy:
|
||||
type: RollingUpdate
|
||||
rollingUpdate:
|
||||
maxSurge: 1
|
||||
# 2026-06-13 Codex: no surge keeps topology spread honest during rollouts.
|
||||
# With surge=1, two new pods can both schedule to the opposite node while
|
||||
# old pods are terminating, then become imbalanced after old pods exit.
|
||||
maxSurge: 0
|
||||
# 2026-05-24 Codex: allow one unavailable replica so rollout can replace
|
||||
# a bad old ReplicaSet instead of deadlocking at 1/2 when probes regress.
|
||||
maxUnavailable: 1
|
||||
@@ -34,6 +37,8 @@ spec:
|
||||
app: awoooi-api
|
||||
system: awoooi
|
||||
environment: prod
|
||||
annotations:
|
||||
awoooi.dev/topology-rebalance-generation: "2026-06-13T13:05:00+08:00"
|
||||
spec:
|
||||
# 2026-06-13 Codex: 120 / 121 皆 Ready 時強制跨節點分散,避免 replicas=2 合法同落單節點。
|
||||
topologySpreadConstraints:
|
||||
|
||||
Reference in New Issue
Block a user