From 4407b46bb6a4ccfb2167dc20bd08de85b3db429a Mon Sep 17 00:00:00 2001 From: Your Name Date: Sun, 24 May 2026 09:52:04 +0800 Subject: [PATCH] ops(runner): inventory workflow labels [skip ci] --- docs/LOGBOOK.md | 61 +++++ ...-04-15-MASTER-ai-autonomous-flywheel-v2.md | 8 + ops/runner/README.md | 36 +++ ops/runner/audit-workflow-labels.py | 259 ++++++++++++++++++ 4 files changed, 364 insertions(+) create mode 100755 ops/runner/audit-workflow-labels.py diff --git a/docs/LOGBOOK.md b/docs/LOGBOOK.md index 3fc4ab69..75db46f4 100644 --- a/docs/LOGBOOK.md +++ b/docs/LOGBOOK.md @@ -1,3 +1,64 @@ +## 2026-05-24|T141 workflow label matrix + +**觸發**: + +- T140 只回答 live runner config:同一個 110 host runner 同時宣告 `awoooi-host`、`ewoooc-host` 與 `ubuntu-latest`。 +- 下一步若要 runner label isolation,必須先知道各 repo workflow 實際用哪些 `runs-on`,避免把 `ewoooc` 或 `stockplatform-v2` 的 CI/CD 直接切斷。 + +**修正**: + +- 新增 `ops/runner/audit-workflow-labels.py`。 + - 只讀 Gitea `.gitea/workflows/*.yml` / `.yaml`,擷取 `runs-on`。 + - Gitea auth 從環境或目前 repo `gitea` remote 解析,token 不輸出。 + - Gitea 不可讀時可用 `--local-repo OWNER/NAME=/path/to/repo` fallback。 + - 輸出 per workflow line inventory、label summary、inventory warnings。 +- `ops/runner/README.md` 補第七層 workflow label matrix 與隔離判讀。 + +**Verification**: + +```text +python3 -m py_compile ops/runner/audit-workflow-labels.py -> pass +ops/runner/audit-workflow-labels.py --local-repo wooo/stockplatform-v2=/Users/ogt/stockplatform-v2 -> pass + +Evidence: +wooo/awoooi: + awoooi-host -> .gitea/workflows/cd.yaml lines 64 / 313 / 1132 + ubuntu-latest -> ansible-lint, cd-dev, code-review, deploy-alerts, e2e-health, run-migration, type-sync + +wooo/ewoooc: + ewoooc-host -> .gitea/workflows/cd.yaml line 67 + +wooo/stockplatform-v2: + ubuntu-latest -> .gitea/workflows/ci.yaml lines 12 / 23 + Gitea API for stockplatform-v2 returned 404 in this session; local repo fallback used. +``` + +**判讀**: + +- AWOOI production CD 已用 `awoooi-host`,但 AWOOI code-review / health / aux workflows 仍走 shared `ubuntu-latest`。 +- EwoooC CD 明確使用 `ewoooc-host`,而這個 foreign label 仍在 110 同一個 user-level runner config 內。 +- Stockplatform-v2 CI 走 `ubuntu-latest`,會和 AWOOI 的 non-CD workflows 共用同一條 runner queue。 +- 真正修復不是在同一份 config 繼續加 label,而是 runner registration / service split,或將非 AWOOI repo 搬到獨立 runner;在替代 runner ready 前不可直接移除 `ewoooc-host` 或 `ubuntu-latest`。 + +**目前整體進度**: + +- AwoooP 告警可觀測鏈:99.998%。 +- Incident-level source correlation 可見性:98.8%。 +- Source correlation apply 狀態鏈可驗證性:99.72%。 +- Source correlation freshness / rolling gate:98.2%。 +- 前端 AI 自動化管理介面同步:99.999%。 +- Dashboard snapshot / SSE console noise 收斂:99.2%。 +- CI/CD runner hygiene:99.5%。 +- Runner ownership 收斂:96%。 +- Runner pool inventory:70% → 82%。 +- Workflow label matrix:0% → 85%。 +- API image build layer hygiene:88%。 +- Deploy rollout-risk 可觀測性:91%。 +- CI/CD evidence 前端可見性:92%。 +- Pipeline stage 可觀測性:88%。 +- Build host pressure治理:86%。 +- 完整 AI 自動化管理產品化:99.965% → 99.966%。 + ## 2026-05-24|T140 runner pool live inventory **觸發**: diff --git a/docs/superpowers/specs/2026-04-15-MASTER-ai-autonomous-flywheel-v2.md b/docs/superpowers/specs/2026-04-15-MASTER-ai-autonomous-flywheel-v2.md index 4f569346..afa7a331 100644 --- a/docs/superpowers/specs/2026-04-15-MASTER-ai-autonomous-flywheel-v2.md +++ b/docs/superpowers/specs/2026-04-15-MASTER-ai-autonomous-flywheel-v2.md @@ -2665,6 +2665,14 @@ Phase 6 完成後 - 判讀:T135 已把 runner ownership 從雙 runner 搶工收斂到 host runner 單一主控;下一段不要重新啟用 Docker-wrapped runner,而是做 runner pool / repo label 隔離、API image `apt-get` / `chown -R` 分層、Web build cache/offload、Playwright apt source-list hygiene。 - 目前進度更新:AwoooP 告警可觀測鏈約 99.998%;Incident-level source correlation 可見性約 98.8%;Source correlation apply 狀態鏈可驗證性約 99.72%;Source correlation freshness / rolling gate 約 98.2%;前端 AI 自動化管理介面同步約 99.999%;Dashboard snapshot / SSE console noise 收斂約 99.2%;CI/CD runner hygiene 約 99.2%;Runner ownership 收斂約 96%;Build host pressure治理約 82%;完整 AI 自動化管理產品化約 99.960%。 +**T141 workflow label matrix(2026-05-24 台北)**: +- 觸發:T140 只回答 110 live runner config,尚未回答各 repo workflow 實際使用哪些 `runs-on`。要做 runner label isolation 前,必須避免切斷 `ewoooc` 或 `stockplatform-v2` 的 CI/CD。 +- 修正:新增 `ops/runner/audit-workflow-labels.py`,只讀 Gitea `.gitea/workflows/*.yml` / `.yaml` 並擷取 `runs-on`;Gitea auth 由 env 或目前 repo `gitea` remote 解析,token 不輸出;Gitea 不可讀時可用 `--local-repo OWNER/NAME=/path/to/repo` fallback。`ops/runner/README.md` 補第七層 workflow label matrix。 +- Evidence:AWOOI `.gitea/workflows/cd.yaml` 三個 job 使用 `awoooi-host`,但 code-review / e2e-health / deploy-alerts / cd-dev / ansible-lint / type-sync / run-migration 使用 `ubuntu-latest`;EwoooC `.gitea/workflows/cd.yaml` 使用 `ewoooc-host`;stockplatform-v2 local `.gitea/workflows/ci.yaml` 兩個 job 使用 `ubuntu-latest`,本 session Gitea API 對該 repo 回 404,因此用 local fallback。 +- 判讀:AWOOI production CD label 已專用,但 runner queue 仍共享,因為同一個 110 runner service 同時宣告 `awoooi-host`、`ewoooc-host`、`ubuntu-latest`。真正修復是 runner registration / service split 或將非 AWOOI repo 搬到獨立 runner;不可只在同一份 config 繼續加 label,也不可在替代 runner ready 前直接移除 `ewoooc-host` 或 `ubuntu-latest`。 +- Verification:`python3 -m py_compile ops/runner/audit-workflow-labels.py` pass;`ops/runner/audit-workflow-labels.py --local-repo wooo/stockplatform-v2=/Users/ogt/stockplatform-v2` pass。 +- 目前進度更新:AwoooP 告警可觀測鏈約 99.998%;Incident-level source correlation 可見性約 98.8%;Source correlation apply 狀態鏈可驗證性約 99.72%;Source correlation freshness / rolling gate 約 98.2%;前端 AI 自動化管理介面同步約 99.999%;Dashboard snapshot / SSE console noise 收斂約 99.2%;CI/CD runner hygiene 約 99.5%;Runner ownership 收斂約 96%;Runner pool inventory 約 82%;Workflow label matrix 約 85%;API image build layer hygiene 約 88%;Deploy rollout-risk 可觀測性約 91%;CI/CD evidence 前端可見性約 92%;Pipeline stage 可觀測性約 88%;Build host pressure治理約 86%;完整 AI 自動化管理產品化約 99.966%。 + **T140 runner pool live inventory(2026-05-24 台北)**: - 觸發:T139 已把 CI/CD stage transition 寫回 AwoooP Deployments,但 shared runner pool 仍是部署證據與 post-deploy queue 的風險來源。直接改 runner labels 會影響 `ewoooc` / `stockplatform-v2` 等 repo,因此先建立可重跑的 live inventory。 - 修正:新增 `ops/runner/audit-runner-pool.sh`,只讀盤點 `gitea-act-runner-host.service`、`/home/wooo/act-runner/config.yaml` labels、Docker-wrapped `gitea-runner`、active `GITEA-ACTIONS-TASK-*` containers、近 2 小時 runner journal repo counts。`ops/runner/README.md` 補第六層 shared runner label inventory,明確禁止把 `capacity: 2` 當修復。 diff --git a/ops/runner/README.md b/ops/runner/README.md index cda118ca..da4fef95 100644 --- a/ops/runner/README.md +++ b/ops/runner/README.md @@ -296,6 +296,42 @@ recent 2h repo counts: none - 下一步應先讀各 repo workflow 實際使用的 labels,再規劃 repo label isolation 或獨立 runner registration;不可在沒有替代 runner 前直接移除 live `ewoooc-host`。 +### 第七層修復: workflow label matrix + +Runner config 只能看到「這台 runner 願意接什麼 label」,不能回答「哪些 repo 實際在使用」。 +T141 新增 workflow label 盤點工具: + +```bash +ops/runner/audit-workflow-labels.py \ + --local-repo wooo/stockplatform-v2=/Users/ogt/stockplatform-v2 +``` + +工具會透過 Gitea API 讀 `.gitea/workflows/*.yml` / `.yaml` 的 `runs-on`,Gitea 不可讀時可指定 +local fallback;Gitea token 只從 env 或目前 repo `gitea` remote 解析,永不輸出。 + +T141 evidence 摘要(2026-05-24 台北): + +```text +wooo/awoooi: + awoooi-host: cd.yaml tests / build-and-deploy / post-deploy-checks + ubuntu-latest: code-review, e2e-health, deploy-alerts, cd-dev, ansible-lint, type-sync, run-migration + +wooo/ewoooc: + ewoooc-host: cd.yaml deploy + +wooo/stockplatform-v2: + ubuntu-latest: ci.yaml hygiene / frontend +``` + +風險判讀: + +- `awoooi-host` 已經是 AWOOI CD 專用 label,但同一個 runner service 仍同時宣告 + `ewoooc-host` 與 `ubuntu-latest`,所以 runner queue 仍共享。 +- `ubuntu-latest` 是最主要共享入口;AWOOI code-review / e2e-health 與 stockplatform-v2 CI + 仍可能互相排隊。 +- 下一步若要真正隔離,必須做新的 runner registration / service split,或把非 AWOOI repo 移到 + 另一台 runner。不可只在同一個 runner config 加更多 label,因為 `capacity: 1` 仍是同一條隊列。 + --- 版本: v2.0 | 更新: 2026-03-29 | 作者: Claude Code 變更: v1.0→v2.0 序列建構取代 Job Concurrency Groups diff --git a/ops/runner/audit-workflow-labels.py b/ops/runner/audit-workflow-labels.py new file mode 100755 index 00000000..37a3db34 --- /dev/null +++ b/ops/runner/audit-workflow-labels.py @@ -0,0 +1,259 @@ +#!/usr/bin/env python3 +"""Read-only inventory for Gitea workflow runner labels. + +The script never prints credentials. It reads workflow files from Gitea when +GITEA_BASE/GITEA_USER/GITEA_TOKEN are available, or derives them from the +current repository's `gitea` remote when that remote embeds basic auth. + +Example: + ops/runner/audit-workflow-labels.py \ + --local-repo wooo/stockplatform-v2=/Users/ogt/stockplatform-v2 +""" + +from __future__ import annotations + +import argparse +import base64 +import json +import re +import subprocess +import sys +import urllib.error +import urllib.request +from dataclasses import dataclass +from pathlib import Path +from typing import Iterable + + +DEFAULT_REPOS = ("wooo/awoooi", "wooo/ewoooc", "wooo/stockplatform-v2") +WORKFLOW_DIRS = (".gitea/workflows",) +RUNS_ON_RE = re.compile(r"^\s*runs-on:\s*(?P