docs(awooop): record ai route visibility rollout [skip ci]

This commit is contained in:
Your Name
2026-05-19 14:19:34 +08:00
parent 815dcf370f
commit dc34e81224

View File

@@ -1,3 +1,134 @@
## 2026-05-19T79/T80 AI Provider 路由前端可視化 + CI/CD 通知主路徑修復
**背景**
- 統帥校正:所有 Ollama 類路徑必須固定為 `GCP-A → GCP-B → 111 local → Gemini`,且這個順序不能只存在於 Telegram 或 pod smokeOperator Console 必須看得到目前 primary / fallback / health。
- T79 目標是把 AI provider route status 做成 AwoooP Runs 的可見狀態,不再讓 Operator 猜測到底跑到 GCP-A、GCP-B、111 或 Gemini。
- T79 production CD 另外暴露一個通知技術債post-deploy job container 沒有 `python3` 時,`scripts/ci/notify-awoooi-cicd.sh` 無法產生 Alertmanager JSON導致 success notification 退回 direct Telegram fallback。T80 立即修正,讓 CI/CD success notification 回到 AWOOI API / AwoooP timeline 主路徑。
**T79 完成變更**
- 新增 read-only `GET /api/v1/platform/ai-route-status?workload_type=deep_rca`
- Response 會回傳:
- `schema_version=awooop_ai_route_status_v1`
- `policy_order=ollama_gcp_a → ollama_gcp_b → ollama_local → gemini`
- live `selected_provider / selected_url / selected_model`
- `fallback_chain`
- health mapGCP-A healthy 時 GCP-B / 111 顯示 `not_checked`,避免誤讀為壞掉。
- AwoooP Runs 前端新增「AI Provider 路由」區塊,顯示策略順序、目前 primary、model、health、latency、URL 與 active/standby。
- i18n 補 `zh-TW` / `en`;新增區塊沒有引入新的 literal-string warning。
**T80 完成變更**
- `scripts/ci/notify-awoooi-cicd.sh` 保留 Python payload builder。
- 新增 Node.js payload builder fallback當 job container 沒有 `python3`、但有 `node` 時,仍能產生同一份 Alertmanager/AWOOI JSON payload。
-`python3``node` 都不存在,才回傳明確錯誤,讓呼叫端 fallback Telegram。
**本地驗證**
```text
python -m py_compile
apps/api/src/services/platform_operator_service.py
apps/api/src/api/v1/platform/operator_runs.py
apps/api/tests/test_awooop_operator_timeline_labels.py
-> OK
ruff check --select F,E9,I
touched backend files
-> OK
pytest
test_awooop_operator_timeline_labels.py
test_ollama_endpoint_resolver.py
test_ollama_failover_manager.py
-> 76 passed
jq empty apps/web/messages/zh-TW.json apps/web/messages/en.json -> OK
pnpm --filter @awoooi/web typecheck -> OK
pnpm --dir apps/web exec next lint --file src/app/[locale]/awooop/runs/page.tsx
-> exit 0此頁既有 literal-string warnings 仍存在,本輪新增區塊走 i18n
NEXT_PUBLIC_API_URL=https://awoooi.wooo.work pnpm --filter @awoooi/web build -> OK
bash -n scripts/ci/notify-awoooi-cicd.sh -> OK
AWOOI_CICD_DRY_RUN=1 ... notify-awoooi-cicd.sh | jq
-> receiver=awoooi-cicd, alertname=CI_post_deploy_success, status=success
PATH=node-only AWOOI_CICD_DRY_RUN=1 ... notify-awoooi-cicd.sh | jq
-> receiver=awoooi-cicd, alertname=CI_post_deploy_success, status=success
git diff --check -> OK
```
**Commit / Deploy**
```text
56a8085d feat(awooop): surface ai provider route status
570b99e9 chore(cd): deploy 56a8085 [skip ci]
170f927b fix(ci): build cicd notification payload without python
815dcf37 chore(cd): deploy 170f927 [skip ci]
```
**Gitea Actions**
```text
2445 Code Review for 56a8085d -> success
2444 CD for 56a8085d -> success
tests -> success
build-and-deploy -> success
post-deploy-checks -> success
2449 Code Review for 170f927b -> success
2450 CD workflow_dispatch for 170f927b -> success
tests -> success
build-and-deploy -> success
post-deploy-checks -> success
```
**Production 驗證**
```text
K8s image after T80:
awoooi-api 192.168.0.110:5000/awoooi/api:170f927bc677da492d222d561504d6fe4b82c0f1
awoooi-worker 192.168.0.110:5000/awoooi/api:170f927bc677da492d222d561504d6fe4b82c0f1
awoooi-web 192.168.0.110:5000/awoooi/web:170f927bc677da492d222d561504d6fe4b82c0f1
GET https://awoooi.wooo.work/api/v1/health
-> healthy, prod, mock_mode=false
GET /api/v1/platform/ai-route-status?workload_type=deep_rca
-> selected_provider=ollama_gcp_a
-> selected_url=http://34.143.170.20:11434
-> selected_model=gemma3:4b
-> policy_order=ollama_gcp_a → ollama_gcp_b → ollama_local → gemini
-> fallback_chain=ollama_gcp_b → ollama_local → gemini
Production Playwright smoke on /zh-TW/awooop/runs:
-> AI Provider 路由 visible
-> ollama_gcp_a / ollama_gcp_b / ollama_local / gemini visible
-> Primary=ollama_gcp_a visible
-> route error not visible
CD post-deploy notification after T80:
-> AwoooP-mirrored CI/CD notification sent via http://192.168.0.125:32334/api/v1/webhooks/alertmanager
-> CI/CD success notification mirrored through AWOOI API
-> no python3 missing fallback
```
**邊界 / 技術債**
- T79 是路由狀態可視化,不會觸發 inference、自動修復、approval 或 incident 狀態變更。
- T80 修掉 success notification 因 `python3 missing` 回退 Telegram 的問題direct Telegram fallback 仍保留作為 API 離線保底。
- Gitea act runner 仍偶發 cleanup warning`__pycache__` permission / symlink cleanup目前 job conclusion 為 success這是 runner hygiene 技術債,不影響本輪交付。
**目前整體進度**
- AwoooP 告警可觀測鏈:約 96.5%。
- 低風險自動修復閉環:約 95%。
- 前端 AI 自動化管理介面同步:約 92.5%。
- CI/CD notification AwoooP 主路徑:約 99%。
- 完整 AI 自動化管理產品化:約 89%。
---
## 2026-05-19 | T72 Homepage live status and flow-pipeline stabilization
**背景**:首頁 `https://awoooi.wooo.work/zh-TW` 已能載入 production 資料,但值班視角仍有三個明顯斷點:飛輪 KPI 卡會持續嘗試 production 未接通的 `/api/v1/stats/flywheel/ws` WebSocket 並造成 console 噪音;每張 IncidentCard 都各自抓 CSRF token活躍事件很多時會把首頁網路請求放大小龍蝦 / OpenClaw 流程管線只看 `incident.status`,沒有把 `decision.state` / proposal evidence 納入,導致已有 AI 提案或待授權的事件看起來仍停在早期偵測。