docs(api): record ollama route order rollout [skip ci]
This commit is contained in:
112
docs/LOGBOOK.md
112
docs/LOGBOOK.md
@@ -1962,6 +1962,118 @@
|
||||
**目前整體進度**:
|
||||
- Alertmanager 低風險自動修復主線:約 96%。
|
||||
- 完整 AI 自動化管理產品化:約 86%。
|
||||
|
||||
---
|
||||
|
||||
## 2026-05-19|T75/T76 Ollama 全域路由順序校正與 direct caller 收斂
|
||||
|
||||
**背景**:
|
||||
|
||||
- Telegram 告警看起來像是跑到 111 Ollama 處理,統帥校正所有 Ollama 類路徑必須固定為 `GCP-A → GCP-B → 111 local → Gemini`。
|
||||
- Live 檢查發現 production env URL 順序本身正確,但 failover manager 會在 GCP-A healthy 時仍等待 111 health check,造成 90s webhook timeout 風險。
|
||||
- 另外部分 direct caller 只呼叫單一 `resolve_ollama_endpoint()`,沒有完整嘗試 GCP-A/GCP-B/111。
|
||||
|
||||
**完成變更**:
|
||||
|
||||
1. `ollama_failover_manager.select_provider()` 改為 GCP-A healthy 時直接回傳 primary,fallback chain 保留 `GCP-B → 111 → Gemini`,不再等待 111 health check。
|
||||
2. `ollama_endpoint_resolver` 新增 `resolve_ollama_order()`,所有 workload(含 `local_required` / `privacy_sensitive` / `dr`)統一回傳 `GCP-A → GCP-B → 111`。
|
||||
3. 高風險 direct caller 先接上 ordered fallback:
|
||||
- `EmbeddingService`
|
||||
- `KnowledgeExtractorService`
|
||||
- `KnowledgeRAGService`
|
||||
- `LocalCodeReviewService`
|
||||
4. Code review 的 Gemini fallback 維持既有 `LOCAL_CODE_REVIEW_ALLOW_GEMINI_FALLBACK` 控管;未新增無條件 Gemini 直呼叫,避免繞過費用治理。
|
||||
|
||||
**Commit / Deploy**:
|
||||
|
||||
```text
|
||||
36aeea80 fix(api): avoid local ollama health blocking gcp route
|
||||
5fa0e145 chore(cd): deploy 36aeea8 [skip ci]
|
||||
45cd55b2 fix(api): enforce global ollama endpoint order
|
||||
1b09a64e chore(cd): deploy 45cd55b [skip ci]
|
||||
```
|
||||
|
||||
**本地驗證**:
|
||||
|
||||
```text
|
||||
python -m py_compile
|
||||
apps/api/src/services/ollama_endpoint_resolver.py
|
||||
apps/api/src/services/knowledge_extractor_service.py
|
||||
apps/api/src/services/knowledge_rag_service.py
|
||||
apps/api/src/services/local_code_review_service.py
|
||||
apps/api/src/services/embedding_service.py
|
||||
-> OK
|
||||
|
||||
ruff check
|
||||
apps/api/src/services/ollama_endpoint_resolver.py
|
||||
apps/api/src/services/knowledge_extractor_service.py
|
||||
apps/api/src/services/knowledge_rag_service.py
|
||||
apps/api/src/services/local_code_review_service.py
|
||||
apps/api/src/services/embedding_service.py
|
||||
apps/api/tests/test_ollama_endpoint_resolver.py
|
||||
apps/api/tests/test_local_code_review_cloud_fallback.py
|
||||
-> OK
|
||||
|
||||
DATABASE_URL=postgresql+asyncpg://test:test@localhost/test pytest
|
||||
apps/api/tests/test_ollama_endpoint_resolver.py
|
||||
apps/api/tests/test_local_code_review_cloud_fallback.py
|
||||
-> 6 passed
|
||||
|
||||
DATABASE_URL=postgresql+asyncpg://test:test@localhost/test pytest
|
||||
apps/api/tests/test_ollama_failover_manager.py
|
||||
apps/api/tests/test_ai_router_failover_integration.py
|
||||
-> 43 passed
|
||||
```
|
||||
|
||||
**Gitea Actions**:
|
||||
|
||||
```text
|
||||
2423 Code Review for 36aeea80 -> success
|
||||
2422 CD for 36aeea80 -> success
|
||||
2426 Code Review for 45cd55b2 -> success
|
||||
2425 CD for 45cd55b2 -> success
|
||||
tests -> success
|
||||
build-and-deploy -> success
|
||||
post-deploy-checks -> success
|
||||
```
|
||||
|
||||
**Production 驗證**:
|
||||
|
||||
```text
|
||||
K8s image:
|
||||
awoooi-web 192.168.0.110:5000/awoooi/web:45cd55b2dad45d7c60a247bfa58db4c412fab752
|
||||
awoooi-api 192.168.0.110:5000/awoooi/api:45cd55b2dad45d7c60a247bfa58db4c412fab752
|
||||
awoooi-worker 192.168.0.110:5000/awoooi/api:45cd55b2dad45d7c60a247bfa58db4c412fab752
|
||||
|
||||
GET https://awoooi.wooo.work/api/v1/health
|
||||
-> healthy, prod, mock_mode=false
|
||||
|
||||
Pod 內 resolver smoke:
|
||||
interactive / deep_rca / embedding / rag / code_review / local_required / privacy_sensitive / dr
|
||||
-> ollama_gcp_a:http://34.143.170.20:11434
|
||||
-> ollama_gcp_b:http://34.21.145.224:11434
|
||||
-> ollama_local:http://192.168.0.111:11434
|
||||
|
||||
Pod 內 failover manager smoke:
|
||||
primary=ollama_gcp_a
|
||||
fallback_chain=ollama_gcp_b -> ollama_local -> gemini
|
||||
latency_ms=1.5
|
||||
health_gcp_b=null / health_local=null(GCP-A healthy 時不阻塞檢查)
|
||||
```
|
||||
|
||||
**判讀**:
|
||||
|
||||
- 此次修正確認「所有經 resolver 的 Ollama workload」不會再先走 111。
|
||||
- 告警主路由已恢復 `GCP-A → GCP-B → 111 → Gemini`,且 GCP-A healthy 時不再被 111 慢速 health check 拖爆 webhook timeout。
|
||||
- 尚未宣稱所有歷史 direct HTTP caller 已 100% 收斂;下一階段要繼續掃 `resolve_ollama_endpoint` / `settings.OLLAMA_URL` / `/api/generate`,把 ChatManager、log summary、intent classifier、RAG debug 等剩餘 caller 逐步改成 ordered fallback 或 AI Router choke point。
|
||||
|
||||
**目前整體進度**:
|
||||
|
||||
- 本輪 WIP(T73-T76 告警閉環與 Ollama 路由修正):約 99.8%。
|
||||
- AwoooP 告警可觀測鏈:約 95%。
|
||||
- 低風險自動修復閉環:約 95%。
|
||||
- 前端 AI 自動化管理介面同步:約 91%。
|
||||
- 完整 AI 自動化管理產品化:約 87%。
|
||||
- T21 已把 verifier coverage / freshness 從後端真相鏈推到前端;下一段建議 T22 拆解 9 筆 non-success verification 的原因,將 degraded/failed/timeout 分流到工作鏈路與 Ticket / PlayBook / KM 修復項。
|
||||
|
||||
## 2026-05-14 | T20 Governance SLO 前端狀態語意接入,低樣本不再偽裝紅燈
|
||||
|
||||
Reference in New Issue
Block a user