docs(api): record direct ollama fallback rollout [skip ci]
This commit is contained in:
107
docs/LOGBOOK.md
107
docs/LOGBOOK.md
@@ -2074,6 +2074,113 @@ health_gcp_b=null / health_local=null(GCP-A healthy 時不阻塞檢查)
|
||||
- 低風險自動修復閉環:約 95%。
|
||||
- 前端 AI 自動化管理介面同步:約 91%。
|
||||
- 完整 AI 自動化管理產品化:約 87%。
|
||||
|
||||
---
|
||||
|
||||
## 2026-05-19|T77 剩餘 direct Ollama caller ordered fallback 收斂
|
||||
|
||||
**背景**:
|
||||
|
||||
- T75/T76 已建立全域 `resolve_ollama_order()`,並先修 KB/RAG/Embedding/Code Review。
|
||||
- 統帥再次確認所有 Ollama 類路徑都必須固定為 `GCP-A → GCP-B → 111 local → Gemini`,不能因 workload 名稱或舊註解而先打 111。
|
||||
- 本段目標是把仍在應用層直接呼叫 Ollama 的 caller 接上 ordered fallback loop,並維持 Gemini 最後備援與費用治理邊界。
|
||||
|
||||
**完成變更**:
|
||||
|
||||
- `ChatManager`:OpenClaw / NemoClaw chat 改為依 `interactive` ordered endpoints 嘗試。
|
||||
- `Hermes NL Gateway`:自然語言 gateway 改為 `hermes` ordered endpoints。
|
||||
- `IntentClassifier`:LLM fallback classifier 改為 `hermes` ordered endpoints;全端點失敗時回 keyword fallback。
|
||||
- `LogSummaryService`:Pod log 摘要改為 `deep_rca` ordered endpoints。
|
||||
- `ImageAnalysisService`:llava image analysis 改為 `image_analysis` ordered endpoints。
|
||||
- `routes/agent.py`:agent thinking SSE stream 改為逐端點嘗試後才回全端點不可用。
|
||||
- `api/v1/rag.py`:RAG debug embedding check 改為檢查 ordered endpoints。
|
||||
- `DecisionFusion` / `DecisionFusionAdapter`:Hermes / Elephant / governance fusion LLM 評分改為 `deep_rca` ordered endpoints。
|
||||
- `AlertRuleEngine`:auto rule generation 的 Ollama 生成改為 ordered endpoints,Gemini 仍只在 Ollama 全失敗且既有 key 可用時作最後備援。
|
||||
- `OllamaToolProvider`:tool calling `/v1/chat/completions` health / tool call / chat 改為 `hermes` ordered endpoints。
|
||||
- 移除 `DriftNarratorService` 內已不再使用的舊 111 helper,避免誤導。
|
||||
|
||||
**保留邊界**:
|
||||
|
||||
- 未新增任何無條件 Gemini 直呼叫。
|
||||
- 未修改 `decision_manager.py` 紅區核心;該檔仍有舊式 direct `OLLAMA_URL` 呼叫,需下一個明確紅區小階段處理。
|
||||
- 健康檢查、版本探測、OpenClaw provider registry、AI provider 類別仍保留各自的 endpoint 語意;它們不是本段 direct caller 收斂目標。
|
||||
|
||||
**Commit / Deploy**:
|
||||
|
||||
```text
|
||||
35fe37c8 fix(api): route direct ollama callers through ordered fallback
|
||||
4de626fc chore(cd): deploy 35fe37c [skip ci]
|
||||
```
|
||||
|
||||
**本地驗證**:
|
||||
|
||||
```text
|
||||
python -m py_compile
|
||||
chat_manager.py log_summary_service.py image_analysis_service.py
|
||||
hermes/nl_gateway.py intent_classifier.py decision_fusion_adapter.py
|
||||
decision_fusion.py routes/agent.py api/v1/rag.py alert_rule_engine.py
|
||||
nvidia_provider.py drift_narrator_service.py
|
||||
-> OK
|
||||
|
||||
ruff check --select F,E9,I
|
||||
touched backend files + test_chat_manager_ollama_routing.py
|
||||
-> OK
|
||||
|
||||
DATABASE_URL=postgresql+asyncpg://test:test@localhost/test pytest
|
||||
test_chat_manager_ollama_routing.py
|
||||
test_intent_classifier.py
|
||||
test_ollama_endpoint_resolver.py
|
||||
test_local_code_review_cloud_fallback.py
|
||||
test_nvidia_provider.py
|
||||
test_governance_dispatcher.py
|
||||
-> 75 passed, 7 skipped
|
||||
|
||||
git diff --check -> OK
|
||||
```
|
||||
|
||||
**Gitea Actions**:
|
||||
|
||||
```text
|
||||
2430 Code Review for 35fe37c8 -> success
|
||||
2429 CD for 35fe37c8 -> success
|
||||
tests -> success
|
||||
2123 passed, 23 skipped
|
||||
B5 integration -> 5 passed
|
||||
build-and-deploy -> success
|
||||
post-deploy-checks -> success
|
||||
```
|
||||
|
||||
**Production 驗證**:
|
||||
|
||||
```text
|
||||
K8s image:
|
||||
awoooi-web 192.168.0.110:5000/awoooi/web:35fe37c82af3e20e205ff379c7f9c7277511702b
|
||||
awoooi-api 192.168.0.110:5000/awoooi/api:35fe37c82af3e20e205ff379c7f9c7277511702b
|
||||
awoooi-worker 192.168.0.110:5000/awoooi/api:35fe37c82af3e20e205ff379c7f9c7277511702b
|
||||
|
||||
GET https://awoooi.wooo.work/api/v1/health
|
||||
-> healthy, prod, mock_mode=false
|
||||
|
||||
Pod 內 resolver smoke:
|
||||
interactive / hermes / deep_rca / embedding / rag / code_review / image_analysis
|
||||
-> ollama_gcp_a:http://34.143.170.20:11434
|
||||
-> ollama_gcp_b:http://34.21.145.224:11434
|
||||
-> ollama_local:http://192.168.0.111:11434
|
||||
|
||||
Pod 內 OllamaToolProvider smoke:
|
||||
-> http://34.143.170.20:11434
|
||||
-> http://34.21.145.224:11434
|
||||
-> http://192.168.0.111:11434
|
||||
```
|
||||
|
||||
**目前整體進度**:
|
||||
|
||||
- 本輪 WIP(T73-T77 告警閉環與 Ollama direct caller 收斂):約 99.95%。
|
||||
- AwoooP 告警可觀測鏈:約 95%。
|
||||
- 低風險自動修復閉環:約 95%。
|
||||
- 前端 AI 自動化管理介面同步:約 91%。
|
||||
- 完整 AI 自動化管理產品化:約 88%。
|
||||
|
||||
- T21 已把 verifier coverage / freshness 從後端真相鏈推到前端;下一段建議 T22 拆解 9 筆 non-success verification 的原因,將 degraded/failed/timeout 分流到工作鏈路與 Ticket / PlayBook / KM 修復項。
|
||||
|
||||
## 2026-05-14 | T20 Governance SLO 前端狀態語意接入,低樣本不再偽裝紅燈
|
||||
|
||||
Reference in New Issue
Block a user