docs(api): record direct ollama fallback rollout [skip ci]

2026-05-19 13:10:40 +08:00
parent 4de626fcd5
commit a0ca2ccb7f
1 changed files with 107 additions and 0 deletions
--- a/docs/LOGBOOK.md
+++ b/docs/LOGBOOK.md
@@ -2074,6 +2074,113 @@ health_gcp_b=null / health_local=null（GCP-A healthy 時不阻塞檢查）
 - 低風險自動修復閉環：約 95%。
 - 前端 AI 自動化管理介面同步：約 91%。
 - 完整 AI 自動化管理產品化：約 87%。
+
+---
+
+## 2026-05-19｜T77 剩餘 direct Ollama caller ordered fallback 收斂
+
+**背景**：
+
+- T75/T76 已建立全域 `resolve_ollama_order()`，並先修 KB/RAG/Embedding/Code Review。
+- 統帥再次確認所有 Ollama 類路徑都必須固定為 `GCP-A → GCP-B → 111 local → Gemini`，不能因 workload 名稱或舊註解而先打 111。
+- 本段目標是把仍在應用層直接呼叫 Ollama 的 caller 接上 ordered fallback loop，並維持 Gemini 最後備援與費用治理邊界。
+
+**完成變更**：
+
+- `ChatManager`：OpenClaw / NemoClaw chat 改為依 `interactive` ordered endpoints 嘗試。
+- `Hermes NL Gateway`：自然語言 gateway 改為 `hermes` ordered endpoints。
+- `IntentClassifier`：LLM fallback classifier 改為 `hermes` ordered endpoints；全端點失敗時回 keyword fallback。
+- `LogSummaryService`：Pod log 摘要改為 `deep_rca` ordered endpoints。
+- `ImageAnalysisService`：llava image analysis 改為 `image_analysis` ordered endpoints。
+- `routes/agent.py`：agent thinking SSE stream 改為逐端點嘗試後才回全端點不可用。
+- `api/v1/rag.py`：RAG debug embedding check 改為檢查 ordered endpoints。
+- `DecisionFusion` / `DecisionFusionAdapter`：Hermes / Elephant / governance fusion LLM 評分改為 `deep_rca` ordered endpoints。
+- `AlertRuleEngine`：auto rule generation 的 Ollama 生成改為 ordered endpoints，Gemini 仍只在 Ollama 全失敗且既有 key 可用時作最後備援。
+- `OllamaToolProvider`：tool calling `/v1/chat/completions` health / tool call / chat 改為 `hermes` ordered endpoints。
+- 移除 `DriftNarratorService` 內已不再使用的舊 111 helper，避免誤導。
+
+**保留邊界**：
+
+- 未新增任何無條件 Gemini 直呼叫。
+- 未修改 `decision_manager.py` 紅區核心；該檔仍有舊式 direct `OLLAMA_URL` 呼叫，需下一個明確紅區小階段處理。
+- 健康檢查、版本探測、OpenClaw provider registry、AI provider 類別仍保留各自的 endpoint 語意；它們不是本段 direct caller 收斂目標。
+
+**Commit / Deploy**：
+
+```text
+35fe37c8 fix(api): route direct ollama callers through ordered fallback
+4de626fc chore(cd): deploy 35fe37c [skip ci]
+```
+
+**本地驗證**：
+
+```text
+python -m py_compile
+  chat_manager.py log_summary_service.py image_analysis_service.py
+  hermes/nl_gateway.py intent_classifier.py decision_fusion_adapter.py
+  decision_fusion.py routes/agent.py api/v1/rag.py alert_rule_engine.py
+  nvidia_provider.py drift_narrator_service.py
+  -> OK
+
+ruff check --select F,E9,I
+  touched backend files + test_chat_manager_ollama_routing.py
+  -> OK
+
+DATABASE_URL=postgresql+asyncpg://test:test@localhost/test pytest
+  test_chat_manager_ollama_routing.py
+  test_intent_classifier.py
+  test_ollama_endpoint_resolver.py
+  test_local_code_review_cloud_fallback.py
+  test_nvidia_provider.py
+  test_governance_dispatcher.py
+  -> 75 passed, 7 skipped
+
+git diff --check -> OK
+```
+
+**Gitea Actions**：
+
+```text
+2430 Code Review for 35fe37c8 -> success
+2429 CD for 35fe37c8 -> success
+  tests -> success
+    2123 passed, 23 skipped
+    B5 integration -> 5 passed
+  build-and-deploy -> success
+  post-deploy-checks -> success
+```
+
+**Production 驗證**：
+
+```text
+K8s image:
+awoooi-web    192.168.0.110:5000/awoooi/web:35fe37c82af3e20e205ff379c7f9c7277511702b
+awoooi-api    192.168.0.110:5000/awoooi/api:35fe37c82af3e20e205ff379c7f9c7277511702b
+awoooi-worker 192.168.0.110:5000/awoooi/api:35fe37c82af3e20e205ff379c7f9c7277511702b
+
+GET https://awoooi.wooo.work/api/v1/health
+  -> healthy, prod, mock_mode=false
+
+Pod 內 resolver smoke:
+interactive / hermes / deep_rca / embedding / rag / code_review / image_analysis
+  -> ollama_gcp_a:http://34.143.170.20:11434
+  -> ollama_gcp_b:http://34.21.145.224:11434
+  -> ollama_local:http://192.168.0.111:11434
+
+Pod 內 OllamaToolProvider smoke:
+  -> http://34.143.170.20:11434
+  -> http://34.21.145.224:11434
+  -> http://192.168.0.111:11434
+```
+
+**目前整體進度**：
+
+- 本輪 WIP（T73-T77 告警閉環與 Ollama direct caller 收斂）：約 99.95%。
+- AwoooP 告警可觀測鏈：約 95%。
+- 低風險自動修復閉環：約 95%。
+- 前端 AI 自動化管理介面同步：約 91%。
+- 完整 AI 自動化管理產品化：約 88%。
+
 - T21 已把 verifier coverage / freshness 從後端真相鏈推到前端；下一段建議 T22 拆解 9 筆 non-success verification 的原因，將 degraded/failed/timeout 分流到工作鏈路與 Ticket / PlayBook / KM 修復項。

 ## 2026-05-14 | T20 Governance SLO 前端狀態語意接入，低樣本不再偽裝紅燈