# Phase 0 探測報告 — Operation Ollama-First v5.0 > **日期**:2026-05-03 > **產出**:A1 onboarder(LLM/MCP audit)+ A2 web-researcher(替代查證) > **狀態**:Phase 0 完成,作為 Phase 1+ 的事實基線 --- ## TL;DR — 三個必讀結論 1. **LLM 呼叫點實測 ≥ 34 個**(戰役清單原 26 個,補強 8 個遺漏點)。AIGenerationHistory 覆蓋率僅 **11.8%**(4/34),其餘 88% 完全沒結構化記錄。 2. **A2 三項紅綠燈**:Tavily+Exa 🟢 / Qwen 替代 🟡 / DeepSeek-R1 🔴(改用 qwen3:14b) 3. **四個 P0 風險**:AiderHeal 寫死 111、Code Review Hermes 寫死 111、bge-m3 `:latest` tag 漂移、OllamaService 多 worker 競態 --- ## Section 1 — LLM 呼叫點完整盤點(34 個) ### 1.1 主機標記原則 | 標記 | 定義 | |---|---| | `gcp_ollama` | 預設 GCP(34.21.145.224:11434),失敗自動 fallback `111_ollama` | | `ollama_111` | 寫死 `192.168.0.111:11434`(如 AiderHeal、Code Review Hermes)| | `gemini` | `google.generativeai` SDK | | `nim` | NVIDIA NIM `https://integrate.api.nvidia.com/v1` | | `nim_via_elephant` | `services/elephant_service.py` 走 NIM endpoint | ### 1.2 完整呼叫點表 | ID | 功能 | file:line | 模型 | 主機 | Cron 觸發 | History? | |----|------|-----------|------|------|-----------|----------| | 1 | Hermes 競價分析(批量威脅)| `services/hermes_analyst_service.py:411-426` | `hermes3:latest` (keep_alive 24h) | gcp_ollama → 111 | 每 4h | ❌ | | 2 | Hermes L1 意圖分類(Telegram NLP)| `services/hermes_analyst_service.py:151-167` | `hermes3:latest` | gcp_ollama → 111 | 事件驅動 | ❌ | | 3 | KM Embedding(worker queue)| `services/openclaw_learning_service.py:111` + `services/ollama_service.py:592-639` | `bge-m3:latest` | EMBEDDING_HOST → resolve | 每 60s 輪詢 | ❌ | | 4 | KM Embedding(即時 RAG 查詢)| `services/openclaw_learning_service.py:399` | `bge-m3:latest` | 同上 | 事件驅動 | ❌ | | 5 | **AiderHeal Code Repair** ⚠️| `services/aider_heal_executor.py:48-49` | `qwen2.5-coder:7b` | **寫死 111**(違反 ADR-027)| Code Review 觸發 | ❌ | | 6 | MCP L1/L2 Gemini Grounding | `services/mcp_collector_service.py:163-167, 185-186` | `gemini-2.0-flash` → `gemini-1.5-flash` | gemini | 6 topic / 24h | ❌ | | 7 | MCP L3 Ollama Fallback | `services/mcp_collector_service.py:205-214` | `qwen2.5-coder:7b` | gcp_ollama → 111 | Gemini 雙重失敗才觸發 | ❌ | | 8 | OpenClaw 日報 | `services/openclaw_strategist_service.py:1093` → `_call_gemini` (L668) → `_call_nvidia_nim` (L694) | `gemini-2.5-flash` → `meta/llama-3.3-70b-instruct` | gemini → nim | 每日 09:00 | ❌ | | 9 | OpenClaw 週報 | `services/openclaw_strategist_service.py:759` | 同上 | 同上 | 週一 06:00 | ❌ | | 10 | OpenClaw 月報 | `services/openclaw_strategist_service.py:1267` | 同上 | 同上 | 每月 1 日 07:00 | ❌ | | 11 | OpenClaw Meta 自審 | `services/openclaw_strategist_service.py:1503` | 同上 | 同上 | 每 6h | ❌ | | 12 | OpenClaw Q&A(Telegram NLP)| `services/openclaw_strategist_service.py:56` | 同上 | gemini → nim | 事件驅動 | ❌ | | 13 | **NemoTron 行動派發** | `services/nemoton_dispatcher_service.py:101-102` | `meta/llama-3.1-8b-instruct` | nim(80 calls/day 配額)| 每 4h | ❌ | | 14 | **Code Review – Hermes 掃描** ⚠️| `services/code_review_pipeline_service.py:218-225` | `hermes3:latest` | **寫死 HERMES_URL(111)**| CD 部署 | ❌ | | 15 | Code Review – OpenClaw 評估 | `services/code_review_pipeline_service.py:278-286` | `gemini-2.5-flash` | gemini | CD 部署 | ❌ | | 16 | Code Review – ElephantAlpha 降級 | `services/code_review_pipeline_service.py:293-299` → `services/elephant_service.py:24-30` | `nvidia/llama-3.3-nemotron-super-49b-v1.5` (chain) | nim | CD 部署 | ❌ | | 17 | EA Autonomous Engine | `services/elephant_alpha_autonomous_engine.py:540` | ElephantService | nim | daemon thread | ❌ | | 18 | EA HITL pre-fetch(Hermes 預跑)| `services/elephant_alpha_orchestrator.py`(line TBD)| `hermes3:latest` | gcp_ollama → 111 | EA escalation 事件 | ❌ | | 19 | PPT Gemini 分析 | `routes/openclaw_bot_routes.py:2464-2477` `_call_gemini` | `gemini-2.0-flash` | gemini | Telegram 指令 | ❌ | | 20 | PPT Ollama Fallback | `routes/openclaw_bot_routes.py:2479-2500` | `qwen2.5-coder:7b` | gcp_ollama → 111 | 主路徑失敗 | ❌ | | 21 | **PPT NIM (deepseek-v3.2)** ⚠️| `routes/openclaw_bot_routes.py:2513-2528` | `deepseek-ai/deepseek-v3.2`(不在 ELEPHANT_FALLBACK 列表)| nim | 同上 | ❌ | | 22 | Sales Copy | `routes/ai_routes.py:650` + `services/ollama_service.py:219-308` | `llama3.1:8b` | gcp_ollama → 111 | HTTP API | ✅ | | 23 | Trend 商品比對 | `routes/ai_routes.py:503` | `llama3.1:8b` | gcp_ollama → 111 | HTTP API | ✅ | | 24 | Trend Web Search Q&A | `routes/trend_routes.py:293-294` + `routes/ai_routes.py:1129` | `llama3.1:8b` | gcp_ollama → 111 | HTTP | 部分 ✅ | | 25 | Product Insights | `routes/ai_routes.py:1219` | `llama3.1:8b` | gcp_ollama → 111 | HTTP | ✅ | | 26 | Trend Keywords | `routes/ai_routes.py:1307` | `llama3.1:8b` | gcp_ollama → 111 | HTTP | ✅ | | 27 | Telegram Bot `/copy` | `services/telegram_bot_service.py:347-362` | `llama3.1:8b` | gcp_ollama → 111 | Telegram | ❌ | | 28 | Telegram Bot 第二處 | `services/telegram_bot_service.py:1204-1206` | `llama3.1:8b` | gcp_ollama → 111 | Telegram | ❌ | | 29 | OpenClaw Bot Q&A 主鏈 Ollama | `routes/openclaw_bot_routes.py:6784-6824` | `llama3.1:8b` | gcp_ollama → 111 | Telegram | ❌ | | 30 | OpenClaw Bot Q&A 備援 Gemini | `routes/openclaw_bot_routes.py:~6843+` | `gemini-2.0-flash` | gemini | fallback | ❌ | | 31 | OpenClaw Bot Q&A 備援 NIM | `routes/openclaw_bot_routes.py` | `deepseek-ai/deepseek-v3.2` | nim | fallback | ❌ | | 32 | bot_api_routes 文案 | `routes/bot_api_routes.py:673-693` | `llama3.1:8b` | gcp_ollama → 111 | HTTP 內部 | ❌ | | 33 | trend_crawler_service Ollama | `services/trend_crawler_service.py:35` | `llama3.1:8b` | gcp_ollama → 111 | 趨勢爬蟲流程 | ❌ | | 34 | ai_provider 抽象層 | `services/ai_provider.py:74` | `llama3.1:8b` | gcp_ollama → 111 | 由 caller 觸發 | ❌ | ### 1.3 戰役清單未列的 8 個遺漏點 - #27/#28 `telegram_bot_service.py` 兩處 - #32 `routes/bot_api_routes.py:673` - #33 `services/trend_crawler_service.py:35` - #34 `services/ai_provider.py:74` - #17 EA Engine 與 #18 EA HITL pre-fetch 是兩條獨立鏈 - Code Review pipeline 內部其實**同時呼叫 Hermes(#14)+ Gemini(#15)+ ElephantAlpha(#16)三個獨立 LLM** ### 1.4 AIGenerationHistory 覆蓋率 - 只有 `routes/ai_routes.py` 4 處(L361/1163/1252/1339) - **覆蓋率 4/34 ≈ 11.8%** - Phase 1 必須建立統一 `ai_calls` 表並接入剩餘 30 個呼叫點 --- ## Section 2 — 13 個 MCP Server 紅綠燈 | # | MCP Server | 紅綠燈 | 評估 | |---|-----------|--------|------| | 1 | mcp-omnisearch(Tavily/Exa)| 🟢 立即引入 | 取代 Gemini Grounding 單點依賴 | | 2 | firecrawl-mcp(自建)| 🟢 立即引入 | 補強 SPA 反爬蟲,**強制 mem_limit:2g + chrome-reaper** | | 3 | postgres-mcp | 🟢 立即引入 | RBAC 限 SELECT 到 ai_insights/daily_sales/competitor_prices 等熱表 | | 4 | playwright-mcp | 🟡 評估後 | 與 firecrawl 重疊,選一個即可 | | 5 | memory-mcp(Anthropic KG)| 🔴 不採用 | 違反 ADR-002(pgvector 唯一)| | 6 | fetch-mcp | 🟡 評估後 | 簡單 HTTP,requests.get 寫一行就好 | | 7 | sequential-thinking-mcp | 🟡 評估後 | Phase 11 RAG 完成後再評估 | | 8 | filesystem-mcp | 🟢 立即引入 | 跨 188/110/MacBook 開發效率 | | 9 | git-mcp | 🟢 立即引入 | momo 用 Gitea,選 git-mcp(github-mcp 不適用)| | 10 | time-mcp | 🟡 評估後 | 已有 TAIPEI_TZ 處理,低優先 | | 11 | sentry-mcp | 🔴 不採用 | momo 沒用 Sentry,走 ADR-013 AutoHeal 既有閉環 | | 12 | slack-mcp | 🔴 不採用 | 統帥用 Telegram | | 13 | gdrive-mcp | 🟡 評估後 | PPT v3 穩定後再考慮 | ### 2.1 Phase 10 引入順序(5 個 🟢) 1. **postgres-mcp**(最高 ROI — 統帥每天 SQL 查詢) 2. **mcp-omnisearch**(Tavily 主 + Exa 備,Tavily 1000 free/月,避開 Brave) 3. **filesystem-mcp**(跨主機開發效率) 4. **firecrawl-mcp**(爬蟲韌性) 5. **git-mcp**(Gitea 兼容) --- ## Section 3 — BGE-M3 一致性現況報告 ### 3.1 模型參數盤點 | 項目 | 實況 | |------|------| | 主呼叫位置 | `services/ollama_service.py:592-639` `generate_embedding` | | 預設模型 | `bge-m3:latest`(floating tag — **風險**)| | API endpoint | 主:`POST /api/embed`,fallback:`POST /api/embeddings` | | Host 解析 | `host` 參數 > `EMBEDDING_HOST` env > `resolve_ollama_host()` | | Timeout | env `OLLAMA_EMBED_TIMEOUT` 或 `EMBEDDING_TIMEOUT`,預設 45s | | **normalize 參數** | ❌ **未顯式傳遞**(依賴 server-side 預設)| | **pooling 策略** | ❌ **未顯式傳遞**(依賴 server-side 預設 mean)| | 維度 | 1024(pgvector column 鎖定)| | HNSW 索引 | `vector_cosine_ops`(cosine 距離)| ### 3.2 風險警示 🔴 **HIGH 風險 1:normalize 未強制** - bge-m3 server-side 預設 normalize=True,但無程式契約鎖定 - **護欄**:在 ai_insights 寫入時記錄 `embedding_signature`(model+normalize+dim hash) 🟡 **MED 風險 2:`bge-m3:latest` floating tag** - `:latest` 在任何 Ollama upgrade 都會跳版本,**RAG 召回會悄悄退化** - **護欄**:固定為某個 digest 或固定 tag 🟢 **LOW 風險 3:dim=1024 一致性** - 程式與 schema 都鎖 1024,無衝突 ### 3.3 ai_insights.embedding 統計(**待 SSH 188 確認**) ```sql SELECT COUNT(*) AS total, COUNT(embedding) AS with_embedding, COUNT(*) - COUNT(embedding) AS missing, MIN(created_at) FILTER (WHERE embedding IS NOT NULL) AS earliest, MAX(created_at) FILTER (WHERE embedding IS NOT NULL) AS latest, COUNT(DISTINCT array_length(embedding::real[], 1)) AS distinct_dims FROM ai_insights; ``` > **statistics needed before Phase 11 開工** ### 3.4 Embedding worker 存活確認(**待 SSH 188**) ```bash docker logs momo-scheduler 2>&1 | grep "OCLearn" ``` 若 worker 死了,新 ai_insights 會持續累積 `embedding IS NULL`,RAG 召回率降級而無告警。 --- ## Section 4 — A2 替代查證紅綠燈 | 任務 | 結論 | 戰術 | |------|------|------| | OpenClaw Q&A: Gemini → Qwen | 🟡 黃燈 | qwen3:14b + 繁中強制 prompt + Gemini fallback chain + **黃金測試集 A/B 必跑** | | Nemotron: NIM → DeepSeek-R1 | 🔴 紅燈 | **改用 qwen3:14b**(DeepSeek-R1 Ollama tool_calls 假支援,GitHub Issue #10935 未解)| | Phase 10 Search API | 🟢 綠燈 | Tavily 主(1000 free/月)+ Exa 備(1000 free),月成本 $0;**避開 Brave**(2026-02-12 取消免費 tier)| ### 4.1 三大警訊 1. **Qwen 繁中短板有學術佐證**(TMMLU+ 論文):必跑黃金集 A/B 2. **DeepSeek-R1 在 Ollama 是「假支援」**:官方 tools capability 標示但 chat template 缺對應 jinja 3. **Brave 政策大改**:2026-02-12 後新用戶須綁信用卡 --- ## Section 5 — 統帥決策建議 ### 5.1 Phase 1 LLM Logger 優先接點 TOP 5 | 優先 | 呼叫點 | 理由 | |-----|--------|------| | **#1** | NemoTron 派發(#13)| NIM 80 calls/day 硬上限 + 結構化輸出,配額管理剛需 | | **#2** | OpenClaw 三大報告(#8/#9/#10/#11,4 個合併)| Gemini 主力,prompt+output+token 完整 trace | | **#3** | Hermes 競價分析(#1)| 4h 一次 + 每次 ~300 商品,需回溯為何漏 SKU | | **#4** | Code Review 三鏈(#14/#15/#16)| ElephantAlpha 49B 成本可觀,需追蹤 | | **#5** | OpenClaw Bot Q&A 三層 fallback(#29/#30/#31)| Telegram 用戶端體驗一線 | ### 5.2 統一介面建議 ```python @llm_call_logger(provider, model, callsite) def some_llm_call(...): # 自動捕捉:prompt/output/tokens_in/tokens_out/duration/host/error/cost # 雙寫 ai_calls + 結構化 log ``` AiderHeal(#5)暫不接 logger(透過 SSH 跑 CLI,不在 Python 進程內)。 ### 5.3 Phase 11 RAG 一致性護欄(必須 Phase 11 開工前完成) 1. **bge-m3 模型簽名鎖定**:固定 digest + ai_insights 加 `embedding_signature` 欄位 2. **Embedding worker 存活確認**:SSH 188 驗證 retry queue worker 真的在跑 ### 5.4 戰役級風險揭示(v5.1 修訂) 🔴 **新增 Phase 2 修補項**: - AiderHeal `services/aider_heal_executor.py:48` 寫死 111 → 改 resolve_ollama_host - Code Review Hermes `services/code_review_pipeline_service.py:218` 寫死 111 → 同上 🟡 **新增 Phase 3 觀察項**: - PPT NIM 用 deepseek-v3.2 不在 ELEPHANT_FALLBACK_MODELS → 兩條 NIM 鏈用不同模型,配額易漏算 - OllamaService 全域單例 + monkey-patch 競態風險(gunicorn 多 worker) --- ## 附錄:關鍵檔案絕對路徑 ``` services/ollama_service.py services/hermes_analyst_service.py services/openclaw_strategist_service.py services/openclaw_learning_service.py services/mcp_collector_service.py services/nemoton_dispatcher_service.py services/elephant_service.py services/elephant_alpha_autonomous_engine.py services/elephant_alpha_orchestrator.py services/code_review_pipeline_service.py services/aider_heal_executor.py services/ai_history_service.py services/telegram_bot_service.py services/trend_crawler_service.py services/ai_provider.py routes/openclaw_bot_routes.py routes/ai_routes.py routes/trend_routes.py routes/bot_api_routes.py scheduler.py run_scheduler.py migrations/009_pgvector_embedding.sql migrations/011_embedding_retry_queue.sql ``` --- ## 來源(A2 web research) - [Qwen3 Technical Report — arXiv](https://arxiv.org/pdf/2505.09388) - [Ollama qwen3 registry](https://ollama.com/library/qwen3) - [TMMLU+ Traditional Chinese Eval — arXiv](https://arxiv.org/html/2403.01858v1) - [DeepSeek-R1-0528 Release Notes](https://api-docs.deepseek.com/news/news250528) - [Ollama Issue #10935 — R1 missing tool calling](https://github.com/ollama/ollama/issues/10935) - [Tavily Pricing](https://www.tavily.com/pricing) - [Brave Free Tier Removal](https://www.implicator.ai/brave-drops-free-search-api-tier-puts-all-developers-on-metered-billing/) - [Exa API Pricing](https://exa.ai/pricing)