ewoooc/docs/phase0_audit_report_20260503.md

# Phase 0 探測報告 — Operation Ollama-First v5.0

> **日期**：2026-05-03
> **產出**：A1 onboarder（LLM/MCP audit）+ A2 web-researcher（替代查證）
> **狀態**：Phase 0 完成，作為 Phase 1+ 的事實基線

---

## TL;DR — 三個必讀結論

1. **LLM 呼叫點實測 ≥ 34 個**（戰役清單原 26 個，補強 8 個遺漏點）。AIGenerationHistory 覆蓋率僅 **11.8%**（4/34），其餘 88% 完全沒結構化記錄。
2. **A2 三項紅綠燈**：Tavily+Exa 🟢 / Qwen 替代 🟡 / DeepSeek-R1 🔴（改用 qwen3:14b）
3. **四個 P0 風險**：AiderHeal 寫死 111、Code Review Hermes 寫死 111、bge-m3 `:latest` tag 漂移、OllamaService 多 worker 競態

---

## Section 1 — LLM 呼叫點完整盤點（34 個）

### 1.1 主機標記原則

| 標記 | 定義 |
|---|---|
| `gcp_ollama` | 預設 GCP（34.21.145.224:11434），失敗自動 fallback `111_ollama` |
| `ollama_111` | 寫死 `192.168.0.111:11434`（如 AiderHeal、Code Review Hermes）|
| `gemini` | `google.generativeai` SDK |
| `nim` | NVIDIA NIM `https://integrate.api.nvidia.com/v1` |
| `nim_via_elephant` | `services/elephant_service.py` 走 NIM endpoint |

### 1.2 完整呼叫點表

| ID | 功能 | file:line | 模型 | 主機 | Cron 觸發 | History? |
|----|------|-----------|------|------|-----------|----------|
| 1 | Hermes 競價分析（批量威脅）| `services/hermes_analyst_service.py:411-426` | `hermes3:latest` (keep_alive 24h) | gcp_ollama → 111 | 每 4h | ❌ |
| 2 | Hermes L1 意圖分類（Telegram NLP）| `services/hermes_analyst_service.py:151-167` | `hermes3:latest` | gcp_ollama → 111 | 事件驅動 | ❌ |
| 3 | KM Embedding（worker queue）| `services/openclaw_learning_service.py:111` + `services/ollama_service.py:592-639` | `bge-m3:latest` | EMBEDDING_HOST → resolve | 每 60s 輪詢 | ❌ |
| 4 | KM Embedding（即時 RAG 查詢）| `services/openclaw_learning_service.py:399` | `bge-m3:latest` | 同上 | 事件驅動 | ❌ |
| 5 | **AiderHeal Code Repair** ⚠️| `services/aider_heal_executor.py:48-49` | `qwen2.5-coder:7b` | **寫死 111**（違反 ADR-027）| Code Review 觸發 | ❌ |
| 6 | MCP L1/L2 Gemini Grounding | `services/mcp_collector_service.py:163-167, 185-186` | `gemini-2.0-flash` → `gemini-1.5-flash` | gemini | 6 topic / 24h | ❌ |
| 7 | MCP L3 Ollama Fallback | `services/mcp_collector_service.py:205-214` | `qwen2.5-coder:7b` | gcp_ollama → 111 | Gemini 雙重失敗才觸發 | ❌ |
| 8 | OpenClaw 日報 | `services/openclaw_strategist_service.py:1093` → `_call_gemini` (L668) → `_call_nvidia_nim` (L694) | `gemini-2.5-flash` → `meta/llama-3.3-70b-instruct` | gemini → nim | 每日 09:00 | ❌ |
| 9 | OpenClaw 週報 | `services/openclaw_strategist_service.py:759` | 同上 | 同上 | 週一 06:00 | ❌ |
| 10 | OpenClaw 月報 | `services/openclaw_strategist_service.py:1267` | 同上 | 同上 | 每月 1 日 07:00 | ❌ |
| 11 | OpenClaw Meta 自審 | `services/openclaw_strategist_service.py:1503` | 同上 | 同上 | 每 6h | ❌ |
| 12 | OpenClaw Q&A（Telegram NLP）| `services/openclaw_strategist_service.py:56` | 同上 | gemini → nim | 事件驅動 | ❌ |
| 13 | **NemoTron 行動派發** | `services/nemoton_dispatcher_service.py:101-102` | `meta/llama-3.1-8b-instruct` | nim（80 calls/day 配額）| 每 4h | ❌ |
| 14 | **Code Review – Hermes 掃描** ⚠️| `services/code_review_pipeline_service.py:218-225` | `hermes3:latest` | **寫死 HERMES_URL（111）**| CD 部署 | ❌ |
| 15 | Code Review – OpenClaw 評估 | `services/code_review_pipeline_service.py:278-286` | `gemini-2.5-flash` | gemini | CD 部署 | ❌ |
| 16 | Code Review – ElephantAlpha 降級 | `services/code_review_pipeline_service.py:293-299` → `services/elephant_service.py:24-30` | `nvidia/llama-3.3-nemotron-super-49b-v1.5` (chain) | nim | CD 部署 | ❌ |
| 17 | EA Autonomous Engine | `services/elephant_alpha_autonomous_engine.py:540` | ElephantService | nim | daemon thread | ❌ |
| 18 | EA HITL pre-fetch（Hermes 預跑）| `services/elephant_alpha_orchestrator.py`（line TBD）| `hermes3:latest` | gcp_ollama → 111 | EA escalation 事件 | ❌ |
| 19 | PPT Gemini 分析 | `routes/openclaw_bot_routes.py:2464-2477` `_call_gemini` | `gemini-2.0-flash` | gemini | Telegram 指令 | ❌ |
| 20 | PPT Ollama Fallback | `routes/openclaw_bot_routes.py:2479-2500` | `qwen2.5-coder:7b` | gcp_ollama → 111 | 主路徑失敗 | ❌ |
| 21 | **PPT NIM (deepseek-v3.2)** ⚠️| `routes/openclaw_bot_routes.py:2513-2528` | `deepseek-ai/deepseek-v3.2`（不在 ELEPHANT_FALLBACK 列表）| nim | 同上 | ❌ |
| 22 | Sales Copy | `routes/ai_routes.py:650` + `services/ollama_service.py:219-308` | `llama3.1:8b` | gcp_ollama → 111 | HTTP API | ✅ |
| 23 | Trend 商品比對 | `routes/ai_routes.py:503` | `llama3.1:8b` | gcp_ollama → 111 | HTTP API | ✅ |
| 24 | Trend Web Search Q&A | `routes/trend_routes.py:293-294` + `routes/ai_routes.py:1129` | `llama3.1:8b` | gcp_ollama → 111 | HTTP | 部分 ✅ |
| 25 | Product Insights | `routes/ai_routes.py:1219` | `llama3.1:8b` | gcp_ollama → 111 | HTTP | ✅ |
| 26 | Trend Keywords | `routes/ai_routes.py:1307` | `llama3.1:8b` | gcp_ollama → 111 | HTTP | ✅ |
| 27 | Telegram Bot `/copy` | `services/telegram_bot_service.py:347-362` | `llama3.1:8b` | gcp_ollama → 111 | Telegram | ❌ |
| 28 | Telegram Bot 第二處 | `services/telegram_bot_service.py:1204-1206` | `llama3.1:8b` | gcp_ollama → 111 | Telegram | ❌ |
| 29 | OpenClaw Bot Q&A 主鏈 Ollama | `routes/openclaw_bot_routes.py:6784-6824` | `llama3.1:8b` | gcp_ollama → 111 | Telegram | ❌ |
| 30 | OpenClaw Bot Q&A 備援 Gemini | `routes/openclaw_bot_routes.py:~6843+` | `gemini-2.0-flash` | gemini | fallback | ❌ |
| 31 | OpenClaw Bot Q&A 備援 NIM | `routes/openclaw_bot_routes.py` | `deepseek-ai/deepseek-v3.2` | nim | fallback | ❌ |
| 32 | bot_api_routes 文案 | `routes/bot_api_routes.py:673-693` | `llama3.1:8b` | gcp_ollama → 111 | HTTP 內部 | ❌ |
| 33 | trend_crawler_service Ollama | `services/trend_crawler_service.py:35` | `llama3.1:8b` | gcp_ollama → 111 | 趨勢爬蟲流程 | ❌ |
| 34 | ai_provider 抽象層 | `services/ai_provider.py:74` | `llama3.1:8b` | gcp_ollama → 111 | 由 caller 觸發 | ❌ |

### 1.3 戰役清單未列的 8 個遺漏點

- #27/#28 `telegram_bot_service.py` 兩處
- #32 `routes/bot_api_routes.py:673`
- #33 `services/trend_crawler_service.py:35`
- #34 `services/ai_provider.py:74`
- #17 EA Engine 與 #18 EA HITL pre-fetch 是兩條獨立鏈
- Code Review pipeline 內部其實**同時呼叫 Hermes（#14）+ Gemini（#15）+ ElephantAlpha（#16）三個獨立 LLM**

### 1.4 AIGenerationHistory 覆蓋率

- 只有 `routes/ai_routes.py` 4 處（L361/1163/1252/1339）
- **覆蓋率 4/34 ≈ 11.8%**
- Phase 1 必須建立統一 `ai_calls` 表並接入剩餘 30 個呼叫點

---

## Section 2 — 13 個 MCP Server 紅綠燈

| # | MCP Server | 紅綠燈 | 評估 |
|---|-----------|--------|------|
| 1 | mcp-omnisearch（Tavily/Exa）| 🟢 立即引入 | 取代 Gemini Grounding 單點依賴 |
| 2 | firecrawl-mcp（自建）| 🟢 立即引入 | 補強 SPA 反爬蟲，**強制 mem_limit:2g + chrome-reaper** |
| 3 | postgres-mcp | 🟢 立即引入 | RBAC 限 SELECT 到 ai_insights/daily_sales/competitor_prices 等熱表 |
| 4 | playwright-mcp | 🟡 評估後 | 與 firecrawl 重疊，選一個即可 |
| 5 | memory-mcp（Anthropic KG）| 🔴 不採用 | 違反 ADR-002（pgvector 唯一）|
| 6 | fetch-mcp | 🟡 評估後 | 簡單 HTTP，requests.get 寫一行就好 |
| 7 | sequential-thinking-mcp | 🟡 評估後 | Phase 11 RAG 完成後再評估 |
| 8 | filesystem-mcp | 🟢 立即引入 | 跨 188/110/MacBook 開發效率 |
| 9 | git-mcp | 🟢 立即引入 | momo 用 Gitea，選 git-mcp（github-mcp 不適用）|
| 10 | time-mcp | 🟡 評估後 | 已有 TAIPEI_TZ 處理，低優先 |
| 11 | sentry-mcp | 🔴 不採用 | momo 沒用 Sentry，走 ADR-013 AutoHeal 既有閉環 |
| 12 | slack-mcp | 🔴 不採用 | 統帥用 Telegram |
| 13 | gdrive-mcp | 🟡 評估後 | PPT v3 穩定後再考慮 |

### 2.1 Phase 10 引入順序（5 個 🟢）

1. **postgres-mcp**（最高 ROI — 統帥每天 SQL 查詢）
2. **mcp-omnisearch**（Tavily 主 + Exa 備，Tavily 1000 free/月，避開 Brave）
3. **filesystem-mcp**（跨主機開發效率）
4. **firecrawl-mcp**（爬蟲韌性）
5. **git-mcp**（Gitea 兼容）

---

## Section 3 — BGE-M3 一致性現況報告

### 3.1 模型參數盤點

| 項目 | 實況 |
|------|------|
| 主呼叫位置 | `services/ollama_service.py:592-639` `generate_embedding` |
| 預設模型 | `bge-m3:latest`（floating tag — **風險**）|
| API endpoint | 主：`POST /api/embed`，fallback：`POST /api/embeddings` |
| Host 解析 | `host` 參數 > `EMBEDDING_HOST` env > `resolve_ollama_host()` |
| Timeout | env `OLLAMA_EMBED_TIMEOUT` 或 `EMBEDDING_TIMEOUT`，預設 45s |
| **normalize 參數** | ❌ **未顯式傳遞**（依賴 server-side 預設）|
| **pooling 策略** | ❌ **未顯式傳遞**（依賴 server-side 預設 mean）|
| 維度 | 1024（pgvector column 鎖定）|
| HNSW 索引 | `vector_cosine_ops`（cosine 距離）|

### 3.2 風險警示

🔴 **HIGH 風險 1：normalize 未強制**
- bge-m3 server-side 預設 normalize=True，但無程式契約鎖定
- **護欄**：在 ai_insights 寫入時記錄 `embedding_signature`（model+normalize+dim hash）

🟡 **MED 風險 2：`bge-m3:latest` floating tag**
- `:latest` 在任何 Ollama upgrade 都會跳版本，**RAG 召回會悄悄退化**
- **護欄**：固定為某個 digest 或固定 tag

🟢 **LOW 風險 3：dim=1024 一致性**
- 程式與 schema 都鎖 1024，無衝突

### 3.3 ai_insights.embedding 統計（**待 SSH 188 確認**）

```sql
SELECT
  COUNT(*) AS total,
  COUNT(embedding) AS with_embedding,
  COUNT(*) - COUNT(embedding) AS missing,
  MIN(created_at) FILTER (WHERE embedding IS NOT NULL) AS earliest,
  MAX(created_at) FILTER (WHERE embedding IS NOT NULL) AS latest,
  COUNT(DISTINCT array_length(embedding::real[], 1)) AS distinct_dims
FROM ai_insights;
```

> **statistics needed before Phase 11 開工**

### 3.4 Embedding worker 存活確認（**待 SSH 188**）

```bash
docker logs momo-scheduler 2>&1 | grep "OCLearn"
```

若 worker 死了，新 ai_insights 會持續累積 `embedding IS NULL`，RAG 召回率降級而無告警。

---

## Section 4 — A2 替代查證紅綠燈

| 任務 | 結論 | 戰術 |
|------|------|------|
| OpenClaw Q&A: Gemini → Qwen | 🟡 黃燈 | qwen3:14b + 繁中強制 prompt + Gemini fallback chain + **黃金測試集 A/B 必跑** |
| Nemotron: NIM → DeepSeek-R1 | 🔴 紅燈 | **改用 qwen3:14b**（DeepSeek-R1 Ollama tool_calls 假支援，GitHub Issue #10935 未解）|
| Phase 10 Search API | 🟢 綠燈 | Tavily 主（1000 free/月）+ Exa 備（1000 free），月成本 $0；**避開 Brave**（2026-02-12 取消免費 tier）|

### 4.1 三大警訊

1. **Qwen 繁中短板有學術佐證**（TMMLU+ 論文）：必跑黃金集 A/B
2. **DeepSeek-R1 在 Ollama 是「假支援」**：官方 tools capability 標示但 chat template 缺對應 jinja
3. **Brave 政策大改**：2026-02-12 後新用戶須綁信用卡

---

## Section 5 — 統帥決策建議

### 5.1 Phase 1 LLM Logger 優先接點 TOP 5

| 優先 | 呼叫點 | 理由 |
|-----|--------|------|
| **#1** | NemoTron 派發（#13）| NIM 80 calls/day 硬上限 + 結構化輸出，配額管理剛需 |
| **#2** | OpenClaw 三大報告（#8/#9/#10/#11，4 個合併）| Gemini 主力，prompt+output+token 完整 trace |
| **#3** | Hermes 競價分析（#1）| 4h 一次 + 每次 ~300 商品，需回溯為何漏 SKU |
| **#4** | Code Review 三鏈（#14/#15/#16）| ElephantAlpha 49B 成本可觀，需追蹤 |
| **#5** | OpenClaw Bot Q&A 三層 fallback（#29/#30/#31）| Telegram 用戶端體驗一線 |

### 5.2 統一介面建議

```python
@llm_call_logger(provider, model, callsite)
def some_llm_call(...):
    # 自動捕捉：prompt/output/tokens_in/tokens_out/duration/host/error/cost
    # 雙寫 ai_calls + 結構化 log
```

AiderHeal（#5）暫不接 logger（透過 SSH 跑 CLI，不在 Python 進程內）。

### 5.3 Phase 11 RAG 一致性護欄（必須 Phase 11 開工前完成）

1. **bge-m3 模型簽名鎖定**：固定 digest + ai_insights 加 `embedding_signature` 欄位
2. **Embedding worker 存活確認**：SSH 188 驗證 retry queue worker 真的在跑

### 5.4 戰役級風險揭示（v5.1 修訂）

🔴 **新增 Phase 2 修補項**：
- AiderHeal `services/aider_heal_executor.py:48` 寫死 111 → 改 resolve_ollama_host
- Code Review Hermes `services/code_review_pipeline_service.py:218` 寫死 111 → 同上

🟡 **新增 Phase 3 觀察項**：
- PPT NIM 用 deepseek-v3.2 不在 ELEPHANT_FALLBACK_MODELS → 兩條 NIM 鏈用不同模型，配額易漏算
- OllamaService 全域單例 + monkey-patch 競態風險（gunicorn 多 worker）

---

## 附錄：關鍵檔案絕對路徑

```
services/ollama_service.py
services/hermes_analyst_service.py
services/openclaw_strategist_service.py
services/openclaw_learning_service.py
services/mcp_collector_service.py
services/nemoton_dispatcher_service.py
services/elephant_service.py
services/elephant_alpha_autonomous_engine.py
services/elephant_alpha_orchestrator.py
services/code_review_pipeline_service.py
services/aider_heal_executor.py
services/ai_history_service.py
services/telegram_bot_service.py
services/trend_crawler_service.py
services/ai_provider.py
routes/openclaw_bot_routes.py
routes/ai_routes.py
routes/trend_routes.py
routes/bot_api_routes.py
scheduler.py
run_scheduler.py
migrations/009_pgvector_embedding.sql
migrations/011_embedding_retry_queue.sql
```

---

## 來源（A2 web research）

- [Qwen3 Technical Report — arXiv](https://arxiv.org/pdf/2505.09388)
- [Ollama qwen3 registry](https://ollama.com/library/qwen3)
- [TMMLU+ Traditional Chinese Eval — arXiv](https://arxiv.org/html/2403.01858v1)
- [DeepSeek-R1-0528 Release Notes](https://api-docs.deepseek.com/news/news250528)
- [Ollama Issue #10935 — R1 missing tool calling](https://github.com/ollama/ollama/issues/10935)
- [Tavily Pricing](https://www.tavily.com/pricing)
- [Brave Free Tier Removal](https://www.implicator.ai/brave-drops-free-search-api-tier-puts-all-developers-on-metered-billing/)
- [Exa API Pricing](https://exa.ai/pricing)