Files
ewoooc/docs/phase0_audit_report_20260503.md
OoO 4648673423 db(p1): ai_calls/mcp_calls/budgets schema + bge-m3 signature
migrations 024/025/026 — 統一 LLM 遙測 + 預算告警 + RAG 一致性護欄
- 024: ai_calls 表 + 5 索引 + 6 CHECK constraint(H1/H2/M3/L3)
- 025: mcp_calls + ai_call_budgets + 10 種子預算(含 ollama_secondary)
- 026: ai_insights.embedding_signature + pgcrypto + CONCURRENTLY index

A11 critic 三輪審查記錄完整保留:
- Phase 1 schema review: 2 BLOCKER + 4 HIGH + 6 MEDIUM 全處理
- Phase 1 final sign-off: 0 BLOCKER + 2 HIGH + 4 MEDIUM
- Phase 6 ADR review: 5 BLOCKER + 6 HIGH 全修

Operation Ollama-First v5.0 / Phase 0+1+6 護欄

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-03 23:04:42 +08:00

263 lines
14 KiB
Markdown
Raw Permalink Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# Phase 0 探測報告 — Operation Ollama-First v5.0
> **日期**2026-05-03
> **產出**A1 onboarderLLM/MCP audit+ A2 web-researcher替代查證
> **狀態**Phase 0 完成,作為 Phase 1+ 的事實基線
---
## TL;DR — 三個必讀結論
1. **LLM 呼叫點實測 ≥ 34 個**(戰役清單原 26 個,補強 8 個遺漏點。AIGenerationHistory 覆蓋率僅 **11.8%**4/34其餘 88% 完全沒結構化記錄。
2. **A2 三項紅綠燈**Tavily+Exa 🟢 / Qwen 替代 🟡 / DeepSeek-R1 🔴(改用 qwen3:14b
3. **四個 P0 風險**AiderHeal 寫死 111、Code Review Hermes 寫死 111、bge-m3 `:latest` tag 漂移、OllamaService 多 worker 競態
---
## Section 1 — LLM 呼叫點完整盤點34 個)
### 1.1 主機標記原則
| 標記 | 定義 |
|---|---|
| `gcp_ollama` | 預設 GCP34.21.145.224:11434失敗自動 fallback `111_ollama` |
| `ollama_111` | 寫死 `192.168.0.111:11434`(如 AiderHeal、Code Review Hermes|
| `gemini` | `google.generativeai` SDK |
| `nim` | NVIDIA NIM `https://integrate.api.nvidia.com/v1` |
| `nim_via_elephant` | `services/elephant_service.py` 走 NIM endpoint |
### 1.2 完整呼叫點表
| ID | 功能 | file:line | 模型 | 主機 | Cron 觸發 | History? |
|----|------|-----------|------|------|-----------|----------|
| 1 | Hermes 競價分析(批量威脅)| `services/hermes_analyst_service.py:411-426` | `hermes3:latest` (keep_alive 24h) | gcp_ollama → 111 | 每 4h | ❌ |
| 2 | Hermes L1 意圖分類Telegram NLP| `services/hermes_analyst_service.py:151-167` | `hermes3:latest` | gcp_ollama → 111 | 事件驅動 | ❌ |
| 3 | KM Embeddingworker queue| `services/openclaw_learning_service.py:111` + `services/ollama_service.py:592-639` | `bge-m3:latest` | EMBEDDING_HOST → resolve | 每 60s 輪詢 | ❌ |
| 4 | KM Embedding即時 RAG 查詢)| `services/openclaw_learning_service.py:399` | `bge-m3:latest` | 同上 | 事件驅動 | ❌ |
| 5 | **AiderHeal Code Repair** ⚠️| `services/aider_heal_executor.py:48-49` | `qwen2.5-coder:7b` | **寫死 111**(違反 ADR-027| Code Review 觸發 | ❌ |
| 6 | MCP L1/L2 Gemini Grounding | `services/mcp_collector_service.py:163-167, 185-186` | `gemini-2.0-flash``gemini-1.5-flash` | gemini | 6 topic / 24h | ❌ |
| 7 | MCP L3 Ollama Fallback | `services/mcp_collector_service.py:205-214` | `qwen2.5-coder:7b` | gcp_ollama → 111 | Gemini 雙重失敗才觸發 | ❌ |
| 8 | OpenClaw 日報 | `services/openclaw_strategist_service.py:1093``_call_gemini` (L668) → `_call_nvidia_nim` (L694) | `gemini-2.5-flash``meta/llama-3.3-70b-instruct` | gemini → nim | 每日 09:00 | ❌ |
| 9 | OpenClaw 週報 | `services/openclaw_strategist_service.py:759` | 同上 | 同上 | 週一 06:00 | ❌ |
| 10 | OpenClaw 月報 | `services/openclaw_strategist_service.py:1267` | 同上 | 同上 | 每月 1 日 07:00 | ❌ |
| 11 | OpenClaw Meta 自審 | `services/openclaw_strategist_service.py:1503` | 同上 | 同上 | 每 6h | ❌ |
| 12 | OpenClaw Q&ATelegram NLP| `services/openclaw_strategist_service.py:56` | 同上 | gemini → nim | 事件驅動 | ❌ |
| 13 | **NemoTron 行動派發** | `services/nemoton_dispatcher_service.py:101-102` | `meta/llama-3.1-8b-instruct` | nim80 calls/day 配額)| 每 4h | ❌ |
| 14 | **Code Review Hermes 掃描** ⚠️| `services/code_review_pipeline_service.py:218-225` | `hermes3:latest` | **寫死 HERMES_URL111**| CD 部署 | ❌ |
| 15 | Code Review OpenClaw 評估 | `services/code_review_pipeline_service.py:278-286` | `gemini-2.5-flash` | gemini | CD 部署 | ❌ |
| 16 | Code Review ElephantAlpha 降級 | `services/code_review_pipeline_service.py:293-299``services/elephant_service.py:24-30` | `nvidia/llama-3.3-nemotron-super-49b-v1.5` (chain) | nim | CD 部署 | ❌ |
| 17 | EA Autonomous Engine | `services/elephant_alpha_autonomous_engine.py:540` | ElephantService | nim | daemon thread | ❌ |
| 18 | EA HITL pre-fetchHermes 預跑)| `services/elephant_alpha_orchestrator.py`line TBD| `hermes3:latest` | gcp_ollama → 111 | EA escalation 事件 | ❌ |
| 19 | PPT Gemini 分析 | `routes/openclaw_bot_routes.py:2464-2477` `_call_gemini` | `gemini-2.0-flash` | gemini | Telegram 指令 | ❌ |
| 20 | PPT Ollama Fallback | `routes/openclaw_bot_routes.py:2479-2500` | `qwen2.5-coder:7b` | gcp_ollama → 111 | 主路徑失敗 | ❌ |
| 21 | **PPT NIM (deepseek-v3.2)** ⚠️| `routes/openclaw_bot_routes.py:2513-2528` | `deepseek-ai/deepseek-v3.2`(不在 ELEPHANT_FALLBACK 列表)| nim | 同上 | ❌ |
| 22 | Sales Copy | `routes/ai_routes.py:650` + `services/ollama_service.py:219-308` | `llama3.1:8b` | gcp_ollama → 111 | HTTP API | ✅ |
| 23 | Trend 商品比對 | `routes/ai_routes.py:503` | `llama3.1:8b` | gcp_ollama → 111 | HTTP API | ✅ |
| 24 | Trend Web Search Q&A | `routes/trend_routes.py:293-294` + `routes/ai_routes.py:1129` | `llama3.1:8b` | gcp_ollama → 111 | HTTP | 部分 ✅ |
| 25 | Product Insights | `routes/ai_routes.py:1219` | `llama3.1:8b` | gcp_ollama → 111 | HTTP | ✅ |
| 26 | Trend Keywords | `routes/ai_routes.py:1307` | `llama3.1:8b` | gcp_ollama → 111 | HTTP | ✅ |
| 27 | Telegram Bot `/copy` | `services/telegram_bot_service.py:347-362` | `llama3.1:8b` | gcp_ollama → 111 | Telegram | ❌ |
| 28 | Telegram Bot 第二處 | `services/telegram_bot_service.py:1204-1206` | `llama3.1:8b` | gcp_ollama → 111 | Telegram | ❌ |
| 29 | OpenClaw Bot Q&A 主鏈 Ollama | `routes/openclaw_bot_routes.py:6784-6824` | `llama3.1:8b` | gcp_ollama → 111 | Telegram | ❌ |
| 30 | OpenClaw Bot Q&A 備援 Gemini | `routes/openclaw_bot_routes.py:~6843+` | `gemini-2.0-flash` | gemini | fallback | ❌ |
| 31 | OpenClaw Bot Q&A 備援 NIM | `routes/openclaw_bot_routes.py` | `deepseek-ai/deepseek-v3.2` | nim | fallback | ❌ |
| 32 | bot_api_routes 文案 | `routes/bot_api_routes.py:673-693` | `llama3.1:8b` | gcp_ollama → 111 | HTTP 內部 | ❌ |
| 33 | trend_crawler_service Ollama | `services/trend_crawler_service.py:35` | `llama3.1:8b` | gcp_ollama → 111 | 趨勢爬蟲流程 | ❌ |
| 34 | ai_provider 抽象層 | `services/ai_provider.py:74` | `llama3.1:8b` | gcp_ollama → 111 | 由 caller 觸發 | ❌ |
### 1.3 戰役清單未列的 8 個遺漏點
- #27/#28 `telegram_bot_service.py` 兩處
- #32 `routes/bot_api_routes.py:673`
- #33 `services/trend_crawler_service.py:35`
- #34 `services/ai_provider.py:74`
- #17 EA Engine 與 #18 EA HITL pre-fetch 是兩條獨立鏈
- Code Review pipeline 內部其實**同時呼叫 Hermes#14+ Gemini#15+ ElephantAlpha#16)三個獨立 LLM**
### 1.4 AIGenerationHistory 覆蓋率
- 只有 `routes/ai_routes.py` 4 處L361/1163/1252/1339
- **覆蓋率 4/34 ≈ 11.8%**
- Phase 1 必須建立統一 `ai_calls` 表並接入剩餘 30 個呼叫點
---
## Section 2 — 13 個 MCP Server 紅綠燈
| # | MCP Server | 紅綠燈 | 評估 |
|---|-----------|--------|------|
| 1 | mcp-omnisearchTavily/Exa| 🟢 立即引入 | 取代 Gemini Grounding 單點依賴 |
| 2 | firecrawl-mcp自建| 🟢 立即引入 | 補強 SPA 反爬蟲,**強制 mem_limit:2g + chrome-reaper** |
| 3 | postgres-mcp | 🟢 立即引入 | RBAC 限 SELECT 到 ai_insights/daily_sales/competitor_prices 等熱表 |
| 4 | playwright-mcp | 🟡 評估後 | 與 firecrawl 重疊,選一個即可 |
| 5 | memory-mcpAnthropic KG| 🔴 不採用 | 違反 ADR-002pgvector 唯一)|
| 6 | fetch-mcp | 🟡 評估後 | 簡單 HTTPrequests.get 寫一行就好 |
| 7 | sequential-thinking-mcp | 🟡 評估後 | Phase 11 RAG 完成後再評估 |
| 8 | filesystem-mcp | 🟢 立即引入 | 跨 188/110/MacBook 開發效率 |
| 9 | git-mcp | 🟢 立即引入 | momo 用 Gitea選 git-mcpgithub-mcp 不適用)|
| 10 | time-mcp | 🟡 評估後 | 已有 TAIPEI_TZ 處理,低優先 |
| 11 | sentry-mcp | 🔴 不採用 | momo 沒用 Sentry走 ADR-013 AutoHeal 既有閉環 |
| 12 | slack-mcp | 🔴 不採用 | 統帥用 Telegram |
| 13 | gdrive-mcp | 🟡 評估後 | PPT v3 穩定後再考慮 |
### 2.1 Phase 10 引入順序5 個 🟢)
1. **postgres-mcp**(最高 ROI — 統帥每天 SQL 查詢)
2. **mcp-omnisearch**Tavily 主 + Exa 備Tavily 1000 free/月,避開 Brave
3. **filesystem-mcp**(跨主機開發效率)
4. **firecrawl-mcp**(爬蟲韌性)
5. **git-mcp**Gitea 兼容)
---
## Section 3 — BGE-M3 一致性現況報告
### 3.1 模型參數盤點
| 項目 | 實況 |
|------|------|
| 主呼叫位置 | `services/ollama_service.py:592-639` `generate_embedding` |
| 預設模型 | `bge-m3:latest`floating tag — **風險**|
| API endpoint | 主:`POST /api/embed`fallback`POST /api/embeddings` |
| Host 解析 | `host` 參數 > `EMBEDDING_HOST` env > `resolve_ollama_host()` |
| Timeout | env `OLLAMA_EMBED_TIMEOUT``EMBEDDING_TIMEOUT`,預設 45s |
| **normalize 參數** | ❌ **未顯式傳遞**(依賴 server-side 預設)|
| **pooling 策略** | ❌ **未顯式傳遞**(依賴 server-side 預設 mean|
| 維度 | 1024pgvector column 鎖定)|
| HNSW 索引 | `vector_cosine_ops`cosine 距離)|
### 3.2 風險警示
🔴 **HIGH 風險 1normalize 未強制**
- bge-m3 server-side 預設 normalize=True但無程式契約鎖定
- **護欄**:在 ai_insights 寫入時記錄 `embedding_signature`model+normalize+dim hash
🟡 **MED 風險 2`bge-m3:latest` floating tag**
- `:latest` 在任何 Ollama upgrade 都會跳版本,**RAG 召回會悄悄退化**
- **護欄**:固定為某個 digest 或固定 tag
🟢 **LOW 風險 3dim=1024 一致性**
- 程式與 schema 都鎖 1024無衝突
### 3.3 ai_insights.embedding 統計(**待 SSH 188 確認**
```sql
SELECT
COUNT(*) AS total,
COUNT(embedding) AS with_embedding,
COUNT(*) - COUNT(embedding) AS missing,
MIN(created_at) FILTER (WHERE embedding IS NOT NULL) AS earliest,
MAX(created_at) FILTER (WHERE embedding IS NOT NULL) AS latest,
COUNT(DISTINCT array_length(embedding::real[], 1)) AS distinct_dims
FROM ai_insights;
```
> **statistics needed before Phase 11 開工**
### 3.4 Embedding worker 存活確認(**待 SSH 188**
```bash
docker logs momo-scheduler 2>&1 | grep "OCLearn"
```
若 worker 死了,新 ai_insights 會持續累積 `embedding IS NULL`RAG 召回率降級而無告警。
---
## Section 4 — A2 替代查證紅綠燈
| 任務 | 結論 | 戰術 |
|------|------|------|
| OpenClaw Q&A: Gemini → Qwen | 🟡 黃燈 | qwen3:14b + 繁中強制 prompt + Gemini fallback chain + **黃金測試集 A/B 必跑** |
| Nemotron: NIM → DeepSeek-R1 | 🔴 紅燈 | **改用 qwen3:14b**DeepSeek-R1 Ollama tool_calls 假支援GitHub Issue #10935 未解)|
| Phase 10 Search API | 🟢 綠燈 | Tavily 主1000 free/月)+ Exa 備1000 free月成本 $0**避開 Brave**2026-02-12 取消免費 tier|
### 4.1 三大警訊
1. **Qwen 繁中短板有學術佐證**TMMLU+ 論文):必跑黃金集 A/B
2. **DeepSeek-R1 在 Ollama 是「假支援」**:官方 tools capability 標示但 chat template 缺對應 jinja
3. **Brave 政策大改**2026-02-12 後新用戶須綁信用卡
---
## Section 5 — 統帥決策建議
### 5.1 Phase 1 LLM Logger 優先接點 TOP 5
| 優先 | 呼叫點 | 理由 |
|-----|--------|------|
| **#1** | NemoTron 派發(#13| NIM 80 calls/day 硬上限 + 結構化輸出,配額管理剛需 |
| **#2** | OpenClaw 三大報告(#8/#9/#10/#114 個合併)| Gemini 主力prompt+output+token 完整 trace |
| **#3** | Hermes 競價分析(#1| 4h 一次 + 每次 ~300 商品,需回溯為何漏 SKU |
| **#4** | Code Review 三鏈(#14/#15/#16| ElephantAlpha 49B 成本可觀,需追蹤 |
| **#5** | OpenClaw Bot Q&A 三層 fallback#29/#30/#31| Telegram 用戶端體驗一線 |
### 5.2 統一介面建議
```python
@llm_call_logger(provider, model, callsite)
def some_llm_call(...):
# 自動捕捉prompt/output/tokens_in/tokens_out/duration/host/error/cost
# 雙寫 ai_calls + 結構化 log
```
AiderHeal#5)暫不接 logger透過 SSH 跑 CLI不在 Python 進程內)。
### 5.3 Phase 11 RAG 一致性護欄(必須 Phase 11 開工前完成)
1. **bge-m3 模型簽名鎖定**:固定 digest + ai_insights 加 `embedding_signature` 欄位
2. **Embedding worker 存活確認**SSH 188 驗證 retry queue worker 真的在跑
### 5.4 戰役級風險揭示v5.1 修訂)
🔴 **新增 Phase 2 修補項**
- AiderHeal `services/aider_heal_executor.py:48` 寫死 111 → 改 resolve_ollama_host
- Code Review Hermes `services/code_review_pipeline_service.py:218` 寫死 111 → 同上
🟡 **新增 Phase 3 觀察項**
- PPT NIM 用 deepseek-v3.2 不在 ELEPHANT_FALLBACK_MODELS → 兩條 NIM 鏈用不同模型,配額易漏算
- OllamaService 全域單例 + monkey-patch 競態風險gunicorn 多 worker
---
## 附錄:關鍵檔案絕對路徑
```
services/ollama_service.py
services/hermes_analyst_service.py
services/openclaw_strategist_service.py
services/openclaw_learning_service.py
services/mcp_collector_service.py
services/nemoton_dispatcher_service.py
services/elephant_service.py
services/elephant_alpha_autonomous_engine.py
services/elephant_alpha_orchestrator.py
services/code_review_pipeline_service.py
services/aider_heal_executor.py
services/ai_history_service.py
services/telegram_bot_service.py
services/trend_crawler_service.py
services/ai_provider.py
routes/openclaw_bot_routes.py
routes/ai_routes.py
routes/trend_routes.py
routes/bot_api_routes.py
scheduler.py
run_scheduler.py
migrations/009_pgvector_embedding.sql
migrations/011_embedding_retry_queue.sql
```
---
## 來源A2 web research
- [Qwen3 Technical Report — arXiv](https://arxiv.org/pdf/2505.09388)
- [Ollama qwen3 registry](https://ollama.com/library/qwen3)
- [TMMLU+ Traditional Chinese Eval — arXiv](https://arxiv.org/html/2403.01858v1)
- [DeepSeek-R1-0528 Release Notes](https://api-docs.deepseek.com/news/news250528)
- [Ollama Issue #10935 — R1 missing tool calling](https://github.com/ollama/ollama/issues/10935)
- [Tavily Pricing](https://www.tavily.com/pricing)
- [Brave Free Tier Removal](https://www.implicator.ai/brave-drops-free-search-api-tier-puts-all-developers-on-metered-billing/)
- [Exa API Pricing](https://exa.ai/pricing)