Operation Ollama-First v5.0 / Phase 10 + Phase 12 收尾 docker-compose.mcp.yml — 4+3 容器 MCP stack - postgres-mcp (port 3001): Claude 直連 momo_pro DB read-only RBAC - mcp-omnisearch (3003): Tavily 主 + Exa 備(取代 Gemini Grounding) 避開 Brave(2026-02 取消免費 tier) - firecrawl-self (3002): 自建爬蟲,SPA 反爬蟲 - filesystem-mcp (3004): 跨主機檔案 read-only 護欄 #2 落地(Owen v5.0 鐵律 / ADR-033): firecrawl-self mem_limit:2g + cpus:1.5 PLAYWRIGHT_BROWSER_POOL_MAX=3 chrome-reaper sidecar 每小時清 Chrome zombies 安全設計: - 全部 127.0.0.1 暴露(不外網) - read-only volume mount(filesystem 只能讀) - postgres-mcp RBAC mcp_readonly role 限 SELECT 6 熱表 - API key 全走 env var 不寫死 ADR-031 — MCP 自建 Stack 治理決策 - 取代 Gemini Grounding 唯一通路(多供應商策略) - 預期 70%+ grounding 流量走免費 Tavily - 188 主機資源 +4-5GB RAM 可控 - Migration Plan:6 步驟(含 Tavily/Exa key 申請 + mcp_readonly role 預建) 啟用前置(待統帥): 1. .env 加 TAVILY_API_KEY / EXA_API_KEY / MCP_POSTGRES_PASSWORD / FIRECRAWL_AUTH_KEY 2. momo-db 建 mcp_readonly role + GRANT SELECT 3. ssh wooo@110 → ssh ollama@188 → docker compose -f docker-compose.mcp.yml up -d Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
166 lines
5.8 KiB
Markdown
166 lines
5.8 KiB
Markdown
# ADR-031: MCP 自建 Stack — postgres + omnisearch + firecrawl + filesystem
|
||
|
||
- **Status**: Accepted (待 188 deploy 後 Active)
|
||
- **Date**: 2026-05-04
|
||
- **Decision Maker**: 統帥
|
||
- **Author**: Operation Ollama-First v5.0 / Phase 10
|
||
- **Related**: ADR-028(LLM 路由統一準則)、ADR-032(RAG 自主學習)、ADR-033(三護欄)
|
||
|
||
---
|
||
|
||
## Context
|
||
|
||
戰役 v4.0 階段就提案 MCP 自建(取代 Gemini Grounding 唯一聯網能力),但 Phase 10 因 hook 阻擋 SSH 188 deploy,先完成本地 docker-compose.mcp.yml + ADR 設計,待統帥手動 `ssh wooo@110 → ssh ollama@188 → docker compose -f docker-compose.mcp.yml up -d` 啟用。
|
||
|
||
**為何要自建 MCP?**
|
||
- mcp_collector_service.py 目前 100% 走 Gemini 2.0 Flash Grounding(鎖定 7 場景之一)
|
||
- 若 Gemini API 配額爆 / 政策變更 → 即時情報唯一通路斷
|
||
- 多供應商策略(ADR-030)需要 Tavily / Exa 作為 grounding 備援
|
||
- Claude Code 直連 momo_pro DB(read-only RBAC)能加速統帥日常 SQL 查詢
|
||
|
||
---
|
||
|
||
## Decision — 4 + 3 容器架構
|
||
|
||
### 4 個 MCP server(核心)
|
||
|
||
| Server | port | 用途 | 取代誰 |
|
||
|---|---|---|---|
|
||
| `postgres-mcp` | 127.0.0.1:3001 | Claude 直連 momo_pro DB(read-only) | 統帥手動 SQL |
|
||
| `mcp-omnisearch` | 127.0.0.1:3003 | Tavily + Exa 統一搜尋 | Gemini Grounding |
|
||
| `firecrawl-self` | 127.0.0.1:3002 | 自建爬蟲(含護欄 #2) | 部分自寫 BeautifulSoup |
|
||
| `filesystem-mcp` | 127.0.0.1:3004 | 跨主機檔案操作(read-only) | SSH 跳板手動 cat |
|
||
|
||
### 3 個輔助容器
|
||
|
||
| Container | 用途 |
|
||
|---|---|
|
||
| `firecrawl-redis` | Firecrawl job queue |
|
||
| `firecrawl-playwright` | Browser pool(mem 1.5g)|
|
||
| `chrome-reaper` | 每小時清 Chrome 殘留(護欄 #2)|
|
||
|
||
### 護欄 #2 落地(Owen v5.0 鐵律)
|
||
|
||
```yaml
|
||
firecrawl-self:
|
||
deploy:
|
||
resources:
|
||
limits:
|
||
memory: 2g # ⭐ 硬上限
|
||
cpus: '1.5'
|
||
environment:
|
||
- PLAYWRIGHT_BROWSER_POOL_MAX=3
|
||
- SCRAPE_TIMEOUT_MS=30000
|
||
healthcheck:
|
||
interval: 30s
|
||
start_period: 60s
|
||
|
||
chrome-reaper:
|
||
command: 每小時 pkill chrome zombie processes
|
||
```
|
||
|
||
### 安全設計
|
||
|
||
- **僅 127.0.0.1 暴露**:避免外網直連 DB / 爬蟲服務
|
||
- **read-only volume mount**:filesystem-mcp 只能讀
|
||
- **postgres-mcp RBAC**:mcp_readonly role 限 SELECT 到熱表(ai_insights / ai_calls / mcp_calls / daily_sales_snapshot / competitor_prices / products)
|
||
- **Tavily/Exa API key 走 env**:不寫死 docker-compose
|
||
|
||
### 取代 Gemini Grounding 的 fallback 鏈(mcp_collector_service.py 改造)
|
||
|
||
```
|
||
舊:
|
||
Gemini 2.0 Grounding → Gemini 1.5 → Ollama → 靜態
|
||
|
||
新:
|
||
mcp-omnisearch (Tavily) → omnisearch (Exa) →
|
||
全失敗 → Gemini 2.0 Grounding (保留為 L4)
|
||
→ Gemini 1.5 → Ollama → 靜態
|
||
```
|
||
|
||
預期 70%+ 流量走免費 Tavily,省 ~70% Gemini Grounding 成本。
|
||
|
||
---
|
||
|
||
## Alternatives Considered
|
||
|
||
| 方案 | 否決理由 |
|
||
|---|---|
|
||
| **A. 維持 Gemini Grounding 唯一** | 單供應商風險(已是 ADR-030 否決理由)|
|
||
| **B. 用 Brave Search API** | 2026-02 取消免費 tier(A2 web research 紅燈)|
|
||
| **C. 純 Tavily 不要 Firecrawl** | Firecrawl 對 SPA 動態頁更強(蝦皮等 JS-heavy 站)|
|
||
| **D. Firecrawl 不限資源** | 188 主機跑 5+ project,OOM 連鎖(reference_188_multi_project)|
|
||
| **E. 用 SaaS Firecrawl Cloud** | 成本(自建免費)+ 資料外流風險 |
|
||
|
||
---
|
||
|
||
## Consequences
|
||
|
||
### 正面(5)
|
||
1. **Gemini Grounding 多供應商**:Tavily 主 + Exa 備援,月省 ~70% grounding 成本
|
||
2. **Claude 直連 DB**:統帥日常 SQL 查詢可走 MCP 介面(read-only 安全)
|
||
3. **爬蟲自主性**:Firecrawl 取代部分自寫爬蟲,SPA 反爬蟲更強
|
||
4. **零外部 SaaS 依賴**:全部自建在 188(Tavily/Exa 是 API 不是 SaaS)
|
||
5. **Owen 護欄 #2 落地**:mem_limit + chrome-reaper 防 OOM
|
||
|
||
### 負面(3)
|
||
1. **188 主機資源占用 +4-5GB RAM**(Firecrawl 2g + Playwright 1.5g + 其他)
|
||
2. **Tavily/Exa API key 維護**:申請 + 月配額追蹤
|
||
3. **mcp_collector_service.py 重構工作量**:~200 行改動
|
||
|
||
### 風險(3)
|
||
1. **Firecrawl OOM 連鎖**:mem_limit 2g 觸發 OOM kill → mitigate by healthcheck + restart
|
||
2. **Tavily 免費額度(1000/月)爆**:mitigate by Exa 備援 + Gemini Grounding L4
|
||
3. **postgres-mcp RBAC 設置失誤**:mitigate by mcp_readonly role 預先建立 + only SELECT
|
||
|
||
---
|
||
|
||
## Verification
|
||
|
||
### V1:健康檢查
|
||
```bash
|
||
curl http://localhost:3001/health # postgres-mcp
|
||
curl http://localhost:3002/health # firecrawl
|
||
curl http://localhost:3003/health # omnisearch
|
||
curl http://localhost:3004/health # filesystem
|
||
# 全部期待 200 OK
|
||
```
|
||
|
||
### V2:Firecrawl 資源
|
||
```bash
|
||
ssh ollama@192.168.0.188 'docker stats momo-mcp-firecrawl --no-stream'
|
||
# 期望 < 1.8GB(mem_limit 2GB 90%)
|
||
```
|
||
|
||
### V3:Tavily 配額
|
||
```sql
|
||
SELECT COUNT(*) FROM mcp_calls
|
||
WHERE server = 'omnisearch' AND tool = 'tavily_search'
|
||
AND called_at > date_trunc('month', NOW());
|
||
-- 期望 < 1000(免費額度上限)
|
||
```
|
||
|
||
---
|
||
|
||
## Migration Plan
|
||
|
||
| 步驟 | 工作 | 狀態 |
|
||
|---|---|---|
|
||
| 10.1 | docker-compose.mcp.yml 寫完 | ✅ 本 commit |
|
||
| 10.2 | ADR-031 撰寫 | ✅ 本 commit |
|
||
| 10.3 | 統帥申請 Tavily + Exa API key | ⏳ 待 |
|
||
| 10.4 | momo-db 建 mcp_readonly role + GRANT SELECT | ⏳ 待 |
|
||
| 10.5 | 188 deploy: docker compose -f docker-compose.mcp.yml up -d | ⏳ 待 |
|
||
| 10.6 | mcp_collector_service.py 改用 mcp-omnisearch(取代 Gemini Grounding 主路徑)| ⏳ Phase 10.5 |
|
||
| 10.7 | 健康檢查 + Firecrawl mem 監控告警 | ⏳ Phase 10.5 |
|
||
|
||
---
|
||
|
||
## References
|
||
|
||
- `docker-compose.mcp.yml`(本 commit)
|
||
- ADR-028(LLM 路由)/ ADR-030(多供應商)/ ADR-033(護欄)
|
||
- `services/mcp_collector_service.py`(將改造)
|
||
- A2 web research 報告:`docs/phase0_research_report_20260503.md`
|
||
- mcp-omnisearch GitHub:https://github.com/spences10/mcp-omnisearch
|