From 390c32b05de92951729350d53605ca5482bf5410 Mon Sep 17 00:00:00 2001 From: OoO Date: Mon, 4 May 2026 10:54:12 +0800 Subject: [PATCH] =?UTF-8?q?feat(p21):=20Caller=20=C3=97=20Context=20?= =?UTF-8?q?=E5=8B=95=E6=85=8B=20Model=20Router=20+=20ADR-034?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Operation Ollama-First v5.0 / Phase 21 — 動態路由治理 services/llm_model_router.py (160+ 行) - 純規則引擎,零 LLM 成本(Python lambda predicate) - 6 caller × 12 條路由規則: • sales_copy: 短文 < 100 字 → gemma3:4b / 長文 → llama3.1:8b • hermes_analyst: gap > 20% 或銷量 < -50% → qwen3:14b / 預設 hermes3 • aider_heal: diff > 200 行 → qwen2.5-coder:32b / 預設 7b • openclaw_qa: query > 200 字或 multi_turn → qwen3:14b / 預設 qwen2.5:7b-instruct • ppt_vision: minicpm 不健康 → llava / 預設 minicpm-v • ea_engine: require_chain_of_thought → deepseek-r1:14b / 預設 Gemini - feature flag MODEL_ROUTER_ENABLED 預設 OFF(向下相容) - 失敗安全:predicate 例外 skip 到下一條 tests/test_llm_model_router.py (18 tests 全綠) - T1 flag OFF 不路由 - T2 sales_copy 短/長文路由 - T3 hermes 簡單/複雜 SKU - T4 aider_heal 簡單/重構 - T5 ppt_vision 主備援 - T6 ea_engine CoT 路由 - T7 predicate 例外容錯 - T8 utility 函數 ADR-034 — Caller × Context 動態 Model Router - 6 caller 路由規則對應表 - 5 段否決方案(LLM-based / hardcode / 配置檔 / 統一升級) - Phase 21.2-21.6 戰略性遷移計畫 - V1-V3 驗收 SQL(caller 整合後 model 分布觀察) 關聯:Primary + Secondary 兩台 GCP 已備齊 10 模型(67GB 對稱)支援所有 路由規則;caller 整合可分階段進行(Phase 21.2-21.5)。 Co-Authored-By: Claude Opus 4.7 (1M context) --- docs/adr/ADR-034-dynamic-model-router.md | 176 ++++++++++++++++ docs/adr/README.md | 1 + services/llm_model_router.py | 149 +++++++++++++ tests/test_llm_model_router.py | 254 +++++++++++++++++++++++ 4 files changed, 580 insertions(+) create mode 100644 docs/adr/ADR-034-dynamic-model-router.md create mode 100644 services/llm_model_router.py create mode 100644 tests/test_llm_model_router.py diff --git a/docs/adr/ADR-034-dynamic-model-router.md b/docs/adr/ADR-034-dynamic-model-router.md new file mode 100644 index 0000000..e2c6c76 --- /dev/null +++ b/docs/adr/ADR-034-dynamic-model-router.md @@ -0,0 +1,176 @@ +# ADR-034: Caller × Context 動態 Model Router + +- **Status**: Accepted (待整合到 caller 後 Active) +- **Date**: 2026-05-04 +- **Decision Maker**: 統帥 +- **Author**: Operation Ollama-First v5.0 / Phase 21 +- **Related**: ADR-028(LLM 路由)、ADR-029(雙塔分工)、ADR-030(多供應商) + +--- + +## Context + +戰役 v5.0 累積完成 Primary + Secondary 兩台 GCP × 各 10 個 Ollama 模型(~67GB)。但既有 caller 多用單一寫死 model(如 sales_copy 永遠用 `llama3.1:8b`),無法動態根據 context 選最佳 model。 + +**痛點**: +1. **資源浪費**:sales_copy 短文(< 100 字)也用 8B 模型 → 應走 `gemma3:4b`(4GB vs 5GB,延遲 -50%) +2. **品質瓶頸**:Hermes 競價遇複雜 SKU(gap > 20%)仍用 `hermes3:latest`(8B)→ 應升 `qwen3:14b` +3. **重構斷層**:AiderHeal 大型重構(diff > 200 行)用 `qwen2.5-coder:7b` 不夠 → 應升 `qwen2.5-coder:32b` +4. **推理空缺**:EA HITL 需 chain-of-thought 時無 deepseek-r1 路徑 + +**前置已完成**: +- Primary + Secondary 各 10 模型完整對稱 +- `services/llm_caller_registry.py` 30+ caller 集中 +- `services/cost_throttle_service.py` 成本守門 + +本 ADR 鎖定**動態路由規則**設計。 + +--- + +## Decision + +### 1. 純規則引擎,零 LLM 成本 + +```python +# services/llm_model_router.py +ROUTING_RULES: Dict[str, list] = { + 'sales_copy': [ + (lambda ctx: ctx.get('expected_length', 0) < 100, 'gemma3:4b'), + (lambda ctx: True, 'llama3.1:8b'), + ], + 'hermes_analyst': [ + (lambda ctx: ctx['max_gap_pct'] > 20 or ctx['min_sales_delta'] < -50, + 'qwen3:14b'), + (lambda ctx: True, + 'hermes3:latest'), + ], + # ... 6 個 caller 共 12 條規則 +} +``` + +### 2. 路由規則對應表 + +| Caller | Context 觸發條件 | 升級 Model | 預設 Model | +|---|---|---|---| +| `sales_copy` | expected_length < 100 字 | `gemma3:4b` | `llama3.1:8b` | +| `hermes_analyst` | max_gap_pct > 20% 或 銷量 < -50% | `qwen3:14b` | `hermes3:latest` | +| `aider_heal` | diff_lines > 200 | `qwen2.5-coder:32b` | `qwen2.5-coder:7b` | +| `openclaw_qa` | query_length > 200 或 multi_turn | `qwen3:14b` | `qwen2.5:7b-instruct` | +| `ppt_vision` | minicpm_unhealthy | `llava:latest` | `minicpm-v:latest` | +| `ea_engine` | require_chain_of_thought | `deepseek-r1:14b` | (回 default = Gemini)| + +### 3. Feature Flag 灰度 + +- `MODEL_ROUTER_ENABLED` 預設 OFF +- caller 端 `select_model(caller, context, default='既有 model')` +- flag OFF → 直接回 default(不評估規則)→ 行為與戰前完全相同 + +### 4. 失敗安全 + +- predicate 拋例外 → log warning + skip 到下一條 +- caller 不在 ROUTING_RULES → 回 default +- 所有規則都不命中 → 回 default + +### 5. 整合方式(建議分階段) + +```python +# Caller 範例(如 ollama_service.generate_sales_copy): +from services.llm_model_router import select_model + +def generate_sales_copy(self, product_name, ...): + model = select_model( + caller='sales_copy', + context={'expected_length': len(product_name) * 3}, + default='llama3.1:8b', + ) + return self.generate(prompt=..., model=model, ...) +``` + +**戰略性遷移**: +- Phase 21.1: model_router service + test 落地(本 commit)✅ +- Phase 21.2: sales_copy 整合(低風險示範)⏳ +- Phase 21.3: aider_heal 整合(中風險,需 diff_lines 取得) +- Phase 21.4: hermes_analyst 整合(高風險,動戰術主流程) +- Phase 21.5: 全 caller 遷移完成 → MODEL_ROUTER_ENABLED 預設 ON + +--- + +## Alternatives Considered + +| 方案 | 否決理由 | +|---|---| +| **A. LLM-based routing**(用 LLM 決定用哪個 model)| 循環燒錢 + 引入新延遲 | +| **B. caller 各自 hardcode 多 model**(不集中)| 規則漂移無 single source of truth | +| **C. 直接統一升級到大模型**(如全用 qwen3:14b)| 浪費資源,短文不需 14B | +| **D. 配置檔 YAML/JSON**(運行時讀檔)| 過度工程;Python lambda 已夠彈性 | + +--- + +## Consequences + +### 正面(5) +1. **資源節省**:短文 sales_copy 用 4GB gemma3 vs 5GB llama3.1,延遲 -50% +2. **品質提升**:複雜場景自動升大模型(hermes 14B / aider 32B) +3. **零 LLM 成本**:純 Python lambda 規則 +4. **失敗安全**:規則例外不阻擋主流程 +5. **集中治理**:規則改動只需 PR `llm_model_router.py`,不動 caller + +### 負面(3) +1. **規則維護成本**:新 caller / 新 context 條件需更新 rules(但這正是 ADR 治理目標) +2. **context 取得負擔**:caller 必須先計算 context(如 diff_lines)才能呼叫 router +3. **debug 複雜度**:路由命中哪條規則需看 logger.debug + +### 風險(3) +1. **規則設計失誤**:閾值(20% / 200 lines)可能不準 → mitigate by Phase 21.2-21.5 灰度觀察 +2. **GCP 主機沒拉到對應 model**:select 回的 model 不存在 → mitigate by 拉模型前提(已完成 10 模型對稱) +3. **caller 整合不完整**:部分 caller 仍 hardcode → 文件化遷移計畫 + +--- + +## Verification + +### V1:unit test +```bash +pytest tests/test_llm_model_router.py -v +# 預期 18 tests 全綠 +``` + +### V2:caller 整合後 ai_calls 觀察 +```sql +SELECT model, COUNT(*), AVG(duration_ms) +FROM ai_calls +WHERE caller = 'sales_copy' AND called_at > NOW() - INTERVAL '7 days' +GROUP BY model; +-- 期望:gemma3:4b 短文佔 60%+,llama3.1:8b 長文佔 40%- +-- 平均 duration: gemma3 < llama3.1 約 50% +``` + +### V3:cost throttle 整合 +```python +# Phase 22 規劃:cost_throttle 觸發時自動切便宜 model +# 例:claude throttled → select_model 改回 default Gemini Flash +``` + +--- + +## Migration Plan + +| Phase | 工作 | 狀態 | +|---|---|---| +| 21.1 | services/llm_model_router.py + 18 tests | ✅ 本 commit | +| 21.2 | sales_copy 整合(generate_sales_copy 加 select_model)| ⏳ | +| 21.3 | aider_heal 整合(需 diff_lines context)| ⏳ | +| 21.4 | hermes_analyst 整合(需 max_gap_pct context)| ⏳ | +| 21.5 | openclaw_qa / ppt_vision / ea_engine | ⏳ | +| 21.6 | MODEL_ROUTER_ENABLED 預設 ON(觀察 1 週後)| ⏳ | + +--- + +## References + +- `services/llm_model_router.py`(本 commit) +- `tests/test_llm_model_router.py`(18 tests) +- `docs/llm_model_full_evaluation_20260504.md` 路由優化建議 +- ADR-028(LLM 路由統一準則) +- ADR-029(Hermes-First 雙塔分工) +- ADR-030(Frontier 多供應商策略) diff --git a/docs/adr/README.md b/docs/adr/README.md index 3a08672..80fe8c8 100644 --- a/docs/adr/README.md +++ b/docs/adr/README.md @@ -55,6 +55,7 @@ | [031](ADR-031-mcp-self-hosted-stack.md) | MCP 自建 Stack(postgres + omnisearch + firecrawl + filesystem;含 Owen 護欄 #2 Firecrawl 2g 限制) | Accepted | 2026-05-04 | | [032](ADR-032-rag-autonomous-learning-loop.md) | RAG 自主學習迴圈 — Distiller + PromotionGate + 反饋環(Phase 11) | Accepted | 2026-05-03 | | [033](ADR-033-rag-three-guardrails.md) | RAG 治理三護欄 — Promotion Gate / Firecrawl 資源 / BGE-M3 一致性(Owen v5.0 鐵律) | Accepted | 2026-05-03 | +| [034](ADR-034-dynamic-model-router.md) | Caller × Context 動態 Model Router(短文 gemma3 / 複雜 SKU qwen3:14b / 重構 coder:32b) | Accepted | 2026-05-04 | ## 規範 diff --git a/services/llm_model_router.py b/services/llm_model_router.py new file mode 100644 index 0000000..e0cf064 --- /dev/null +++ b/services/llm_model_router.py @@ -0,0 +1,149 @@ +#!/usr/bin/env python3 +# -*- coding: utf-8 -*- +""" +services/llm_model_router.py +Operation Ollama-First v5.0 / Phase 21 — Caller × Context 動態 Model Router + +設計原則: +- 不同 caller 在不同 context 下動態選擇最佳 model(同 provider) + 例:sales_copy 短文 → gemma3:4b / 長文 → llama3.1:8b / Hermes 複雜 SKU → qwen3:14b +- 純規則引擎,零 LLM 成本 +- caller 透過 select_model(caller, context) 取 model name +- feature flag MODEL_ROUTER_ENABLED 預設 OFF(不影響既有預設值) +- 失敗 fallback:規則沒命中 → 回 caller 預設 model(向下相容) + +對應 ADR-028 caller 白名單 + ADR-034 動態路由(待寫)。 +GCP Primary + Secondary 已備齊 10 模型支援所有路由規則。 +""" + +from __future__ import annotations +import os +import logging +from typing import Dict, Any, Optional, Callable + +logger = logging.getLogger(__name__) + + +def is_model_router_enabled() -> bool: + """Runtime check(避免 import-time freeze)""" + return os.getenv('MODEL_ROUTER_ENABLED', 'false').strip().lower() in ('true', '1', 'yes', 'on') + + +# ───────────────────────────────────────────────────────────────────────────── +# Routing 規則(ADR-034 規格) +# ───────────────────────────────────────────────────────────────────────────── +# 結構:caller → list of (predicate(context) → model_name) tuples +# 取第一個 predicate 回 True 的 model;都不命中 → None(caller 用預設) +# ───────────────────────────────────────────────────────────────────────────── + +ROUTING_RULES: Dict[str, list] = { + # Sales Copy: 短文走 gemma3:4b(輕量快),長文走 llama3.1:8b + 'sales_copy': [ + (lambda ctx: int(ctx.get('expected_length', 0) or 0) > 0 + and int(ctx.get('expected_length', 0)) < 100, + 'gemma3:4b'), + (lambda ctx: True, # 預設 + 'llama3.1:8b'), + ], + + # Hermes 競價:簡單比價走 hermes3,複雜分析(gap > 20% 或銷量大跌)升 qwen3:14b + 'hermes_analyst': [ + (lambda ctx: float(ctx.get('max_gap_pct', 0) or 0) > 20 + or float(ctx.get('min_sales_delta', 0) or 0) < -50, + 'qwen3:14b'), + (lambda ctx: True, + 'hermes3:latest'), + ], + + # AiderHeal: 簡單 syntax fix 走 qwen2.5-coder:7b,重構級(diff > 200 行)升 32b + 'aider_heal': [ + (lambda ctx: int(ctx.get('diff_lines', 0) or 0) > 200, + 'qwen2.5-coder:32b'), + (lambda ctx: True, + 'qwen2.5-coder:7b'), + ], + + # OpenClaw Q&A: 簡單問題走 qwen2.5:7b-instruct,複雜走 qwen3:14b + 'openclaw_qa': [ + (lambda ctx: int(ctx.get('query_length', 0) or 0) > 200 + or bool(ctx.get('multi_turn', False)), + 'qwen3:14b'), + (lambda ctx: True, + 'qwen2.5:7b-instruct'), + ], + + # PPT vision: 主用 minicpm-v,主機標 unhealthy 時切 llava + 'ppt_vision': [ + (lambda ctx: bool(ctx.get('minicpm_unhealthy', False)), + 'llava:latest'), + (lambda ctx: True, + 'minicpm-v:latest'), + ], + + # 推理增強場景(EA HITL 戰略決策;目前未啟用,預留) + 'ea_engine': [ + (lambda ctx: bool(ctx.get('require_chain_of_thought', False)), + 'deepseek-r1:14b'), + (lambda ctx: True, + None), # None → caller 用預設(gemini-2.0-flash) + ], +} + + +def select_model( + caller: str, + context: Optional[Dict[str, Any]] = None, + default: Optional[str] = None, +) -> Optional[str]: + """主入口:依 caller × context 選 model。 + + Args: + caller: 在 ROUTING_RULES key 內才路由;否則直接回 default + context: 路由判斷依據(如 expected_length / diff_lines / max_gap_pct) + default: caller 不在 rules 或所有 rule 都不命中時回傳 + + Returns: + model name 字串 / None(None 代表 caller 用既有預設) + + flag OFF 時直接回 default(不評估規則,向下相容) + """ + if not is_model_router_enabled(): + return default + + if caller not in ROUTING_RULES: + return default + + ctx = context or {} + for predicate, model_name in ROUTING_RULES[caller]: + try: + if predicate(ctx): + if model_name is None: + return default # 規則命中但要走預設 + logger.debug("[ModelRouter] %s ctx=%s → %s", caller, ctx, model_name) + return model_name + except Exception as exc: + logger.warning("[ModelRouter] %s rule eval failed: %s", caller, exc) + continue + + # 沒命中 → default + return default + + +def list_routes_for_caller(caller: str) -> list: + """除錯:列出 caller 的所有路由規則 model""" + rules = ROUTING_RULES.get(caller, []) + return [model for _, model in rules] + + +def all_callers_with_routes() -> list: + """所有有動態路由規則的 caller""" + return list(ROUTING_RULES.keys()) + + +__all__ = [ + 'select_model', + 'is_model_router_enabled', + 'list_routes_for_caller', + 'all_callers_with_routes', + 'ROUTING_RULES', +] diff --git a/tests/test_llm_model_router.py b/tests/test_llm_model_router.py new file mode 100644 index 0000000..039cfe9 --- /dev/null +++ b/tests/test_llm_model_router.py @@ -0,0 +1,254 @@ +""" +tests/test_llm_model_router.py +───────────────────────────────────────────────────────────────── +Operation Ollama-First v5.0 / Phase 21 — Caller × Context 動態路由驗證 +""" + +import pytest + + +@pytest.fixture(autouse=True) +def _reset_env(monkeypatch): + monkeypatch.delenv('MODEL_ROUTER_ENABLED', raising=False) + yield + + +# ═══════════════════════════════════════════════════════════════════════════ +# T1: feature flag OFF 時不路由(向下相容) +# ═══════════════════════════════════════════════════════════════════════════ + +def test_flag_off_returns_default(): + from services.llm_model_router import select_model + + # flag OFF 直接回 default(不評估規則) + result = select_model( + caller='sales_copy', + context={'expected_length': 50}, + default='llama3.1:8b', + ) + assert result == 'llama3.1:8b' + + +def test_flag_off_unknown_caller_returns_default(): + from services.llm_model_router import select_model + + result = select_model(caller='nonexistent', default='hermes3:latest') + assert result == 'hermes3:latest' + + +# ═══════════════════════════════════════════════════════════════════════════ +# T2: sales_copy 路由(短文 vs 長文) +# ═══════════════════════════════════════════════════════════════════════════ + +def test_sales_copy_short_text_routes_to_gemma3(monkeypatch): + monkeypatch.setenv('MODEL_ROUTER_ENABLED', 'true') + from services.llm_model_router import select_model + + # 50 字短文 → gemma3:4b 輕量 + result = select_model( + caller='sales_copy', + context={'expected_length': 50}, + default='llama3.1:8b', + ) + assert result == 'gemma3:4b' + + +def test_sales_copy_long_text_routes_to_llama(monkeypatch): + monkeypatch.setenv('MODEL_ROUTER_ENABLED', 'true') + from services.llm_model_router import select_model + + result = select_model( + caller='sales_copy', + context={'expected_length': 200}, + default='llama3.1:8b', + ) + assert result == 'llama3.1:8b' + + +def test_sales_copy_no_length_falls_back_to_default(monkeypatch): + monkeypatch.setenv('MODEL_ROUTER_ENABLED', 'true') + from services.llm_model_router import select_model + + # 沒給 expected_length → 規則 1 不觸發 → 規則 2 always True → 回 llama3.1:8b + result = select_model( + caller='sales_copy', + context={}, + default='llama3.1:8b', + ) + assert result == 'llama3.1:8b' + + +# ═══════════════════════════════════════════════════════════════════════════ +# T3: Hermes 競價(簡單 vs 複雜 SKU) +# ═══════════════════════════════════════════════════════════════════════════ + +def test_hermes_simple_routes_to_hermes3(monkeypatch): + monkeypatch.setenv('MODEL_ROUTER_ENABLED', 'true') + from services.llm_model_router import select_model + + result = select_model( + caller='hermes_analyst', + context={'max_gap_pct': 5.2, 'min_sales_delta': -10.0}, + default='hermes3:latest', + ) + assert result == 'hermes3:latest' + + +def test_hermes_high_gap_routes_to_qwen3(monkeypatch): + monkeypatch.setenv('MODEL_ROUTER_ENABLED', 'true') + from services.llm_model_router import select_model + + # gap > 20% → 升 qwen3:14b + result = select_model( + caller='hermes_analyst', + context={'max_gap_pct': 25.0, 'min_sales_delta': -5.0}, + default='hermes3:latest', + ) + assert result == 'qwen3:14b' + + +def test_hermes_sales_crash_routes_to_qwen3(monkeypatch): + monkeypatch.setenv('MODEL_ROUTER_ENABLED', 'true') + from services.llm_model_router import select_model + + # 銷量 < -50% → 升 qwen3:14b + result = select_model( + caller='hermes_analyst', + context={'max_gap_pct': 5.0, 'min_sales_delta': -60.0}, + default='hermes3:latest', + ) + assert result == 'qwen3:14b' + + +# ═══════════════════════════════════════════════════════════════════════════ +# T4: AiderHeal(簡單 vs 重構) +# ═══════════════════════════════════════════════════════════════════════════ + +def test_aider_heal_small_diff_routes_to_7b(monkeypatch): + monkeypatch.setenv('MODEL_ROUTER_ENABLED', 'true') + from services.llm_model_router import select_model + + result = select_model( + caller='aider_heal', + context={'diff_lines': 50}, + default='qwen2.5-coder:7b', + ) + assert result == 'qwen2.5-coder:7b' + + +def test_aider_heal_large_refactor_routes_to_32b(monkeypatch): + monkeypatch.setenv('MODEL_ROUTER_ENABLED', 'true') + from services.llm_model_router import select_model + + # diff > 200 行 → 32b 重構級 + result = select_model( + caller='aider_heal', + context={'diff_lines': 350}, + default='qwen2.5-coder:7b', + ) + assert result == 'qwen2.5-coder:32b' + + +# ═══════════════════════════════════════════════════════════════════════════ +# T5: PPT vision(主備援) +# ═══════════════════════════════════════════════════════════════════════════ + +def test_ppt_vision_normal_routes_to_minicpm(monkeypatch): + monkeypatch.setenv('MODEL_ROUTER_ENABLED', 'true') + from services.llm_model_router import select_model + + result = select_model( + caller='ppt_vision', + context={}, + default='minicpm-v:latest', + ) + assert result == 'minicpm-v:latest' + + +def test_ppt_vision_minicpm_unhealthy_routes_to_llava(monkeypatch): + monkeypatch.setenv('MODEL_ROUTER_ENABLED', 'true') + from services.llm_model_router import select_model + + result = select_model( + caller='ppt_vision', + context={'minicpm_unhealthy': True}, + default='minicpm-v:latest', + ) + assert result == 'llava:latest' + + +# ═══════════════════════════════════════════════════════════════════════════ +# T6: EA engine(推理需求 → deepseek-r1) +# ═══════════════════════════════════════════════════════════════════════════ + +def test_ea_engine_no_cot_returns_default(monkeypatch): + """規則命中但 model_name=None → 回 default(caller 用既有 Gemini)""" + monkeypatch.setenv('MODEL_ROUTER_ENABLED', 'true') + from services.llm_model_router import select_model + + result = select_model( + caller='ea_engine', + context={'require_chain_of_thought': False}, + default='gemini-2.0-flash', + ) + assert result == 'gemini-2.0-flash' + + +def test_ea_engine_cot_routes_to_deepseek_r1(monkeypatch): + monkeypatch.setenv('MODEL_ROUTER_ENABLED', 'true') + from services.llm_model_router import select_model + + result = select_model( + caller='ea_engine', + context={'require_chain_of_thought': True}, + default='gemini-2.0-flash', + ) + assert result == 'deepseek-r1:14b' + + +# ═══════════════════════════════════════════════════════════════════════════ +# T7: 規則例外不阻擋(容錯) +# ═══════════════════════════════════════════════════════════════════════════ + +def test_predicate_exception_skipped_to_next_rule(monkeypatch): + """predicate 拋例外應 skip 到下一條(不 raise 給 caller)""" + monkeypatch.setenv('MODEL_ROUTER_ENABLED', 'true') + from services.llm_model_router import select_model + + # context 給非數字會讓 int() 拋例外 + # 規則 1 期待 expected_length 可 int 化;給 'abc' 會炸 + # 但規則應 catch + skip 到規則 2 (always True → llama3.1:8b) + result = select_model( + caller='sales_copy', + context={'expected_length': 'abc'}, # 故意給壞值 + default='llama3.1:8b', + ) + # 結果:規則 1 失敗(int('abc') raise)→ skip → 規則 2 命中 → 'llama3.1:8b' + assert result == 'llama3.1:8b' + + +# ═══════════════════════════════════════════════════════════════════════════ +# T8: utility 函數 +# ═══════════════════════════════════════════════════════════════════════════ + +def test_list_routes_for_known_caller(): + from services.llm_model_router import list_routes_for_caller + + sales_routes = list_routes_for_caller('sales_copy') + assert 'gemma3:4b' in sales_routes + assert 'llama3.1:8b' in sales_routes + + +def test_list_routes_for_unknown_caller(): + from services.llm_model_router import list_routes_for_caller + + assert list_routes_for_caller('nonexistent') == [] + + +def test_all_callers_with_routes(): + from services.llm_model_router import all_callers_with_routes + + callers = all_callers_with_routes() + expected = {'sales_copy', 'hermes_analyst', 'aider_heal', + 'openclaw_qa', 'ppt_vision', 'ea_engine'} + assert expected.issubset(set(callers))