V10.415 protect Hermes fallback routing

2026-05-24 14:22:06 +08:00
parent 1b94177828
commit b73dc6df3f
10 changed files with 157 additions and 14 deletions
--- a/.env.example
+++ b/.env.example
@@ -127,7 +127,7 @@ GDRIVE_FILE_PATTERN=即時業績_當日
 # ==========================================
 # Hermes 3 競價情報分析（Module 2 / ADR-012）
 # ==========================================
-# [選填] Hermes Ollama 端點；留空時自動走 GCP-A → GCP-B → 111（ADR-028）
+# [選填] Hermes Ollama 端點；留空時自動走 GCP-A → GCP-B（111 預設不承接 Hermes 批量分析）
 # 僅允許 http://34.143.170.20:11434、http://34.21.145.224:11434、http://192.168.0.111:11434
 HERMES_URL=

@@ -135,6 +135,8 @@ HERMES_URL=
 HERMES_TIMEOUT=120
 # [預設 5m] Hermes runner 熱駐留；禁止回到 24h，避免 GCP-B/111 長駐高負載
 HERMES_KEEP_ALIVE=5m
+# [預設 false] 僅救急時才允許 Hermes LLM 落到 111；平時失敗交給規則/DB fallback
+HERMES_ALLOW_111_FALLBACK=false

 # [選填] Embedding 服務主機；留空時自動走同一條 Ollama 三主機級聯
 EMBEDDING_HOST=
--- a/config.py
+++ b/config.py
@@ -325,7 +325,7 @@ YOUTUBE_API_KEY = os.getenv('YOUTUBE_API_KEY', '')
 # ==========================================
 # 系統版本與路徑
 # ==========================================
-SYSTEM_VERSION = "V10.414"
+SYSTEM_VERSION = "V10.415"
 LOG_FILE_PATH = os.path.join(BASE_DIR, 'logs/system.log')
 public_url = PUBLIC_URL  # 用於模板顯示

--- a/docs/AI_INTELLIGENCE_MODULE_SOT.md
+++ b/docs/AI_INTELLIGENCE_MODULE_SOT.md
@@ -13,7 +13,7 @@
 - Gemini 只能作為 Ollama 主路徑失敗後的備援；MCP Grounding、PPT/vision、週/月報、Code Review、EA HITL、複雜 SKU 升級等舊鎖定場景也必須先走 GCP-A → GCP-B → 111。
 - 188 `192.168.0.188` 僅是 App / DB / scheduler / Telegram bot 容器宿主與 AutoHeal target，不可作為 Ollama 節點。
 - 通用 AI 文案、關鍵字、商品洞察與 Telegram Q&A 第一響應不得 Gemini-first。
- Hermes intent / analyst 路徑不得手刻 `/api/generate` 或只 resolve 單次 host；必須走 `OllamaService`，讓同一請求可依序 retry GCP-A → GCP-B → 111。
+- Hermes intent / analyst 路徑不得手刻 `/api/generate` 或只 resolve 單次 host；必須走 `OllamaService`。預設 `HERMES_ALLOW_111_FALLBACK=false`，同一請求只跑 GCP-A → GCP-B；兩台都失敗時回規則引擎或 DB 證據 fallback，不把批量價格分析轉嫁到 111。救急時才可顯式設 true 允許 111 接手。
 - NemoTron qwen3 dispatch 的 `/api/chat` tool-calling 路徑也必須同一請求最多嘗試三台 Ollama，第一台失敗要 `mark_unhealthy()` 後再試下一台，最後才 fallback NIM。
 - PPT vision、PPT 文案 final fallback、MCP 離線 final fallback 等特殊 Ollama 路徑也不得只打單一 host；如需 `/api/generate`，一律透過 `OllamaService.generate()`。
 - Code Review pipeline 也必須 Ollama-first：Hermes scan 與 OpenClaw assessment 都走 `OllamaService` 三主機 retry；Gemini telemetry 只能以 `code_review_openclaw_gemini` 出現，表示 Ollama/可選 Claude 備援都失敗後才啟用。
@@ -28,7 +28,7 @@
 - `docker-compose.yml` 的 `momo-app`、`scheduler`、`telegram-bot` 必須明確設定 `GEMINI_API_HARD_DISABLED=${GEMINI_API_HARD_DISABLED:-true}` 與 `GEMINI_FALLBACK_ENABLED=${GEMINI_FALLBACK_ENABLED:-false}`；`.env` 可保留 `GEMINI_API_KEY`，但不得因 key 存在就讓核心容器產生 Gemini 付費出站。
 - Gemini 不可被任何狀態面板或 router 推薦為主提供者：`AIProviderService._get_recommended_provider()` 不得回傳 `gemini`，只能顯示為 fallback 狀態；`llm_model_router` 的 `ea_engine` 若收到 `gemini-*` default 必須改回 `hermes3:latest`，需要深推理時才升本地 `deepseek-r1:14b`。
 - ElephantAlpha prompt / agent registry 不得再把 OpenClaw 描述為 Gemini 主模型；OpenClaw 是 `qwen2.5-coder:7b` / `qwen3:14b` Ollama-first 策略師，Gemini 僅能在 guard 顯式解鎖後作 emergency fallback。
- 111 `192.168.0.111` 只是最後一道 Mac fallback，不承接 7B+、vision、long-context 模型長駐；`OllamaService.generate()` 落到 111 時會將 `qwen3`、`deepseek-r1`、`hermes3`、`qwen2.5*`、`gemma3`、`llava`、`minicpm-v` 與 7B+ 模型依 `OLLAMA_111_MODEL_DOWNGRADE_PATTERNS` 降級到 `OLLAMA_111_MODEL_FALLBACK=llama3.2:latest`，並以 `OLLAMA_111_KEEP_ALIVE=5m`、`OLLAMA_111_MAX_TIMEOUT=20`、`OLLAMA_111_NUM_CTX=4096`、`OLLAMA_111_NUM_PREDICT=512` 封頂。Hermes / OpenClaw 報告型路徑的業務 keep-alive 也預設 `5m`；Code Review 另以 `CODE_REVIEW_ALLOW_111_FALLBACK=false` 預設跳過 111，避免 16GB RAM 主機與 GCP-B 被長駐 runner、長輸出與 24h keep-alive 壓到高 load。
+- 111 `192.168.0.111` 只是最後一道 Mac fallback，不承接 7B+、vision、long-context 模型長駐；`OllamaService.generate()` 落到 111 時會將 `qwen3`、`deepseek-r1`、`hermes3`、`qwen2.5*`、`gemma3`、`llava`、`minicpm-v` 與 7B+ 模型依 `OLLAMA_111_MODEL_DOWNGRADE_PATTERNS` 降級到 `OLLAMA_111_MODEL_FALLBACK=llama3.2:latest`，並以 `OLLAMA_111_KEEP_ALIVE=5m`、`OLLAMA_111_MAX_TIMEOUT=20`、`OLLAMA_111_NUM_CTX=4096`、`OLLAMA_111_NUM_PREDICT=512` 封頂。OpenClaw 報告型路徑的業務 keep-alive 預設 `5m`；Code Review 以 `CODE_REVIEW_ALLOW_111_FALLBACK=false`、Hermes 以 `HERMES_ALLOW_111_FALLBACK=false` 預設跳過 111，避免 16GB RAM 主機與 GCP-B 被長駐 runner、長輸出與 24h keep-alive 壓到高 load。
 - ElephantAlpha 的 `price_drop_alert` / `market_opportunity` Telegram HITL 告警必須把同款證據獨立呈現，至少包含 `match_type`、`price_basis`、`alert_tier` 與 `match_score`；沒有高信心同款與總價可比證據時，不得把 PChome/MOMO 價差寫成可直接跟價建議。

 ## 一、四 AI Agent 路由架構
--- a/docs/memory/history_logs.md
+++ b/docs/memory/history_logs.md
@@ -13,6 +13,7 @@
 ## 📅 詳細更新日誌 (考古存檔)

 ### 2026-05-24：PChome 近門檻身份回收第二輪
+- **V10.415 Hermes 預設不落 111 + 比對保護**: `OllamaService.generate()` 新增 `allow_111_fallback` 參數，預設維持三主機相容；Hermes intent / competitor analyst 改以 `HERMES_ALLOW_111_FALLBACK=false` 預設只跑 GCP-A → GCP-B，兩台都不可用時交給規則引擎或 DB 證據 fallback，不再把批量價格分析與意圖分類轉嫁到 111。同版 marketplace matcher 將防曬類列入 variant-sensitive，排除 SPF/PA/UVA/UVB 這類規格 token 被誤當型號，避免「兒童防曬乳」與「海洋友善保濕防曬乳」誤配；Recipe Box 兒童防曬氣墊粉餅保留精準同品線例外；另新增 `pack_quantity_difference`，讓 Beauty Foot 足膜 5入 vs 4入走 unit comparable，不再卡在低信心。
 - **V10.414 MCP fetch run readiness gate**: 新增 `mcp_fetch_run_readiness` read-only builder、GET/POST endpoint、UI run readiness 審核面板與 deployment readiness smoke target，在 run package 後檢查 command preview、receipt path、artifact path、節流/timeout/dry-run-first 與操作員 shell-only 邊界；API/UI 不執行 CLI、不抓外站、不寫檔、不開 DB、不掛 scheduler，只放行到人工 shell dry-run 與後續 receipt gate。
 - **V10.413 Code Review 預設保護 111 fallback**: production `ai_calls` 顯示 GCP-A 不可達時，Code Review OpenClaw 會先耗掉 primary timeout，再讓 GCP-B 撐到 60s，最後落到 111 `llama3.2` 成功，造成 111 與 GCP-B 高負載。新增 `CODE_REVIEW_ALLOW_111_FALLBACK=false` 預設：Code Review 的 Hermes LLM scan / OpenClaw assessment 只跑 GCP-A → GCP-B；只有明確設 true 才把部署後重分析丟給 111。若 GCP-A/GCP-B 都失敗且 Claude/Gemini 未顯式開啟，改回 deterministic 本地降級摘要，不呼叫 Gemini，也不再用 111 承接非即時重分析。
 - **V10.412 MCP fetch run package gate**: 新增 `mcp_fetch_run_package` read-only builder、獨立 route extension、GET/POST endpoint、UI run package 審核面板與 deployment readiness smoke target，將已通過的 target review 轉成操作員可覆核的 command argv preview 與 receipt path 契約；API/UI 不執行 CLI、不抓外站、不寫檔、不開 DB、不掛 scheduler，只放行到後續 run readiness review。
--- a/services/hermes_analyst_service.py
+++ b/services/hermes_analyst_service.py
@@ -35,6 +35,9 @@ from config import HERMES_TIMEOUT

 HERMES_MODEL = "hermes3:latest"
 HERMES_KEEP_ALIVE = os.getenv("HERMES_KEEP_ALIVE", "5m")
+HERMES_ALLOW_111_FALLBACK = os.getenv("HERMES_ALLOW_111_FALLBACK", "false").strip().lower() in (
+    "1", "true", "yes", "on",
+)
 TOP_N = 20  # 輸出前 N 個威脅，控制 NemoTron 每次消耗配額


@@ -280,6 +283,7 @@ class HermesAnalystService:
                    temperature=0.1,
                    timeout=HERMES_TIMEOUT,
                    keep_alive=HERMES_KEEP_ALIVE,  # ADR-012：避免冷啟動 timeout
+                    allow_111_fallback=HERMES_ALLOW_111_FALLBACK,
                )
                _ctx.set_provider(get_provider_tag(resp.host or ''))
                _ctx.set_model(resp.model or HERMES_MODEL)
@@ -585,6 +589,7 @@ class HermesAnalystService:
                    temperature=0.1,
                    timeout=HERMES_TIMEOUT,
                    keep_alive=HERMES_KEEP_ALIVE,
+                    allow_111_fallback=HERMES_ALLOW_111_FALLBACK,
                )
                _ctx.set_provider(get_provider_tag(resp.host or ''))
                _ctx.set_model(resp.model or HERMES_MODEL)
--- a/services/marketplace_product_matcher.py
+++ b/services/marketplace_product_matcher.py
@@ -523,6 +523,11 @@ VARIANT_SENSITIVE_KEYWORDS = {
    "粉底棒",
    "遮瑕棒",
    "修容打亮棒",
+    "防曬",
+    "防曬乳",
+    "防曬霜",
+    "防曬噴霧",
+    "防曬棒",
 }

 VARIANT_OPTION_COLOR_WORDS = {
@@ -876,11 +881,21 @@ def _extract_model_tokens(text: str) -> set[str]:
    tokens: set[str] = set()
    for match in re.finditer(r"(?<![a-z0-9])([a-z]{1,4}-?[a-z]{0,3}\d{2,}[a-z0-9-]*)(?![a-z0-9])", text, re.I):
        compact = re.sub(r"[^a-z0-9]", "", match.group(1).lower())
+        if _is_spec_like_latin_token(compact):
+            continue
        if len(compact) >= 4 and re.search(r"[a-z]", compact) and re.search(r"\d", compact):
            tokens.add(compact)
    return tokens


+def _is_spec_like_latin_token(token: str) -> bool:
+    return bool(
+        re.fullmatch(r"spf\d{1,3}[a-z]?", token)
+        or re.fullmatch(r"pa\d*", token)
+        or token in {"uva", "uvb", "uv", "spf"}
+    )
+
+
 def _brand_alias_present(text: str, alias_norm: str, text_tokens: set[str]) -> bool:
    if not alias_norm:
        return False
@@ -946,7 +961,7 @@ def _leading_brand_tokens(original: str, normalized: str) -> set[str]:
        if re.fullmatch(r"[\u4e00-\u9fff]{2,6}", first_token) and first_token not in GENERIC_TOKENS:
            tokens.add(first_token)
    for token in _tokenize(leading):
-        if re.fullmatch(r"[a-z][a-z0-9\-']{2,}", token):
+        if re.fullmatch(r"[a-z][a-z0-9\-']{2,}", token) and not _is_spec_like_latin_token(token):
            tokens.add(token)
    return tokens

@@ -1260,6 +1275,30 @@ def _has_exact_count_alignment(left: ProductIdentity, right: ProductIdentity) ->
    return left_counts == right_counts


+def _has_pack_quantity_difference(left: ProductIdentity, right: ProductIdentity) -> bool:
+    if not left.counts or not right.counts or _has_exact_count_alignment(left, right):
+        return False
+
+    if left.total_piece_count and right.total_piece_count:
+        return left.total_piece_count != right.total_piece_count
+
+    left_by_unit: dict[str, set[int]] = {}
+    right_by_unit: dict[str, set[int]] = {}
+    for count, unit in left.counts:
+        family = _count_unit_family(unit)
+        if family in COUNT_UNITS or unit in COUNT_UNITS:
+            left_by_unit.setdefault(family, set()).add(count)
+    for count, unit in right.counts:
+        family = _count_unit_family(unit)
+        if family in COUNT_UNITS or unit in COUNT_UNITS:
+            right_by_unit.setdefault(family, set()).add(count)
+
+    for unit in set(left_by_unit) & set(right_by_unit):
+        if left_by_unit[unit] != right_by_unit[unit]:
+            return True
+    return False
+
+
 def _spec_score(left: ProductIdentity, right: ProductIdentity) -> tuple[float, bool, tuple[str, ...]]:
    volume_score, volume_conflict = _spec_component(left.volumes_ml, right.volumes_ml)
    weight_score, weight_conflict = _spec_component(left.weights_g, right.weights_g)
@@ -1565,6 +1604,7 @@ def _is_unit_comparable_candidate(
        "multi_component_conflict",
        "count_conflict",
        "component_count_conflict",
+        "pack_quantity_difference",
    })
    if not pack_difference:
        return False
@@ -1644,6 +1684,8 @@ def _model_line_tokens(identity: ProductIdentity) -> set[str]:
    for token in identity.core_tokens:
        if token in GENERIC_TOKENS:
            continue
+        if _is_spec_like_latin_token(token):
+            continue
        if re.fullmatch(r"[a-z][a-z0-9-]{2,}", token):
            tokens.add(token)
        for match in re.finditer(r"([\u4e00-\u9fff]{2,})(?:系列)", token):
@@ -1707,6 +1749,7 @@ def _build_evidence_flags(
        "variant_selection_review",
        "variant_option_conflict",
        "variant_descriptor_conflict",
+        "pack_quantity_difference",
        "count_conflict",
        "bundle_offer_conflict",
        "multi_component_conflict",
@@ -1834,7 +1877,17 @@ def score_marketplace_match(
    catalog_count_omission = _allow_catalog_count_omission(left, right)
    if catalog_count_omission:
        reasons.append("catalog_count_omission")
+    if _has_pack_quantity_difference(left, right):
+        reasons.append("pack_quantity_difference")
    variant_descriptor_conflict = _has_variant_descriptor_conflict(left, right, shared_anchor)
+    sun_protection_line_conflict = (
+        variant_descriptor_conflict
+        and left.product_type == right.product_type == "防曬"
+        and not shared_anchor
+    )
+    if sun_protection_line_conflict:
+        reasons.append("variant_descriptor_conflict")
+        reasons.append("sun_protection_line_conflict")
    variant_option_conflict = _has_explicit_variant_option_conflict(left, right, shared_anchor)
    if variant_option_conflict:
        reasons.append("variant_option_conflict")
@@ -1861,6 +1914,8 @@ def score_marketplace_match(
        hard_veto = True
    if left.product_type and right.product_type and left.product_type != right.product_type:
        hard_veto = True
+    if sun_protection_line_conflict:
+        hard_veto = True
    if variant_option_conflict:
        hard_veto = True

@@ -2557,7 +2612,10 @@ def _shared_model_tokens(left: ProductIdentity, right: ProductIdentity) -> set[s
    return {
        token
        for token in left.core_tokens & right.core_tokens
-        if len(token) >= 4 and re.search(r"[a-z]", token) and re.search(r"\d", token)
+        if len(token) >= 4
+        and re.search(r"[a-z]", token)
+        and re.search(r"\d", token)
+        and not _is_spec_like_latin_token(token)
    }


@@ -2694,6 +2752,15 @@ def _has_baan_baby_lip_catalog_alignment(left: ProductIdentity, right: ProductId
    )


+def _has_recipe_box_child_sunscreen_cushion_alignment(left: ProductIdentity, right: ProductIdentity) -> bool:
+    brand_tokens = left.brand_tokens | right.brand_tokens
+    return (
+        {"recipe", "box"} <= brand_tokens
+        and "兒童防曬氣墊粉餅" in left.searchable_name
+        and "兒童防曬氣墊粉餅" in right.searchable_name
+    )
+
+
 def _has_pavaruni_40_scent_oil_alignment(left: ProductIdentity, right: ProductIdentity) -> bool:
    left_text = left.searchable_name
    right_text = right.searchable_name
@@ -3068,6 +3135,8 @@ def _has_variant_descriptor_conflict(left: ProductIdentity, right: ProductIdenti
        return False
    if _has_baan_baby_lip_catalog_alignment(left, right):
        return False
+    if _has_recipe_box_child_sunscreen_cushion_alignment(left, right):
+        return False
    if _has_pavaruni_40_scent_oil_alignment(left, right):
        return False
    if _has_pavaruni_20_scent_candle_alignment(left, right):
--- a/services/ollama_service.py
+++ b/services/ollama_service.py
@@ -373,7 +373,8 @@ class OllamaService:
                 system_prompt: str = None, temperature: float = 0.7,
                 timeout: int = None, keep_alive: str = None,
                 options: Optional[Dict[str, Any]] = None,
-                 images: Optional[List[str]] = None) -> OllamaResponse:
+                 images: Optional[List[str]] = None,
+                 allow_111_fallback: bool = True) -> OllamaResponse:
        """
        生成文字 — 含三主機自動 retry（HOTFIX 2026-05-04）

@@ -400,17 +401,26 @@ class OllamaService:
        attempted_hosts: List[str] = []
        last_error: Optional[str] = None
        canonical_hosts = _canonical_host_chain()
+        allowed_hosts = [
+            host for host in canonical_hosts
+            if allow_111_fallback or not _is_111_fallback_host(host)
+        ]
+        max_attempts = len(canonical_hosts) if allow_111_fallback else max(1, len(allowed_hosts))

-        for attempt in range(3):
+        for attempt in range(max_attempts):
            current_host = _normalize_host(self.host)  # property 每次 lazy resolve
+            if not allow_111_fallback and _is_111_fallback_host(current_host):
+                last_error = "111 fallback disabled; no approved GCP Ollama host available"
+                logger.warning("[Ollama] %s", last_error)
+                break
            if current_host in attempted_hosts:
                # 已試過同主機時，若是標準三主機鏈且 caller 沒指定 host，
                # 改走尚未嘗試的下一台。避免 request timeout(60s) 大於
                # unhealthy TTL(30s) 時第三輪又 resolve 回 primary，導致 111
                # final fallback 永遠沒被打到。
                next_host = None
-                if self._explicit_host is None and current_host in canonical_hosts:
-                    next_host = next((host for host in canonical_hosts if host not in attempted_hosts), None)
+                if self._explicit_host is None and current_host in allowed_hosts:
+                    next_host = next((host for host in allowed_hosts if host not in attempted_hosts), None)
                if not next_host:
                    # 非標準 host 或 explicit host 維持原行為：跳出避免無限迴圈。
                    break
@@ -434,8 +444,9 @@ class OllamaService:
                payload["keep_alive"] = keep_alive

            logger.info(
-                "[Ollama] 嘗試 #%s/3 host=%s model=%s timeout=%ss keep_alive=%s",
+                "[Ollama] 嘗試 #%s/%s host=%s model=%s timeout=%ss keep_alive=%s",
                attempt + 1,
+                max_attempts,
                current_host,
                effective_model,
                effective_timeout,
--- a/tests/test_hermes_ollama_cascade.py
+++ b/tests/test_hermes_ollama_cascade.py
@@ -77,7 +77,7 @@ def test_hermes_intent_uses_ollama_service_and_logs_actual_host(monkeypatch, res
        monkeypatch,
        content='{"intent":"query_sales","confidence":0.9,"complexity_score":0.8,'
                '"requires_data_fetch":true,"preliminary_answer":""}',
-        host='http://192.168.0.111:11434',
+        host='http://34.21.145.224:11434',
    )

    svc = hermes_mod.HermesAnalystService()
@@ -88,12 +88,13 @@ def test_hermes_intent_uses_ollama_service_and_logs_actual_host(monkeypatch, res
    call_kwargs = fake_service.instances[0].generate_calls[0]
    assert call_kwargs['model'] == hermes_mod.HERMES_MODEL
    assert call_kwargs['keep_alive'] == hermes_mod.HERMES_KEEP_ALIVE
+    assert call_kwargs['allow_111_fallback'] is False

    assert _wait_for(reset_ai_logger, 1)
    rec = reset_ai_logger[0]
    assert rec['caller'] == 'hermes_intent'
-    assert rec['provider'] == 'ollama_111'
-    assert rec['meta']['host_label'] == '111 備援'
+    assert rec['provider'] == 'ollama_secondary'
+    assert rec['meta']['host_label'] == 'GCP-SSD-2'


 def test_hermes_batch_analyze_uses_ollama_service_and_logs_secondary(monkeypatch, reset_ai_logger):
@@ -136,6 +137,7 @@ def test_hermes_batch_analyze_uses_ollama_service_and_logs_secondary(monkeypatch
    call_kwargs = fake_service.instances[0].generate_calls[0]
    assert call_kwargs['system_prompt'] == svc.SYSTEM_PROMPT
    assert call_kwargs['keep_alive'] == hermes_mod.HERMES_KEEP_ALIVE
+    assert call_kwargs['allow_111_fallback'] is False

    assert _wait_for(reset_ai_logger, 1)
    rec = reset_ai_logger[0]
@@ -158,3 +160,7 @@ def test_hermes_candidate_sql_only_joins_direct_price_alert_matches():

 def test_hermes_keep_alive_defaults_to_short_runner_residency():
    assert hermes_mod.HERMES_KEEP_ALIVE == "5m"
+
+
+def test_hermes_disables_111_fallback_by_default():
+    assert hermes_mod.HERMES_ALLOW_111_FALLBACK is False
--- a/tests/test_marketplace_product_matcher.py
+++ b/tests/test_marketplace_product_matcher.py
@@ -151,6 +151,23 @@ def test_unit_price_comparison_builds_normalized_evidence():
    assert comparison["unit_gap_pct"] < 0


+def test_marketplace_matcher_routes_same_base_different_piece_pack_to_unit_comparable():
+    from services.marketplace_product_matcher import score_marketplace_match
+
+    diagnostics = score_marketplace_match(
+        "【日本Beauty Foot】去角質足膜25mlx2枚入 5入組(一般尺寸、大尺寸可選)",
+        "【日本Beauty Foot 】煥膚足膜(25ml*2枚入)四入組",
+        momo_price=1290,
+        competitor_price=989,
+    )
+
+    assert diagnostics.comparison_mode == "unit_comparable"
+    assert diagnostics.match_type == "same_product_different_pack"
+    assert diagnostics.price_basis == "unit_price"
+    assert "pack_quantity_difference" in diagnostics.reasons
+    assert "unit_comparable" in diagnostics.reasons
+
+
 def test_marketplace_matcher_does_not_unit_compare_multi_component_set():
    from services.marketplace_product_matcher import score_marketplace_match

@@ -1313,6 +1330,10 @@ def test_marketplace_matcher_keeps_high_variant_low_score_lines_outside_focused_
        "【Solone】持久眼線筆(眼線膠 超防暈推薦)",
        "Solone 斜角眉筆 0.35g",
    )
+    sunscreen_line_gap = score_marketplace_match(
+        "【我的心機】溫和寶貝兒童防曬乳35ml(SPF50+ PA+++)",
+        "我的心機 海洋友善保濕高效防曬乳35ml(SPF50+PA++++)",
+    )

    for diagnostics in (
        lush,
@@ -1328,10 +1349,14 @@ def test_marketplace_matcher_keeps_high_variant_low_score_lines_outside_focused_
        romand_line_gap,
        summer_eve_variant_gap,
        solone_type_gap,
+        sunscreen_line_gap,
    ):
        assert diagnostics.score < 0.76
        assert not any(reason.startswith("focused_exact_identity_") for reason in diagnostics.reasons)

+    assert sunscreen_line_gap.hard_veto is True
+    assert "variant_descriptor_conflict" in sunscreen_line_gap.reasons
+

 def test_marketplace_matcher_rejects_refill_core_vs_case_only_pack():
    from services.marketplace_product_matcher import score_marketplace_match
--- a/tests/test_ollama_retry_chain.py
+++ b/tests/test_ollama_retry_chain.py
@@ -163,6 +163,30 @@ def test_generate_forces_final_fallback_when_unhealthy_ttl_expires_mid_request()
    assert 'all 3 hosts failed' in (resp.error or '')


+def test_generate_can_disable_111_fallback_for_batch_llm_work():
+    """批量 LLM 任務可選擇只跑 GCP-A/GCP-B，避免 111 承接長分析。"""
+    import requests
+    from services import ollama_service as oss
+    from services.ollama_service import OllamaService
+
+    svc = OllamaService()
+    hosts = [
+        oss.OLLAMA_HOST_SECONDARY,
+        oss.OLLAMA_HOST_FALLBACK,
+    ]
+
+    with patch('services.ollama_service.resolve_ollama_host', side_effect=hosts), \
+         patch('services.ollama_service.requests.post',
+               side_effect=requests.Timeout('secondary timeout')) as mock_post:
+        resp = svc.generate('test', allow_111_fallback=False)
+
+    posted_hosts = [call.args[0].split('/api/generate')[0] for call in mock_post.call_args_list]
+    assert resp.success is False
+    assert posted_hosts == [oss.OLLAMA_HOST_SECONDARY]
+    assert oss.OLLAMA_HOST_FALLBACK not in posted_hosts
+    assert '111 fallback disabled' in (resp.error or '')
+
+
 def test_generate_token_parsing_phase13():
    """Phase 13 補強：OllamaResponse 解 prompt_eval_count + eval_count"""
    from services.ollama_service import OllamaService