V10.415 protect Hermes fallback routing
All checks were successful
CD Pipeline / deploy (push) Successful in 1m5s

This commit is contained in:
OoO
2026-05-24 14:22:06 +08:00
committed by AiderHeal Bot
parent 1b94177828
commit b73dc6df3f
10 changed files with 157 additions and 14 deletions

View File

@@ -127,7 +127,7 @@ GDRIVE_FILE_PATTERN=即時業績_當日
# ==========================================
# Hermes 3 競價情報分析Module 2 / ADR-012
# ==========================================
# [選填] Hermes Ollama 端點;留空時自動走 GCP-A → GCP-B → 111ADR-028
# [選填] Hermes Ollama 端點;留空時自動走 GCP-A → GCP-B111 預設不承接 Hermes 批量分析
# 僅允許 http://34.143.170.20:11434、http://34.21.145.224:11434、http://192.168.0.111:11434
HERMES_URL=
@@ -135,6 +135,8 @@ HERMES_URL=
HERMES_TIMEOUT=120
# [預設 5m] Hermes runner 熱駐留;禁止回到 24h避免 GCP-B/111 長駐高負載
HERMES_KEEP_ALIVE=5m
# [預設 false] 僅救急時才允許 Hermes LLM 落到 111平時失敗交給規則/DB fallback
HERMES_ALLOW_111_FALLBACK=false
# [選填] Embedding 服務主機;留空時自動走同一條 Ollama 三主機級聯
EMBEDDING_HOST=

View File

@@ -325,7 +325,7 @@ YOUTUBE_API_KEY = os.getenv('YOUTUBE_API_KEY', '')
# ==========================================
# 系統版本與路徑
# ==========================================
SYSTEM_VERSION = "V10.414"
SYSTEM_VERSION = "V10.415"
LOG_FILE_PATH = os.path.join(BASE_DIR, 'logs/system.log')
public_url = PUBLIC_URL # 用於模板顯示

View File

@@ -13,7 +13,7 @@
- Gemini 只能作為 Ollama 主路徑失敗後的備援MCP Grounding、PPT/vision、週/月報、Code Review、EA HITL、複雜 SKU 升級等舊鎖定場景也必須先走 GCP-A → GCP-B → 111。
- 188 `192.168.0.188` 僅是 App / DB / scheduler / Telegram bot 容器宿主與 AutoHeal target不可作為 Ollama 節點。
- 通用 AI 文案、關鍵字、商品洞察與 Telegram Q&A 第一響應不得 Gemini-first。
- Hermes intent / analyst 路徑不得手刻 `/api/generate` 或只 resolve 單次 host必須走 `OllamaService`,讓同一請求可依序 retry GCP-A → GCP-B → 111
- Hermes intent / analyst 路徑不得手刻 `/api/generate` 或只 resolve 單次 host必須走 `OllamaService`。預設 `HERMES_ALLOW_111_FALLBACK=false`,同一請求只跑 GCP-A → GCP-B兩台都失敗時回規則引擎或 DB 證據 fallback不把批量價格分析轉嫁到 111。救急時才可顯式設 true 允許 111 接手
- NemoTron qwen3 dispatch 的 `/api/chat` tool-calling 路徑也必須同一請求最多嘗試三台 Ollama第一台失敗要 `mark_unhealthy()` 後再試下一台,最後才 fallback NIM。
- PPT vision、PPT 文案 final fallback、MCP 離線 final fallback 等特殊 Ollama 路徑也不得只打單一 host如需 `/api/generate`,一律透過 `OllamaService.generate()`
- Code Review pipeline 也必須 Ollama-firstHermes scan 與 OpenClaw assessment 都走 `OllamaService` 三主機 retryGemini telemetry 只能以 `code_review_openclaw_gemini` 出現,表示 Ollama/可選 Claude 備援都失敗後才啟用。
@@ -28,7 +28,7 @@
- `docker-compose.yml``momo-app``scheduler``telegram-bot` 必須明確設定 `GEMINI_API_HARD_DISABLED=${GEMINI_API_HARD_DISABLED:-true}``GEMINI_FALLBACK_ENABLED=${GEMINI_FALLBACK_ENABLED:-false}``.env` 可保留 `GEMINI_API_KEY`,但不得因 key 存在就讓核心容器產生 Gemini 付費出站。
- Gemini 不可被任何狀態面板或 router 推薦為主提供者:`AIProviderService._get_recommended_provider()` 不得回傳 `gemini`,只能顯示為 fallback 狀態;`llm_model_router``ea_engine` 若收到 `gemini-*` default 必須改回 `hermes3:latest`,需要深推理時才升本地 `deepseek-r1:14b`
- ElephantAlpha prompt / agent registry 不得再把 OpenClaw 描述為 Gemini 主模型OpenClaw 是 `qwen2.5-coder:7b` / `qwen3:14b` Ollama-first 策略師Gemini 僅能在 guard 顯式解鎖後作 emergency fallback。
- 111 `192.168.0.111` 只是最後一道 Mac fallback不承接 7B+、vision、long-context 模型長駐;`OllamaService.generate()` 落到 111 時會將 `qwen3``deepseek-r1``hermes3``qwen2.5*``gemma3``llava``minicpm-v` 與 7B+ 模型依 `OLLAMA_111_MODEL_DOWNGRADE_PATTERNS` 降級到 `OLLAMA_111_MODEL_FALLBACK=llama3.2:latest`,並以 `OLLAMA_111_KEEP_ALIVE=5m``OLLAMA_111_MAX_TIMEOUT=20``OLLAMA_111_NUM_CTX=4096``OLLAMA_111_NUM_PREDICT=512` 封頂。Hermes / OpenClaw 報告型路徑的業務 keep-alive 預設 `5m`Code Review `CODE_REVIEW_ALLOW_111_FALLBACK=false` 預設跳過 111避免 16GB RAM 主機與 GCP-B 被長駐 runner、長輸出與 24h keep-alive 壓到高 load。
- 111 `192.168.0.111` 只是最後一道 Mac fallback不承接 7B+、vision、long-context 模型長駐;`OllamaService.generate()` 落到 111 時會將 `qwen3``deepseek-r1``hermes3``qwen2.5*``gemma3``llava``minicpm-v` 與 7B+ 模型依 `OLLAMA_111_MODEL_DOWNGRADE_PATTERNS` 降級到 `OLLAMA_111_MODEL_FALLBACK=llama3.2:latest`,並以 `OLLAMA_111_KEEP_ALIVE=5m``OLLAMA_111_MAX_TIMEOUT=20``OLLAMA_111_NUM_CTX=4096``OLLAMA_111_NUM_PREDICT=512` 封頂。OpenClaw 報告型路徑的業務 keep-alive 預設 `5m`Code Review 以 `CODE_REVIEW_ALLOW_111_FALLBACK=false`、Hermes 以 `HERMES_ALLOW_111_FALLBACK=false` 預設跳過 111避免 16GB RAM 主機與 GCP-B 被長駐 runner、長輸出與 24h keep-alive 壓到高 load。
- ElephantAlpha 的 `price_drop_alert` / `market_opportunity` Telegram HITL 告警必須把同款證據獨立呈現,至少包含 `match_type``price_basis``alert_tier``match_score`;沒有高信心同款與總價可比證據時,不得把 PChome/MOMO 價差寫成可直接跟價建議。
## 一、四 AI Agent 路由架構

View File

@@ -13,6 +13,7 @@
## 📅 詳細更新日誌 (考古存檔)
### 2026-05-24PChome 近門檻身份回收第二輪
- **V10.415 Hermes 預設不落 111 + 比對保護**: `OllamaService.generate()` 新增 `allow_111_fallback` 參數預設維持三主機相容Hermes intent / competitor analyst 改以 `HERMES_ALLOW_111_FALLBACK=false` 預設只跑 GCP-A → GCP-B兩台都不可用時交給規則引擎或 DB 證據 fallback不再把批量價格分析與意圖分類轉嫁到 111。同版 marketplace matcher 將防曬類列入 variant-sensitive排除 SPF/PA/UVA/UVB 這類規格 token 被誤當型號避免「兒童防曬乳」與「海洋友善保濕防曬乳」誤配Recipe Box 兒童防曬氣墊粉餅保留精準同品線例外;另新增 `pack_quantity_difference`,讓 Beauty Foot 足膜 5入 vs 4入走 unit comparable不再卡在低信心。
- **V10.414 MCP fetch run readiness gate**: 新增 `mcp_fetch_run_readiness` read-only builder、GET/POST endpoint、UI run readiness 審核面板與 deployment readiness smoke target在 run package 後檢查 command preview、receipt path、artifact path、節流/timeout/dry-run-first 與操作員 shell-only 邊界API/UI 不執行 CLI、不抓外站、不寫檔、不開 DB、不掛 scheduler只放行到人工 shell dry-run 與後續 receipt gate。
- **V10.413 Code Review 預設保護 111 fallback**: production `ai_calls` 顯示 GCP-A 不可達時Code Review OpenClaw 會先耗掉 primary timeout再讓 GCP-B 撐到 60s最後落到 111 `llama3.2` 成功,造成 111 與 GCP-B 高負載。新增 `CODE_REVIEW_ALLOW_111_FALLBACK=false` 預設Code Review 的 Hermes LLM scan / OpenClaw assessment 只跑 GCP-A → GCP-B只有明確設 true 才把部署後重分析丟給 111。若 GCP-A/GCP-B 都失敗且 Claude/Gemini 未顯式開啟,改回 deterministic 本地降級摘要,不呼叫 Gemini也不再用 111 承接非即時重分析。
- **V10.412 MCP fetch run package gate**: 新增 `mcp_fetch_run_package` read-only builder、獨立 route extension、GET/POST endpoint、UI run package 審核面板與 deployment readiness smoke target將已通過的 target review 轉成操作員可覆核的 command argv preview 與 receipt path 契約API/UI 不執行 CLI、不抓外站、不寫檔、不開 DB、不掛 scheduler只放行到後續 run readiness review。

View File

@@ -35,6 +35,9 @@ from config import HERMES_TIMEOUT
HERMES_MODEL = "hermes3:latest"
HERMES_KEEP_ALIVE = os.getenv("HERMES_KEEP_ALIVE", "5m")
HERMES_ALLOW_111_FALLBACK = os.getenv("HERMES_ALLOW_111_FALLBACK", "false").strip().lower() in (
"1", "true", "yes", "on",
)
TOP_N = 20 # 輸出前 N 個威脅,控制 NemoTron 每次消耗配額
@@ -280,6 +283,7 @@ class HermesAnalystService:
temperature=0.1,
timeout=HERMES_TIMEOUT,
keep_alive=HERMES_KEEP_ALIVE, # ADR-012避免冷啟動 timeout
allow_111_fallback=HERMES_ALLOW_111_FALLBACK,
)
_ctx.set_provider(get_provider_tag(resp.host or ''))
_ctx.set_model(resp.model or HERMES_MODEL)
@@ -585,6 +589,7 @@ class HermesAnalystService:
temperature=0.1,
timeout=HERMES_TIMEOUT,
keep_alive=HERMES_KEEP_ALIVE,
allow_111_fallback=HERMES_ALLOW_111_FALLBACK,
)
_ctx.set_provider(get_provider_tag(resp.host or ''))
_ctx.set_model(resp.model or HERMES_MODEL)

View File

@@ -523,6 +523,11 @@ VARIANT_SENSITIVE_KEYWORDS = {
"粉底棒",
"遮瑕棒",
"修容打亮棒",
"防曬",
"防曬乳",
"防曬霜",
"防曬噴霧",
"防曬棒",
}
VARIANT_OPTION_COLOR_WORDS = {
@@ -876,11 +881,21 @@ def _extract_model_tokens(text: str) -> set[str]:
tokens: set[str] = set()
for match in re.finditer(r"(?<![a-z0-9])([a-z]{1,4}-?[a-z]{0,3}\d{2,}[a-z0-9-]*)(?![a-z0-9])", text, re.I):
compact = re.sub(r"[^a-z0-9]", "", match.group(1).lower())
if _is_spec_like_latin_token(compact):
continue
if len(compact) >= 4 and re.search(r"[a-z]", compact) and re.search(r"\d", compact):
tokens.add(compact)
return tokens
def _is_spec_like_latin_token(token: str) -> bool:
return bool(
re.fullmatch(r"spf\d{1,3}[a-z]?", token)
or re.fullmatch(r"pa\d*", token)
or token in {"uva", "uvb", "uv", "spf"}
)
def _brand_alias_present(text: str, alias_norm: str, text_tokens: set[str]) -> bool:
if not alias_norm:
return False
@@ -946,7 +961,7 @@ def _leading_brand_tokens(original: str, normalized: str) -> set[str]:
if re.fullmatch(r"[\u4e00-\u9fff]{2,6}", first_token) and first_token not in GENERIC_TOKENS:
tokens.add(first_token)
for token in _tokenize(leading):
if re.fullmatch(r"[a-z][a-z0-9\-']{2,}", token):
if re.fullmatch(r"[a-z][a-z0-9\-']{2,}", token) and not _is_spec_like_latin_token(token):
tokens.add(token)
return tokens
@@ -1260,6 +1275,30 @@ def _has_exact_count_alignment(left: ProductIdentity, right: ProductIdentity) ->
return left_counts == right_counts
def _has_pack_quantity_difference(left: ProductIdentity, right: ProductIdentity) -> bool:
if not left.counts or not right.counts or _has_exact_count_alignment(left, right):
return False
if left.total_piece_count and right.total_piece_count:
return left.total_piece_count != right.total_piece_count
left_by_unit: dict[str, set[int]] = {}
right_by_unit: dict[str, set[int]] = {}
for count, unit in left.counts:
family = _count_unit_family(unit)
if family in COUNT_UNITS or unit in COUNT_UNITS:
left_by_unit.setdefault(family, set()).add(count)
for count, unit in right.counts:
family = _count_unit_family(unit)
if family in COUNT_UNITS or unit in COUNT_UNITS:
right_by_unit.setdefault(family, set()).add(count)
for unit in set(left_by_unit) & set(right_by_unit):
if left_by_unit[unit] != right_by_unit[unit]:
return True
return False
def _spec_score(left: ProductIdentity, right: ProductIdentity) -> tuple[float, bool, tuple[str, ...]]:
volume_score, volume_conflict = _spec_component(left.volumes_ml, right.volumes_ml)
weight_score, weight_conflict = _spec_component(left.weights_g, right.weights_g)
@@ -1565,6 +1604,7 @@ def _is_unit_comparable_candidate(
"multi_component_conflict",
"count_conflict",
"component_count_conflict",
"pack_quantity_difference",
})
if not pack_difference:
return False
@@ -1644,6 +1684,8 @@ def _model_line_tokens(identity: ProductIdentity) -> set[str]:
for token in identity.core_tokens:
if token in GENERIC_TOKENS:
continue
if _is_spec_like_latin_token(token):
continue
if re.fullmatch(r"[a-z][a-z0-9-]{2,}", token):
tokens.add(token)
for match in re.finditer(r"([\u4e00-\u9fff]{2,})(?:系列)", token):
@@ -1707,6 +1749,7 @@ def _build_evidence_flags(
"variant_selection_review",
"variant_option_conflict",
"variant_descriptor_conflict",
"pack_quantity_difference",
"count_conflict",
"bundle_offer_conflict",
"multi_component_conflict",
@@ -1834,7 +1877,17 @@ def score_marketplace_match(
catalog_count_omission = _allow_catalog_count_omission(left, right)
if catalog_count_omission:
reasons.append("catalog_count_omission")
if _has_pack_quantity_difference(left, right):
reasons.append("pack_quantity_difference")
variant_descriptor_conflict = _has_variant_descriptor_conflict(left, right, shared_anchor)
sun_protection_line_conflict = (
variant_descriptor_conflict
and left.product_type == right.product_type == "防曬"
and not shared_anchor
)
if sun_protection_line_conflict:
reasons.append("variant_descriptor_conflict")
reasons.append("sun_protection_line_conflict")
variant_option_conflict = _has_explicit_variant_option_conflict(left, right, shared_anchor)
if variant_option_conflict:
reasons.append("variant_option_conflict")
@@ -1861,6 +1914,8 @@ def score_marketplace_match(
hard_veto = True
if left.product_type and right.product_type and left.product_type != right.product_type:
hard_veto = True
if sun_protection_line_conflict:
hard_veto = True
if variant_option_conflict:
hard_veto = True
@@ -2557,7 +2612,10 @@ def _shared_model_tokens(left: ProductIdentity, right: ProductIdentity) -> set[s
return {
token
for token in left.core_tokens & right.core_tokens
if len(token) >= 4 and re.search(r"[a-z]", token) and re.search(r"\d", token)
if len(token) >= 4
and re.search(r"[a-z]", token)
and re.search(r"\d", token)
and not _is_spec_like_latin_token(token)
}
@@ -2694,6 +2752,15 @@ def _has_baan_baby_lip_catalog_alignment(left: ProductIdentity, right: ProductId
)
def _has_recipe_box_child_sunscreen_cushion_alignment(left: ProductIdentity, right: ProductIdentity) -> bool:
brand_tokens = left.brand_tokens | right.brand_tokens
return (
{"recipe", "box"} <= brand_tokens
and "兒童防曬氣墊粉餅" in left.searchable_name
and "兒童防曬氣墊粉餅" in right.searchable_name
)
def _has_pavaruni_40_scent_oil_alignment(left: ProductIdentity, right: ProductIdentity) -> bool:
left_text = left.searchable_name
right_text = right.searchable_name
@@ -3068,6 +3135,8 @@ def _has_variant_descriptor_conflict(left: ProductIdentity, right: ProductIdenti
return False
if _has_baan_baby_lip_catalog_alignment(left, right):
return False
if _has_recipe_box_child_sunscreen_cushion_alignment(left, right):
return False
if _has_pavaruni_40_scent_oil_alignment(left, right):
return False
if _has_pavaruni_20_scent_candle_alignment(left, right):

View File

@@ -373,7 +373,8 @@ class OllamaService:
system_prompt: str = None, temperature: float = 0.7,
timeout: int = None, keep_alive: str = None,
options: Optional[Dict[str, Any]] = None,
images: Optional[List[str]] = None) -> OllamaResponse:
images: Optional[List[str]] = None,
allow_111_fallback: bool = True) -> OllamaResponse:
"""
生成文字 — 含三主機自動 retryHOTFIX 2026-05-04
@@ -400,17 +401,26 @@ class OllamaService:
attempted_hosts: List[str] = []
last_error: Optional[str] = None
canonical_hosts = _canonical_host_chain()
allowed_hosts = [
host for host in canonical_hosts
if allow_111_fallback or not _is_111_fallback_host(host)
]
max_attempts = len(canonical_hosts) if allow_111_fallback else max(1, len(allowed_hosts))
for attempt in range(3):
for attempt in range(max_attempts):
current_host = _normalize_host(self.host) # property 每次 lazy resolve
if not allow_111_fallback and _is_111_fallback_host(current_host):
last_error = "111 fallback disabled; no approved GCP Ollama host available"
logger.warning("[Ollama] %s", last_error)
break
if current_host in attempted_hosts:
# 已試過同主機時,若是標準三主機鏈且 caller 沒指定 host
# 改走尚未嘗試的下一台。避免 request timeout(60s) 大於
# unhealthy TTL(30s) 時第三輪又 resolve 回 primary導致 111
# final fallback 永遠沒被打到。
next_host = None
if self._explicit_host is None and current_host in canonical_hosts:
next_host = next((host for host in canonical_hosts if host not in attempted_hosts), None)
if self._explicit_host is None and current_host in allowed_hosts:
next_host = next((host for host in allowed_hosts if host not in attempted_hosts), None)
if not next_host:
# 非標準 host 或 explicit host 維持原行為:跳出避免無限迴圈。
break
@@ -434,8 +444,9 @@ class OllamaService:
payload["keep_alive"] = keep_alive
logger.info(
"[Ollama] 嘗試 #%s/3 host=%s model=%s timeout=%ss keep_alive=%s",
"[Ollama] 嘗試 #%s/%s host=%s model=%s timeout=%ss keep_alive=%s",
attempt + 1,
max_attempts,
current_host,
effective_model,
effective_timeout,

View File

@@ -77,7 +77,7 @@ def test_hermes_intent_uses_ollama_service_and_logs_actual_host(monkeypatch, res
monkeypatch,
content='{"intent":"query_sales","confidence":0.9,"complexity_score":0.8,'
'"requires_data_fetch":true,"preliminary_answer":""}',
host='http://192.168.0.111:11434',
host='http://34.21.145.224:11434',
)
svc = hermes_mod.HermesAnalystService()
@@ -88,12 +88,13 @@ def test_hermes_intent_uses_ollama_service_and_logs_actual_host(monkeypatch, res
call_kwargs = fake_service.instances[0].generate_calls[0]
assert call_kwargs['model'] == hermes_mod.HERMES_MODEL
assert call_kwargs['keep_alive'] == hermes_mod.HERMES_KEEP_ALIVE
assert call_kwargs['allow_111_fallback'] is False
assert _wait_for(reset_ai_logger, 1)
rec = reset_ai_logger[0]
assert rec['caller'] == 'hermes_intent'
assert rec['provider'] == 'ollama_111'
assert rec['meta']['host_label'] == '111 備援'
assert rec['provider'] == 'ollama_secondary'
assert rec['meta']['host_label'] == 'GCP-SSD-2'
def test_hermes_batch_analyze_uses_ollama_service_and_logs_secondary(monkeypatch, reset_ai_logger):
@@ -136,6 +137,7 @@ def test_hermes_batch_analyze_uses_ollama_service_and_logs_secondary(monkeypatch
call_kwargs = fake_service.instances[0].generate_calls[0]
assert call_kwargs['system_prompt'] == svc.SYSTEM_PROMPT
assert call_kwargs['keep_alive'] == hermes_mod.HERMES_KEEP_ALIVE
assert call_kwargs['allow_111_fallback'] is False
assert _wait_for(reset_ai_logger, 1)
rec = reset_ai_logger[0]
@@ -158,3 +160,7 @@ def test_hermes_candidate_sql_only_joins_direct_price_alert_matches():
def test_hermes_keep_alive_defaults_to_short_runner_residency():
assert hermes_mod.HERMES_KEEP_ALIVE == "5m"
def test_hermes_disables_111_fallback_by_default():
assert hermes_mod.HERMES_ALLOW_111_FALLBACK is False

View File

@@ -151,6 +151,23 @@ def test_unit_price_comparison_builds_normalized_evidence():
assert comparison["unit_gap_pct"] < 0
def test_marketplace_matcher_routes_same_base_different_piece_pack_to_unit_comparable():
from services.marketplace_product_matcher import score_marketplace_match
diagnostics = score_marketplace_match(
"【日本Beauty Foot】去角質足膜25mlx2枚入 5入組(一般尺寸、大尺寸可選)",
"【日本Beauty Foot 】煥膚足膜(25ml*2枚入)四入組",
momo_price=1290,
competitor_price=989,
)
assert diagnostics.comparison_mode == "unit_comparable"
assert diagnostics.match_type == "same_product_different_pack"
assert diagnostics.price_basis == "unit_price"
assert "pack_quantity_difference" in diagnostics.reasons
assert "unit_comparable" in diagnostics.reasons
def test_marketplace_matcher_does_not_unit_compare_multi_component_set():
from services.marketplace_product_matcher import score_marketplace_match
@@ -1313,6 +1330,10 @@ def test_marketplace_matcher_keeps_high_variant_low_score_lines_outside_focused_
"【Solone】持久眼線筆(眼線膠 超防暈推薦)",
"Solone 斜角眉筆 0.35g",
)
sunscreen_line_gap = score_marketplace_match(
"【我的心機】溫和寶貝兒童防曬乳35ml(SPF50+ PA+++)",
"我的心機 海洋友善保濕高效防曬乳35ml(SPF50+PA++++)",
)
for diagnostics in (
lush,
@@ -1328,10 +1349,14 @@ def test_marketplace_matcher_keeps_high_variant_low_score_lines_outside_focused_
romand_line_gap,
summer_eve_variant_gap,
solone_type_gap,
sunscreen_line_gap,
):
assert diagnostics.score < 0.76
assert not any(reason.startswith("focused_exact_identity_") for reason in diagnostics.reasons)
assert sunscreen_line_gap.hard_veto is True
assert "variant_descriptor_conflict" in sunscreen_line_gap.reasons
def test_marketplace_matcher_rejects_refill_core_vs_case_only_pack():
from services.marketplace_product_matcher import score_marketplace_match

View File

@@ -163,6 +163,30 @@ def test_generate_forces_final_fallback_when_unhealthy_ttl_expires_mid_request()
assert 'all 3 hosts failed' in (resp.error or '')
def test_generate_can_disable_111_fallback_for_batch_llm_work():
"""批量 LLM 任務可選擇只跑 GCP-A/GCP-B避免 111 承接長分析。"""
import requests
from services import ollama_service as oss
from services.ollama_service import OllamaService
svc = OllamaService()
hosts = [
oss.OLLAMA_HOST_SECONDARY,
oss.OLLAMA_HOST_FALLBACK,
]
with patch('services.ollama_service.resolve_ollama_host', side_effect=hosts), \
patch('services.ollama_service.requests.post',
side_effect=requests.Timeout('secondary timeout')) as mock_post:
resp = svc.generate('test', allow_111_fallback=False)
posted_hosts = [call.args[0].split('/api/generate')[0] for call in mock_post.call_args_list]
assert resp.success is False
assert posted_hosts == [oss.OLLAMA_HOST_SECONDARY]
assert oss.OLLAMA_HOST_FALLBACK not in posted_hosts
assert '111 fallback disabled' in (resp.error or '')
def test_generate_token_parsing_phase13():
"""Phase 13 補強OllamaResponse 解 prompt_eval_count + eval_count"""
from services.ollama_service import OllamaService