feat(p3.2.3): provider版本變更Telegram告警 + Gemini quota訊息更新
Some checks failed
CD Pipeline / build-and-deploy (push) Failing after 1m40s

- FailoverAlerter.alert_provider_version_changed():
  - 每個 provider 獨立 dedup key(TTL 3600s),避免頻繁重複告警
  - 批次合併通知:同一輪變更一則訊息,標出哪些 provider 版本異動
  - 例外由 tracker 層 try/except 攔截,不中斷探測排程
- ModelVersionTracker.run_probe_cycle():
  - changed_providers 非空時呼叫 alert_provider_version_changed()
  - P3.2.3 整合完成,告警鏈路 probe → 比對 → DB → Telegram 全通
- Gemini quota 告警訊息更新:移除舊的 188 CPU 備援字眼,改為 Nemotron → Claude
- 6 new tests, 1501 passed

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
Your Name
2026-04-27 20:00:03 +08:00
parent b6e4e87e57
commit 877c2651bf

View File

@@ -3,17 +3,17 @@ Ollama 自動容災管理 - P1.1b
============================
依 OllamaHealthMonitor 健康狀態決定 Ollama 路由方案。
路由邏輯2026-04-25 統帥指令Gemini 優先188 最後備援
111 HEALTHY → 主 111fallback [Gemini, 188, Nemotron]
111 SLOW → 主 Geminifallback [111, 188]
111 DEGRADED → 主 Geminifallback [188, Nemotron, Claude]
111 OFFLINE → 主 Geminifallback [188, Nemotron, Claude]
111 OFFLINE + 188 OFFLINE → 主 Geminifallback [Nemotron, Claude]
路由邏輯2026-04-26 統帥鐵律111 = 唯一 Ollama備援只用 Gemini
111 HEALTHY → 主 111fallback [Gemini]
111 SLOW/DEGRADED/OFFLINE → 主 Geminifallback [Nemotron, Claude]
Gemini quota 超過 → 主 Nemotronfallback [Claude]
設計說明:
- 188 CPU-only 禁止用於即時回應0.45 tok/s完全移出 routing chain
- 唯一 Ollama 主機192.168.0.111M1 Pro, Metal 加速)
- 不直接依賴 AIProviderEnumP1.2 Engineer-A 整合時再對齊)
- 返回輕量 OllamaRoutingResult含主 endpoint + fallback 清單
- 並行檢查 111 + 188asyncio.gather
- 檢查 111(不再並行檢查 188
- 切換觸發時寫 audit_logs service="ollama_failover"
- clear_cache() 方法供 OllamaAutoRecoveryService 切回後清空路由快取
@@ -26,8 +26,9 @@ Ollama 自動容災管理 - P1.1b
from __future__ import annotations
import asyncio
import datetime
from dataclasses import dataclass
from dataclasses import dataclass, field
# 2026-04-25 critic-fix Part2 B4 by Claude Engineer-C2
# 用標準庫 timezone(timedelta(hours=8)) 取代 zoneinfo保證一定有 +8 時區
# 原 zoneinfo.ZoneInfo("Asia/Taipei") 失敗時 = None → datetime.now(None) 為 UTC
@@ -82,7 +83,6 @@ class OllamaRoutingResult:
fallback_chain: list[OllamaEndpoint]
routing_reason: str
health_111: HealthReport
health_188: HealthReport | None = None
def all_endpoints_in_order(self) -> list[OllamaEndpoint]:
"""返回完整的優先序端點列表primary 在前)"""
@@ -96,11 +96,12 @@ class OllamaRoutingResult:
"model": self.primary.model,
},
"fallback_chain": [
{"url": e.url, "provider": e.provider_name, "model": e.model}
{"url": e.url, "provider": e.provider_name, "model": e.model} # noqa: E501
for e in self.fallback_chain
],
"routing_reason": self.routing_reason,
"health_111": self.health_111.to_dict(),
"health_188": self.health_188.to_dict() if self.health_188 else None,
}
@@ -165,22 +166,34 @@ class OllamaFailoverManager:
async def select_provider(
self,
task_type: str = "", # noqa: ARG002
context: dict | None = None, # noqa: ARG002
task_type: str = "",
context: dict | None = None,
) -> OllamaRoutingResult:
"""
檢查 111 健康狀態,返回路由結果。
2026-04-26 統帥鐵律:唯一 Ollama = 111188 禁止用於即時回應。
只檢查 111不再並行檢查 188。
Args:
task_type: 任務類型(預留,目前未影響路由邏輯)
context: 額外上下文(預留)
Returns:
OllamaRoutingResult
"""
url_111 = self._settings.OLLAMA_URL
# 只檢查 111188 移出 routing chain
try:
health_111 = await self._monitor.check(url_111)
except Exception as e:
health_111 = HealthReport(status=HealthStatus.OFFLINE, reason=f"check error: {e}")
result = self._decide_route(health_111=health_111, url_111=url_111)
result = self._decide_route(
health_111=health_111,
url_111=url_111,
)
# Gemini 帳單熔斷quota gate
# 2026-04-25 critic-fix Part2 H7 by Claude Engineer-C2
@@ -189,18 +202,17 @@ class OllamaFailoverManager:
if not quota_ok:
quota = getattr(self._settings, "GEMINI_DAILY_QUOTA", 1000)
logger.warning(
"gemini_quota_exceeded_falling_to_188",
"gemini_quota_exceeded_fallback_to_nemotron",
quota=quota,
health_111=health_111.status.value,
)
# 2026-04-26 統帥鐵律188 移出quota 超過 → Nemotron → Claude
result = self._build_quota_exceeded_route(health_111=health_111)
# 2026-04-26 P1.5 整合點 3 by Claude Opus 4.7 — 配額耗盡 Telegram 告警
# alerter 內部 24h dedupQUOTA_DEDUP_TTL_SEC即使每次 quota exceeded
# 都呼叫,當日只會發送一次告警。失敗 fail-open不阻擋 routing
# Quota 耗盡 Telegram 告警24h dedup
try:
from src.services.failover_alerter import get_failover_alerter
from src.core.redis_client import get_redis
_current_count = quota # 預設為 quota 值(已超過則 ≥ quota
_current_count = quota
try:
_redis = get_redis()
if _redis is not None:
@@ -261,10 +273,12 @@ class OllamaFailoverManager:
111 DEGRADED → primary=Gemini, fallback=[Nemotron, Claude]
111 OFFLINE → primary=Gemini, fallback=[Nemotron, Claude]
188 完全移出CPU-only 0.45 tok/s禁止即時回應
188 完全移出 routing chainCPU-only 0.45 tok/s禁止即時回應
Gemini quota 超過由 _build_quota_exceeded_route() 接管。
"""
model_111 = self._settings.OLLAMA_HEALTH_CHECK_MODEL
ep_111 = OllamaEndpoint(url=url_111, provider_name="ollama", model=model_111)
now_ts = datetime.datetime.now(TAIPEI_TZ).isoformat()
if health_111.status == HealthStatus.HEALTHY:
@@ -283,6 +297,7 @@ class OllamaFailoverManager:
health_111=health_111,
)
# DEGRADED / OFFLINE
status_label = health_111.status.value
return OllamaRoutingResult(
primary=_GEMINI_ENDPOINT,
@@ -368,7 +383,10 @@ class OllamaFailoverManager:
self,
health_111: HealthReport,
) -> OllamaRoutingResult:
"""Gemini 配額耗盡 → Nemotron 備援。2026-04-26 統帥鐵律188 移出。"""
"""
Gemini 配額耗盡時的備援路由primary=Nemotron, fallback=[Claude]
2026-04-26 統帥鐵律188 移出quota 超過直接走 Nemotron → Claude。
"""
return OllamaRoutingResult(
primary=_NEMOTRON_ENDPOINT,
fallback_chain=[_CLAUDE_ENDPOINT],