V10.425 add 111 Ollama usage guard
All checks were successful
CD Pipeline / deploy (push) Successful in 1m6s
All checks were successful
CD Pipeline / deploy (push) Successful in 1m6s
This commit is contained in:
@@ -325,7 +325,7 @@ YOUTUBE_API_KEY = os.getenv('YOUTUBE_API_KEY', '')
|
||||
# ==========================================
|
||||
# 系統版本與路徑
|
||||
# ==========================================
|
||||
SYSTEM_VERSION = "V10.424"
|
||||
SYSTEM_VERSION = "V10.425"
|
||||
LOG_FILE_PATH = os.path.join(BASE_DIR, 'logs/system.log')
|
||||
public_url = PUBLIC_URL # 用於模板顯示
|
||||
|
||||
|
||||
@@ -2,7 +2,7 @@
|
||||
|
||||
> **最後更新**: 2026-05-24 (台北時間)
|
||||
> **狀態**: 🟢 四 AI Agent 自動化閉環已落地;LLM 路由紅線升級為 Ollama-first 三主機級聯,Gemini 備援預設關閉
|
||||
> **適用版本**: V10.424
|
||||
> **適用版本**: V10.425
|
||||
|
||||
---
|
||||
|
||||
@@ -31,6 +31,7 @@
|
||||
- Gemini 不可被任何狀態面板或 router 推薦為主提供者:`AIProviderService._get_recommended_provider()` 不得回傳 `gemini`,只能顯示為 fallback 狀態;`llm_model_router` 的 `ea_engine` 若收到 `gemini-*` default 必須改回 `hermes3:latest`,需要深推理時才升本地 `deepseek-r1:14b`。
|
||||
- ElephantAlpha prompt / agent registry 不得再把 OpenClaw 描述為 Gemini 主模型;OpenClaw 是 `qwen2.5-coder:7b` / `qwen3:14b` Ollama-first 策略師,Gemini 僅能在 guard 顯式解鎖後作 emergency fallback。
|
||||
- 111 `192.168.0.111` 只是最後一道 Mac fallback,不承接 7B+、vision、long-context 模型長駐;`OllamaService.generate()` 落到 111 時會將 `qwen3`、`deepseek-r1`、`hermes3`、`qwen2.5*`、`gemma3`、`llava`、`minicpm-v` 與 7B+ 模型依 `OLLAMA_111_MODEL_DOWNGRADE_PATTERNS` 降級到 `OLLAMA_111_MODEL_FALLBACK=llama3.2:latest`,並以 `OLLAMA_111_KEEP_ALIVE=5m`、`OLLAMA_111_MAX_TIMEOUT=20`、`OLLAMA_111_NUM_CTX=4096`、`OLLAMA_111_NUM_PREDICT=512` 封頂。OpenClaw 報告型路徑的業務 keep-alive 預設 `5m`;Code Review 以 `CODE_REVIEW_ALLOW_111_FALLBACK=false`、Hermes 以 `HERMES_ALLOW_111_FALLBACK=false` 預設跳過 111,避免 16GB RAM 主機與 GCP-B 被長駐 runner、長輸出與 24h keep-alive 壓到高 load。
|
||||
- Scheduler 每 15 分鐘執行 `run_ollama_111_usage_guard_check()`,只讀 `ai_calls` 統計最近視窗的 GCP-A / GCP-B / 111 呼叫量;預設 60 分鐘內 Ollama 呼叫至少 20 次、111 至少 3 次且占比 >= 5% 才推 Telegram。這是觀測護欄,不改路由、不寫 DB、不自動重啟服務。
|
||||
- 111 的 LAN 入口必須經 `scripts/ops/ollama111_allow_proxy.py` allowlist proxy:真實 Ollama 綁 `127.0.0.1:11434`,proxy 綁 `192.168.0.111:11434`,預設只允許 111 本機與 188 生產宿主;110 / 121 / 其他 LAN client 不能直接打 111,避免跨專案 CI 或 VM 繞過 momo-pro router 載入 7B+ runner。111 上以 `scripts/ops/install_ollama111_allow_proxy.sh` 安裝 user LaunchAgent,安裝器會把 proxy script 複製到 `~/.local/share/momo-pro-system/ollama111_allow_proxy.py`,讓 LaunchAgent 不依賴 iCloud repo 掛載路徑,並讓 proxy 與 `OLLAMA_HOST=127.0.0.1:11434` 在登入/重啟後自動恢復。
|
||||
- ElephantAlpha 的 `price_drop_alert` / `market_opportunity` Telegram HITL 告警必須把同款證據獨立呈現,至少包含 `match_type`、`price_basis`、`alert_tier` 與 `match_score`;沒有高信心同款與總價可比證據時,不得把 PChome/MOMO 價差寫成可直接跟價建議。
|
||||
|
||||
|
||||
@@ -13,6 +13,7 @@
|
||||
## 📅 詳細更新日誌 (考古存檔)
|
||||
|
||||
### 2026-05-24:PChome 近門檻身份回收第二輪
|
||||
- **V10.425 111 fallback 使用率護欄**: Scheduler 每 15 分鐘只讀 `ai_calls` 檢查 111 Ollama fallback 使用率,預設 60 分鐘內 Ollama 呼叫 >=20、111 呼叫 >=3 且占比 >=5% 才推 Telegram,並列出 111 caller Top 5;此護欄只觀測與告警,不改路由、不寫 DB、不重啟服務,讓 111 被異常承接高負載時可即早發現。
|
||||
- **V10.424 111 proxy LaunchAgent 安裝路徑穩定化**: `install_ollama111_allow_proxy.sh` 會把 proxy script 複製到 `~/.local/share/momo-pro-system/ollama111_allow_proxy.py` 後再寫入 LaunchAgent,避免 111 重啟或 iCloud repo 路徑未掛載時代理失效;同時清空舊 stderr log,讓安裝後狀態更容易判讀。
|
||||
- **V10.423 12 Agent 決策信封**: `triaged_alert()` 支援 `decision_envelope` 結構化區塊,讓 Hermes / NemoTron / OpenClaw / ElephantAlpha 與後續 12 角色決策統一輸出 `severity`、`evidence`、`recommended_action`、`expected_impact`、`confidence`、`guardrails` 與 `trace`;缺證據時必須明確標記資料品質與 HITL 邊界,避免再出現空泛效益預測或不可追溯告警。
|
||||
- **V10.422 111 proxy LaunchAgent 持久化**: 新增 `scripts/ops/install_ollama111_allow_proxy.sh`,在 111 以 user LaunchAgent 安裝 `com.momo.ollama111-allow-proxy`,啟動時設定 `OLLAMA_HOST=127.0.0.1:11434`、重啟 Ollama、載入 allowlist proxy,避免重開機或重新登入後 111 又回到 LAN 全開狀態。
|
||||
|
||||
125
run_scheduler.py
125
run_scheduler.py
@@ -53,6 +53,7 @@ logging.basicConfig(
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
_AI_CALLS_ERROR_SPIKE_LAST_PUSH_TS = 0.0
|
||||
_OLLAMA_111_USAGE_LAST_PUSH_TS = 0.0
|
||||
|
||||
|
||||
def _env_flag(name: str, default: bool = False) -> bool:
|
||||
@@ -204,6 +205,10 @@ def _register_schedules():
|
||||
schedule.every(30).minutes.do(run_ai_calls_error_spike_check)
|
||||
logger.info("📅 每 30 分鐘:ai_calls_error_spike_check(錯誤率 ≥ 30% 推 Telegram)")
|
||||
|
||||
# Phase 57: 111 Ollama 使用率護欄,避免 final fallback 默默承接高負載
|
||||
schedule.every(15).minutes.do(run_ollama_111_usage_guard_check)
|
||||
logger.info("📅 每 15 分鐘:ollama_111_usage_guard_check(111 fallback 使用率告警)")
|
||||
|
||||
# Phase 44: 觀測台每日 09:30 健康摘要推送
|
||||
schedule.every().day.at("09:30").do(run_observability_daily_summary)
|
||||
logger.info("📅 每日 09:30:observability_daily_summary(早晨報三主機/AI/Cost/PPT)")
|
||||
@@ -724,6 +729,126 @@ def run_ai_calls_error_spike_check():
|
||||
)
|
||||
|
||||
|
||||
def run_ollama_111_usage_guard_check():
|
||||
"""Phase 57 — final fallback 111 使用率告警。
|
||||
|
||||
111 是最後防線;這個 guard 只觀測 ai_calls,不改路由。
|
||||
預設條件:最近 60 分鐘 Ollama 呼叫 >= 20、111 呼叫 >= 3、111 占比 >= 5%。
|
||||
"""
|
||||
if not _env_flag("OLLAMA_111_USAGE_ALERT_ENABLED", True):
|
||||
return
|
||||
|
||||
try:
|
||||
from sqlalchemy import text as _sa
|
||||
from database.manager import DatabaseManager
|
||||
|
||||
window_minutes = int(os.getenv("OLLAMA_111_USAGE_ALERT_WINDOW_MINUTES", "60"))
|
||||
threshold_pct = float(os.getenv("OLLAMA_111_USAGE_ALERT_PCT", "5"))
|
||||
min_total = int(os.getenv("OLLAMA_111_USAGE_ALERT_MIN_TOTAL", "20"))
|
||||
min_111 = int(os.getenv("OLLAMA_111_USAGE_ALERT_MIN_111", "3"))
|
||||
dedup_sec = int(os.getenv("OLLAMA_111_USAGE_ALERT_DEDUP_SEC", "3600"))
|
||||
|
||||
session = DatabaseManager().get_session()
|
||||
try:
|
||||
row = session.execute(
|
||||
_sa("""
|
||||
SELECT
|
||||
COUNT(*) FILTER (
|
||||
WHERE provider IN ('gcp_ollama','ollama_secondary','ollama_111')
|
||||
) AS total_ollama,
|
||||
COUNT(*) FILTER (WHERE provider = 'gcp_ollama') AS gcp_a,
|
||||
COUNT(*) FILTER (WHERE provider = 'ollama_secondary') AS gcp_b,
|
||||
COUNT(*) FILTER (WHERE provider = 'ollama_111') AS host_111
|
||||
FROM ai_calls
|
||||
WHERE called_at >= NOW() - (:window_minutes || ' minutes')::interval
|
||||
"""),
|
||||
{"window_minutes": window_minutes},
|
||||
).fetchone()
|
||||
|
||||
total_ollama = int(row[0] or 0)
|
||||
gcp_a = int(row[1] or 0)
|
||||
gcp_b = int(row[2] or 0)
|
||||
host_111 = int(row[3] or 0)
|
||||
|
||||
if total_ollama < min_total or host_111 < min_111:
|
||||
return
|
||||
|
||||
rate_pct = (host_111 / total_ollama * 100.0) if total_ollama else 0.0
|
||||
if rate_pct < threshold_pct:
|
||||
return
|
||||
|
||||
top_callers = session.execute(
|
||||
_sa("""
|
||||
SELECT caller,
|
||||
COALESCE(model, '') AS model,
|
||||
COUNT(*) AS calls,
|
||||
COALESCE(SUM(input_tokens + output_tokens), 0) AS tokens,
|
||||
COUNT(*) FILTER (WHERE status NOT IN ('ok','cache_only')) AS errors
|
||||
FROM ai_calls
|
||||
WHERE called_at >= NOW() - (:window_minutes || ' minutes')::interval
|
||||
AND provider = 'ollama_111'
|
||||
GROUP BY caller, model
|
||||
ORDER BY calls DESC, tokens DESC
|
||||
LIMIT 5
|
||||
"""),
|
||||
{"window_minutes": window_minutes},
|
||||
).fetchall()
|
||||
finally:
|
||||
session.close()
|
||||
|
||||
global _OLLAMA_111_USAGE_LAST_PUSH_TS
|
||||
now_ts = time.time()
|
||||
if now_ts - _OLLAMA_111_USAGE_LAST_PUSH_TS < dedup_sec:
|
||||
logger.info("[Ollama111Guard] skip duplicate alert within %ss window", dedup_sec)
|
||||
return
|
||||
|
||||
from services.telegram_templates import send_telegram_with_result
|
||||
|
||||
lines = [
|
||||
"<b>⚠️ 111 Ollama 使用率偏高</b>",
|
||||
"",
|
||||
f"過去 {window_minutes} 分鐘 Ollama 呼叫:<b>{total_ollama}</b> 次",
|
||||
f"111 fallback:<b>{host_111}</b> 次(<b>{rate_pct:.1f}%</b>)",
|
||||
f"GCP-A:<b>{gcp_a}</b> 次 · GCP-B:<b>{gcp_b}</b> 次",
|
||||
"",
|
||||
]
|
||||
if top_callers:
|
||||
lines.append("<b>111 caller Top 5:</b>")
|
||||
for caller, model, calls, tokens, errors in top_callers:
|
||||
model_part = f" / <code>{model}</code>" if model else ""
|
||||
err_part = f" · err {errors}" if int(errors or 0) else ""
|
||||
lines.append(
|
||||
f"• <code>{caller}</code>{model_part}:{calls} 次 · {int(tokens or 0):,} tokens{err_part}"
|
||||
)
|
||||
lines.append("")
|
||||
lines.extend([
|
||||
"建議先看 GCP-A/GCP-B health probe 與近期 unhealthy mark;",
|
||||
"若 GCP 正常,檢查是否有 fallback flag 或重任務意外打到 111。",
|
||||
])
|
||||
|
||||
reply_markup = {
|
||||
"inline_keyboard": [
|
||||
[{"text": "🏥 主機健康", "callback_data": "cmd:obs_health"},
|
||||
{"text": "📊 AI 呼叫", "callback_data": "cmd:obs_ai_calls"}],
|
||||
],
|
||||
}
|
||||
send_telegram_with_result("\n".join(lines), reply_markup=reply_markup, parse_mode="HTML")
|
||||
_OLLAMA_111_USAGE_LAST_PUSH_TS = now_ts
|
||||
logger.warning(
|
||||
"[Ollama111Guard] alert pushed: total=%s gcp_a=%s gcp_b=%s host_111=%s rate=%.1f%%",
|
||||
total_ollama, gcp_a, gcp_b, host_111, rate_pct,
|
||||
)
|
||||
except Exception as e:
|
||||
logger.error(f"[Ollama111Guard] failed: {e}", exc_info=True)
|
||||
_notify_scheduler_failure(
|
||||
"run_ollama_111_usage_guard_check",
|
||||
e,
|
||||
source="Scheduler.Ollama111Guard",
|
||||
event_type="ollama_111_usage_guard_failure",
|
||||
title="111 Ollama 使用率護欄失敗",
|
||||
)
|
||||
|
||||
|
||||
def run_observability_daily_summary():
|
||||
"""Phase 44 — 每日 09:30 推送觀測台健康摘要(早晨報)。
|
||||
|
||||
|
||||
@@ -16,7 +16,7 @@ INSTALL_SCRIPT_PATH="${INSTALL_DIR}/ollama111_allow_proxy.py"
|
||||
PYTHON_BIN="${PYTHON_BIN:-/usr/bin/python3}"
|
||||
OLLAMA_APP="${OLLAMA_APP:-/Applications/Ollama.app}"
|
||||
OLLAMA_HOST_VALUE="${OLLAMA_HOST_VALUE:-127.0.0.1:11434}"
|
||||
ALLOWED_CIDRS="${OLLAMA111_PROXY_ALLOWED_CIDRS:-127.0.0.1/32,192.168.0.80/32,192.168.0.111/32,192.168.0.188/32}"
|
||||
ALLOWED_CIDRS="${OLLAMA111_PROXY_ALLOWED_CIDRS:-127.0.0.1/32,192.168.0.111/32,192.168.0.188/32}"
|
||||
GUI_DOMAIN="gui/$(id -u)"
|
||||
|
||||
if [[ ! -f "${PROJECT_DIR}/scripts/ops/ollama111_allow_proxy.py" ]]; then
|
||||
|
||||
@@ -15,7 +15,7 @@ import ipaddress
|
||||
import logging
|
||||
import os
|
||||
import signal
|
||||
from typing import Iterable
|
||||
import sys
|
||||
|
||||
|
||||
LISTEN_HOST = os.getenv("OLLAMA111_PROXY_LISTEN_HOST", "192.168.0.111")
|
||||
@@ -26,7 +26,7 @@ ALLOWED_CIDRS = tuple(
|
||||
item.strip()
|
||||
for item in os.getenv(
|
||||
"OLLAMA111_PROXY_ALLOWED_CIDRS",
|
||||
"127.0.0.1/32,192.168.0.80/32,192.168.0.111/32,192.168.0.188/32",
|
||||
"127.0.0.1/32,192.168.0.111/32,192.168.0.188/32",
|
||||
).split(",")
|
||||
if item.strip()
|
||||
)
|
||||
@@ -93,6 +93,7 @@ async def _main() -> None:
|
||||
logging.basicConfig(
|
||||
level=os.getenv("OLLAMA111_PROXY_LOG_LEVEL", "INFO"),
|
||||
format="%(asctime)s %(levelname)s %(message)s",
|
||||
stream=sys.stdout,
|
||||
)
|
||||
server = await asyncio.start_server(_handle_client, LISTEN_HOST, LISTEN_PORT)
|
||||
sockets = ", ".join(str(sock.getsockname()) for sock in (server.sockets or []))
|
||||
|
||||
23
tests/test_ollama111_proxy_contract.py
Normal file
23
tests/test_ollama111_proxy_contract.py
Normal file
@@ -0,0 +1,23 @@
|
||||
from pathlib import Path
|
||||
|
||||
|
||||
ROOT = Path(__file__).resolve().parents[1]
|
||||
|
||||
|
||||
def test_ollama111_proxy_default_allowlist_stays_production_only():
|
||||
proxy_source = (ROOT / "scripts/ops/ollama111_allow_proxy.py").read_text()
|
||||
installer_source = (ROOT / "scripts/ops/install_ollama111_allow_proxy.sh").read_text()
|
||||
|
||||
assert "192.168.0.188/32" in proxy_source
|
||||
assert "192.168.0.188/32" in installer_source
|
||||
assert "192.168.0.111/32" in proxy_source
|
||||
assert "192.168.0.111/32" in installer_source
|
||||
assert "192.168.0.80/32" not in proxy_source
|
||||
assert "192.168.0.80/32" not in installer_source
|
||||
|
||||
|
||||
def test_ollama111_proxy_logs_to_stdout_for_launchagent_collection():
|
||||
proxy_source = (ROOT / "scripts/ops/ollama111_allow_proxy.py").read_text()
|
||||
|
||||
assert "import sys" in proxy_source
|
||||
assert "stream=sys.stdout" in proxy_source
|
||||
@@ -146,6 +146,7 @@ def test_v2_cron_blind_spot_list_has_failure_notifications(monkeypatch):
|
||||
"run_cost_throttle_reset_if_new_month",
|
||||
"run_ppt_vision_audit",
|
||||
"run_embed_consistency_check",
|
||||
"run_ollama_111_usage_guard_check",
|
||||
]:
|
||||
source = inspect.getsource(getattr(run_scheduler, fn_name))
|
||||
assert "_notify_scheduler_failure(" in source
|
||||
@@ -161,6 +162,20 @@ def test_roi_ai_smoke_and_daily_report_schedules_stay_staggered():
|
||||
assert 'schedule.every().day.at("09:05").do(run_roi_monthly_report_if_new_month)' in source
|
||||
assert 'schedule.every().day.at("09:10").do(run_ai_smoke_daily_summary_task)' in source
|
||||
assert "schedule.every(6).hours.do(run_action_plan_hygiene_task)" in source
|
||||
assert "schedule.every(15).minutes.do(run_ollama_111_usage_guard_check)" in source
|
||||
|
||||
|
||||
def test_ollama_111_usage_guard_stays_observational(monkeypatch):
|
||||
run_scheduler = _load_run_scheduler(monkeypatch)
|
||||
source = inspect.getsource(run_scheduler.run_ollama_111_usage_guard_check)
|
||||
|
||||
assert "OLLAMA_111_USAGE_ALERT_ENABLED" in source
|
||||
assert "provider = 'ollama_111'" in source
|
||||
assert "send_telegram_with_result" in source
|
||||
assert "_notify_scheduler_failure(" in source
|
||||
assert "只觀測 ai_calls,不改路由" in source
|
||||
assert "UPDATE" not in source
|
||||
assert "DELETE" not in source
|
||||
|
||||
|
||||
def test_legacy_edm_and_seasonal_promo_schedules_are_opt_in(monkeypatch):
|
||||
|
||||
Reference in New Issue
Block a user