fix(alerts): 3 個飛輪沉默節點 — DIAGNOSE routing + 心跳停用 + 通知格式
Some checks failed
CD Pipeline / build-and-deploy (push) Has been cancelled

1. openclaw.py: DIAGNOSE 移除 require_local=True
   - v4.3 已決定 NIM 為主力且無隱私問題
   - require_local=True 導致所有 provider 被 privacy_skip → 告警永遠失敗
   - 修後 DIAGNOSE 走 _full_fallback_chain(NIM → Gemini → Claude)

2. ai_router.py: require_local 失敗通知改為 ADR-075 TYPE-1 格式
   - 禁止純文字 raw notification(統帥鐵律:所有訊息必須符合格式模板)
   - 改用 ├─ / └─ 樹狀結構 + 語義化標籤

3. main.py: 停用 Telegram 心跳監控
   - 心跳已轉發到另一個 Telegram 群組,不需在此頻道重複發送

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
OG T
2026-04-15 19:49:29 +08:00
parent 2d85b49cc0
commit 3ce5025ca7
3 changed files with 24 additions and 18 deletions

View File

@@ -229,15 +229,14 @@ async def lifespan(_app: FastAPI) -> AsyncGenerator[None, None]:
register_all_providers()
logger.info("mcp_providers_registered")
# Phase 6.5: Telegram 心跳監控 (防止沉默盲點)
# - 每 30 分鐘發送心跳,證明告警鏈路正常
# - 超過 2 小時沒訊息則告警
if settings.OPENCLAW_TG_BOT_TOKEN:
await telegram_gw.start_heartbeat_monitor(
heartbeat_interval_minutes=30,
silence_threshold_hours=2,
)
logger.info("telegram_heartbeat_monitor_started")
# Phase 6.5: Telegram 心跳監控
# 2026-04-15 ogt: 停用 — 心跳已轉發到另一個 Telegram 群組,不需在此頻道重複發送
# if settings.OPENCLAW_TG_BOT_TOKEN:
# await telegram_gw.start_heartbeat_monitor(
# heartbeat_interval_minutes=30,
# silence_threshold_hours=2,
# )
logger.info("telegram_heartbeat_monitor_disabled", reason="forwarded_to_separate_group")
# Reboot Recovery: Warm-up Redis Working Memory from PostgreSQL
# 2026-04-05 ogt: 重開機後 Redis 清空,從 DB restore 未解決的 incidents

View File

@@ -1001,19 +1001,23 @@ class AIRouterExecutor:
pass
# 2026-04-04 ogt: Phase 25 P0 — require_local 全部失敗時 Telegram 通知(隱私邊界)
# 2026-04-15 ogt: 改用 ADR-075 TYPE-1 格式,禁止純文字 raw notification
if require_local:
try:
from src.services.telegram_gateway import get_telegram_gateway
tg = get_telegram_gateway()
import asyncio as _asyncio
# 2026-04-14 Claude Sonnet 4.6: send_text 方法不存在,改 send_notification
tried_str = ", ".join(provider_order)
formatted = (
"⚠️ <b>TYPE-1 | AI Provider 不可用</b>\n"
"──────────────────────\n"
f"├─ 已嘗試: <code>{tried_str}</code>\n"
"└─ 原因: require_local=True無可用本地 Provider\n"
"\n"
"需要人工介入"
)
_asyncio.create_task(
tg.send_notification(
"⚠️ <b>DIAGNOSE 本地 Provider 不可用</b>\n"
f"已嘗試: {', '.join(provider_order)}\n"
"需要人工介入,雲端 Provider 不會被呼叫(隱私邊界)。",
parse_mode="HTML",
)
tg.send_notification(formatted, parse_mode="HTML")
)
except Exception as _tg_e:
logger.warning("diagnose_reject_telegram_failed", error=str(_tg_e))

View File

@@ -893,8 +893,11 @@ class OpenClawService:
except Exception as _e:
logger.warning("ai_control_override_failed", error=str(_e))
# Step 3: D7 隱私 — DIAGNOSE/CODE_REVIEW 強制 local
require_local = decision.intent in (IntentType.DIAGNOSE, IntentType.CODE_REVIEW)
# Step 3: D7 隱私 — CODE_REVIEW 強制 local
# 2026-04-15 ogt: DIAGNOSE 移除 require_localv4.3 決策NIM 為主力,無隱私問題)
# ai_router.py v4.3 已明確「NIM 從 Phase 22 起就是主力,無隱私問題」
# require_local=True 對 DIAGNOSE 只會讓所有 provider 被 privacy_skip → 永遠失敗
require_local = decision.intent in (IntentType.CODE_REVIEW,)
result = await executor.execute(
prompt=prompt,