fix(auto_execute): 守衛加入 target==alertname 檢查,防止 LLM 把告警名稱當 deployment 名稱
All checks were successful
CD Pipeline / build-and-deploy (push) Successful in 13m33s

HostHighCpuLoad 等主機告警,NemoTron Tool Calling 可能把
alertname 填入 deployment_name,導致執行
'kubectl rollout restart deployment HostHighCpuLoad'。

新增守衛: _target == _alertname 時拒絕執行並通知人工介入。

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
OG T
2026-04-11 01:13:24 +08:00
parent 8a8c6a4eb1
commit 68a3858ae4

View File

@@ -679,14 +679,16 @@ class DecisionManager:
action = _re.sub(r"<[^>]+>", _target, action)
# 安全守衛: 替換後仍含 "unknown" 或未替換的 <...>/{...} → 拒絕執行
# 主機層告警HostHighCpuLoad 等)沒有 deployment 名稱,不應盲目執行
if "unknown" in action or _re.search(r"[<{][^>}]+[>}]", action):
# 另外:若 target 等於 alertname代表 LLM 把告警名稱填入 deployment_name也拒絕
_alertname = incident.signals[0].labels.get("alertname", "") if incident.signals else ""
_target_is_alertname = bool(_alertname and _target == _alertname)
if "unknown" in action or _re.search(r"[<{][^>}]+[>}]", action) or _target_is_alertname:
logger.warning(
"auto_execute_blocked_unresolved_placeholder",
incident_id=incident.incident_id,
action=action,
target=_target,
reason="action 含未解析的 placeholderunknown拒絕執行",
reason="action 含未解析的 placeholderunknown、或 target==alertname,拒絕執行",
)
token.state = DecisionState.ERROR
token.error = f"Auto-execute blocked: unresolved placeholder in action: {action[:80]}"