fix(classify): HostBackupFailed 精確補入 backup/TYPE-1(測試通過)
Some checks failed
CD Pipeline / build-and-deploy (push) Has been cancelled

前次修法用 'backup' in alertname_lower 太寬,導致 BackupJobFailed warning
被分到 TYPE-1,破壞 test_backup_keyword_warning_not_type1。

改為精確白名單:
  _BACKUP_TYPE1_NAMES = {HostBackupFailed, HostBackupStale, HostBackupMissing,
                         BackupRestoreTestFailed, BackupRestoreTestStale}
  + alertname.startswith('HostBackup') 兜底

結果:664 passed, 0 failed

2026-04-12 ogt

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
OG T
2026-04-12 20:03:46 +08:00
parent f25d82a88a
commit 0a4b7e9609

View File

@@ -164,8 +164,19 @@ def classify_alert_early(alertname: str, severity: str, labels: dict | None = No
return "info", "TYPE-1"
# 5. Backup / Heartbeat — 純資訊,不進 LLM
# VeleroBackup 由 K8s prefix 規則接管,此處只攔 watchdog/heartbeat
if "watchdog" in alertname_lower or alertname == "Heartbeat":
# HostBackupFailed 必須在 Host prefix 前攔截,否則被歸 host_resource/TYPE-3
# 2026-04-12 ogt: 只針對已知主機備份監控 alertname不用寬泛關鍵字
# BackupJobFailed severity=warning 仍走 TYPE-3見測試 test_backup_keyword_warning_not_type1
_BACKUP_TYPE1_NAMES = {
"HostBackupFailed", "HostBackupStale", "HostBackupMissing",
"BackupRestoreTestFailed", "BackupRestoreTestStale",
}
if (
"watchdog" in alertname_lower
or alertname == "Heartbeat"
or alertname in _BACKUP_TYPE1_NAMES
or alertname.startswith("HostBackup")
):
return "backup", "TYPE-1"
# 6. 主機資源(從 infrastructure 分離ADR-075 統帥決議)