fix(critic-review): PR-R1 4 Major 修 — wildcard 過濾 + 二次確認 + unverified 旗標
Some checks failed
CD Pipeline / build-and-deploy (push) Failing after 1m34s

critic PR review 681b5ac9 揭示 4 Major 問題(無 Critical),全部修復。

## Major #1 — generic_fallback wildcard 污染 RAG 語料
位置:rule_to_playbook_migrator.py:128 `_build_symptom_pattern`

問題:generic_fallback 規則的 `alert_names=["*"]` 會原樣寫入 PlaybookRecord,
進 playbook_rag 向量化文字「告警: *」變成普通 token,每筆查詢都會跟它算相似度
→ RAG top-k 可能回 fallback DRAFT 誤導推薦。

修法:在 `_build_symptom_pattern` 過濾 `["*"]`(與 keywords 一致對待)。

## Major #2 — CLI --commit 無二次確認
位置:scripts/migrate_rules_to_playbooks.py

問題:`--commit` 直接寫 prod DB 25 筆 DRAFT,誤跑無法回頭。

修法:
- 加 `--yes` flag(CI / 自動化用)
- 沒帶 `--yes` 時 stdin prompt: "Type 'yes' to confirm"

## Major #3 — yaml_rule kubectl_command 未過 SPF-2 action_parser
位置:rule_to_playbook_migrator.py:153 `_build_repair_steps`

問題:DRAFT 不會自動 promote(門檻 0.9),但人工 review 路徑無安全攔截器。
若有人 UI 一鍵 promote → 含 {target} placeholder 的危險指令直接到 prod。

修法:在 step dict 加 metadata:
- unverified_command: True
- needs_action_parser_review: True
- source: "yaml_rule_migration"
(promote 流程須強制走 action_parser,由 SPF-2 落地時實作)

## Minor 修
- 刪除 dead import `import re`(未使用)
- `enumerate([:3], start=2)` 取代 `if idx >= 4: break`(邊界寫法易誤讀)

## 驗證
- 23 個 PR-R1 測試全綠(修法不破壞既有行為)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
Your Name
2026-04-29 10:56:32 +08:00
parent 681b5ac949
commit 8d24f15183
2 changed files with 36 additions and 6 deletions

View File

@@ -66,6 +66,14 @@ def parse_args() -> argparse.Namespace:
default=False,
help="模擬 ENABLE_RULE_MIGRATION_DRAFT=false測試 feature flag 關閉路徑)",
)
# 2026-04-29 ogt + Claude Opus 4.7: critic Major #2 修
# --commit 寫 prod DB 必須二次確認,誤跑會在 prod 製造 25 筆 DRAFT
parser.add_argument(
"--yes",
action="store_true",
default=False,
help="跳過 --commit 的二次確認 promptCI / 自動化用)",
)
return parser.parse_args()
@@ -97,6 +105,16 @@ async def _run(args: argparse.Namespace) -> int:
print(f"[ERROR] yaml 不存在: {yaml_path}", file=sys.stderr)
return 1
# 2026-04-29 critic Major #2 修:--commit 二次確認,--yes 跳過
if not dry_run and not args.yes:
ans = input(
"⚠️ 即將寫入 prod DB最多 25 筆 DRAFT Playbook\n"
" Type 'yes' to confirm (or 'n' to abort): "
).strip().lower()
if ans != "yes":
print("[ABORTED] 使用者取消type 'yes' to confirm", file=sys.stderr)
return 1
report = await migrate_yaml_rules_to_playbooks(
yaml_path=yaml_path,
dry_run=dry_run,

View File

@@ -18,7 +18,6 @@ W1 PR-R1 — 規則 → Playbook 遷移
"""
from __future__ import annotations
import re
from dataclasses import dataclass, field
from pathlib import Path
from typing import Any
@@ -125,8 +124,13 @@ def _build_symptom_pattern(rule: dict[str, Any]) -> dict[str, Any]:
# 過濾萬用符generic_fallback 有 "*"
keywords = [k for k in keywords if k != "*"][:15]
# 2026-04-29 ogt + Claude Opus 4.7: critic Major #1 修
# 過濾 alert_names 中的 "*" wildcardgeneric_fallback— 否則進 RAG 向量化
# 後變成「告警: *」污染語料,每筆查詢都會跟它算相似度
raw_names = alertnames if isinstance(alertnames, list) else [alertnames]
filtered_names = [n for n in raw_names if n and n != "*"]
return {
"alert_names": alertnames if isinstance(alertnames, list) else [alertnames],
"alert_names": filtered_names,
"affected_services": [],
"severity_range": severity_range,
"keywords": keywords,
@@ -159,6 +163,14 @@ def _build_repair_steps(rule: dict[str, Any]) -> list[dict[str, Any]]:
"expected_result": resp.get("action_title", ""),
"risk_level": risk_level,
"requires_approval": risk_level == "CRITICAL" or suggested_action in ("RESTART_DEPLOYMENT", "DELETE_POD", "SCALE_DEPLOYMENT"),
# 2026-04-29 ogt + Claude Opus 4.7: critic Major #3 修
# yaml_rule 來源的 kubectl_command 未經 SPF-2 action_parser 驗證
# promote 流程DRAFT → APPROVED必須強制走 action_parser否則危險指令直達 prod
"metadata": {
"unverified_command": True,
"needs_action_parser_review": True,
"source": "yaml_rule_migration",
},
})
else:
# NO_ACTION — 記錄診斷描述為 manual step讓 RAG 至少有症狀可查
@@ -172,8 +184,10 @@ def _build_repair_steps(rule: dict[str, Any]) -> list[dict[str, Any]]:
"requires_approval": True,
})
# 追加 optimization steps最多 3 個step_number 從 2 開始
for idx, opt in enumerate(resp.get("optimization", []) or [], start=2):
# 追加 optimization steps最多 3 個step_number 2/3/4
# 2026-04-29 critic Minor 修:原 `if idx >= 4: break` 寫在 append 後易誤讀
# 改用 [:3] slice 明確限制最多 3 個
for idx, opt in enumerate((resp.get("optimization", []) or [])[:3], start=2):
opt_cmd = (opt.get("command", "") or "").strip()
if not opt_cmd or opt_cmd.startswith("#"):
continue
@@ -185,8 +199,6 @@ def _build_repair_steps(rule: dict[str, Any]) -> list[dict[str, Any]]:
"risk_level": "LOW",
"requires_approval": False,
})
if idx >= 4: # 最多 3 個 optimization steps
break
return steps