feat(adr-081): Phase 1 感官縱深 — 8D 情報蒐集 + 執行後驗證

成品:
- IncidentEvidence DB model(8D 感官 + pre/post 執行狀態)
- EvidenceSnapshot dataclass(build_summary → LLM 上下文)
- SanitizationService(Prompt Injection 0-tolerance,12 pattern)
- MCPToolRegistry(動態工具登記,suggest_tools 不寫死告警類型)
- PreDecisionInvestigator(8D 並行感官,P99 < 8s,Redis 30s 快取)
- PostExecutionVerifier(warmup 10s → 後狀態評估 success/degraded/failed)
- decision_manager + approval_execution 接線(feature flag 守衛)

Gate 1 修復:D4/D5/D7/D8 補 sanitize_dict_values;移除裸 "error" failure
signal 防 error_rate key 誤判;evidence_snapshot rowcount 零行警告。

測試:130 passed(+111 新增)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
OG T
2026-04-15 13:08:38 +08:00
parent db9e304a14
commit f1cbf6db7d
14 changed files with 2936 additions and 3 deletions

View File

@@ -270,6 +270,17 @@ class ApprovalExecutionService:
)
)
# ADR-081 Phase 1: 執行後驗證 (fire-and-forget)
# PostExecutionVerifier 等待 K8s 收斂後抓取後狀態,補填 EvidenceSnapshot
from src.core.feature_flags import aiops_flags
if aiops_flags.is_sub_flag_enabled("AIOPS_P1_POST_EXECUTION_VERIFIER"):
asyncio.create_task(
self._run_post_execution_verify(
approval=approval,
action_taken=f"{operation_type.value}:{resource_name}",
)
)
# 2026-04-07 Claude Code: Sprint 4 B3 — 記錄人工批准處置類型
try:
anomaly_key = await self._get_anomaly_key_from_approval(approval)
@@ -487,6 +498,63 @@ class ApprovalExecutionService:
self._write_execution_result_to_km(approval, success, error_message)
)
async def _run_post_execution_verify(
self,
approval: "ApprovalRequest",
action_taken: str,
) -> None:
"""
ADR-081 Phase 1: 執行後驗證 (fire-and-forget 包裝)
1. 從 incident_id 查 Incident
2. 從 incident_evidence 取最新 EvidenceSnapshot
3. 呼叫 PostExecutionVerifier.verify() 補填後狀態 + 驗證結果
4. 結果傳給 learning_service 更新 Playbook trust_scorePhase 3
"""
if not approval.incident_id:
return
try:
from src.services.incident_service import get_incident_service
from src.services.post_execution_verifier import get_post_execution_verifier
from src.services.evidence_snapshot import EvidenceSnapshot
incident_svc = get_incident_service()
incident = await incident_svc.get_incident(approval.incident_id)
if incident is None:
logger.warning(
"post_verify_incident_not_found",
approval_id=str(approval.id),
incident_id=approval.incident_id,
)
return
# 取最新 EvidenceSnapshot若 Phase 1 flag 有啟動才會有)
snapshot = await EvidenceSnapshot.get_latest_snapshot(approval.incident_id)
verifier = get_post_execution_verifier()
verification_result = await verifier.verify(
incident=incident,
snapshot=snapshot,
action_taken=action_taken,
)
logger.info(
"post_verify_complete",
approval_id=str(approval.id),
incident_id=approval.incident_id,
result=verification_result,
action=action_taken,
)
except Exception as _e:
# 驗證失敗不影響執行結果
logger.warning(
"post_verify_failed",
approval_id=str(approval.id),
error=str(_e),
)
async def _write_execution_result_to_km(
self,
approval: "ApprovalRequest",