feat(adr-081): Phase 1 感官縱深 — 8D 情報蒐集 + 執行後驗證
成品: - IncidentEvidence DB model(8D 感官 + pre/post 執行狀態) - EvidenceSnapshot dataclass(build_summary → LLM 上下文) - SanitizationService(Prompt Injection 0-tolerance,12 pattern) - MCPToolRegistry(動態工具登記,suggest_tools 不寫死告警類型) - PreDecisionInvestigator(8D 並行感官,P99 < 8s,Redis 30s 快取) - PostExecutionVerifier(warmup 10s → 後狀態評估 success/degraded/failed) - decision_manager + approval_execution 接線(feature flag 守衛) Gate 1 修復:D4/D5/D7/D8 補 sanitize_dict_values;移除裸 "error" failure signal 防 error_rate key 誤判;evidence_snapshot rowcount 零行警告。 測試:130 passed(+111 新增) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
@@ -737,3 +737,99 @@ class KnowledgeEntryRecord(Base):
|
||||
# 2026-04-04 ogt: Phase 25 P1 — Anti-Pattern 快速查詢
|
||||
Index("ix_knowledge_symptoms_hash", "symptoms_hash"),
|
||||
)
|
||||
|
||||
|
||||
# IncidentEvidence — ADR-081 Phase 1 EvidenceSnapshot 持久化
|
||||
# 2026-04-15 ogt + Claude Sonnet 4.6: AI 自主化飛輪 Phase 1 初始建立
|
||||
class IncidentEvidence(Base):
|
||||
"""
|
||||
不可變事件證據快照表
|
||||
|
||||
每次決策前 PreDecisionInvestigator 拍攝一次 EvidenceSnapshot,
|
||||
寫入此表以供:
|
||||
- 決策溯源(LLM 推理過程的完整情報上下文)
|
||||
- 學習訓練(Phase 3 fine-tune pipeline 金礦資料)
|
||||
- 異常驗證(執行前 vs 執行後 state diff)
|
||||
|
||||
ADR-081: PreDecisionInvestigator + EvidenceSnapshot
|
||||
設計原則:只追加寫入,禁止 UPDATE(event sourcing 對齊)
|
||||
"""
|
||||
__tablename__ = "incident_evidence"
|
||||
|
||||
id: Mapped[str] = mapped_column(String(36), primary_key=True, default=generate_uuid)
|
||||
|
||||
# 關聯
|
||||
incident_id: Mapped[str] = mapped_column(String(30), nullable=False, index=True)
|
||||
# Phase 3 填充:matched_playbook_id 目前永久 null,Phase 3 修復
|
||||
matched_playbook_id: Mapped[str | None] = mapped_column(String(36), nullable=True)
|
||||
|
||||
# Schema 版本(方便 fine-tune pipeline 過濾相容版本)
|
||||
schema_version: Mapped[str] = mapped_column(String(10), default="v1", nullable=False)
|
||||
|
||||
# 8D 感官數據(各維度 nullable — MCP 失敗時部分缺失)
|
||||
k8s_state: Mapped[dict | None] = mapped_column(
|
||||
JSON, nullable=True, comment="D1: kubectl describe pod + events"
|
||||
)
|
||||
recent_logs: Mapped[str | None] = mapped_column(
|
||||
Text, nullable=True, comment="D2: container stderr tail-50,經 SanitizationService 清洗"
|
||||
)
|
||||
metrics_snapshot: Mapped[dict | None] = mapped_column(
|
||||
JSON, nullable=True, comment="D3: Prometheus 5min vs 1h baseline 對比"
|
||||
)
|
||||
recent_deployments: Mapped[list | None] = mapped_column(
|
||||
JSON, nullable=True, comment="D4: ArgoCD/Gitea 過去 1h 部署 diff"
|
||||
)
|
||||
business_metrics: Mapped[dict | None] = mapped_column(
|
||||
JSON, nullable=True, comment="D5: 訂單量 / 登入成功率 / P0 SLI"
|
||||
)
|
||||
historical_context: Mapped[str | None] = mapped_column(
|
||||
Text, nullable=True, comment="D6: 過去 30 天同 alertname 處置歷史摘要"
|
||||
)
|
||||
peer_health: Mapped[dict | None] = mapped_column(
|
||||
JSON, nullable=True, comment="D7: 同 Deployment 其他 replica 健康度"
|
||||
)
|
||||
dependency_topology: Mapped[dict | None] = mapped_column(
|
||||
JSON, nullable=True, comment="D8: Istio/Service Mesh 上下游 latency/error rate"
|
||||
)
|
||||
|
||||
# 感官品質指標
|
||||
mcp_health: Mapped[dict] = mapped_column(
|
||||
JSON, default=dict, nullable=False,
|
||||
comment="各 MCP 呼叫成敗 {tool_name: bool},用於 decision_fusion 權重調整"
|
||||
)
|
||||
collection_duration_ms: Mapped[int | None] = mapped_column(
|
||||
Integer, nullable=True, comment="情報蒐集總耗時(ms),P99 目標 < 8000"
|
||||
)
|
||||
sensors_attempted: Mapped[int] = mapped_column(
|
||||
default=0, nullable=False, comment="嘗試啟動的感官數"
|
||||
)
|
||||
sensors_succeeded: Mapped[int] = mapped_column(
|
||||
default=0, nullable=False, comment="成功回傳資料的感官數"
|
||||
)
|
||||
|
||||
# LLM 輸入摘要(不超 8K tokens,由 Investigator 壓縮)
|
||||
evidence_summary: Mapped[str | None] = mapped_column(
|
||||
Text, nullable=True, comment="最終餵給 LLM 的情報摘要(UTF-8,< 8K tokens)"
|
||||
)
|
||||
|
||||
# 執行前後 State(PostExecutionVerifier 填入 post_execution_state)
|
||||
pre_execution_state: Mapped[dict | None] = mapped_column(
|
||||
JSON, nullable=True, comment="執行前環境狀態快照(PostExecutionVerifier 基準線)"
|
||||
)
|
||||
post_execution_state: Mapped[dict | None] = mapped_column(
|
||||
JSON, nullable=True, comment="執行後環境狀態(PostExecutionVerifier 抓取,Phase 1 接線)"
|
||||
)
|
||||
verification_result: Mapped[str | None] = mapped_column(
|
||||
String(20), nullable=True, comment="success / degraded / failed / timeout(PostExecutionVerifier 填入)"
|
||||
)
|
||||
|
||||
# 時間戳(台北時區)
|
||||
collected_at: Mapped[datetime] = mapped_column(
|
||||
DateTime(timezone=True), default=taipei_now, nullable=False
|
||||
)
|
||||
|
||||
__table_args__ = (
|
||||
Index("ix_incident_evidence_incident_id", "incident_id"),
|
||||
Index("ix_incident_evidence_collected_at", "collected_at"),
|
||||
Index("ix_incident_evidence_playbook_id", "matched_playbook_id"),
|
||||
)
|
||||
|
||||
Reference in New Issue
Block a user