Files
awoooi/apps/api/src/db/models.py
Your Name 8629ac709b
Some checks failed
run-migration / migrate (push) Failing after 59s
Code Review / ai-code-review (push) Successful in 1m8s
Type Sync Check / check-type-sync (push) Successful in 2m27s
feat(awooop): Phase 1-8 完整實作 — AwoooP Agent Platform 六平面架構
## Phase 1-3: Control Plane + Contract System
- awooop_phase1_control_plane_2026-05-04.sql: 12 張核心表 + RLS
- awooop_phase1_batch1_rls_2026-05-04.sql: 全部 FORCE RLS + GRANT
- packages/awooop-contracts/: 六合約 JSON Schema + golden fixtures
- src/models/awooop_contracts.py: Pydantic v2 contract models(extra=forbid)
- src/repositories/contract_repository.py: contract lifecycle(draft→published→active)
- src/services/contract_service.py: HMAC publish sig + Redis multi-sig activate
- src/services/schema_validator.py: LLM output validator(retry×3, E-SCHEMA-001)

## Phase 2: Tenant Isolation
- awooop_phase2_budget_ledger_2026-05-04.sql: budget_ledger + RLS
- src/services/budget_service.py: Token Budget Hard Kill 三層防線
- src/core/context.py: PROJECT_ID ContextVar(31 background loop 自動繼承)
- src/db/base.py + models.py: project_id 欄位 + RLS set_config 注入
- src/hermes/nl_gateway.py: project_id Redis key 前綴(Phase A 雙寫)
- src/services/anomaly_counter.py: per-project 改造(Phase A fallback)

## Phase 4: Platform Shell in Shadow Mode
- awooop_phase4_run_state_2026-05-04.sql: run_state + step_journal + idempotency
- src/services/run_state_machine.py: 8-state FSM + SKIP LOCKED + stale reaper
- src/services/platform_runtime.py: UUID v7 + W3C trace_id + shadow_execute
- src/services/audit_sink.py: PII/secret redaction 9 patterns
- src/api/v1/platform/runs.py: POST/GET /v1/platform/runs(Router→Service 架構)
- src/workers/platform_worker.py: SKIP LOCKED worker + heartbeat + reaper loop
- src/main.py: platform router + lifespan worker start/stop

## Phase 5: MCP Gateway 五閘門
- awooop_phase5_mcp_gateway_2026-05-04.sql: 4 表 + RLS
- src/plugins/mcp/gateway.py: McpGateway(Gate 1~5, E-MCP-GATE-001~009)
- src/plugins/mcp/redaction_middleware.py: 雙層 redaction + 16K 截斷
- src/plugins/mcp/registry.py: __provider name mangling(ADR-116)
- src/plugins/mcp/credential_resolver.py: k8s secret ref 解析
- tests/test_mcp_credential_isolation.py: 10 個迴歸測試(secret leak 防再現)

## Phase 6-8: EwoooC + Channel Hub + Approval Token
- awooop_phase6_ewoooc_onboarding_2026-05-04.sql: ewoooc tenant + 4 read-only MCP tools
- awooop_phase7_channel_hub_2026-05-04.sql: conversation_event + outbound_message
- src/services/provider_proxy.py: ProviderProxy + PlatformEnvelope(ADR-115)
- src/services/channel_hub.py: Telegram inbound mirror + Progressive Feedback(30s)
- src/services/awooop_approval_token.py: HS256 + jti NX replay 防護 + suggest mode

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-04 19:31:53 +08:00

1688 lines
64 KiB
Python
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
"""
Database Models
===============
CTO-201: Approval & AuditLog persistence
Schema 設計原則:
- UUID 主鍵 (PostgreSQL 相容)
- JSON 欄位儲存複雜結構
- 完整時間戳記
- 索引優化查詢
"""
from datetime import datetime
from decimal import Decimal
from typing import Any
from uuid import UUID, uuid4
from sqlalchemy import (
JSON,
BigInteger,
Boolean,
CheckConstraint,
Date,
DateTime,
Float,
ForeignKey,
Index,
Integer,
Numeric,
String,
Text,
text,
)
from sqlalchemy import (
Enum as SQLEnum,
)
from sqlalchemy.dialects.postgresql import ENUM as PgEnum
from sqlalchemy.dialects.postgresql import JSONB
from sqlalchemy.dialects.postgresql import UUID as pg_UUID
from sqlalchemy.orm import Mapped, mapped_column
from src.db.base import Base
from src.models.approval import ApprovalStatus, RiskLevel
from src.models.incident import IncidentStatus, Severity
from src.models.knowledge import EntrySource, EntryStatus, EntryType
# =============================================================================
# Helper Functions
# =============================================================================
def taipei_now() -> datetime:
"""取得台北時區當前時間 (UTC+8)
🔴 HARD RULE: 全系統使用台北時區,禁止 UTC
2026-04-02 Claude Code: C1 時區統一遷移 (首席架構師審查)
"""
from src.utils.timezone import now_taipei
return now_taipei()
def generate_uuid() -> str:
"""Generate UUID string"""
return str(uuid4())
# =============================================================================
# ApprovalRecord - 授權記錄持久化
# =============================================================================
class ApprovalRecord(Base):
"""
授權記錄 - 對應 Pydantic ApprovalRequest
Note: 與 in-memory TrustEngine 的 ApprovalRequest 同步
"""
__tablename__ = "approval_records"
# Primary Key
id: Mapped[str] = mapped_column(
String(36),
primary_key=True,
default=generate_uuid,
)
# Core Fields
action: Mapped[str] = mapped_column(String(500), nullable=False)
description: Mapped[str] = mapped_column(Text, nullable=False)
status: Mapped[str] = mapped_column(
SQLEnum(ApprovalStatus),
default=ApprovalStatus.PENDING,
nullable=False,
)
risk_level: Mapped[str] = mapped_column(
SQLEnum(RiskLevel),
nullable=False,
)
# Signature Tracking
required_signatures: Mapped[int] = mapped_column(Integer, default=1)
current_signatures: Mapped[int] = mapped_column(Integer, default=0)
signatures: Mapped[dict[str, Any]] = mapped_column(JSON, default=list)
# Blast Radius (JSON)
blast_radius: Mapped[dict[str, Any]] = mapped_column(JSON, default=dict)
# Dry-Run Checks (JSON)
dry_run_checks: Mapped[list[dict[str, Any]]] = mapped_column(JSON, default=list)
# Metadata
requested_by: Mapped[str] = mapped_column(String(100), nullable=False)
rejection_reason: Mapped[str | None] = mapped_column(Text, nullable=True)
extra_metadata: Mapped[dict[str, Any] | None] = mapped_column(JSON, nullable=True)
# ==========================================================================
# 戰略 B: 告警風暴收斂 (Alert Storm Convergence)
# ==========================================================================
# 告警指紋 - 根據 namespace + deployment + alert_name 產生的唯一 Hash
fingerprint: Mapped[str | None] = mapped_column(
String(64),
nullable=True,
index=True,
comment="SHA256 hash of alert identity (namespace:deployment:alert_name)",
)
# 聚合次數 - 相同指紋告警的累計觸發次數
hit_count: Mapped[int] = mapped_column(
Integer,
default=1,
nullable=False,
comment="Number of times this alert pattern was triggered",
)
# 最後觸發時間 - 同指紋告警最近一次出現的時間
last_seen_at: Mapped[datetime] = mapped_column(
DateTime(timezone=True),
default=taipei_now,
nullable=False,
comment="Last time this alert pattern was seen",
)
# Sprint 5.1 MultiSig 雙簽核支援 (2026-04-08 Claude Sonnet 4.6 Asia/TaipeiADR-062 Q3)
approval_level: Mapped[str] = mapped_column(
String(20),
default="standard",
nullable=False,
comment="standard=1票審核, critical=2票MultiSig",
)
approval_votes: Mapped[list[dict[str, Any]]] = mapped_column(
JSON,
default=list,
nullable=False,
comment="[{user_id, voted_at, action}]",
)
required_votes: Mapped[int] = mapped_column(
Integer,
default=1,
nullable=False,
comment="standard=1, critical=2",
)
# 2026-04-06 ogt: Phase 26 — 關聯 Incident ID
# Playbook 萃取和 KM 寫入必須知道 incident_id不能靠文字解析
incident_id: Mapped[str | None] = mapped_column(
String(64),
nullable=True,
index=True,
comment="Associated Incident ID (INC-YYYYMMDD-XXXXXX)",
)
# 2026-04-09 Claude Sonnet 4.6: Telegram 訊息持久化
# Redis tg_msg:{id} TTL 24h 過期後仍可查詢,支援跨 Session 狀態更新
telegram_message_id: Mapped[int | None] = mapped_column(
Integer,
nullable=True,
comment="Telegram message_id of the approval card sent to operator",
)
telegram_chat_id: Mapped[int | None] = mapped_column(
BigInteger,
nullable=True,
comment="Telegram chat_id where the approval card was sent (BIGINT: 支援群組負數 ID)",
)
# B2 fix 2026-04-24 ogt + Claude Sonnet 4.6: Playbook 學習閉環斷鏈修復
# 原欄位缺失 → 人工審核後 matched_playbook_id 永遠 NULL → EWMA 無法更新
# 2026-04-25 db-expert-fix by Claude Engineer-B: 移除 index=True 避免自動生成 full index
# Partial index 改在 __table_args__ 宣告WHERE matched_playbook_id IS NOT NULL
matched_playbook_id: Mapped[str | None] = mapped_column(
String(36),
nullable=True,
comment="匹配的 Playbook ID學習服務用以更新 EWMA trust score",
)
# 2026-04-26 P2-DB-Fix by Claude — db-expert P0 三修P0.3: P2.1 DecisionFusionEngine 欄位
# composite_score / complexity_tier / decision_fusion_details
# 僅在 AIOPS_P2_FUSION_ENABLED=True 且 fusion 成功時填入nullable=True
composite_score: Mapped[float | None] = mapped_column(
Float,
nullable=True,
comment="P2.1 DecisionFusion 合成分數0.0-1.0),方法 III 加權結果",
)
complexity_tier: Mapped[str | None] = mapped_column(
String(16),
nullable=True,
comment="P2.1 告警複雜度分層low / medium / high / critical",
)
decision_fusion_details: Mapped[dict | None] = mapped_column(
JSONB,
nullable=True,
comment=(
"P2.1 DecisionFusionEngine: openclaw_score / hermes_score / "
"playbook_score / mcp_health_score / elephant_score"
),
)
# Timestamps
created_at: Mapped[datetime] = mapped_column(
DateTime(timezone=True),
default=taipei_now,
)
updated_at: Mapped[datetime] = mapped_column(
DateTime(timezone=True),
default=taipei_now,
onupdate=taipei_now,
)
expires_at: Mapped[datetime | None] = mapped_column(
DateTime(timezone=True),
nullable=True,
)
resolved_at: Mapped[datetime | None] = mapped_column(
DateTime(timezone=True),
nullable=True,
)
# Indexes
__table_args__ = (
Index("ix_approval_status", "status"),
Index("ix_approval_risk_level", "risk_level"),
Index("ix_approval_created_at", "created_at"),
Index("ix_approval_requested_by", "requested_by"),
Index("ix_approval_fingerprint", "fingerprint"), # 戰略 B: 指紋查詢優化
# 2026-04-25 db-expert-fix by Claude Engineer-B: 改為 partial index只索引非 NULL 值
# 原 full index 與 index=True 三重宣告衝突已修復(一個來源真相:此處)
Index(
"ix_approval_matched_playbook",
"matched_playbook_id",
postgresql_where=text("matched_playbook_id IS NOT NULL"),
),
# 2026-04-26 P2-DB-Fix by Claude — db-expert P0 三修P0.3: P2 DecisionFusion 欄位
# partial indexfusion fill rate 預期 <50%,只索引有值的行
Index(
"ix_approval_composite_score",
"composite_score",
postgresql_where=text("composite_score IS NOT NULL"),
),
Index(
"ix_approval_complexity_tier",
"complexity_tier",
postgresql_where=text("complexity_tier IS NOT NULL"),
),
CheckConstraint(
"complexity_tier IN ('low','medium','high','critical') OR complexity_tier IS NULL",
name="chk_complexity_tier",
),
)
# =============================================================================
# AuditLog - 稽核日誌
# =============================================================================
class TimelineEvent(Base):
"""
時間軸事件 - Phase 4 Action Timeline
事件類型:
- system: 系統告警接收
- agent: OpenClaw AI 分析
- security: 權限阻擋
- human: 人類授權
- exec: 執行完成
"""
__tablename__ = "timeline_events"
# Primary Key
id: Mapped[str] = mapped_column(
String(36),
primary_key=True,
default=generate_uuid,
)
# Event Type & Status
event_type: Mapped[str] = mapped_column(
String(20),
nullable=False,
comment="system, agent, security, human, exec",
)
status: Mapped[str] = mapped_column(
String(20),
nullable=False,
default="info",
comment="info, success, warning, error",
)
# Content
title: Mapped[str] = mapped_column(String(500), nullable=False)
description: Mapped[str | None] = mapped_column(Text, nullable=True)
# Actor
actor: Mapped[str | None] = mapped_column(String(100), nullable=True)
actor_role: Mapped[str | None] = mapped_column(String(50), nullable=True)
# Context
risk_level: Mapped[str | None] = mapped_column(String(20), nullable=True)
approval_id: Mapped[str | None] = mapped_column(String(36), nullable=True, index=True)
# P1.6 fix 2026-04-24 ogt + Claude Sonnet 4.6: pre_decision_investigator raw SQL 寫不存在欄位
# 原本 INSERT INTO timeline_events (incident_id, ...) 失敗 → 每天+1 錯誤靜默吞
incident_id: Mapped[str | None] = mapped_column(
String(64),
nullable=True,
index=True,
comment="關聯的 Incident IDMCP 事件稽核用)",
)
# Timestamp
created_at: Mapped[datetime] = mapped_column(
DateTime(timezone=True),
default=taipei_now,
)
# Indexes
__table_args__ = (
Index("ix_timeline_event_type", "event_type"),
Index("ix_timeline_created_at", "created_at"),
Index("ix_timeline_incident_id", "incident_id"), # P1.6 fix
)
class AuditLog(Base):
"""
稽核日誌 - 記錄所有執行結果
每次 K8s 操作完成後寫入一筆記錄
"""
__tablename__ = "audit_logs"
# Primary Key
id: Mapped[str] = mapped_column(
String(36),
primary_key=True,
default=generate_uuid,
)
# Reference to Approval
approval_id: Mapped[str] = mapped_column(
String(36),
nullable=False,
index=True,
)
# Operation Details
operation_type: Mapped[str] = mapped_column(
String(50),
nullable=False,
comment="e.g., RESTART_DEPLOYMENT, DELETE_POD",
)
target_resource: Mapped[str] = mapped_column(
String(200),
nullable=False,
comment="e.g., deployment/api-backend, pod/nginx-xxx",
)
namespace: Mapped[str] = mapped_column(
String(63),
default="default",
nullable=False,
)
# AwoooP Phase 2.3 (2026-05-04 ogt): 多租戶隔離欄位,配合 Batch 1 RLS migration
project_id: Mapped[str] = mapped_column(
String(64),
default="awoooi",
nullable=False,
index=True,
)
# Execution Result
success: Mapped[bool] = mapped_column(default=False, nullable=False)
error_message: Mapped[str | None] = mapped_column(Text, nullable=True)
# K8s Response (Raw)
k8s_response: Mapped[dict[str, Any] | None] = mapped_column(
JSON,
nullable=True,
comment="Raw Kubernetes API response",
)
# Execution Context
executed_by: Mapped[str] = mapped_column(
String(100),
nullable=False,
comment="Who triggered the execution",
)
execution_duration_ms: Mapped[int | None] = mapped_column(
Integer,
nullable=True,
comment="Execution time in milliseconds",
)
# Dry-Run Result (pre-execution validation)
dry_run_passed: Mapped[bool] = mapped_column(
default=True,
nullable=False,
)
dry_run_message: Mapped[str | None] = mapped_column(Text, nullable=True)
# ==========================================================================
# Phase 18: 失敗自動修復閉環欄位 (2026-03-26)
# ==========================================================================
# 授權來源追蹤
authorization_channel: Mapped[str | None] = mapped_column(
String(20),
nullable=True,
comment="Authorization source: web, telegram, auto",
)
# 重試與修復追蹤
retry_count: Mapped[int] = mapped_column(
Integer,
default=0,
nullable=False,
comment="Number of retry attempts",
)
failure_classification: Mapped[str | None] = mapped_column(
String(50),
nullable=True,
comment="Failure type: TIMEOUT, K8S_ERROR, NETWORK_ERROR, PERMISSION_DENIED",
)
source_approval_id: Mapped[str | None] = mapped_column(
String(36),
nullable=True,
index=True,
comment="Original approval ID if this is a repair attempt",
)
# 自動修復狀態
auto_repair_attempted: Mapped[bool] = mapped_column(
default=False,
nullable=False,
comment="Whether auto-repair was attempted",
)
auto_repair_result: Mapped[str | None] = mapped_column(
Text,
nullable=True,
comment="Auto-repair result: AI analysis and repair outcome",
)
# Timestamps
created_at: Mapped[datetime] = mapped_column(
DateTime(timezone=True),
default=taipei_now,
)
# Indexes
__table_args__ = (
Index("ix_audit_approval_id", "approval_id"),
Index("ix_audit_operation_type", "operation_type"),
Index("ix_audit_success", "success"),
Index("ix_audit_created_at", "created_at"),
Index("ix_audit_authorization_channel", "authorization_channel"), # Phase 18
Index("ix_audit_failure_classification", "failure_classification"), # Phase 18
)
# =============================================================================
# MCP Audit / Snapshots — 2026-05-01 ghost-loop + MCP governance
# =============================================================================
class MCPAuditLog(Base):
"""Durable audit trail for every MCP tool call."""
__tablename__ = "mcp_audit_log"
id: Mapped[int] = mapped_column(BigInteger, primary_key=True, autoincrement=True)
session_id: Mapped[str] = mapped_column(String(36), nullable=False)
flywheel_node: Mapped[str | None] = mapped_column(String(20), nullable=True)
mcp_server: Mapped[str] = mapped_column(String(80), nullable=False)
tool_name: Mapped[str] = mapped_column(String(120), nullable=False)
input_params: Mapped[dict | None] = mapped_column(JSONB, nullable=True)
output_result: Mapped[dict | list | str | None] = mapped_column(JSONB, nullable=True)
duration_ms: Mapped[int | None] = mapped_column(Integer, nullable=True)
success: Mapped[bool | None] = mapped_column(Boolean, nullable=True)
error_message: Mapped[str | None] = mapped_column(Text, nullable=True)
incident_id: Mapped[str | None] = mapped_column(String(64), nullable=True)
agent_role: Mapped[str | None] = mapped_column(String(40), nullable=True)
created_at: Mapped[datetime] = mapped_column(DateTime(timezone=True), default=taipei_now)
__table_args__ = (
Index("idx_mcp_audit_session", "session_id"),
Index("idx_mcp_audit_incident", "incident_id"),
Index("idx_mcp_audit_node", "flywheel_node", "created_at"),
Index("idx_mcp_audit_server_tool", "mcp_server", "tool_name", "created_at"),
Index("idx_mcp_audit_agent_role", "agent_role", "created_at"),
)
class MCPDailyStats(Base):
"""Daily aggregate for MCP provider/tool success rate and latency."""
__tablename__ = "mcp_daily_stats"
date: Mapped[datetime] = mapped_column(Date, primary_key=True)
mcp_server: Mapped[str] = mapped_column(String(80), primary_key=True)
tool_name: Mapped[str] = mapped_column(String(120), primary_key=True)
call_count: Mapped[int] = mapped_column(Integer, default=0, nullable=False)
success_count: Mapped[int] = mapped_column(Integer, default=0, nullable=False)
avg_duration_ms: Mapped[float | None] = mapped_column(Float, nullable=True)
class K8sStateSnapshot(Base):
"""Pre/post Kubernetes resource state snapshots for remediation verification."""
__tablename__ = "k8s_state_snapshots"
id: Mapped[int] = mapped_column(BigInteger, primary_key=True, autoincrement=True)
incident_id: Mapped[str | None] = mapped_column(String(64), nullable=True)
snapshot_type: Mapped[str] = mapped_column(String(40), nullable=False)
namespace: Mapped[str | None] = mapped_column(String(63), nullable=True)
resource_type: Mapped[str | None] = mapped_column(String(80), nullable=True)
resource_name: Mapped[str | None] = mapped_column(String(253), nullable=True)
state_json: Mapped[dict | None] = mapped_column(JSONB, nullable=True)
captured_at: Mapped[datetime] = mapped_column(DateTime(timezone=True), default=taipei_now)
__table_args__ = (
Index("idx_k8s_snapshot_incident", "incident_id"),
Index("idx_k8s_snapshot_resource", "namespace", "resource_type", "resource_name"),
Index("idx_k8s_snapshot_captured", "captured_at"),
)
class PrometheusSnapshot(Base):
"""Prometheus query snapshots for detect/verify flywheel stages."""
__tablename__ = "prometheus_snapshots"
id: Mapped[int] = mapped_column(BigInteger, primary_key=True, autoincrement=True)
incident_id: Mapped[str | None] = mapped_column(String(64), nullable=True)
query: Mapped[str] = mapped_column(Text, nullable=False)
result_json: Mapped[dict | list | str | None] = mapped_column(JSONB, nullable=True)
snapshot_type: Mapped[str | None] = mapped_column(String(40), nullable=True)
captured_at: Mapped[datetime] = mapped_column(DateTime(timezone=True), default=taipei_now)
__table_args__ = (
Index("idx_prom_snapshot_incident", "incident_id"),
Index("idx_prom_snapshot_type", "snapshot_type", "captured_at"),
)
# =============================================================================
# AutoRepairExecution - Phase 10 操作記錄
# 2026-04-08 Claude Code: 統帥指令「所有操作都必須被記錄,寫入資料庫」
# =============================================================================
class AutoRepairExecution(Base):
"""
自動修復執行記錄
每次 evaluate_auto_repair 觸發並執行 (成功或失敗) 都寫入此表。
不依賴 approval_id自動修復不需人工批准
"""
__tablename__ = "auto_repair_executions"
id: Mapped[str] = mapped_column(String(36), primary_key=True, default=generate_uuid)
# 關聯
incident_id: Mapped[str] = mapped_column(String(30), nullable=False, index=True)
playbook_id: Mapped[str] = mapped_column(String(36), nullable=False, index=True)
playbook_name: Mapped[str] = mapped_column(String(200), nullable=False)
# 執行結果
success: Mapped[bool] = mapped_column(default=False, nullable=False)
executed_steps: Mapped[list] = mapped_column(JSON, default=list, nullable=False)
error_message: Mapped[str | None] = mapped_column(Text, nullable=True)
# 執行上下文
triggered_by: Mapped[str] = mapped_column(
String(50), default="auto_repair", nullable=False,
comment="auto_repair / cold_start_trust",
)
similarity_score: Mapped[float | None] = mapped_column(nullable=True)
risk_level: Mapped[str | None] = mapped_column(String(20), nullable=True)
execution_time_ms: Mapped[int | None] = mapped_column(Integer, nullable=True)
# 時間戳 (台北時區)
created_at: Mapped[datetime] = mapped_column(DateTime(timezone=True), default=taipei_now)
__table_args__ = (
Index("ix_are_created_at", "created_at"),
Index("ix_are_success", "success"),
)
# =============================================================================
# AlertOperationLog - Phase 11 告警操作溯源 (Event Sourcing)
# 2026-04-08 Claude Code: 統帥指令「所有操作都必須被記錄,寫入資料庫」
# 不可變 — 只 INSERT不 UPDATE/DELETE
# =============================================================================
class AlertOperationLog(Base):
"""
告警操作完整溯源
Event Sourcing 模式:每個告警生命週期的每個事件都寫一筆。
不可變 (Immutable)。
event_type 值:
ALERT_RECEIVED / TELEGRAM_SENT / USER_ACTION /
AUTO_REPAIR_TRIGGERED / EXECUTION_STARTED / EXECUTION_COMPLETED /
TELEGRAM_RESULT_SENT / RESOLVED / SILENCED / ESCALATED
"""
__tablename__ = "alert_operation_log"
id: Mapped[str] = mapped_column(String(36), primary_key=True, default=generate_uuid)
# 關聯 (允許 NULL不同事件有不同關聯)
incident_id: Mapped[str | None] = mapped_column(String(30), nullable=True, index=True)
approval_id: Mapped[str | None] = mapped_column(String(36), nullable=True, index=True)
audit_log_id: Mapped[str | None] = mapped_column(String(36), nullable=True)
auto_repair_id: Mapped[str | None] = mapped_column(String(36), nullable=True)
# 事件核心
# 2026-04-08 Claude Sonnet 4.6: Sprint 5.1 — 修正 enum 型別不符 (String→PgEnum, create_type=False)
event_type: Mapped[str] = mapped_column(
PgEnum(
"ALERT_RECEIVED", "TELEGRAM_SENT", "USER_ACTION", "AUTO_REPAIR_TRIGGERED",
"EXECUTION_STARTED", "EXECUTION_COMPLETED", "TELEGRAM_RESULT_SENT",
"RESOLVED", "SILENCED", "ESCALATED", "GUARDRAIL_BLOCKED",
"PRE_FLIGHT_PASSED", "PRE_FLIGHT_FAILED", "BACKUP_TRIGGERED",
"BACKUP_COMPLETED", "BACKUP_FAILED", "APPROVAL_ESCALATED", "CHANGE_APPLIED",
name="alert_event_type", create_type=False,
),
nullable=False, index=True,
)
actor: Mapped[str | None] = mapped_column(String(100), nullable=True, index=True)
action_detail: Mapped[str | None] = mapped_column(String(200), nullable=True)
# 執行結果 (NULL = 不適用)
success: Mapped[bool | None] = mapped_column(nullable=True)
error_message: Mapped[str | None] = mapped_column(Text, nullable=True)
# 結構化上下文
context: Mapped[dict] = mapped_column(JSON, default=dict, nullable=False)
# 時間戳 (台北時區,不可變)
created_at: Mapped[datetime] = mapped_column(DateTime(timezone=True), default=taipei_now)
__table_args__ = (
Index("ix_aol_created_at", "created_at"),
)
# =============================================================================
# IncidentRecord - Phase 6.2 Episodic Memory (PostgreSQL)
# =============================================================================
class IncidentRecord(Base):
"""
事件記錄 - 對應 Pydantic Incident Schema v0.3
Phase 6.2: Episodic Memory (長期記憶)
- 從 Working Memory (Redis) 遷移過來
- 永久保留,供 RAG 檢索
- 複雜結構使用 JSONB 欄位
三層記憶架構:
- Working Memory (Redis): 7 天 TTL
- Episodic Memory (PostgreSQL): 此表,永久保留
- Semantic Memory (Vector DB): Phase 6.3+
"""
__tablename__ = "incidents"
# === 主鍵 ===
incident_id: Mapped[str] = mapped_column(
String(30),
primary_key=True,
comment="事件唯一識別碼 (如 INC-20260322-A1B2C3)",
)
# AwoooP Phase 2.3 (2026-05-04 ogt): 多租戶隔離欄位,配合 Batch 1 RLS migration
project_id: Mapped[str] = mapped_column(
String(64),
default="awoooi",
nullable=False,
index=True,
)
# === 狀態與嚴重度 ===
status: Mapped[str] = mapped_column(
SQLEnum(IncidentStatus),
default=IncidentStatus.INVESTIGATING,
nullable=False,
comment="事件狀態 (investigating, mitigating, resolved, closed, escalated)",
)
severity: Mapped[str] = mapped_column(
SQLEnum(Severity),
nullable=False,
comment="事件嚴重度 (P0, P1, P2, P3)",
)
# === 感知層 (Signals) - JSONB ===
signals: Mapped[list[dict[str, Any]]] = mapped_column(
JSON,
default=list,
nullable=False,
comment="關聯的告警信號列表 (JSONB)",
)
affected_services: Mapped[list[str]] = mapped_column(
JSON,
default=list,
nullable=False,
comment="受影響的服務列表",
)
# === 認知層 (AI Decision Chain) - JSONB ===
decision_chain: Mapped[dict[str, Any] | None] = mapped_column(
JSON,
nullable=True,
comment="AI 決策鏈 (完整推論過程)",
)
# === 決策層 (Proposals) ===
proposal_ids: Mapped[list[str]] = mapped_column(
JSON,
default=list,
nullable=False,
comment="關聯的 ApprovalRequest ID 列表",
)
# === 結果層 (Outcome) - JSONB ===
outcome: Mapped[dict[str, Any] | None] = mapped_column(
JSON,
nullable=True,
comment="事件結果與人類回饋",
)
# === ADR-073 Phase 2 欄位 (2026-04-12 ogt) ===
alertname: Mapped[str | None] = mapped_column(
String(100),
nullable=True,
comment="告警名稱 (從 signals labels 抽取)",
)
notification_type: Mapped[str | None] = mapped_column(
String(10),
nullable=True,
comment="通知類型 TYPE-1/2/3/4/4D (早期分診)",
)
alert_category: Mapped[str | None] = mapped_column(
String(50),
nullable=True,
comment="告警類別 config_drift/info/backup/infrastructure/kubernetes/database/general",
)
# === 頻率快照 (Phase 27, 2026-04-10 ogt) ===
# frequency_stats 原本只存記憶體/Redis(TTL=35天)Pod重啟或超期即失
# 此欄位在 incident 建立時寫入快照,永久保存當時的頻率統計
frequency_snapshot: Mapped[dict[str, Any] | None] = mapped_column(
JSON,
nullable=True,
comment="建立時刻的 AnomalyFrequency 快照,永久保存 (Phase 27)",
)
# === 時間軸 ===
created_at: Mapped[datetime] = mapped_column(
DateTime(timezone=True),
default=taipei_now,
nullable=False,
)
updated_at: Mapped[datetime] = mapped_column(
DateTime(timezone=True),
default=taipei_now,
onupdate=taipei_now,
nullable=False,
)
resolved_at: Mapped[datetime | None] = mapped_column(
DateTime(timezone=True),
nullable=True,
)
closed_at: Mapped[datetime | None] = mapped_column(
DateTime(timezone=True),
nullable=True,
)
# === 記憶管理 ===
ttl_days: Mapped[int] = mapped_column(
Integer,
default=7,
nullable=False,
comment="Working Memory TTL (天)",
)
vectorized: Mapped[bool] = mapped_column(
default=False,
nullable=False,
comment="是否已向量化到 Vector DB (Semantic Memory)",
)
# === 索引 ===
__table_args__ = (
Index("ix_incident_status", "status"),
Index("ix_incident_severity", "severity"),
Index("ix_incident_created_at", "created_at"),
Index("ix_incident_resolved_at", "resolved_at"),
)
# =============================================================================
# KnowledgeEntry - Knowledge Base Phase 1
# =============================================================================
class KnowledgeEntryRecord(Base):
"""
知識庫條目 - Knowledge Base Phase 1
兩層架構:
- KnowledgeEntry: 知識條目 (此表)
- Playbook: 獨立 Redis透過 related_playbook_id 關聯
建立時間: 2026-04-02 (台北時區)
建立者: Claude Code (Knowledge Base Phase 1)
"""
__tablename__ = "knowledge_entries"
# Primary Key
id: Mapped[str] = mapped_column(
String(36),
primary_key=True,
default=generate_uuid,
)
# AwoooP Phase 2.3 (2026-05-04 ogt): 多租戶隔離欄位,配合 Batch 1 RLS migration
project_id: Mapped[str] = mapped_column(
String(64),
default="awoooi",
nullable=False,
index=True,
)
# Core Fields
title: Mapped[str] = mapped_column(String(255), nullable=False)
content: Mapped[str] = mapped_column(Text, nullable=False)
entry_type: Mapped[str] = mapped_column(
SQLEnum(EntryType),
nullable=False,
comment="incident_case / runbook / best_practice / postmortem",
)
category: Mapped[str] = mapped_column(
String(100),
nullable=False,
comment="分類樹節點 (基礎設施/應用層/AI系統/安全合規)",
)
tags: Mapped[list[str]] = mapped_column(
JSON,
default=list,
nullable=False,
comment="標籤列表 (JSONB string array)",
)
# Source & Status
source: Mapped[str] = mapped_column(
SQLEnum(EntrySource),
nullable=False,
comment="ai_extracted / human",
)
status: Mapped[str] = mapped_column(
SQLEnum(EntryStatus),
default=EntryStatus.DRAFT,
nullable=False,
comment="draft / review / approved / archived",
)
# Relations (soft references, not FK)
related_incident_id: Mapped[str | None] = mapped_column(
String(30),
nullable=True,
comment="關聯 Incident ID",
)
related_playbook_id: Mapped[str | None] = mapped_column(
String(255),
nullable=True,
comment="關聯 Playbook Redis Key",
)
# 2026-04-04 ogt: Phase 25 P1 — Anti-Pattern 閉環攔截用症狀 hash (SymptomPattern.compute_hash())
symptoms_hash: Mapped[str | None] = mapped_column(
String(16),
nullable=True,
comment="症狀模式 hash (16字元 SHA256 前綴)Anti-Pattern 閉環攔截使用",
)
# P1-1 2026-04-28 ogt + Claude Sonnet 4.6: M4 補反查鏈
# phase26_incident_km_integration.sql 已建立欄位與 partial index
# KMWriter.write() 會自動填入並回填 Path A 條目approval → KM 雙向追蹤)
related_approval_id: Mapped[str | None] = mapped_column(
String(36),
nullable=True,
comment="關聯 ApprovalRequest IDP1-1 反查鏈修復approval → KM 追蹤)",
)
# P1-1 M3 2026-04-28 ogt + Claude Sonnet 4.6: 冪等 key 的一部分
# migration: p1_1_km_idempotent_path_type.sql
# unique index: uix_knowledge_incident_path (related_incident_id, path_type) WHERE both NOT NULL
path_type: Mapped[str | None] = mapped_column(
String(50),
nullable=True,
comment="KMWriter 路徑類型,與 related_incident_id 構成冪等 key",
)
# Metrics
view_count: Mapped[int] = mapped_column(
Integer,
default=0,
nullable=False,
)
# Metadata
created_by: Mapped[str | None] = mapped_column(String(100), nullable=True)
created_at: Mapped[datetime] = mapped_column(
DateTime(timezone=True),
default=taipei_now,
)
updated_at: Mapped[datetime] = mapped_column(
DateTime(timezone=True),
default=taipei_now,
onupdate=taipei_now,
)
# Indexes
__table_args__ = (
Index("ix_knowledge_entry_type", "entry_type"),
Index("ix_knowledge_category", "category"),
Index("ix_knowledge_status", "status"),
Index("ix_knowledge_created_at", "created_at"),
# 2026-04-04 ogt: Phase 25 P1 — Anti-Pattern 快速查詢
Index("ix_knowledge_symptoms_hash", "symptoms_hash"),
# P1-1 2026-04-28 ogt + Claude Sonnet 4.6: M4 反查鏈 partial index配合 phase26 migration
Index(
"ix_knowledge_related_approval",
"related_approval_id",
postgresql_where=text("related_approval_id IS NOT NULL"),
),
# P1-1 M3 2026-04-28 ogt + Claude Sonnet 4.6: 冪等 unique index
# migration: p1_1_km_idempotent_path_type.sql
Index(
"uix_knowledge_incident_path",
"related_incident_id",
"path_type",
unique=True,
postgresql_where=text(
"related_incident_id IS NOT NULL AND path_type IS NOT NULL"
),
),
)
# IncidentEvidence — ADR-081 Phase 1 EvidenceSnapshot 持久化
# 2026-04-15 ogt + Claude Sonnet 4.6: AI 自主化飛輪 Phase 1 初始建立
class IncidentEvidence(Base):
"""
不可變事件證據快照表
每次決策前 PreDecisionInvestigator 拍攝一次 EvidenceSnapshot
寫入此表以供:
- 決策溯源LLM 推理過程的完整情報上下文)
- 學習訓練Phase 3 fine-tune pipeline 金礦資料)
- 異常驗證(執行前 vs 執行後 state diff
ADR-081: PreDecisionInvestigator + EvidenceSnapshot
設計原則:只追加寫入,禁止 UPDATEevent sourcing 對齊)
"""
__tablename__ = "incident_evidence"
id: Mapped[str] = mapped_column(String(36), primary_key=True, default=generate_uuid)
# 關聯
incident_id: Mapped[str] = mapped_column(String(30), nullable=False) # index via __table_args__
# Phase 3 填充matched_playbook_id 目前永久 nullPhase 3 修復
matched_playbook_id: Mapped[str | None] = mapped_column(String(36), nullable=True)
# Schema 版本(方便 fine-tune pipeline 過濾相容版本)
schema_version: Mapped[str] = mapped_column(String(10), default="v1", nullable=False)
# 8D 感官數據(各維度 nullable — MCP 失敗時部分缺失)
k8s_state: Mapped[dict | None] = mapped_column(
JSON, nullable=True, comment="D1: kubectl describe pod + events"
)
recent_logs: Mapped[str | None] = mapped_column(
Text, nullable=True, comment="D2: container stderr tail-50經 SanitizationService 清洗"
)
metrics_snapshot: Mapped[dict | None] = mapped_column(
JSON, nullable=True, comment="D3: Prometheus 5min vs 1h baseline 對比"
)
recent_deployments: Mapped[list | None] = mapped_column(
JSON, nullable=True, comment="D4: ArgoCD/Gitea 過去 1h 部署 diff"
)
business_metrics: Mapped[dict | None] = mapped_column(
JSON, nullable=True, comment="D5: 訂單量 / 登入成功率 / P0 SLI"
)
historical_context: Mapped[str | None] = mapped_column(
Text, nullable=True, comment="D6: 過去 30 天同 alertname 處置歷史摘要"
)
peer_health: Mapped[dict | None] = mapped_column(
JSON, nullable=True, comment="D7: 同 Deployment 其他 replica 健康度"
)
dependency_topology: Mapped[dict | None] = mapped_column(
JSON, nullable=True, comment="D8: Istio/Service Mesh 上下游 latency/error rate"
)
# Phase 4 ADR-084: 動態異常偵測增強感官DynamicBaseline + LogAnomaly + TrendPredictor
# 2026-04-15 ogt + Claude Sonnet 4.6(亞太): Phase 4 8D 升級
anomaly_context: Mapped[dict | None] = mapped_column(
JSON, nullable=True,
comment="Phase 4 動態異常上下文baseline_anomalies / log_patterns / trend_breaches"
)
# 感官品質指標
mcp_health: Mapped[dict] = mapped_column(
JSON, default=dict, nullable=False,
comment="各 MCP 呼叫成敗 {tool_name: bool},用於 decision_fusion 權重調整"
)
collection_duration_ms: Mapped[int | None] = mapped_column(
Integer, nullable=True, comment="情報蒐集總耗時msP99 目標 < 8000"
)
sensors_attempted: Mapped[int] = mapped_column(
default=0, nullable=False, comment="嘗試啟動的感官數"
)
sensors_succeeded: Mapped[int] = mapped_column(
default=0, nullable=False, comment="成功回傳資料的感官數"
)
# LLM 輸入摘要(不超 8K tokens由 Investigator 壓縮)
evidence_summary: Mapped[str | None] = mapped_column(
Text, nullable=True, comment="最終餵給 LLM 的情報摘要UTF-8< 8K tokens"
)
# 執行前後 StatePostExecutionVerifier 填入 post_execution_state
pre_execution_state: Mapped[dict | None] = mapped_column(
JSON, nullable=True, comment="執行前環境狀態快照PostExecutionVerifier 基準線)"
)
post_execution_state: Mapped[dict | None] = mapped_column(
JSON, nullable=True, comment="執行後環境狀態PostExecutionVerifier 抓取Phase 1 接線)"
)
verification_result: Mapped[str | None] = mapped_column(
String(20), nullable=True, comment="success / degraded / failed / timeoutPostExecutionVerifier 填入)"
)
# W2 PR-V1: SelfHealingValidator 自愈品質分數 (2026-04-28 ogt + Claude Sonnet 4.6)
# 0.0-1.01.0=完全自愈,<0.5=觸發 rollback 提案Telegram 警示)
# base.py ALTER IF NOT EXISTS 補欄對應下方
self_healing_score: Mapped[float | None] = mapped_column(
Float,
nullable=True,
comment="W2 PR-V1 SelfHealingValidator 自愈品質分數0.0-1.0<0.5 觸發 rollback 提案",
)
self_healing_detail: Mapped[dict | None] = mapped_column(
JSON,
nullable=True,
comment="W2 PR-V1 SelfHealingValidator 評估明細root_cause_cleared/regressions/detail",
)
# 時間戳(台北時區)
collected_at: Mapped[datetime] = mapped_column(
DateTime(timezone=True), default=taipei_now, nullable=False
)
__table_args__ = (
Index("ix_incident_evidence_incident_id", "incident_id"),
Index("ix_incident_evidence_collected_at", "collected_at"),
Index("ix_incident_evidence_playbook_id", "matched_playbook_id"),
)
# =============================================================================
# PlaybookRecord — Phase 3.5 Playbook PostgreSQL 持久化 (System of Record)
# ADR-085: AI 學習成果不可存在 Cache — Playbook 是 AI 的肌肉記憶
# 2026-04-15 ogt + Claude Sonnet 4.6(亞太): Phase 3.5 初始建立
#
# 核心鐵律:
# - PostgreSQL = System of Record永久保存AI 的長期記憶)
# - Redis = Warm Cache7天 TTL加速讀取DB 為 source of truth
# - trust_score, EWMA, 統計數據必須持久化 — 不能因 Redis TTL 消失
# =============================================================================
class PlaybookRecord(Base):
"""
Playbook 修復劇本 PostgreSQL ORM
與 Pydantic Playbook 模型對應。
Redis 為 warm cache7d TTLPostgreSQL 為 source of truth。
設計原則:
- AI 的學習成果trust_score、success_count、failure_count永久保存
- EWMA 信任度在 Redis TTL 後不會重置Pod 重啟後 AI 記憶不失
- 雙寫create/update 先寫 PG再更新 Redis cache
- 讀取Redis-firstcache hitmiss 時從 PG 載入並回填 Redis
"""
__tablename__ = "playbooks"
# Primary Key
playbook_id: Mapped[str] = mapped_column(
String(36), primary_key=True,
comment="Playbook 唯一識別碼 (PB-YYYYMMDD-XXXXXX)",
)
# AwoooP Phase 2.3 (2026-05-04 ogt): 多租戶隔離欄位,配合 Batch 1 RLS migration
project_id: Mapped[str] = mapped_column(
String(64),
default="awoooi",
nullable=False,
index=True,
)
# Core Fields
name: Mapped[str] = mapped_column(String(256), nullable=False)
description: Mapped[str] = mapped_column(Text, default="", nullable=False)
status: Mapped[str] = mapped_column(String(20), default="draft", nullable=False)
source: Mapped[str] = mapped_column(String(20), default="extracted", nullable=False)
# Complex structures (JSONB)
symptom_pattern: Mapped[dict[str, Any]] = mapped_column(JSON, default=dict, nullable=False)
repair_steps: Mapped[list[dict[str, Any]]] = mapped_column(JSON, default=list, nullable=False)
# Timing
estimated_duration_minutes: Mapped[int] = mapped_column(Integer, default=5, nullable=False)
# Source tracing
source_incident_ids: Mapped[list[str]] = mapped_column(JSON, default=list, nullable=False)
version: Mapped[int] = mapped_column(Integer, default=1, nullable=False)
parent_playbook_id: Mapped[str | None] = mapped_column(String(36), nullable=True, index=True)
supersedes_playbook_id: Mapped[str | None] = mapped_column(String(36), nullable=True, index=True)
version_reason: Mapped[str | None] = mapped_column(Text, nullable=True)
ai_confidence: Mapped[float] = mapped_column(default=0.0, nullable=False)
# Stats — MUST be in PG (AI learning artifacts, cannot expire)
success_count: Mapped[int] = mapped_column(Integer, default=0, nullable=False)
failure_count: Mapped[int] = mapped_column(Integer, default=0, nullable=False)
last_used_at: Mapped[datetime | None] = mapped_column(DateTime(timezone=True), nullable=True)
# EWMA trust score — ADR-083 Phase 3, 絕對不能用 Redis TTL 管理
# trust_score 是 AI 累積學習的結晶TTL 到期就歸零 = AI 記憶全部消失
trust_score: Mapped[float] = mapped_column(default=0.3, nullable=False,
comment="EWMA 動態信任度 (Phase 3)。成功 α=0.1,失敗 α=0.22x 衰減)。< 0.1 → 封存")
# Approval metadata
approved_by: Mapped[str | None] = mapped_column(String(100), nullable=True)
approved_at: Mapped[datetime | None] = mapped_column(DateTime(timezone=True), nullable=True)
tags: Mapped[list[str]] = mapped_column(JSON, default=list, nullable=False)
notes: Mapped[str | None] = mapped_column(Text, nullable=True)
# Sprint 5.1 護欄欄位 (2026-04-08)
requires_approval_level: Mapped[str] = mapped_column(
String(20), default="auto", nullable=False,
comment="auto=直接執行, standard=1票, critical=2票MultiSig",
)
stateful_targets: Mapped[list[str]] = mapped_column(JSON, default=list, nullable=False)
requires_pre_backup: Mapped[bool] = mapped_column(default=False, nullable=False)
# W2 PR-L1 2026-04-28 ogt + Claude Sonnet 4.6: KM→Playbook 互饋回路(飛輪 C3 修復)
# 同 symptom_pattern_hash 累積 N=5 條 KM 後LearningService 自動設 True
# 人工 review 後可重設為 False由 playbook_service 負責清除)
review_required: Mapped[bool] = mapped_column(
Boolean, default=False, nullable=False,
comment="W2 PR-L1: True=KM 累積觸發人工複審信號symptom_hash≥5 條review 後清為 False",
)
# Timestamps
created_at: Mapped[datetime] = mapped_column(DateTime(timezone=True), default=taipei_now, nullable=False)
updated_at: Mapped[datetime] = mapped_column(DateTime(timezone=True), default=taipei_now,
onupdate=taipei_now, nullable=False)
__table_args__ = (
Index("ix_playbook_status", "status"),
Index("ix_playbook_trust_score", "trust_score"),
Index("ix_playbook_created_at", "created_at"),
Index("ix_playbook_lineage", "parent_playbook_id", "version"),
# W2 PR-L1: 快速查詢需要人工 review 的 Playbook預期數量少partial index 最省空間)
Index(
"ix_playbook_review_required",
"review_required",
postgresql_where=text("review_required = true"),
),
)
# =============================================================================
# DynamicBaselineRecord — Phase 4 Holt-Winters 訓練基線持久化
# ADR-084: 動態基線不能只存 Redis — AI 每天重學「正常」不是在學習
# 2026-04-15 ogt + Claude Sonnet 4.6(亞太): Phase 4 初始建立
#
# 核心鐵律:
# - 訓練好的 Holt-Winters 模型必須在 PG 長期保存
# - Redis 為 24h warm cache加速 is_anomaly() 讀取)
# - 基線消失 = AI 對「正常」的認識消失 = 每天從頭學習 = 不是 AI
# =============================================================================
class DynamicBaselineRecord(Base):
"""
動態基線訓練結果 PostgreSQL ORM
Holt-Winters 訓練完成後:
1. 先寫入 PG永久保存
2. 再寫入 Redis24h warm cache加速讀取
Redis key: baseline:{metric_name}
PG: 此表metric_name 為主鍵,最新一筆 = 有效基線
"""
__tablename__ = "dynamic_baselines"
id: Mapped[str] = mapped_column(String(36), primary_key=True, default=generate_uuid)
# 基線識別
metric_name: Mapped[str] = mapped_column(
String(200), nullable=False, index=True,
comment="基線識別名 (e.g. cpu_usage_node_mon)",
)
# 訓練結果Holt-Winters 統計)
mean: Mapped[float] = mapped_column(nullable=False, comment="擬合值均值")
std: Mapped[float] = mapped_column(nullable=False, comment="殘差標準差")
# 24h 季節性因子JSON 陣列,長度 24
seasonal_factors: Mapped[list[float]] = mapped_column(
JSON, default=list, nullable=False,
comment="24h 週期季節性因子(乘法形式,均值 ≈ 1.0",
)
# 訓練元資料
datapoint_count: Mapped[int] = mapped_column(Integer, default=0, nullable=False)
promql: Mapped[str] = mapped_column(Text, default="", nullable=False,
comment="訓練使用的 PromQL 查詢")
lookback_hours: Mapped[int] = mapped_column(Integer, default=336, nullable=False)
# Timestamps
trained_at: Mapped[datetime] = mapped_column(DateTime(timezone=True), default=taipei_now, nullable=False)
created_at: Mapped[datetime] = mapped_column(DateTime(timezone=True), default=taipei_now, nullable=False)
__table_args__ = (
Index("ix_dynamic_baseline_metric", "metric_name"),
Index("ix_dynamic_baseline_trained_at", "trained_at"),
)
# =============================================================================
# LogClusterRecord — Phase 4 Drain3 學習到的 Log Pattern 持久化
# ADR-084: Drain3 模板不能只存 Redis — 每次重啟 AI 把已知 pattern 當新 pattern
# 2026-04-15 ogt + Claude Sonnet 4.6(亞太): Phase 4 初始建立
#
# 核心鐵律:
# - Drain3 學到的 log cluster template 必須在 PG 長期保存
# - 新 cluster 事件列表 (log_anomaly:new) 才存 Redis短期工作記憶
# - 基礎知識庫(已學到的 pattern必須在 PG
# =============================================================================
class LogClusterRecord(Base):
"""
Drain3 Log Cluster Template 持久化
每個新 pattern 首次偵測到時:
1. 寫入 PG永久保存AI 的 log 語意理解)
2. 推送到 Redis list log_anomaly:new短期工作記憶
Re-detect 相同 template 時只更新 last_seen_at + size不重複寫入 PG。
"""
__tablename__ = "log_clusters"
id: Mapped[str] = mapped_column(String(36), primary_key=True, default=generate_uuid)
# Cluster 識別MD5[:8] of template
cluster_id: Mapped[str] = mapped_column(
String(16), nullable=False, unique=True, index=True,
comment="模板 MD5[:8].upper(),穩定 ID",
)
# Drain3 模板
template: Mapped[str] = mapped_column(
Text, nullable=False,
comment="Drain3 萃取的 log 模板 (e.g. 'ERROR <*> connection failed to <*>')",
)
# 統計
size: Mapped[int] = mapped_column(Integer, default=1, nullable=False,
comment="命中次數(第一次 = 1")
source: Mapped[str] = mapped_column(String(50), default="k8s_pod", nullable=False,
comment="k8s_pod | host_syslog | app_log")
# 樣本日誌(保留首次觸發的原始行,供事後分析)
sample_log: Mapped[str | None] = mapped_column(Text, nullable=True,
comment="首次觸發的原始 log 行(前 500 字元)")
# Timestamps
first_seen_at: Mapped[datetime] = mapped_column(DateTime(timezone=True), default=taipei_now, nullable=False)
last_seen_at: Mapped[datetime] = mapped_column(DateTime(timezone=True), default=taipei_now,
onupdate=taipei_now, nullable=False)
__table_args__ = (
Index("ix_log_cluster_first_seen", "first_seen_at"),
Index("ix_log_cluster_source", "source"),
)
# =============================================================================
# AgentSession — Phase 2 多 Agent 辯證 Audit Trail
# =============================================================================
class AgentSession(Base):
"""
ADR-082 Phase 2: 多 Agent 辯證 Immutable Event Log
每個 Agent 每次「發言」寫一行。
session_id 串連同一次 Incident 決策的所有 Agent turns。
不可刪除 — 只能新增Immutable Event Sourcing
Phase 3 學習閉環依賴此表Critic 挑戰成功作為負向學習信號)。
ADR-082: 多 Agent 協作架構
2026-04-15 ogt + Claude Sonnet 4.6(亞太): Phase 2 初始建立
"""
__tablename__ = "agent_sessions"
id: Mapped[str] = mapped_column(
String(36), primary_key=True, default=lambda: str(uuid4()),
comment="行主鍵UUID"
)
session_id: Mapped[str] = mapped_column(
String(36), nullable=False,
comment="辯證 Session ID一次 Incident 決策的所有 turns 共用同一 session_id"
)
incident_id: Mapped[str] = mapped_column(
String(50), nullable=False,
comment="關聯 Incident ID"
)
agent_role: Mapped[str] = mapped_column(
String(20), nullable=False,
comment="Agent 角色diagnostician / solver / reviewer / critic / coordinator"
)
# 輸入指紋sha256[:16])— 用於查重、快取命中追蹤
input_hash: Mapped[str] = mapped_column(
String(16), nullable=False, default="",
comment="sha256(input_json)[:16],供查重與快取命中追蹤"
)
# Agent 輸出(完整 JSON供 Phase 3 學習 + 事後複盤)
output_json: Mapped[dict] = mapped_column(
JSON, nullable=False, default=dict,
comment="Agent 原始輸出DiagnosisReport / ActionPlan / 等序列化 dict"
)
# 品質指標
latency_ms: Mapped[int] = mapped_column(
Integer, nullable=False, default=0,
comment="此 Agent 的執行耗時ms"
)
vote: Mapped[str] = mapped_column(
String(20), nullable=False, default="abstain",
comment="Agent 投票approve / reject / request_revision / abstain / degraded"
)
degraded: Mapped[bool] = mapped_column(
nullable=False, default=False,
comment="True = 此 Agent 因熔斷/超時降級,輸出為 rule-based mock"
)
# 時間戳(台北時區)
created_at: Mapped[datetime] = mapped_column(
DateTime(timezone=True), default=taipei_now, nullable=False
)
__table_args__ = (
Index("ix_agent_sessions_session_id", "session_id"),
Index("ix_agent_sessions_incident_id", "incident_id"),
Index("ix_agent_sessions_created_at", "created_at"),
# 查詢某 session 中特定 role 的 turnCoordinator 聚合時常用)
Index("ix_agent_sessions_session_role", "session_id", "agent_role"),
)
# =============================================================================
# AiGovernanceEvent — Phase 6 自我治理事件溯源(不可刪除)
# ADR-087: AI 自我治理閉環SLO 違反 / 信任漂移 / KB 腐爛 / 自我降級
# 2026-04-15 ogt + Claude Sonnet 4.6(亞太): Phase 6 初始建立
#
# 核心鐵律:
# - 不可變 Event Sourcing — 只 INSERT禁止 UPDATE/DELETE
# - 所有治理事件必須落地 PGSLO dashboard 依賴此表
# - resolved=True 僅由人工或下次計算時補填,不可自動翻轉未解決項目
# =============================================================================
class AiGovernanceEvent(Base):
"""
AI 自我治理事件記錄(不可變)
event_type 值:
slo_violation — SLO 計算結果違反閾值
trust_drift — Playbook 信任度分布偏態(全高或全低)
kb_stale — KB 條目引用已廢棄 K8s API / Prometheus query
self_demotion — 信心閾值自動調高(自我降級)
conservative_mode — 連續 SLO 違反,全系統切保守模式
replay_degraded — 離線回放一致率連續下降
immutable — 只 INSERT禁 UPDATE / DELETE
"""
__tablename__ = "ai_governance_events"
id: Mapped[str] = mapped_column(
String(36), primary_key=True, default=generate_uuid,
comment="主鍵UUID"
)
event_type: Mapped[str] = mapped_column(
String(40), nullable=False,
comment="slo_violation / trust_drift / kb_stale / self_demotion / conservative_mode / replay_degraded"
)
triggered_at: Mapped[datetime] = mapped_column(
DateTime(timezone=True), default=taipei_now, nullable=False,
comment="事件觸發時間(台北時區)"
)
details: Mapped[dict] = mapped_column(
JSON, nullable=False, default=dict,
comment="事件詳情 JSONBSLO 數值、漂移分布等)"
)
resolved: Mapped[bool] = mapped_column(
default=False, nullable=False,
comment="是否已解決(人工確認或下次計算恢復正常後補填)"
)
resolved_at: Mapped[datetime | None] = mapped_column(
DateTime(timezone=True), nullable=True,
comment="解決時間(僅人工/系統補填,不得自動反轉未解決項目)"
)
__table_args__ = (
Index("ix_ai_governance_event_type", "event_type"),
Index("ix_ai_governance_triggered_at", "triggered_at"),
Index("ix_ai_governance_resolved", "resolved"),
)
# =============================================================================
# GovernanceRemediationDispatch — Wave 2 D 治理修復派遣表
# 2026-05-03 ogt + Claude Sonnet 4.6(亞太): db-expert spec 實作
#
# 設計原則:
# - 失敗重試 → INSERT 新 rowattempt_count+1不改舊 row審計痕跡
# - partial unique index同 event_id 不可同時有 2 筆活躍)→ migration SQL 宣告
# - 狀態機合法轉換由 Repository 層強制驗證
# =============================================================================
class GovernanceRemediationDispatch(Base):
"""
治理事件修復派遣記錄
將 5 種治理事件trust_drift / knowledge_degradation / llm_hallucination /
execution_blast_radius / governance_slo_data_gap接到修復執行器。
狀態機:
pending → dispatched | skipped | cancelled
dispatched → executing | failed | cancelled
executing → succeeded | failed | cancelled
failed → pending僅當 attempt < max_attempts且 INSERT 新 row舊 row 留 failed
succeeded / cancelled / skippedterminal
重試策略INSERT 新 rowaudit trail舊 row 保留 failed 狀態不可更改。
"""
__tablename__ = "governance_remediation_dispatch"
id: Mapped[str] = mapped_column(
String(36), primary_key=True, default=generate_uuid,
comment="主鍵UUID"
)
governance_event_id: Mapped[str] = mapped_column(
String(36),
ForeignKey("ai_governance_events.id", ondelete="RESTRICT"),
nullable=False,
index=True,
comment="關聯的治理事件 IDRESTRICT 禁止孤兒事件)"
)
event_type: Mapped[str] = mapped_column(
PgEnum(
"trust_drift", "knowledge_degradation", "llm_hallucination",
"execution_blast_radius", "governance_slo_data_gap",
name="governance_event_type", create_type=False,
),
nullable=False,
comment="治理事件類型(來自 ai_governance_events"
)
dispatch_status: Mapped[str] = mapped_column(
PgEnum(
"pending", "dispatched", "executing",
"succeeded", "failed", "skipped", "cancelled",
name="governance_dispatch_status", create_type=False,
),
nullable=False,
default="pending",
comment="派遣狀態機pending 為初始)"
)
playbook_id: Mapped[str | None] = mapped_column(
String(36),
ForeignKey("playbooks.playbook_id", ondelete="SET NULL"),
nullable=True,
index=True,
comment="關聯 Playbook可選未匹配時 NULL"
)
incident_id: Mapped[str | None] = mapped_column(
String(30),
ForeignKey("incidents.incident_id", ondelete="SET NULL"),
nullable=True,
index=True,
comment="關聯 Incident可選治理事件觸發的修復可無 incident"
)
approval_id: Mapped[str | None] = mapped_column(
String(36),
ForeignKey("approval_records.id", ondelete="SET NULL"),
nullable=True,
comment="關聯授權記錄(需人工審核時填入)"
)
decision_context: Mapped[dict] = mapped_column(
JSON, nullable=False, default=dict,
comment="派遣決策上下文 JSONBDecisionContextV1 schema 驗證後寫入)"
)
executor_type: Mapped[str] = mapped_column(
String(80), nullable=False,
comment="執行器類型(如 playbook_executor / manual / slo_repair"
)
attempt_count: Mapped[int] = mapped_column(
Integer, nullable=False, default=0,
comment="本 row 的嘗試次數(失敗重試時新 row attempt_count = 上筆 +1"
)
max_attempts: Mapped[int] = mapped_column(
Integer, nullable=False, default=3,
comment="最大重試次數上限(含首次)"
)
last_error: Mapped[str | None] = mapped_column(
Text, nullable=True,
comment="最後一次失敗的錯誤訊息"
)
dispatched_at: Mapped[datetime] = mapped_column(
DateTime(timezone=True), default=taipei_now, nullable=False,
comment="派遣時間(台北時區)"
)
started_at: Mapped[datetime | None] = mapped_column(
DateTime(timezone=True), nullable=True,
comment="執行開始時間executing 狀態時填入)"
)
completed_at: Mapped[datetime | None] = mapped_column(
DateTime(timezone=True), nullable=True,
comment="執行完成時間terminal 狀態時填入)"
)
created_by: Mapped[str | None] = mapped_column(
String(100), nullable=True, default="governance_dispatcher",
comment="建立者(系統自動派遣時為 governance_dispatcher"
)
__table_args__ = (
Index("ix_grd_status_dispatched", "dispatch_status", "dispatched_at"),
Index("ix_grd_event_status", "governance_event_id", "dispatch_status"),
Index("ix_grd_playbook_id", "playbook_id"),
Index("ix_grd_event_type_status", "event_type", "dispatch_status"),
CheckConstraint(
"attempt_count >= 0 AND attempt_count <= max_attempts",
name="ck_grd_attempts",
),
CheckConstraint(
"max_attempts > 0",
name="ck_grd_max_attempts_positive",
),
)
# =============================================================================
# TrustRecordDB - ADR-088 TrustScore 持久化
# =============================================================================
class TrustRecordDB(Base):
"""
Trust Score 持久化記錄
ADR-088: TrustScoreManager 從記憶體升級為 PostgreSQL 持久化。
Pod 重啟後分數不歸零AI 能真正累積信任達到 L4 自動放行。
score >= 5: MEDIUM → LOW (自動執行)
score >= 10: HIGH → MEDIUM (降一級)
2026-04-17 ogt + Claude Sonnet 4.6(亞太): Phase 4 信任持久化
"""
__tablename__ = "trust_records"
action_pattern: Mapped[str] = mapped_column(
String(255), primary_key=True,
comment="操作模式,例如 delete:nginx-frontend-*"
)
score: Mapped[int] = mapped_column(
Integer, nullable=False, default=0,
comment="累積信任分數。+1/approvereject 歸零"
)
total_approvals: Mapped[int] = mapped_column(
Integer, nullable=False, default=0,
)
total_rejections: Mapped[int] = mapped_column(
Integer, nullable=False, default=0,
)
last_approval_by: Mapped[str | None] = mapped_column(String(100), nullable=True)
last_approval_at: Mapped[datetime | None] = mapped_column(
DateTime(timezone=True), nullable=True,
)
last_rejection_by: Mapped[str | None] = mapped_column(String(100), nullable=True)
last_rejection_at: Mapped[datetime | None] = mapped_column(
DateTime(timezone=True), nullable=True,
)
created_at: Mapped[datetime] = mapped_column(
DateTime(timezone=True), nullable=False, default=taipei_now,
)
updated_at: Mapped[datetime] = mapped_column(
DateTime(timezone=True), nullable=False, default=taipei_now, onupdate=taipei_now,
)
__table_args__ = (
Index("ix_trust_records_score", "score"),
Index("ix_trust_records_updated", "updated_at"),
)
# =============================================================================
# AIProviderVersionHistory - AI Provider 版本歷史
# 2026-04-27 P3.2.2 by Claude
# =============================================================================
class AIProviderVersionHistory(Base):
"""AI Provider 版本探測歷史記錄
每次 ModelVersionTracker.run_probe_cycle() 寫入一筆。
changed=True 表示本次探測到版本或 digest 與上一筆不同。
Migration: apps/api/migrations/p3_2_provider_version_history.sql
"""
__tablename__ = "ai_provider_version_history"
id: Mapped[int] = mapped_column(Integer, primary_key=True, autoincrement=True)
provider: Mapped[str] = mapped_column(String(40), nullable=False, index=True)
model: Mapped[str] = mapped_column(String(100), nullable=False)
version: Mapped[str | None] = mapped_column(String(200), nullable=True)
digest: Mapped[str | None] = mapped_column(String(80), nullable=True)
captured_at: Mapped[datetime] = mapped_column(
DateTime(timezone=True), nullable=False, default=taipei_now,
)
prev_version: Mapped[str | None] = mapped_column(String(200), nullable=True)
changed: Mapped[bool] = mapped_column(Boolean, nullable=False, default=False)
__table_args__ = (
Index("ix_provider_version_captured", "provider", "captured_at"),
)
# =============================================================================
# BudgetLedgerRecord — ADR-120 Token Budget Hard KillPhase 2.6
# 2026-05-04 ogt + Claude Sonnet 4.6
# =============================================================================
class BudgetLedgerRecord(Base):
"""
LLM call 費用記帳表ADR-120 D5
每次 LLM call 完成後插入一筆記錄,供:
- Tenant Budget 累計計算Redis 快取,每分鐘從此表同步)
- 儀表板消費統計
- 告警閾值觸發80% / 95% / 100%
"""
__tablename__ = "budget_ledger"
id: Mapped[UUID] = mapped_column(
pg_UUID(as_uuid=True),
primary_key=True,
server_default=text("gen_random_uuid()"),
)
project_id: Mapped[str] = mapped_column(
String(64), nullable=False, default="awoooi", index=True
)
agent_id: Mapped[str | None] = mapped_column(String(128), nullable=True)
run_id: Mapped[UUID | None] = mapped_column(pg_UUID(as_uuid=True), nullable=True)
model: Mapped[str | None] = mapped_column(String(64), nullable=True)
provider: Mapped[str | None] = mapped_column(String(32), nullable=True)
prompt_tokens: Mapped[int | None] = mapped_column(Integer, nullable=True)
completion_tokens: Mapped[int | None] = mapped_column(Integer, nullable=True)
cost_usd: Mapped[Decimal] = mapped_column(
Numeric(10, 4), nullable=False, default=Decimal("0.0000")
)
recorded_at: Mapped[datetime] = mapped_column(
DateTime(timezone=True), nullable=False, server_default=text("NOW()")
)
__table_args__ = (
Index("idx_budget_ledger_project_date", "project_id", "recorded_at"),
)