Files
awoooi/apps/api/tests/test_ai_governance_endpoints.py
Your Name e45b055e0e
Some checks failed
Code Review / ai-code-review (push) Successful in 48s
run-migration / migrate (push) Failing after 45s
CD Pipeline / tests (push) Successful in 3m46s
Type Sync Check / check-type-sync (push) Successful in 2m8s
CD Pipeline / build-and-deploy (push) Failing after 31m14s
CD Pipeline / post-deploy-checks (push) Has been skipped
feat(governance): AI 治理事件處理鏈四軌交付(C/D/B/A)
【十二人專家團隊全景掃描 + 並行四軌實施】

統帥質疑「有讓 12-agent 一起協作嗎」後,依照團隊規則完成全鏈路交付:
onboarder + critic + db-expert + debugger + frontend-designer 並行掃描,
找到 6 大 Gap,再由 fullstack-engineer × 4、refactor-specialist 協作落地。

【Track C — trust_drift 雙寫整併】

兩條獨立寫 event_type=trust_drift 路徑互不呼叫,下游 consumer 拿到雙份資料
無法判定 source-of-truth。整併保留 governance_agent.check_trust_drift(功能
更全:auto-deprecate + Telegram + PG),TrustDriftDetector 降為純統計 lib,
W-6 watchdog 改呼叫 governance_agent。新增 TestSinglePgWritePerDriftScenario
驗證同一 drift 場景只觸發一次 PG 寫入。

  變更:
    - apps/api/src/services/trust_drift_detector.py(lib only,不再寫 PG)
    - apps/api/tests/test_trust_drift_watchdog.py(W-6 改 mock governance_agent)

【Track D — governance_remediation_dispatch 派遣表】

ai_governance_events 是不可變 Event Sourcing,不能塞執行狀態。新建派遣表
作為投影層:1 event → 0..N dispatches,狀態可變、可重試、可審計。

  - PgEnum 5 種 event_type + 7 階段狀態機(pending → dispatched → executing →
    succeeded/failed/cancelled/skipped)
  - 失敗重試 INSERT 新 row(不改舊 row 的 status,保留審計痕跡)
  - Partial unique index ux_grd_one_active_per_event 強制「同事件唯一活躍」
  - 4 個複合 index 支援 worker poll、去重查詢、觀測面板
  - FK 對應 ai_governance_events / playbooks / incidents / approval_records
    全部 SET NULL(avoid cascade lock,但 governance_event 用 RESTRICT)

  變更:
    - apps/api/src/db/models.py(GovernanceRemediationDispatch ORM class)
    - apps/api/migrations/governance_remediation_dispatch_2026-05-03.sql
    - apps/api/src/repositories/governance_remediation_dispatch_repo.py
      (6 個 async 函式 + 3 個自訂例外:DispatchAlreadyActive /
       InvalidStatusTransition / DispatchNotFound)
    - apps/api/src/models/governance_dispatch.py(DecisionContextV1 等 4 schema)
    - apps/api/tests/test_governance_remediation_dispatch.py(29 tests)

【Track B — /governance 頁面】

後端 PR1 三個 endpoint + 前端 PR2-5 完整三 Tab。

PR1 後端:
  - GET /api/v1/ai/governance/events(events_tab,含 event_type/severity/
    狀態/時間範圍篩選 + 分頁)
  - GET /api/v1/ai/governance/queue(queue_tab,含 graceful fallback:
    dispatch 表不存在時回 table_pending=True 不拋 500)
  - GET /api/v1/ai/governance/summary(slo_tab 30d 違反時序圖)
  - severity 映射規則寫死(critic 建議未來移 settings)

PR2-5 前端:
  - /governance 路由 + AppLayout + Compliance Badge 橫幅 + PageTabs
  - SLO Tab:3 KPI 卡片(Syne 28px + StatusOrb + 7d sparkline)+
    30d 違反 stacked BarChart
  - Events Tab:篩選列 + 表格 + inline 展開行(JSON / 修復建議 / 派遣記錄)
  - Queue Tab:HITL 待辦卡片 + 信任度進度條 + 批准/拒絕按鈕(本 PR console.log)
  - Sidebar 加入「AI 治理」入口(ShieldCheck icon)
  - i18n 雙語完整(governance namespace + nav.governance)
  - 7 個新元件:slo-kpi-card / slo-violation-chart / events-table /
    events-filter-bar / event-detail-drawer / queue-item-card / queue-history-tabs

  變更:
    - apps/api/src/api/v1/ai_governance.py(router)
    - apps/api/src/services/governance_query_service.py
    - apps/api/src/models/governance.py(Pydantic V2 schemas)
    - apps/api/tests/test_ai_governance_endpoints.py(21 tests)
    - apps/web/src/app/[locale]/governance/(page + 3 tabs)
    - apps/web/src/components/governance/(7 元件)
    - apps/web/messages/{zh-TW,en}.json(governance namespace)
    - apps/web/src/components/layout/sidebar.tsx(+1 行)
    - apps/api/src/main.py(router include)

【Track A — GovernanceDispatcher 決策融合】

把治理事件接到 remediation 執行器,走北極星方向決策融合(LLM × Playbook trust
× MCP),符合「禁寫死規則」鐵律。

  - 設計鐵律:DecisionFusionAdapter 是新增 wrapper,**不修改任何 Tier 3 檔**
    (decision_manager / learning_service / trust_engine),只 consume 既有 API
  - 三維融合公式:confidence = 0.4×llm + 0.3×playbook_trust + 0.3×mcp_consistency
    (權重加 TODO 標明未來由 AI 自學調整)
  - 三分支決策路徑:
    confidence ≥ 0.85 → auto_dispatch(status=dispatched)
    0.65 ≤ confidence < 0.85 → pending_approval(HITL)
    confidence < 0.65 → skip + log
  - decision_context JSONB 完整記錄三維輸入快照(給未來 fine-tune 用)
  - poll 30s 掃 unresolved 事件,仿 governance loop 模式
  - 重複事件擋去重(呼叫 get_active_for_event)

  變更:
    - apps/api/src/services/governance_dispatcher.py
    - apps/api/src/services/decision_fusion_adapter.py
    - apps/api/tests/test_governance_dispatcher.py(14 tests)
    - apps/api/src/main.py(lifespan task 接 run_governance_dispatcher_loop)

【驗證】

1836 個 unit test 全過(29 skipped 為既有 PG integration env 問題)

【調度教訓 — 已記入 memory】

- vuln-verifier 應在 fullstack-engineer **之前**跑(避免並行讀到已修代碼誤判)
- critic 雙輪審查不可省(第二輪抓到 NaN sentinel + Prom rule 連鎖)
- 北極星「禁寫死規則」搭配 decision-fusion 確實實施

【未動 Tier 3 — 已驗證】

git diff 確認本 commit 完全沒改 decision_manager.py / learning_service.py /
trust_engine.py,只新增 wrapper service consume 既有 API。

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-03 12:42:40 +08:00

368 lines
13 KiB
Python
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# apps/api/tests/test_ai_governance_endpoints.py | 2026-05-02 @ Asia/Taipei
"""
Unit Tests — AI Governance Endpoints (PR 1)
覆蓋範圍:
1. events endpoint 分頁邏輯正確
2. events endpoint severity 映射正確critical / warning / info
3. queue endpoint graceful fallbackmock ProgrammingError
4. summary endpoint compliance_rate 計算(含 total=0 邊界)
5. summary endpoint compliance_rate 計算(有 unresolved 的正常情況)
測試策略mock service 層函式,不依賴 DB確保 Router 邏輯正確。
"""
from __future__ import annotations
from datetime import datetime, timezone, timedelta
from unittest.mock import AsyncMock, patch
import pytest
from fastapi import FastAPI
from fastapi.testclient import TestClient
from src.api.v1.ai_governance import router
from src.models.governance import (
DailyCount,
DispatchItem,
GovernanceEvent,
GovernanceEventsResponse,
GovernanceQueueResponse,
GovernanceSummaryResponse,
map_severity,
)
TAIPEI = timezone(timedelta(hours=8))
NOW = datetime(2026, 5, 2, 12, 0, tzinfo=TAIPEI)
# =============================================================================
# Fixture
# =============================================================================
@pytest.fixture
def client():
app = FastAPI()
app.include_router(router, prefix="/api/v1")
return TestClient(app)
def _make_event(
event_id: str = "evt-001",
event_type: str = "slo_violation",
resolved: bool = False,
) -> GovernanceEvent:
return GovernanceEvent(
id=event_id,
event_type=event_type,
severity=map_severity(event_type),
triggered_at=NOW,
resolved=resolved,
resolved_at=None,
impact="SLO violated",
details={"message": "test"},
remediation=None,
dispatch_ids=[],
)
# =============================================================================
# 1. severity 映射單元測試
# =============================================================================
class TestSeverityMapping:
def test_critical_types(self):
for et in ("slo_violation", "conservative_mode", "governance_slo_data_gap"):
assert map_severity(et) == "critical", f"{et} should be critical"
def test_warning_types(self):
for et in ("trust_drift", "kb_stale", "knowledge_degradation", "execution_blast_radius"):
assert map_severity(et) == "warning", f"{et} should be warning"
def test_info_types(self):
for et in ("replay_degraded", "self_demotion", "llm_hallucination", "unknown_event"):
assert map_severity(et) == "info", f"{et} should be info"
# =============================================================================
# 2. events endpoint 分頁
# =============================================================================
class TestEventsEndpoint:
def test_pagination_default(self, client):
"""page=1 size=20 預設分頁正確."""
fake_response = GovernanceEventsResponse(
items=[_make_event(str(i)) for i in range(5)],
total=5,
page=1,
size=20,
)
with patch(
"src.api.v1.ai_governance.query_governance_events",
new_callable=lambda: lambda **kw: None,
):
with patch(
"src.api.v1.ai_governance.query_governance_events",
new=AsyncMock(return_value=fake_response),
):
r = client.get("/api/v1/ai/governance/events")
assert r.status_code == 200
data = r.json()
assert data["total"] == 5
assert data["page"] == 1
assert data["size"] == 20
assert len(data["items"]) == 5
def test_pagination_custom(self, client):
"""自訂分頁參數傳入 service."""
fake_response = GovernanceEventsResponse(
items=[_make_event()],
total=50,
page=3,
size=10,
)
captured: dict = {}
async def mock_query(**kwargs):
captured.update(kwargs)
return fake_response
with patch("src.api.v1.ai_governance.query_governance_events", new=mock_query):
r = client.get("/api/v1/ai/governance/events?page=3&size=10")
assert r.status_code == 200
assert captured["page"] == 3
assert captured["size"] == 10
data = r.json()
assert data["total"] == 50
def test_severity_filter_passed(self, client):
"""severity query param 正確傳入 service."""
fake_response = GovernanceEventsResponse(items=[], total=0, page=1, size=20)
captured: dict = {}
async def mock_query(**kwargs):
captured.update(kwargs)
return fake_response
with patch("src.api.v1.ai_governance.query_governance_events", new=mock_query):
r = client.get("/api/v1/ai/governance/events?severity=critical")
assert r.status_code == 200
assert captured["severity"] == "critical"
def test_invalid_severity_rejected(self, client):
"""非法 severity 值應被拒絕422."""
r = client.get("/api/v1/ai/governance/events?severity=bad_value")
assert r.status_code == 422
def test_invalid_status_rejected(self, client):
"""非法 status 值應被拒絕422."""
r = client.get("/api/v1/ai/governance/events?status=invalid")
assert r.status_code == 422
def test_severity_in_response(self, client):
"""回傳的事件 severity 欄位對應 event_type 映射."""
events = [
_make_event("e1", "slo_violation"), # critical
_make_event("e2", "trust_drift"), # warning
_make_event("e3", "self_demotion"), # info
]
fake_response = GovernanceEventsResponse(items=events, total=3, page=1, size=20)
with patch(
"src.api.v1.ai_governance.query_governance_events",
new=AsyncMock(return_value=fake_response),
):
r = client.get("/api/v1/ai/governance/events")
assert r.status_code == 200
items = r.json()["items"]
assert items[0]["severity"] == "critical"
assert items[1]["severity"] == "warning"
assert items[2]["severity"] == "info"
# =============================================================================
# 3. queue endpoint graceful fallback
# =============================================================================
class TestQueueEndpoint:
def test_graceful_fallback_on_programming_error(self, client):
"""dispatch 表不存在時回 table_pending=true不拋 500."""
fallback = GovernanceQueueResponse(
items=[], total=0, page=1, size=10, table_pending=True,
)
with patch(
"src.api.v1.ai_governance.query_governance_queue",
new=AsyncMock(return_value=fallback),
):
r = client.get("/api/v1/ai/governance/queue")
assert r.status_code == 200
data = r.json()
assert data["table_pending"] is True
assert data["items"] == []
assert data["total"] == 0
def test_normal_response_when_table_ready(self, client):
"""表就緒時正常回傳 items."""
dispatch_item = DispatchItem(
id="d-001",
governance_event_id="evt-001",
event_type="slo_violation",
dispatch_status="pending",
proposed_action="restart deployment",
playbook_id=None,
playbook_trust=None,
created_at=NOW,
dispatched_at=None,
completed_at=None,
operator_note=None,
)
normal = GovernanceQueueResponse(
items=[dispatch_item], total=1, page=1, size=10, table_pending=False,
)
with patch(
"src.api.v1.ai_governance.query_governance_queue",
new=AsyncMock(return_value=normal),
):
r = client.get("/api/v1/ai/governance/queue")
assert r.status_code == 200
data = r.json()
assert data["table_pending"] is False
assert len(data["items"]) == 1
assert data["items"][0]["dispatch_status"] == "pending"
def test_invalid_dispatch_status_rejected(self, client):
"""非法 dispatch_status 應被拒絕422."""
r = client.get("/api/v1/ai/governance/queue?dispatch_status=unknown")
assert r.status_code == 422
# =============================================================================
# 4. summary endpoint compliance_rate
# =============================================================================
class TestSummaryEndpoint:
def test_compliance_rate_normal(self, client):
"""有 unresolved 時計算 1 - unresolved/total."""
fake = GovernanceSummaryResponse(
compliance_rate=0.8,
total_events=10,
unresolved_count=2,
daily_counts=[],
)
with patch(
"src.api.v1.ai_governance.query_governance_summary",
new=AsyncMock(return_value=fake),
):
r = client.get("/api/v1/ai/governance/summary")
assert r.status_code == 200
data = r.json()
assert data["compliance_rate"] == pytest.approx(0.8)
assert data["total_events"] == 10
assert data["unresolved_count"] == 2
def test_compliance_rate_all_resolved(self, client):
"""全部已解決時 compliance_rate = 1.0."""
fake = GovernanceSummaryResponse(
compliance_rate=1.0,
total_events=5,
unresolved_count=0,
daily_counts=[],
)
with patch(
"src.api.v1.ai_governance.query_governance_summary",
new=AsyncMock(return_value=fake),
):
r = client.get("/api/v1/ai/governance/summary?days=7")
assert r.status_code == 200
assert r.json()["compliance_rate"] == pytest.approx(1.0)
def test_compliance_rate_total_zero(self, client):
"""total_events=0 時 compliance_rate = 1.0(邊界測試)."""
fake = GovernanceSummaryResponse(
compliance_rate=1.0,
total_events=0,
unresolved_count=0,
daily_counts=[],
)
with patch(
"src.api.v1.ai_governance.query_governance_summary",
new=AsyncMock(return_value=fake),
):
r = client.get("/api/v1/ai/governance/summary")
assert r.status_code == 200
data = r.json()
assert data["compliance_rate"] == pytest.approx(1.0)
assert data["total_events"] == 0
def test_days_max_boundary(self, client):
"""days=90 邊界值應被接受."""
fake = GovernanceSummaryResponse(
compliance_rate=1.0, total_events=0, unresolved_count=0, daily_counts=[],
)
with patch(
"src.api.v1.ai_governance.query_governance_summary",
new=AsyncMock(return_value=fake),
):
r = client.get("/api/v1/ai/governance/summary?days=90")
assert r.status_code == 200
def test_days_over_max_rejected(self, client):
"""days=91 應被拒絕422."""
r = client.get("/api/v1/ai/governance/summary?days=91")
assert r.status_code == 422
def test_daily_counts_structure(self, client):
"""daily_counts 結構正確."""
fake = GovernanceSummaryResponse(
compliance_rate=0.9,
total_events=10,
unresolved_count=1,
daily_counts=[
DailyCount(date="2026-05-01", total=3, by_type={"slo_violation": 2, "trust_drift": 1}),
DailyCount(date="2026-05-02", total=7, by_type={"slo_violation": 7}),
],
)
with patch(
"src.api.v1.ai_governance.query_governance_summary",
new=AsyncMock(return_value=fake),
):
r = client.get("/api/v1/ai/governance/summary")
assert r.status_code == 200
counts = r.json()["daily_counts"]
assert len(counts) == 2
assert counts[0]["date"] == "2026-05-01"
assert counts[0]["by_type"]["slo_violation"] == 2
# =============================================================================
# 5. service 層 compliance_rate 純函式測試(不經 HTTP
# =============================================================================
class TestComplianceRateCalculation:
"""直接測試 service 邏輯,不經 Router。"""
def test_formula_normal(self):
"""1 - 2/10 = 0.8"""
rate = round(1.0 - 2 / 10, 4)
assert rate == pytest.approx(0.8)
def test_formula_zero_total(self):
"""total=0 → 1.0"""
total = 0
rate = 1.0 if total == 0 else round(1.0 - 0 / total, 4)
assert rate == pytest.approx(1.0)
def test_formula_all_unresolved(self):
"""1 - 5/5 = 0.0"""
rate = round(1.0 - 5 / 5, 4)
assert rate == pytest.approx(0.0)