Files
awoooi/apps/api/tests/test_km_playbook_feedback_loop.py
Your Name 3668d49f2f
Some checks failed
CD Pipeline / build-and-deploy (push) Failing after 1m38s
feat(flywheel): W2 三件 + KMWriter critic 修法(1635 tests 全綠)
W2 (onboarder 4 週飛輪 80→90 路徑第二週) + critic PR review 5 個 critical/major
全部修完,default flag=false 安全無爆炸風險。

## W2 三件 PR

### PR-R2 — AOL → catalog confidence EWMA 回灌(修飛輪斷鏈 C2)
- 新檔 `apps/api/src/jobs/aol_to_catalog_writeback_job.py`
- 邏輯:每小時掃 AOL 計算 EWMA confidence (alpha=0.3) 回灌 alert_rule_catalog
- 失敗閾值 N=5 連續低成功率 → review_status='draft'
- Hermes _fetch_noisy_rules SQL 加 OR review_status='draft'
- ENABLE_AOL_WRITEBACK_JOB=false (default)
- 8 個測試(mock path 修正:lazy import → patch src.db.base.get_db_context)

### PR-V1 — self_healing_validator 串接 (修飛輪斷鏈 C6)
- 新檔 `apps/api/src/services/self_healing_validator.py`(純函數 assess_self_healing)
- post_execution_verifier.py step 5 串接(feature flag gate)
- evidence_snapshot.py 加 self_healing_score / self_healing_detail 欄位
- db/models.py + base.py ALTER IF NOT EXISTS
- score < 0.5 → 觸發 rollback 提案 Telegram alert(不自動執行)
- ENABLE_SELF_HEALING_VALIDATOR=false (default)
- 7 個測試

### PR-L1 — KM ↔ Playbook 雙向回路 (修飛輪斷鏈 C3+C4)
- learning_service.py 三條新邏輯:
  1. _write_playbook_evolution_km:promote/demote 寫 KM 演化條目
  2. _check_and_mark_playbook_review:N=5 累積觸發 review_required
  3. _demote_alert_rule_catalog_confidence:DEPRECATED → confidence×=0.5
- PlaybookRecord 加 review_required 欄位(schema migration via base.py)
- ENABLE_KM_PLAYBOOK_FEEDBACK_LOOP=false (default)
- KM_PLAYBOOK_REVIEW_THRESHOLD=5 可調
- 6 個測試

## KMWriter Critic 5 個 Critical/Major 修復(之前 critic PR review 發現)
之前 push commit c5753e1c 已修,本 commit 補回 stash 中的對應檔案:
- C1 km_writer.py:194 backfill 自打臉(已修:同步 await + DLQ)
- C2 km_writer.py:391 KM_WRITE_AWAIT=false 路徑收緊
- M1 decision_manager.py:2178/2203 移除 _fire_and_forget
- M2 incident_service.py:1099 自製 path 加 retry+DLQ
- M3 km_writer.py:166 冪等聲明對齊(UPSERT + partial unique index)

## 驗證
- 1635 unit tests 全綠(+27 from 1608)
- 與 fb0c72db (推翻 A2 Ollama primary) 共存無衝突
- 所有新 Job/Service default flag=false(不爆炸)

## 期望影響
飛輪斷鏈 C2 + C3 + C4 + C6 全修
飛輪自主化評分:65 → 85 預估(W2 完成後)

啟用順序(待 prod fb0c72db 驗證 OLLAMA primary 跑得起來後):
1. ENABLE_AOL_WRITEBACK_JOB=true
2. ENABLE_KM_PLAYBOOK_FEEDBACK_LOOP=true
3. ENABLE_SELF_HEALING_VALIDATOR=true

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 19:44:04 +08:00

403 lines
13 KiB
Python
Raw Permalink Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
"""
KM → Playbook 互饋回路單元測試
================================
W2 PR-L1: 飛輪斷鏈 C3 + C4 修復測試
測試範圍:
1. test_playbook_promotion_writes_km_entry
— _promote_playbook 觸發後KMWriter 被呼叫寫 playbook_evolution 條目
2. test_playbook_demotion_writes_km_entry
— _demote_playbook 觸發後KMWriter 被呼叫寫 playbook_evolution 條目
3. test_km_accumulation_triggers_playbook_review
— 同 symptoms_hash 累積 5 條 → UPDATE playbooks.review_required=true
4. test_km_accumulation_below_threshold_no_update
— KM 條目 < threshold → 不執行 UPDATE
5. test_playbook_deprecated_demotes_alert_rule_confidence
— DEPRECATED Playbook → alert_rule_catalog.confidence *= 0.5
6. test_feature_flag_disabled
— ENABLE_KM_PLAYBOOK_FEEDBACK_LOOP=false → 三條邏輯全部跳過,不呼叫 DB
設計原則:
- 外部服務DB / KMWriter / PlaybookRepository以 AsyncMock 替換
- 每個 test 只測一條主路徑(單一職責)
- Feature flag 透過 patch 'src.core.config.settings' 控制
- get_db_context patch 路徑src.db.base.get_db_contextlocal import 的來源模組)
- get_playbook_repository patch 路徑:
src.repositories.playbook_repository.get_playbook_repository
建立2026-04-28 (台北時區) ogt + Claude Sonnet 4.6
"""
from __future__ import annotations
from contextlib import asynccontextmanager
from types import SimpleNamespace
from unittest.mock import AsyncMock, MagicMock, patch
import pytest
# =============================================================================
# Helpers
# =============================================================================
def _make_playbook(
playbook_id: str = "PB-20260428-AAAAAA",
name: str = "TestPlaybook",
trust_score: float = 0.5,
success_count: int = 3,
failure_count: int = 1,
status: str = "approved",
alert_names: list[str] | None = None,
) -> SimpleNamespace:
"""
建立一個最小可用的 Playbook mock 物件。
使用 SimpleNamespace 讓屬性存取與 Pydantic model 相同,
但不引入真實 ORM / Pydantic 依賴(防止 DB 連線)。
symptom_pattern.compute_hash() 返回固定 'abc123' 供測試使用。
"""
symptom = SimpleNamespace(
alert_names=alert_names or ["HighCpuUsage"],
affected_services=["api"],
label_patterns={},
compute_hash=lambda: "abc123",
)
from src.models.playbook import PlaybookStatus
status_enum = PlaybookStatus(status)
return SimpleNamespace(
playbook_id=playbook_id,
name=name,
trust_score=trust_score,
success_count=success_count,
failure_count=failure_count,
status=status_enum,
symptom_pattern=symptom,
)
def _make_learning_service():
"""
建立 LearningService 實例,所有外部依賴 mock 掉。
repository 和 trust_repository 均使用 AsyncMock 防止 Redis 連線。
"""
from src.services.learning_service import LearningService
mock_repo = AsyncMock()
mock_trust_repo = AsyncMock()
mock_trust_mgr = MagicMock()
mock_trust_mgr.get_trust_record.return_value = None
svc = LearningService(
repository=mock_repo,
trust_repository=mock_trust_repo,
)
svc._trust_manager = mock_trust_mgr
return svc
def _make_settings(enable_loop: bool = True, threshold: int = 5) -> MagicMock:
"""
建立 settings mock。
patch 路徑src.core.config.settingslearning_service 各方法均 local import 自此模組)
"""
m = MagicMock()
m.ENABLE_KM_PLAYBOOK_FEEDBACK_LOOP = enable_loop
m.KM_PLAYBOOK_REVIEW_THRESHOLD = threshold
m.KM_WRITE_AWAIT = True
m.KM_WRITE_TIMEOUT_SECONDS = 5.0
return m
def _make_db_context_factory(mock_db):
"""
返回一個可多次呼叫的 async context manager factory。
每次呼叫 factory() 返回新的 async context manager 實例,
防止同一 cm 物件被複用async generator 只能迭代一次)。
"""
def factory():
@asynccontextmanager
async def _ctx():
yield mock_db
return _ctx()
return factory
# =============================================================================
# 1. Promote 觸發 → 寫 KM 演化條目
# =============================================================================
@pytest.mark.asyncio
async def test_playbook_promotion_writes_km_entry():
"""
_promote_playbook 觸發後,若 ENABLE_KM_PLAYBOOK_FEEDBACK_LOOP=True
km_write_with_flag 應被呼叫一次path_type 含 'playbook_evolution'
"""
svc = _make_learning_service()
playbook = _make_playbook(trust_score=0.5, status="approved")
km_calls: list = []
async def _mock_km_write(payload, *, timeout=None):
km_calls.append(payload)
from src.services.km_writer import KMWriteResult
return KMWriteResult.SUCCESS
mock_pb_repo = AsyncMock()
mock_pb_repo.find_by_source_incident = AsyncMock(return_value=[playbook])
mock_pb_repo.adjust_confidence = AsyncMock(return_value=True)
mock_settings = _make_settings(enable_loop=True)
with (
patch("src.core.config.settings", mock_settings),
patch("src.services.km_writer.km_write_with_flag", side_effect=_mock_km_write),
patch(
"src.repositories.playbook_repository.get_playbook_repository",
return_value=mock_pb_repo,
),
):
result = await svc._promote_playbook("INC-TEST-001")
assert result is True
assert len(km_calls) == 1, "KMWriter 應被呼叫一次(一個 Playbook promote"
assert "playbook_evolution" in km_calls[0].path_type
assert km_calls[0].metadata["evolution_type"] == "promote"
assert km_calls[0].metadata["playbook_id"] == playbook.playbook_id
assert km_calls[0].metadata["previous_trust"] == 0.5
# =============================================================================
# 2. Demote 觸發 → 寫 KM 演化條目
# =============================================================================
@pytest.mark.asyncio
async def test_playbook_demotion_writes_km_entry():
"""
_demote_playbook 觸發後,若 ENABLE_KM_PLAYBOOK_FEEDBACK_LOOP=True
km_write_with_flag 應被呼叫一次evolution_type='demote'
status='approved'(非 DEPRECATED→ 邏輯 3 不觸發,保持單一職責。
"""
svc = _make_learning_service()
playbook = _make_playbook(trust_score=0.4, status="approved")
km_calls: list = []
async def _mock_km_write(payload, *, timeout=None):
km_calls.append(payload)
from src.services.km_writer import KMWriteResult
return KMWriteResult.SUCCESS
mock_pb_repo = AsyncMock()
mock_pb_repo.find_by_source_incident = AsyncMock(return_value=[playbook])
mock_pb_repo.adjust_confidence = AsyncMock(return_value=True)
mock_settings = _make_settings(enable_loop=True)
with (
patch("src.core.config.settings", mock_settings),
patch("src.services.km_writer.km_write_with_flag", side_effect=_mock_km_write),
patch(
"src.repositories.playbook_repository.get_playbook_repository",
return_value=mock_pb_repo,
),
):
result = await svc._demote_playbook("INC-TEST-002")
assert result is True
assert len(km_calls) == 1, "KMWriter 應被呼叫一次(一個 Playbook demote"
assert "playbook_evolution" in km_calls[0].path_type
assert km_calls[0].metadata["evolution_type"] == "demote"
# =============================================================================
# 3. KM 累積 N=5 → review_required=True
# =============================================================================
@pytest.mark.asyncio
async def test_km_accumulation_triggers_playbook_review():
"""
同 symptoms_hash 的 KM 條目達到 threshold預設 5
_check_and_mark_playbook_review 應執行 COUNT + UPDATE並 commit。
"""
svc = _make_learning_service()
symptoms_hash = "abc123"
mock_db = AsyncMock()
execute_call_count = {"n": 0}
mock_count_result = MagicMock()
mock_count_result.scalar.return_value = 5
mock_update_result = MagicMock()
mock_update_result.fetchall.return_value = [("PB-20260428-AAAAAA",)]
async def _multi_execute(stmt, params=None):
execute_call_count["n"] += 1
if execute_call_count["n"] == 1:
return mock_count_result
return mock_update_result
mock_db.execute = _multi_execute
mock_db.commit = AsyncMock()
mock_settings = _make_settings(enable_loop=True, threshold=5)
with (
patch("src.core.config.settings", mock_settings),
patch(
"src.db.base.get_db_context",
side_effect=_make_db_context_factory(mock_db),
),
):
await svc._check_and_mark_playbook_review(symptoms_hash)
assert execute_call_count["n"] == 2, "應執行兩次 SQLCOUNT + UPDATE"
mock_db.commit.assert_called_once()
@pytest.mark.asyncio
async def test_km_accumulation_below_threshold_no_update():
"""
KM 條目數 < threshold → 不執行 UPDATE不 commit。
"""
svc = _make_learning_service()
symptoms_hash = "abc123"
mock_db = AsyncMock()
execute_call_count = {"n": 0}
mock_count_result = MagicMock()
mock_count_result.scalar.return_value = 3 # < 5
async def _single_execute(stmt, params=None):
execute_call_count["n"] += 1
return mock_count_result
mock_db.execute = _single_execute
mock_db.commit = AsyncMock()
mock_settings = _make_settings(enable_loop=True, threshold=5)
with (
patch("src.core.config.settings", mock_settings),
patch(
"src.db.base.get_db_context",
side_effect=_make_db_context_factory(mock_db),
),
):
await svc._check_and_mark_playbook_review(symptoms_hash)
assert execute_call_count["n"] == 1, "只執行 COUNT不執行 UPDATE"
mock_db.commit.assert_not_called()
# =============================================================================
# 4. DEPRECATED → alert_rule_catalog.confidence *= 0.5
# =============================================================================
@pytest.mark.asyncio
async def test_playbook_deprecated_demotes_alert_rule_confidence():
"""
DEPRECATED Playbook 的 _demote_alert_rule_catalog_confidence 執行後,
每個 alert_name 執行一次 UPDATE最後 commit 一次。
"""
svc = _make_learning_service()
from src.models.playbook import PlaybookStatus
playbook = _make_playbook(
status="deprecated",
alert_names=["HighCpuUsage", "PodCrashLooping"],
)
playbook.status = PlaybookStatus.DEPRECATED
mock_db = AsyncMock()
execute_call_count = {"n": 0}
async def _track_execute(stmt, params=None):
execute_call_count["n"] += 1
m = MagicMock()
m.rowcount = 1
return m
mock_db.execute = _track_execute
mock_db.commit = AsyncMock()
mock_settings = _make_settings(enable_loop=True)
with (
patch("src.core.config.settings", mock_settings),
patch(
"src.db.base.get_db_context",
side_effect=_make_db_context_factory(mock_db),
),
):
await svc._demote_alert_rule_catalog_confidence(playbook)
assert execute_call_count["n"] == 2, "2 條 alert_names → 2 次 UPDATE"
mock_db.commit.assert_called_once()
# =============================================================================
# 5. Feature flag disabled → 所有邏輯跳過
# =============================================================================
@pytest.mark.asyncio
async def test_feature_flag_disabled():
"""
ENABLE_KM_PLAYBOOK_FEEDBACK_LOOP=False 時,
_write_playbook_evolution_km / _check_and_mark_playbook_review /
_demote_alert_rule_catalog_confidence 均不應呼叫任何 DB 或 KMWriter。
"""
svc = _make_learning_service()
from src.models.playbook import PlaybookStatus
playbook = _make_playbook(trust_score=0.3, status="deprecated")
playbook.status = PlaybookStatus.DEPRECATED
km_write_calls: list = []
db_execute_calls: list = []
async def _mock_km_write(payload, *, timeout=None):
km_write_calls.append(payload)
from src.services.km_writer import KMWriteResult
return KMWriteResult.SUCCESS
mock_db = AsyncMock()
async def _track_execute(stmt, params=None):
db_execute_calls.append(stmt)
return MagicMock()
mock_db.execute = _track_execute
mock_db.commit = AsyncMock()
mock_settings = _make_settings(enable_loop=False)
with (
patch("src.core.config.settings", mock_settings),
patch("src.services.km_writer.km_write_with_flag", side_effect=_mock_km_write),
patch(
"src.db.base.get_db_context",
side_effect=_make_db_context_factory(mock_db),
),
):
# 邏輯 1
await svc._write_playbook_evolution_km(
playbook=playbook,
previous_trust=0.5,
evolution_type="promote",
incident_id="INC-TEST-FLAG",
)
# 邏輯 2
await svc._check_and_mark_playbook_review("abc123")
# 邏輯 3
await svc._demote_alert_rule_catalog_confidence(playbook)
assert len(km_write_calls) == 0, "KMWriter 不應被呼叫flag=False"
assert len(db_execute_calls) == 0, "DB execute 不應被呼叫flag=False"
mock_db.commit.assert_not_called()