Files
awoooi/apps/api/src/models/knowledge.py
Your Name c5753e1c57 fix(critic-review): KMWriter 名實統一 + Alertmanager 修抑制 + drift checker AST 化
critic PR review 揭示已 push commits 的 7 個 blocker,本 commit 全部修復。

## C1 + C2 + M1 + M2 + M3 — KMWriter 真正統一契約(critic 最嚴重 5 條)

### C1 km_writer.py:194 — backfill 自打臉修
- 裸 asyncio.create_task(_backfill_path_a_approval) → await _backfill_path_a_approval_safe()
- 同步 await + 獨立 DLQ km:backfill:dlq + try/except 不阻塞主寫入
- 新增 km_backfill_reconciler_job.py(每 5 分鐘掃 DLQ)+ ENABLE_KM_BACKFILL_RECONCILER flag
- 防 Path B 比 Path A 先完成 → related_approval_id 永遠 NULL 的 race

### C2 km_writer.py:391 — KM_WRITE_AWAIT=false 路徑收緊
- 從 ensure_future(fire-and-forget 比舊版同步寫更糟)
- 改 await writer.write(retry=1, timeout=2.0)(仍 await 但只試一次、超時短)
- docstring 明確標註「緊急回滾用,不保證可靠性」

### M1 decision_manager.py:2178/2203 — 移除 _fire_and_forget 旁路
- 兩處 _fire_and_forget(executor.write_execution_result_to_km(...))
- 改 await asyncio.shield(...) + BaseException 保護(防上層 cancel 中斷)
- KM_WRITE_AWAIT=true 在這條路徑終於真正 await

### M2 incident_service.py:1099 — 自製 path 加 retry+DLQ
- 原本 if settings.KM_WRITE_AWAIT: await asyncio.wait_for else create_task
- 改 3 次指數退避 retry + DLQ 保護(呼叫 km_writer 私有 helper)

### M3 km_writer.py:166 — 冪等聲明對齊實作
- knowledge_repository.create() 加 UPSERT 路徑(pg_insert ON CONFLICT DO UPDATE)
- KnowledgeEntryCreate / KnowledgeEntryRecord 加 path_type 欄位
- migration: ADD COLUMN path_type + partial unique index uix_knowledge_incident_path

## M4 alertmanager.yml — equal: [] 收緊(critic 防爆炸抑制)
- OllamaInstanceDown / KMConverterDown 抑制加 equal: ['cluster'] 約束
- 防多 cluster 場景下任一 Ollama down 誤抑全 AI/SLO 告警

## M5 Alertmanager 版本驗證(已確認 v0.31.1,遠超 v0.22+)

## M6 governance_agent.py — health score 區分 skipped vs ok vs violated
- check_slo_compliance 加 _meta {violated_count, skipped_count, ok_count, all_skipped, status}
- run_self_check: SLO 全 skipped 時獨立發 governance_slo_data_gap 告警
  (不污染 self_failure 計數,因為 no_data 是 emitter 未實作不是治理機制故障)

## M7 scripts/check_config_drift.py — 改 AST 解析
- regex 改 ast.parse 找 Settings ClassDef AnnAssign Field(default=...)
- 避免多行 list / default_factory= / 含跳行字串的 false negative
- 4 欄位(AI_FALLBACK_ORDER / ARGOCD_URL / PROMETHEUS_URL / OLLAMA_URL)全對齊

## 新增測試
- test_km_writer_backfill_reconciler.py: 7 cases(C1 reconciler + safe helper)
- test_km_writer_idempotent.py: 5 cases(M3 path_type 注入 + UPSERT 分支)

## 驗證
- 1585 unit tests 全綠(+13 從 1572)
- amtool check-config SUCCESS(8 inhibit_rules / 2 receivers)
- drift checker AST-based 4 欄位全對齊
- Alertmanager v0.31.1 確認支援新語法

## 期望影響
- KMWriter 名實統一:飛輪閉環 KM 寫入路徑 100% 可靠
- M4 抑制爆炸風險解除
- 治理層不再對 SLO no_data 靜默
- drift checker false negative 風險解除

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 10:44:39 +08:00

130 lines
4.6 KiB
Python
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
"""
Knowledge Base Models
=====================
Phase KB-1: 知識庫資料模型
兩層架構:
- KnowledgeEntry: 知識條目 (incident_case / runbook / best_practice / postmortem)
- Playbook: 獨立系統,透過 related_playbook_id 關聯
建立時間: 2026-04-02 (台北時區)
建立者: Claude Code (Knowledge Base Phase 1)
遵循 leWOOOgo 積木化原則:
- Pydantic BaseModel 定義
- PostgreSQL Episodic Memory
"""
from datetime import datetime
from enum import Enum
from pydantic import BaseModel, Field
from src.utils.timezone import now_taipei
# =============================================================================
# Enums
# =============================================================================
class EntryType(str, Enum):
"""知識條目類型"""
INCIDENT_CASE = "incident_case" # AI 從 Incident 萃取的案例分析
RUNBOOK = "runbook" # 手動建立的操作手冊
BEST_PRACTICE = "best_practice" # 最佳實踐文章
POSTMORTEM = "postmortem" # 事後分析報告
# 2026-04-04 ogt: Phase 25 P1 — Knowledge Auto-Harvesting 新增類型
AUTO_RUNBOOK = "auto_runbook" # Nemotron 自動生成的 RunbookDRAFT 待人工審核)
ANTI_PATTERN = "anti_pattern" # 修復失敗案例(直接 PUBLISHED阻斷後續重蹈覆轍
class EntrySource(str, Enum):
"""知識來源"""
AI_EXTRACTED = "ai_extracted" # AI 自動萃取
HUMAN = "human" # 人工建立
class EntryStatus(str, Enum):
"""知識條目狀態"""
DRAFT = "draft" # 草稿
REVIEW = "review" # 審核中
APPROVED = "approved" # 已批准
ARCHIVED = "archived" # 已封存
# 2026-04-04 Claude Code: Phase 25 P1 — ANTI_PATTERN 直接發布,無需審核
PUBLISHED = "published" # 已發布ANTI_PATTERN 用,無需人工審核)
# =============================================================================
# Pydantic Models
# =============================================================================
class KnowledgeEntryCreate(BaseModel):
"""建立知識條目 Request"""
title: str = Field(..., min_length=1, max_length=255)
content: str = Field(..., min_length=1)
entry_type: EntryType
category: str = Field(..., min_length=1, max_length=100)
tags: list[str] = Field(default_factory=list)
source: EntrySource = EntrySource.HUMAN
status: EntryStatus = EntryStatus.DRAFT
related_incident_id: str | None = None
related_playbook_id: str | None = None
# P1-1 2026-04-28 ogt + Claude Sonnet 4.6: M4 補反查鏈 — approval → KM 關聯
# phase26_incident_km_integration.sql 已建立 related_approval_id 欄位與 partial index
related_approval_id: str | None = None
# P1-1 M3 2026-04-28 ogt + Claude Sonnet 4.6: 冪等 key與 related_incident_id 組合)
# migration: p1_1_km_idempotent_path_type.sql
path_type: str | None = None
# 2026-04-04 ogt: Phase 25 P1 — Anti-Pattern 閉環用症狀 hash
symptoms_hash: str | None = None
created_by: str | None = None
class KnowledgeEntryUpdate(BaseModel):
"""更新知識條目 Request"""
title: str | None = None
content: str | None = None
entry_type: EntryType | None = None
category: str | None = None
tags: list[str] | None = None
status: EntryStatus | None = None
class KnowledgeEntry(BaseModel):
"""知識條目完整模型"""
id: str
title: str
content: str
entry_type: EntryType
category: str
tags: list[str] = Field(default_factory=list)
source: EntrySource
status: EntryStatus = EntryStatus.DRAFT
related_incident_id: str | None = None
related_playbook_id: str | None = None
# P1-1 2026-04-28 ogt + Claude Sonnet 4.6: M4 補反查鏈
related_approval_id: str | None = None
# P1-1 M3 2026-04-28 ogt + Claude Sonnet 4.6: 冪等 key
path_type: str | None = None
# 2026-04-04 ogt: Phase 25 P1 — Anti-Pattern 閉環攔截用的症狀 hashSymptomPattern.compute_hash()
symptoms_hash: str | None = None
view_count: int = 0
created_by: str | None = None
created_at: datetime = Field(default_factory=now_taipei)
updated_at: datetime = Field(default_factory=now_taipei)
model_config = {"from_attributes": True}
class CategoryCount(BaseModel):
"""分類統計"""
category: str
count: int
class KnowledgeListResponse(BaseModel):
"""列表回應"""
items: list[KnowledgeEntry]
total: int
categories: list[CategoryCount] = Field(default_factory=list)