Files
awoooi/apps/api/src/utils/similarity.py
OG T e1e3bba296
Some checks failed
E2E Health Check / e2e-health (push) Has been cancelled
refactor(api): Phase 22 技術債修復 - 業務邏輯分層
P2.3: LearningService.get_learning_summary() 業務邏輯移至 Service 層
- Repository 只提供原始統計數據
- Service 計算 best_action 和 learning_status

P2.6: Playbook similarity 計算邏輯抽取
- 新增 src/utils/similarity.py
- Repository 從 utils 導入,不再定義演算法

2026-03-31 Claude Code (首席架構師技術債修復)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-03-31 18:55:06 +08:00

88 lines
2.1 KiB
Python
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
"""
Similarity Calculation Utils
=============================
Phase 22 P2: 將相似度計算邏輯從 Repository 移出
設計原則:
- 演算法邏輯應獨立於資料存取層
- Repository 只負責 CRUD不負責演算法
- Service 層可以使用這些工具函數
版本: v1.0
建立: 2026-03-31 (台北時區)
建立者: Claude Code (首席架構師技術債修復)
"""
from src.models.playbook import SymptomPattern
def calculate_jaccard_similarity(set_a: set, set_b: set) -> float:
"""
計算 Jaccard 相似度
Jaccard = |A ∩ B| / |A B|
Args:
set_a: 集合 A
set_b: 集合 B
Returns:
float: 0.0 ~ 1.0
"""
if not set_a and not set_b:
return 1.0 # 兩個空集合視為完全相同
if not set_a or not set_b:
return 0.0
intersection = len(set_a & set_b)
union = len(set_a | set_b)
return intersection / union if union > 0 else 0.0
def calculate_symptom_similarity(
pattern_a: SymptomPattern,
pattern_b: SymptomPattern,
) -> float:
"""
計算症狀相似度
算法: 加權 Jaccard 相似度
維度權重:
- alert_names: 0.35 (最重要)
- affected_services: 0.30
- severity: 0.15
- keywords: 0.20
Returns:
float: 0.0 ~ 1.0 相似度分數
"""
weights = {
"alert_names": 0.35,
"affected_services": 0.30,
"severity": 0.15,
"keywords": 0.20,
}
scores = {
"alert_names": calculate_jaccard_similarity(
set(pattern_a.alert_names),
set(pattern_b.alert_names),
),
"affected_services": calculate_jaccard_similarity(
set(pattern_a.affected_services),
set(pattern_b.affected_services),
),
"severity": (
1.0
if set(pattern_a.severity_range) & set(pattern_b.severity_range)
else 0.0
),
"keywords": calculate_jaccard_similarity(
set(pattern_a.keywords),
set(pattern_b.keywords),
),
}
return sum(weights[k] * scores[k] for k in weights)