fix(knowledge): C1 首席架構師必修 — _query_kb_context 5秒 hard timeout
Some checks failed
CD Pipeline / build-and-deploy (push) Has been cancelled
Some checks failed
CD Pipeline / build-and-deploy (push) Has been cancelled
C1 修復 (首席架構師 Review 74/100 → 條件通過):
- 抽出 _query_kb_context_inner 含實際查詢邏輯
- _query_kb_context 用 asyncio.wait_for(timeout=5.0) 包裝
- Ollama hang/慢響應最多消耗 5s,保護 30s 決策 SLA
- timeout 時 logger.warning("kb_rag_timeout") 靜默降級
同步移除 LLM prompt 中的 emoji (## 📚 → ## Knowledge Base)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
@@ -548,41 +548,51 @@ class DecisionManager:
|
||||
_push_decision_to_telegram(incident, token.proposal_data)
|
||||
)
|
||||
|
||||
async def _query_kb_context_inner(self, incident: Incident) -> str:
|
||||
"""KB RAG 實際查詢邏輯,由 _query_kb_context 包裝 timeout 後呼叫"""
|
||||
from src.services.knowledge_service import get_knowledge_service
|
||||
query_parts = list(incident.affected_services)
|
||||
if incident.signals:
|
||||
query_parts.insert(0, getattr(incident.signals[0], "alert_name", ""))
|
||||
query = " ".join(filter(None, query_parts))
|
||||
|
||||
svc = get_knowledge_service()
|
||||
results = await svc.semantic_search(query, limit=3, threshold=0.4)
|
||||
if not results:
|
||||
return ""
|
||||
|
||||
lines = ["## Knowledge Base Related Entries (KB RAG)"]
|
||||
for entry, score in results:
|
||||
lines.append(
|
||||
f"\n### [{entry.entry_type}] {entry.title} (similarity={score:.2f})"
|
||||
)
|
||||
lines.append(entry.content[:500])
|
||||
if len(entry.content) > 500:
|
||||
lines.append("... (truncated)")
|
||||
|
||||
logger.info(
|
||||
"kb_rag_context_injected",
|
||||
incident_id=incident.incident_id,
|
||||
kb_hits=len(results),
|
||||
)
|
||||
return "\n".join(lines)
|
||||
|
||||
async def _query_kb_context(self, incident: Incident) -> str:
|
||||
"""
|
||||
KB Phase 2: 語意搜尋相關 KB 條目,組裝為 LLM context 字串
|
||||
2026-04-04 Claude Code: KB RAG 整合
|
||||
|
||||
失敗時靜默降級,不影響主分析流程
|
||||
C1 修復 (首席架構師審查): 5 秒 hard timeout,防止 Ollama 慢響應威脅 30s SLA
|
||||
失敗/timeout 時靜默降級,不影響主分析流程
|
||||
"""
|
||||
try:
|
||||
from src.services.knowledge_service import get_knowledge_service
|
||||
query_parts = list(incident.affected_services)
|
||||
if incident.signals:
|
||||
query_parts.insert(0, getattr(incident.signals[0], "alert_name", ""))
|
||||
query = " ".join(filter(None, query_parts))
|
||||
|
||||
svc = get_knowledge_service()
|
||||
results = await svc.semantic_search(query, limit=3, threshold=0.4)
|
||||
if not results:
|
||||
return ""
|
||||
|
||||
lines = ["## 📚 Knowledge Base 相關條目 (KB RAG)"]
|
||||
for entry, score in results:
|
||||
lines.append(
|
||||
f"\n### [{entry.entry_type}] {entry.title} (similarity={score:.2f})"
|
||||
)
|
||||
lines.append(entry.content[:500])
|
||||
if len(entry.content) > 500:
|
||||
lines.append("... (truncated)")
|
||||
|
||||
logger.info(
|
||||
"kb_rag_context_injected",
|
||||
incident_id=incident.incident_id,
|
||||
kb_hits=len(results),
|
||||
return await asyncio.wait_for(
|
||||
self._query_kb_context_inner(incident),
|
||||
timeout=5.0,
|
||||
)
|
||||
return "\n".join(lines)
|
||||
|
||||
except asyncio.TimeoutError:
|
||||
logger.warning("kb_rag_timeout", incident_id=incident.incident_id)
|
||||
return ""
|
||||
except Exception as e:
|
||||
logger.warning("kb_rag_failed", incident_id=incident.incident_id, error=str(e))
|
||||
return ""
|
||||
|
||||
Reference in New Issue
Block a user