OG T
|
a2cc985f60
|
feat(mcp-phase3): ArgoCD MCP + Sentry MCP + 完整 Provider 註冊
CD Pipeline / build-and-deploy (push) Has been cancelled
ArgoCDProvider (3 工具):
- argocd_list_apps: 列出所有 App + sync/health 狀態
- argocd_get_app_status: 詳細狀態 + 問題資源清單
- argocd_get_sync_history: 最近 N 筆部署記錄
- 輸入驗證: app_name 白名單 regex
- 需 ARGOCD_API_TOKEN + ARGOCD_MCP_ENABLED=true
SentryProvider (3 工具):
- sentry_list_issues: 列出最近 Issues(狀態過濾)
- sentry_get_issue: 詳情 + stacktrace 最後 5 frames
- sentry_search_issues: PromQL 風格搜尋
- issue_id 白名單驗證(只允許純數字)
- 需 SENTRY_AUTH_TOKEN + SENTRY_MCP_ENABLED=true
providers/__init__.py: 補上 Prometheus + SSH + ArgoCD + Sentry 全部 10 個 providers
config.py: 新增 ARGOCD_URL / ARGOCD_API_TOKEN / ARGOCD_MCP_ENABLED / SENTRY_MCP_ENABLED
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-04-11 09:11:53 +08:00 |
|
OG T
|
1ec19656b5
|
feat(adr071-ij): TYPE-2 指標快照卡片 + KM 三段資料整合
CD Pipeline / build-and-deploy (push) Failing after 8m17s
Deploy Alert Rules / Deploy Prometheus Alert Rules (push) Successful in 36s
Ansible Lint / lint (push) Has been cancelled
ADR-071-I: decision_manager 執行前後各抓一次 Prometheus metrics
- _fetch_metrics_snapshot(): 依 alertname 選擇 CPU/Mem/Disk/Restart 查詢
- _format_metrics_delta(): 輸出 "CPU 92%→23% | Mem 78%→45%" 格式
- _push_auto_repair_result(): metrics_after 寫 DB + TYPE-2 卡片顯示 delta
- _auto_execute(): metrics_before 在執行前寫 DB(完成閉環)
ADR-071-J: km_conversion_service._build_content() 使用精簡 delta 格式
- 從 metrics_before/after 產生人讀 delta(CPU/Mem/Disk/重啟次數)
- 附加 k8s_state_after(若有)
- 格式: 症狀 + 根因 + 動作 + 效果數字(症狀→情境→動作→效果)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-04-11 03:09:35 +08:00 |
|
OG T
|
a29e5e1de2
|
feat(mcp-phase1): K8s MCP 強化 — 6 個新工具 + namespace 白名單
MCP Phase 1 (ADR-069 Sprint B 後驗收):
k8s_get_pod_logs — Pod log 取得 (tail 1-500,支援 previous)
k8s_watch_rollout — rollout 狀態監控直到完成 (timeout 10-300s)
k8s_get_events — K8s events (可過濾 resource_name / event_type)
k8s_describe_pod — 完整 Pod describe (Conditions/Volumes/Env)
k8s_get_hpa_status — HPA 副本數/CPU utilization
k8s_get_node_conditions — Node Ready/MemoryPressure/DiskPressure
安全強化:
- ALLOWED_NAMESPACES = {"awoooi-prod"} 硬編碼白名單
- _validate_namespace() + _validate_name() 參數白名單
- 數值參數上下限夾緊 (tail 1-500, timeout 10-300s)
- event_type 只允許 Warning / Normal
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-04-11 03:01:38 +08:00 |
|
OG T
|
2af4dffcc6
|
fix(security): Architecture Review 修復 5 項高信心問題
安全修復 (P0):
1. ssh_provider: 新增 _validate_param() 白名單驗證,防止 command injection
- container_name/service/filter_name: [a-zA-Z0-9._-]{1,128}
- compose_dir: 必須以 /opt/ 或 /srv/ 開頭,禁止 ..
- domain: FQDN 白名單
- tail/port/lines: int() 轉換 + 上下限夾緊
2. ssh_provider: known_hosts=None 改為讀 SSH_MCP_KNOWN_HOSTS_FILE 環境變數
- 預設仍 None(內網快速啟動),但啟動時寫入 warning log
- 設定文件:ops/runbooks/ssh-mcp-setup.md (待補)
模組化修復 (P1):
3. km_conversion_service: 移除 import 時的 ALERT_EVENT_TYPES.update() 副作用
- ADR-071 event types 移入 alert_operation_log_repository.py 靜態集合
4. telegram_gateway: create_task() 改為 await + try/except
- 避免 DB session 關閉後的競爭條件
- KM 轉換失敗記錄 warning log,不中斷主流程
5. km_conversion_service: 新增頂層 try/except,錯誤一律 error log 後 re-raise
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-04-11 02:50:26 +08:00 |
|
OG T
|
6351e9a0e9
|
feat(mcp-phase2): MCP Phase 2 — Prometheus MCP + SSH MCP + alert labels
CD Pipeline / build-and-deploy (push) Successful in 13m37s
Deploy Alert Rules / Deploy Prometheus Alert Rules (push) Successful in 35s
MCP-2b: prometheus_provider.py
- prometheus_query (PromQL 即時查詢)
- prometheus_query_range (歷史趨勢,預設 15 分鐘)
- prometheus_get_alert_history (告警觸發歷史)
- config: PROMETHEUS_URL + PROMETHEUS_MCP_ENABLED
MCP-2a: ssh_provider.py
- 群組A 9 個只讀診斷工具 (top/disk/memory/logs/status/port/nginx/swap)
- 群組B 6 個安全操作工具 (restart/compose/systemctl/clear-log/ssl/nginx-reload)
- 四層安全守衛 (白名單/allowed_hosts/forbidden_patterns/trust_score)
- config: SSH_MCP_ENABLED + SSH_MCP_ALLOWED_HOSTS
K8s: 04-ssh-mcp-secret.example.yaml (ssh-mcp-key Secret 範本 + 建立步驟)
Alert labels: alerts-unified.yml 補充 mcp_provider/host_type/alert_category
覆蓋: HostHighCpuLoad/HostOutOfMemory/HostOutOfDiskSpace/DockerContainer*
SignOzDown/SentryDown/HarborDown/GiteaDown
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-04-11 02:35:35 +08:00 |
|
OG T
|
325b3851b5
|
feat(adr-071): 告警通知四類型第一批 B/C/E/F/G/H 全實作
CD Pipeline / build-and-deploy (push) Has been cancelled
Type Sync Check / check-type-sync (push) Failing after 1m7s
ADR-071-B: classify_notification() — 五型分類器 (TYPE-1/2/3/4/4D)
ADR-071-C: send_info_notification() — TYPE-1 純資訊無按鈕卡片
ADR-071-E: _build_inline_keyboard() — 依 alert_category 動態組合 TYPE-3 按鈕
ADR-071-F: send_drift_card() — TYPE-4D Config Drift 卡片 + Diff 截斷
ADR-071-G: km_conversion_service.py — Incident RESOLVED 自動轉 KM
ADR-071-H: handle_manual_fix_done() — TYPE-4 手動修復 Bot 對話閉環
前批已完成: ADR-071-A (DB Migration) + ADR-071-D (狀態機守衛)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-04-11 02:24:20 +08:00 |
|
OG T
|
68a3858ae4
|
fix(auto_execute): 守衛加入 target==alertname 檢查,防止 LLM 把告警名稱當 deployment 名稱
CD Pipeline / build-and-deploy (push) Successful in 13m33s
HostHighCpuLoad 等主機告警,NemoTron Tool Calling 可能把
alertname 填入 deployment_name,導致執行
'kubectl rollout restart deployment HostHighCpuLoad'。
新增守衛: _target == _alertname 時拒絕執行並通知人工介入。
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-04-11 01:13:24 +08:00 |
|
OG T
|
a4d655ea7f
|
fix(auto_execute): 安全守衛 — 拒絕執行含 unknown 或未解析 placeholder 的 action
CD Pipeline / build-and-deploy (push) Successful in 19m7s
E2E Health Check / e2e-health (push) Successful in 43s
主機層告警(HostHighCpuLoad、DockerContainerUnhealthy 等)沒有對應
K8s deployment 名稱,affected_services=[],導致 _target='unknown',
執行 'kubectl rollout restart deployment unknown' 這種無意義命令。
修復: 替換後若 action 仍含 'unknown' 或 <...>/{...} 格式,
直接拒絕執行並通知人工介入,不允許帶 placeholder 的命令上線。
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-04-10 23:57:17 +08:00 |
|
OG T
|
dabc62e0f8
|
fix(telegram): append_incident_update — 儲存告警卡片 message_id 到 Redis
CD Pipeline / build-and-deploy (push) Successful in 14m31s
_send_approval_card_to_group 發出告警卡片後,將 Telegram message_id
存入 Redis tg_msg:{incident_id}(TTL 24h),供後續 append_incident_update
換掉批准按鈕 + reply 狀態。
修復前:tg_msg key 從未被寫入,append 永遠 fallback 發新訊息,
批准按鈕永遠無法被移除。
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-04-10 22:41:30 +08:00 |
|
OG T
|
797c7c749e
|
fix(nemotron): deepseek-r1 num_predict 400→1200,避免 <think> block 截斷後空回覆
CD Pipeline / build-and-deploy (push) Failing after 28s
deepseek-r1:14b 思考 token 超過 400 會在 </think> 前截斷,導致
清理後 body 為空,Telegram 顯示空訊息。
- chat_manager: num_predict 400 → 1200
- telegram_gateway: _clean_ai_reply 空值加 fallback 錯誤提示
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-04-10 22:35:37 +08:00 |
|
OG T
|
f8926bb70a
|
ci: 觸發 CD — decision_manager 修復標記
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-04-10 22:12:56 +08:00 |
|
OG T
|
e5f1541d69
|
fix(auto_execute): 替換 action 中的 <deployment_name>/{target}/{namespace} placeholder
CD Pipeline / build-and-deploy (push) Failing after 24s
Nemotron tool calling 生成 <deployment_name> 佔位符未替換
auto_execute 前統一替換所有 {target}/{namespace}/<xxx> 格式
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-04-10 22:00:19 +08:00 |
|
OG T
|
71f0dbf2b5
|
fix(auto_execute): ApprovalRequest 補齊 description/requested_by/required_signatures
CD Pipeline / build-and-deploy (push) Has been cancelled
3 validation errors 導致 auto_execute_failed
補上所有必填欄位,required_signatures=0 表示自動核准不需簽核
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-04-10 21:59:52 +08:00 |
|
OG T
|
f33d514391
|
fix(auto_repair): playbook_seed_service — 從 alert_rules.yaml 初始化 APPROVED Playbook
CD Pipeline / build-and-deploy (push) Has been cancelled
根本原因: playbooks 表空 → NO_MATCH → 永遠走審批,從不自動修復
修復: startup 時從 alert_rules.yaml seed APPROVED Playbook(冪等)
確保自動修復鏈路有規則可用
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-04-10 21:52:38 +08:00 |
|
OG T
|
100e4d9b89
|
fix(chat): AI 回覆截斷問題 — 強制 persona + Markdown 清理 + 600字上限
CD Pipeline / build-and-deploy (push) Successful in 14m39s
問題: OpenClaw/NemoClaw 回覆 Markdown 語法 + 超長,Telegram 顯示截斷
修正:
1. chat_manager: _call_openclaw/_call_nemotron 強制前置 persona (含不超過300字規範)
2. telegram_gateway: _clean_ai_reply() 移除 **bold** *italic* # header 語法
移除 deepseek-r1 <think> 標籤,截斷 > 600 字並在段落邊界截
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-04-10 21:26:15 +08:00 |
|
OG T
|
527ce9faaf
|
fix(notifications): 新增後端 /api/v1/notifications/channels 路由
CD Pipeline / build-and-deploy (push) Failing after 2m4s
前端 /notifications 頁面呼叫此 endpoint 但後端不存在 (404)
新增 notifications.py:回傳 4 個真實頻道狀態
- Telegram OpenClaw Bot (BOT_TOKEN 設定檢查)
- Telegram Nemotron Bot (BOT_TOKEN 設定檢查)
- SSE Web Stream (永遠 active)
- Redis Stream awoooi:signals (ping 檢查)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-04-10 16:17:37 +08:00 |
|
OG T
|
167e115a6d
|
feat(phase31): Log 異常摘要觸發點 — 告警後 NemoTron 發 log summary
CD Pipeline / build-and-deploy (push) Failing after 2m44s
_send_log_summary: 取 Pod log → deepseek-r1:14b 分析 → NemoTron 發到群組
觸發點: _push_decision_to_telegram 送完審批卡後異步執行
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-04-10 16:07:56 +08:00 |
|
OG T
|
95f63d64d7
|
fix(auto_approve): min_trust_score 0 解除自動修復封鎖
CD Pipeline / build-and-deploy (push) Has been cancelled
根本原因: trust_score 是 in-memory dict,Pod 重啟即歸零
永遠 < min_trust_score=1 → 所有告警走審批,從未自動執行
修復: min_trust_score=0,medium risk + confidence>=0.65 直接自動執行
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-04-10 16:06:40 +08:00 |
|
OG T
|
ff3be51e13
|
fix(phase34): 圖片分析改用 send_as_openclaw 發到 SRE 群組
CD Pipeline / build-and-deploy (push) Has been cancelled
send_notification 發到私人 chat,改用 send_as_openclaw 發到 SRE 戰情室
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-04-10 15:56:19 +08:00 |
|
OG T
|
b9dbbb3575
|
feat(rag): Telegram /rag 指令 + /rag/optimize ivfflat 端點
CD Pipeline / build-and-deploy (push) Successful in 14m9s
- telegram_gateway: /rag <query> → KnowledgeRAGService.query()
_handle_group_command 加 full_text 參數取得完整指令文字
/help 更新加入 /rag 說明
- rag.py: POST /rag/optimize → rag_repo.create_ivfflat_index()
- rag_chunk_repository: create_ivfflat_index() — ivfflat with lists=100
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-04-10 14:47:21 +08:00 |
|
OG T
|
33abe988f8
|
fix(phase34): 圖片分析結果改由 OpenClaw 回覆(llava vision)
CD Pipeline / build-and-deploy (push) Has been cancelled
NemoTron 負責文字問答(deepseek-r1:14b),OpenClaw 負責圖片分析(llava)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-04-10 14:13:57 +08:00 |
|
OG T
|
7e5ac00d62
|
fix(phase34): image_analysis 用正確 bot token 下載 + NemoTron 回覆
CD Pipeline / build-and-deploy (push) Has been cancelled
- 下載圖片改用 OPENCLAW_TG_BOT_TOKEN(polling bot)
- 結果改用 send_as_nemotron 從 NemoTron bot 回覆到群組
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-04-10 13:58:59 +08:00 |
|
OG T
|
cf5eb71ea6
|
fix(phase34): polling loop 補圖片路由 — _handle_chat_message photo handler
CD Pipeline / build-and-deploy (push) Has been cancelled
text=None 時直接 return,導致圖片訊息被丟棄
在 text 檢查前插入 photo 路由,呼叫 image_analysis_service
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-04-10 13:58:05 +08:00 |
|
OG T
|
7768924fea
|
fix(flywheel): 自動修復後移除 Telegram 按鈕 + 心跳告警排除飛輪
CD Pipeline / build-and-deploy (push) Failing after 6m56s
問題: 自動修復成功後 Telegram 卡片仍顯示批准/拒絕/靜默按鈕
Fix 1 — Telegram 卡片回饋閉環 (積木化合規):
- telegram_gateway.send_approval_card: 發送後自動存 tg_approval:{id} 到 Redis
- telegram_gateway.mark_auto_repaired(): 新方法 — 移除按鈕 + reply 結果
- _try_auto_repair_background: 改呼叫 gateway.mark_auto_repaired() (Service 層)
Fix 2 — 心跳/看門狗告警排除飛輪:
- constants.py: is_heartbeat_alertname() + HEARTBEAT_ALERT_NAMES
- NoAlertsReceived2Hours 等不觸發 _try_auto_repair_background
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-04-10 11:52:04 +08:00 |
|
OG T
|
670cd5df86
|
refactor(flywheel): 首席架構師審查修正 C1/C2/I1/I2/I3/I4/M1
CD Pipeline / build-and-deploy (push) Has started running
C1 — Repository 層修正 (積木化鐵律):
新增 PlaybookEmbeddingRepository (pgvector UPSERT)
playbook_embedding_service 改透過 Repository 存取 DB,不再直接 db.execute(text(...))
C2 — Router 層業務邏輯移入 Service 層:
create_incident_for_approval + extract_affected_services (去掉底線前綴) 移入 incident_service.py
webhooks.py 改從 incident_service import,自身不再含業務邏輯
I1 — _infra_jobs 提升為 module-level frozenset (_INFRA_JOB_NAMES),避免每次呼叫重建
I2 — _persist_embeddings_to_db 補齊 PlaybookRAGService / list[Playbook] 型別標注
I3 — embedding 格式顯式化: "[" + ",".join(str(float(x)) for x in embedding) + "]"
防止 pgvector 因格式差異靜默解析失敗
I4 — import asyncio 移至 main.py 頂層,移除 try 區塊內重複 import
M1 — similarity.py: 移除死代碼 `if union > 0 else 0.0`
union 在兩個集合都非空時不可能為 0
2026-04-10 Asia/Taipei — Claude Sonnet 4.6
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-04-10 11:35:10 +08:00 |
|
OG T
|
ab6f6faa32
|
feat(phase32): 實作 review_push + gitea_webhook 改用本地 Ollama 審查
CD Pipeline / build-and-deploy (push) Has been cancelled
- local_code_review_service: 新增 review_push() 方法
使用 qwen2.5-coder:7b 審查 push event(非 PR)
- gitea_webhook_service: _call_openclaw_push_review 改用本地推理
OpenClaw 無 push-review 端點(404) → 改用 LocalCodeReviewService
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-04-10 11:09:11 +08:00 |
|
OG T
|
b24fae313e
|
fix(drift_narrator): 補寫 narrative_text 到 DB + drift_repository.update_narrative
CD Pipeline / build-and-deploy (push) Has been cancelled
|
2026-04-10 11:06:50 +08:00 |
|
OG T
|
c6edfb5614
|
fix(flywheel): 四階段系統性修復 AUTO_REPAIR NO_MATCH 斷層
CD Pipeline / build-and-deploy (push) Has been cancelled
Phase 1 — affected_services 污染根治
- webhooks.py: _extract_affected_services() 從 labels 精準萃取服務名
(component > job > pod deployment name > clean target_resource > [])
- create_incident_for_approval: alert_labels 完整保留進 Signal
- alert_name 從 alertname 取,不再用 "custom"
Phase 2 — Playbook alertname 變體擴充
- alert_rules.yaml: 5 條規則新增 HostHighCpuLoad、KubePodCrashLooping 等變體
- scripts/update_playbook_alert_variants.py: Redis index 已執行更新 ✅
Phase 3 — Jaccard 通用型 Playbook 豁免
- similarity.py: affected_services=[] → 1.0 豁免(基礎設施 Playbook 不針對特定服務)
- severity_range=[] → 1.0 豁免(適用所有嚴重度)
Phase 4 — Playbook Embedding 持久化(冷啟動修復)
- migrations/flywheel_playbook_embeddings.sql: pgvector 持久化表
- services/playbook_embedding_service.py: 啟動時重建 Redis 向量快取 + 同步 DB
- main.py: lifespan 啟動時 asyncio.create_task 非阻塞執行
2026-04-10 Asia/Taipei — Claude Sonnet 4.6
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-04-10 11:04:56 +08:00 |
|
OG T
|
1c4bdedc64
|
fix(drift_narrator): send_text → send_notification + DriftLevel case fix
CD Pipeline / build-and-deploy (push) Successful in 14m43s
|
2026-04-10 10:48:36 +08:00 |
|
OG T
|
5c2db65ea1
|
refactor(rag): C1 修正 — 新增 rag_chunk_repository,Service 不再直接存 DB
CD Pipeline / build-and-deploy (push) Has been cancelled
- 新增 src/repositories/rag_chunk_repository.py
search_chunks / insert_chunk / delete_by_source_id / get_stats
- KnowledgeRAGService 移除所有 get_db_context 直接呼叫
改委派 rag_repo.search_chunks / insert_chunk / delete_by_source_id / get_stats
- 移除 unused Any import
leWOOOgo 合規評分: 62 → 95/100
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-04-10 10:43:53 +08:00 |
|
OG T
|
cc8cabebf9
|
refactor(rag): 架構審查修正 — leWOOOgo 合規 + 去重 + httpx 關機
CD Pipeline / build-and-deploy (push) Has been cancelled
- C2: _run_index() 業務邏輯移入 KnowledgeRAGService.index_all_sources()
Router 層只做 background_tasks.add_task(_run_index) 轉發
- C3: glob("*.md") → rglob("*.md") — 掃描巢狀子目錄
- C4: docstring 修正 Ollama 188 → 111
- I2: index_document() 先刪舊版本 (_delete_by_source_id) 避免重複累積
- I3: debug endpoint 改用 settings.OLLAMA_URL 取代硬碼 IP
- I4: main.py shutdown 加入 get_knowledge_rag_service().close()
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-04-10 10:39:14 +08:00 |
|
OG T
|
09a8c3a90b
|
fix(rag): 修正 debug endpoint 與訊息文字 — Ollama 188→111
CD Pipeline / build-and-deploy (push) Has been cancelled
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-04-10 10:28:04 +08:00 |
|
OG T
|
68e9ef5d26
|
fix(drift_narrator): DriftItem.severity → drift_level.value 欄位名稱修正
CD Pipeline / build-and-deploy (push) Has been cancelled
|
2026-04-10 10:24:41 +08:00 |
|
OG T
|
974f84511b
|
fix(rag): embed 改用 settings.OLLAMA_URL — K3s NetworkPolicy 擋住直連 188:11434
CD Pipeline / build-and-deploy (push) Has been cancelled
nomic-embed-text 在 111 也有,改走 OLLAMA_URL (111) 避開 NetworkPolicy
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-04-10 10:14:33 +08:00 |
|
OG T
|
b51f1b011c
|
debug(rag): /rag/debug 顯示完整 Ollama 錯誤訊息
CD Pipeline / build-and-deploy (push) Has been cancelled
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-04-10 10:13:52 +08:00 |
|
OG T
|
6786da89c8
|
debug(rag): 加入 /rag/debug 診斷端點 — 確認容器路徑 + Ollama 連線
CD Pipeline / build-and-deploy (push) Successful in 13m14s
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-04-10 09:54:56 +08:00 |
|
OG T
|
3ed52b0424
|
fix(rag): _run_index 修正 index_document 簽名不符 — 讀檔內容再傳 service
CD Pipeline / build-and-deploy (push) Successful in 13m3s
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-04-10 09:00:26 +08:00 |
|
OG T
|
0ee5d532ba
|
feat(rag): 新增 RAG Router + 掛載到 main.py (Phase 33 ADR-067)
CD Pipeline / build-and-deploy (push) Successful in 13m11s
- rag.py: POST /index, POST /query, GET /stats 三端點
- stats 委派給 KnowledgeRAGService.get_stats()(leWOOOgo 合規)
- main.py: include_router rag_v1.router
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-04-10 07:34:06 +08:00 |
|
OG T
|
e605b7192b
|
feat(rag): Phase 33 RAG API 端點 — /knowledge/rag/index + query + stats
CD Pipeline / build-and-deploy (push) Successful in 14m35s
ADR-067 Phase 33: pgvector RAG 三個 HTTP 端點
- POST /knowledge/rag/index — 索引文件到 rag_chunks
- GET /knowledge/rag/query — embed→knn→生成答案
- GET /knowledge/rag/stats — chunks 統計 (透過 Service 層)
- 修正: rag_stats 移至 KnowledgeRAGService.get_stats() (leWOOOgo 積木化)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-04-10 02:00:59 +08:00 |
|
OG T
|
63e840ae42
|
feat(ollama): Phase 31-34 ADR-067 — Log摘要/PR審查/RAG知識庫/圖片分析
CD Pipeline / build-and-deploy (push) Has started running
Phase 31: log_summary_service.py — deepseek-r1:14b K8s Pod日誌異常摘要
- 觸發: signoz_webhook 告警時背景呼叫
- Redis快取 log_summary:{pod}:{date} TTL 24h
- 敏感資料regex遮蔽
Phase 32: local_code_review_service.py — qwen2.5-coder:7b PR自動審查
- Fallback: Gemini (diff > 50KB 或 Ollama超時)
- semaphore 最多2個同時審查
- 雙寫: Redis TTL 7d + pr_reviews表 (phase29 migration)
Phase 33: knowledge_rag_service.py — nomic-embed-text 768維 pgvector RAG
- 向量化(188) + 生成(111) 雙Ollama
- rag_chunks表 (phase28 migration)
- 初期線性搜尋,>100筆啟用ivfflat索引
Phase 34: image_analysis_service.py — llava:latest Telegram圖片分析
- download_and_analyze: Bot API getFile → 下載 → llava → 回應
- Rate limit: 每chat_id每分鐘3次 (Redis sliding window)
- telegram.py webhook新增photo分支
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-04-10 01:50:22 +08:00 |
|
OG T
|
89015d4527
|
feat(phase30): Drift 報告 AI 人話摘要 (ADR-067)
CD Pipeline / build-and-deploy (push) Has been cancelled
- 新增 DriftNarratorService — qwen2.5:7b-instruct (Ollama 111)
- 觸發條件: high >= 1 or medium >= 3(HPA replicas 白名單)
- Redis 快取: drift_narrative:{report_id} TTL 1h
- LLM 失敗時 graceful fallback 結構化文字
- drift.py _analyze_and_notify: 接入 narrator(Phase 30 標記)
- Migration: drift_reports.narrative_text TEXT (已在 prod 執行)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-04-10 01:37:43 +08:00 |
|
OG T
|
a30713b292
|
fix(chat): NemoClaw 禁止自稱 DeepSeek + 強制繁體中文
CD Pipeline / build-and-deploy (push) Successful in 13m36s
- 明確禁止透露底層模型身分
- 強制繁體中文(禁簡體)
- 補充 SRE 專長範圍定義
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-04-10 01:18:18 +08:00 |
|
OG T
|
88ac1c7f50
|
feat(phase27): 歷史按鈕雙層頻率統計 + DB frequency_snapshot 持久化
CD Pipeline / build-and-deploy (push) Failing after 1m44s
- telegram_gateway: _send_incident_history 改為 Phase 27 雙層策略
Layer 1: DB frequency_snapshot (建立時刻永久快照)
Layer 2: Redis AnomalyCounter disposition 累積統計 (35d TTL)
修復舊版呼叫 record_anomaly() 導致誤計數的 bug
- 新增 migration: phase27_incident_frequency_snapshot.sql (已在 prod 執行)
- CLAUDE.md: 精簡至 123 行,減少 Token 消耗
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-04-10 01:06:51 +08:00 |
|
OG T
|
9846a6cc93
|
feat(incident): Phase 27 frequency_snapshot DB 持久化 — incidents 表新增 JSONB 欄位
CD Pipeline / build-and-deploy (push) Has been cancelled
frequency_stats 原僅存 Redis(TTL 35天),Pod 重啟或超期即失
- incidents.frequency_snapshot JSONB:建立 incident 時寫入快照,永久保存
- incident_repository: _record_to_incident 還原 IncidentFrequencyStats
- _incident_to_record_data 序列化 frequency_stats 快照到 DB
- Migration: phase27_incident_frequency_snapshot.sql 已執行完成
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-04-10 01:05:41 +08:00 |
|
OG T
|
ae90c36cd7
|
fix(telegram): _send_incident_history 加入 freq=None fallback — 無頻率統計資料
CD Pipeline / build-and-deploy (push) Has been cancelled
test_history_handles_no_stats 要求原始碼中有「無頻率統計資料」fallback 分支,
當 AnomalyCounter.record_anomaly() 回傳 None 時顯示此訊息而非繼續處理。
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-04-10 01:01:19 +08:00 |
|
OG T
|
e59f8181b3
|
fix(telegram): 歷史按鈕改從 AnomalyCounter(Redis) 讀頻率,修復永遠顯示「無頻率統計資料」
CD Pipeline / build-and-deploy (push) Failing after 1m45s
根本原因: frequency_stats 從未持久化到 DB,get_by_id() 回傳永遠是 None
修復: 用 AnomalyCounter.derive_key_from_incident() 推導 anomaly_key,
直接從 Redis 查即時頻率與處置分佈統計
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-04-10 00:56:23 +08:00 |
|
OG T
|
e2c6ca598e
|
fix(approval_db): update_telegram_message 用 raw SQL + CAST BIGINT 避免 int32 overflow
CD Pipeline / build-and-deploy (push) Has been cancelled
telegram_chat_id 為 BIGINT (5619078117 > 2^31-1),SQLAlchemy ORM 會推斷為 $N::INTEGER
改用 raw SQL + CAST(:telegram_chat_id AS BIGINT) 繞過型別推斷
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-04-10 00:53:50 +08:00 |
|
OG T
|
0571ad15d5
|
fix(signoz_webhook): AIDataImpact.value 大寫 → .lower() 轉 DataImpact
CD Pipeline / build-and-deploy (push) Has been cancelled
AIDataImpact enum value 為 'NONE'/'READ_ONLY' 等大寫,
DataImpact enum value 為 'none'/'read_only' 等小寫,
轉換時補 .lower() 避免 ValueError。
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-04-10 00:38:29 +08:00 |
|
OG T
|
5d591c4639
|
fix(drift_repository): CAST(:param AS jsonb) 取代 :param::jsonb
CD Pipeline / build-and-deploy (push) Has been cancelled
asyncpg 不支援 named param 混用 :: cast 語法,導致 PostgresSyntaxError。
改用 CAST() 函數語法,與 SQLAlchemy text() named params 相容。
影響: drift_reports 現在可正常寫入 DB
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-04-10 00:22:43 +08:00 |
|
OG T
|
7e498621e0
|
fix(signoz_webhook): AIBlastRadius → BlastRadius 型別轉換
CD Pipeline / build-and-deploy (push) Has been cancelled
blast_radius 欄位傳入 AIBlastRadius 物件導致 Pydantic validation error,
approval 無法存進 DB(Telegram 仍送出但無法批准)。
修法:明確轉換 AIBlastRadius → BlastRadius,data_impact enum 用 .value 橋接。
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-04-10 00:15:40 +08:00 |
|