OG T
|
76f7330c9d
|
feat(api): POST /playbooks/ 建立端點 + seed-repair-playbooks.py (Task 14)
CD Pipeline / build-and-deploy (push) Failing after 57s
CD Pipeline / Deploy Prometheus Alert Rules (push) Has been skipped
- playbooks.py: 新增 POST / 端點供直接建立 Playbook (seed/管理用)
- seed-repair-playbooks.py: 5個 Host Repair Playbooks (ssh_command)
sentry/harbor/gitea/alertmanager (110) + openclaw (188)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-04-05 11:53:49 +08:00 |
|
OG T
|
bf4f81412c
|
feat(api): ActionType.SSH_COMMAND + auto_repair_service SSH分支 (Task 12)
- playbook.py: 新增 SSH_COMMAND ActionType
- auto_repair_service._execute_step: SSH_COMMAND 分支,格式 layer/component
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-04-05 11:47:00 +08:00 |
|
OG T
|
e7d8da85f6
|
feat(api): HostRepairAgent — SSH 主機層修復 (Task 11)
- host_repair_agent.py: layer路由、command injection防護、asyncio SSH執行
- 測試: 12 cases 全通過 (routing/sanitize/success/fail/timeout/denied)
- SSH key: /etc/repair-ssh/id_ed25519 (K8s secret mount)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-04-05 11:22:00 +08:00 |
|
OG T
|
7a6fa6359e
|
feat(api): Sentry init 加入統一 layer/component 標籤
對齊 Prometheus 告警標籤規範 (layer/component/team)
讓 Sentry 事件與 auto_repair 路由決策保持一致
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-04-05 11:10:40 +08:00 |
|
OG T
|
2243a21b96
|
fix(ai-router): v4.3 NIM 保護 — timeout 不計 CB 失敗,每次先跑 NIM 才切 Gemini
CD Pipeline / build-and-deploy (push) Failing after 20s
需求: NIM 必須等到有回應才切換,不能因為慢就被 CB 封鎖走 Gemini
變更:
- Timeout exception 不累積 CB failure(只有真實連線錯誤才計)
- NIM CB: failure_threshold=10, recovery_timeout=30s(比預設寬鬆)
- 設計文件 v4.3: 更新方向二,移除錯誤假設
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-04-05 01:51:12 +08:00 |
|
OG T
|
5ad403b287
|
fix(p0): v4.3 — 實測確認 Ollama CPU-only 不可用,DIAGNOSE 統一走 NIM
實測依據 (2026-04-05):
- Ollama llama3.2:3b CPU-only: 238s 回 {"ok":true},生產不可用
- Nemotron NIM: 2.2s~27.3s,avg 10.6s,一直是主力(Phase 22 起)
- NIM 從未有隱私問題,Incident 資料一直送雲端 GPU
變更:
- ai_router.py: _local_fallback_chain 廢棄(空 list)
- ai_router.py: DIAGNOSE route/route_sync 改回 _full_fallback_chain
- config.py: 更新 timeout 說明反映實測結果
- test_p0_diagnose_routing.py: 更新 docstring
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-04-05 01:49:06 +08:00 |
|
OG T
|
8300879d02
|
chore: trigger CD deploy (warm-up + MinIO startup)
CD Pipeline / build-and-deploy (push) Failing after 24s
|
2026-04-05 01:00:31 +08:00 |
|
OG T
|
a81bf50537
|
feat(drift): ADR-057 adopt() Gitea PR API 實作
- DriftAdoptService: 透過 Gitea REST API 建立 branch + commit + PR
不在 API Pod 內執行 git(修復 C2 安全漏洞)
- adopt() 端點: 501 → 真實實作(呼叫 DriftAdoptService)
- config.py: 新增 GITEA_API_URL / GITEA_API_TOKEN / GITEA_REPO_OWNER / GITEA_REPO_NAME
- K8s secret awoooi-secrets 已注入 GITEA_API_TOKEN
- drift.py: 移除 trigger_drift_scan 中未使用的 interpreter 變數
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-04-05 00:39:29 +08:00 |
|
OG T
|
f4f454fd98
|
feat(api): 重開機後自動 warm-up Redis Working Memory from PostgreSQL
- main.py lifespan: 啟動時從 DB restore INVESTIGATING/MITIGATING incidents
- scripts/reboot-recovery: 188 + 110 自動化腳本 + systemd services
- scripts/reboot-recovery: aiops-network 自動建立 (ClawBot 依賴)
- docs/runbooks/REBOOT-RECOVERY-SOP.md: 完整改寫,含自動化腳本說明
Why: 重開機後 Redis 清空導致前端 incidents 顯示 0 筆(DB 完整保存)
統帥批准: 「所有數據必須被長久記錄下來」
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-04-05 00:39:20 +08:00 |
|
OG T
|
96d5e18924
|
fix(p0): 實測修正 — timeout 依 benchmark 調整,_local_fallback_chain 移除雲端 Nemotron
- config.py: NEMOTRON_DIAGNOSE_TIMEOUT_SECONDS=60s (NIM 實測 11-45s + 15s buffer)
- config.py: OLLAMA_DIAGNOSE_TIMEOUT_SECONDS=200s (Ollama 實測 ~173s + 27s buffer)
- ollama.py: 新增 per-task timeout (diagnose/force_local 用 200s)
- ai_router.py: _local_fallback_chain 移除 Nemotron (NIM=雲端,不可進 local chain)
- ai_router.py: v4.2 — Option C 分情境路由正式確立
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-04-05 00:29:09 +08:00 |
|
OG T
|
4912c7f307
|
fix(phase25): 首席架構師 Review R2 修正 (I1/I2/I3/I4/C3/M1)
I1: auto_repair_service — 失敗分支 anti_pattern task 補齊 _pending_tasks GC 防護
C3: drift_remediator — _kubectl_apply() 實作 resource_key 範圍過濾(修復虛設參數 bug)
M1: drift_remediator — _git_push() 標記 DISABLED,防止誤啟用
I2: drift.py — Telegram 通知移除失效的 adopt() 端點連結
I3: drift/page.tsx — handleScan POST body namespace→namespaces(對齊後端 DriftScanRequest)
I4: drift/page.tsx — 移除硬編碼英文字串,改用 t('loading')/t('highCount')/t('mediumCount')
i18n: zh-TW.json + en.json 補齊 drift.loading key
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-04-05 00:22:38 +08:00 |
|
OG T
|
4bc4757fdc
|
test(phase25): Phase 25 P1/P2 source code inspection tests (36 tests)
- test_phase25_auto_harvesting.py: 18 tests for NemotronRunbookGenerator,
AntiPattern gate, fire-and-forget pattern, symptoms_hash
- test_phase25_drift_detection.py: 18 tests for DriftDetector, NemotronDriftInterpreter
(read-only), DriftRemediator, local fallback chain for DIAGNOSE
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-04-05 00:14:50 +08:00 |
|
OG T
|
688146ef9c
|
test(ai-router): test_fallback_list >= 2 改 >= 1
CD Pipeline / build-and-deploy (push) Has been cancelled
DIAGNOSE local chain 選 Nemotron 後 fallback 只剩 Ollama 一個
>= 2 斷言過嚴,與 test_query_routes_to_ollama 同樣修正
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-04-04 18:05:25 +08:00 |
|
OG T
|
428ed5f8cd
|
test(ai-router): 修正 test_query_routes_to_ollama 斷言
CD Pipeline / build-and-deploy (push) Failing after 41s
Phase 25 P0 後 DIAGNOSE 走 _local_fallback_chain [NEMOTRON, OLLAMA]
選 NEMOTRON 為 primary,fallback 只剩 OLLAMA 一個,
>= 2 斷言過嚴,改為 >= 1。
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-04-04 18:02:43 +08:00 |
|
OG T
|
a562db4048
|
fix(phase25): 首席架構師 Review C1/C2/I1/I3 修正
CD Pipeline / build-and-deploy (push) Failing after 57s
C1: NemotronProvider.privacy_level cloud→local
NIM 部署在 192.168.0.188 內網,非官方雲端 API
可納入 DIAGNOSE _local_fallback_chain 隱私邊界
C2: adopt() 端點暫停,返回 501
API Pod 執行 git add -A 有安全風險
ADR-057 起草後改用 Gitea PR API 實作
I1: timeout log 修正,記錄實際套用的 timeout 值
原本永遠記錄 NEMOTRON_TIMEOUT_SECONDS=45
現在記錄依 task_type 選擇的正確值
I3: route_sync() 補 DIAGNOSE 隱私邊界
async route() 已有 _local_fallback_chain
sync 版本遺漏,此次補齊
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-04-04 18:00:05 +08:00 |
|
OG T
|
c4eafd2a5b
|
fix(ai-router): fallback_models 排除 selected_model 避免重複
CD Pipeline / build-and-deploy (push) Successful in 6m58s
DIAGNOSE intent 路由至 Nemotron 後,fallback_chain 中的 OPENCLAW_NEMO
也使用相同 model string,導致 fallback_models 包含已選模型。
修正: 過濾掉與 selected_model 相同的 model string。
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-04-04 17:43:44 +08:00 |
|
OG T
|
8056be5847
|
feat(ai-router): DIAGNOSE intent override 升級至 Nemotron (P0)
|
2026-04-04 17:41:45 +08:00 |
|
OG T
|
671974dedb
|
test(ai-router): TestLocalFallbackChain — require_local 隱私邊界驗證 (P0)
CD Pipeline / build-and-deploy (push) Failing after 43s
新增兩個測試:cloud provider 被跳過 + 全失敗回傳 local_providers_unavailable。
實作邏輯已存在於 AIRouterExecutor.execute()(2026-04-04 ogt Phase 25 P0)。
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-04-04 17:32:32 +08:00 |
|
OG T
|
ffd679f5d3
|
feat(nemotron): per-task timeout,DIAGNOSE 使用獨立 timeout 設定 (P0)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-04-04 16:58:23 +08:00 |
|
OG T
|
3455044457
|
feat(phase25): Nemotron 主動防禦三方向 P0+P1+P2 完整實作
CD Pipeline / build-and-deploy (push) Failing after 38s
Type Sync Check / check-type-sync (push) Failing after 35s
P0 - DIAGNOSE Privacy-First Routing:
- ai_router.py: _local_fallback_chain [NEMOTRON→OLLAMA→REJECT]
- DIAGNOSE 意圖 override 改為 NEMOTRON (原 OLLAMA)
- DIAGNOSE fallback 使用 local-only 鏈,不觸碰雲端
- 全部失敗時 REJECT + Telegram 通知
- config.py: NEMOTRON_DIAGNOSE_TIMEOUT_SECONDS=30, OLLAMA_DIAGNOSE_TIMEOUT_SECONDS=60
- nemotron.py: 根據 context[task_type] 選擇 timeout
P1 - Knowledge Auto-Harvesting:
- models/knowledge.py: EntryType.AUTO_RUNBOOK + ANTI_PATTERN + symptoms_hash
- EntryStatus.PUBLISHED (ANTI_PATTERN 直接發布,無需審核)
- models/playbook.py: SymptomPattern.compute_hash() (16字元確定性 hash)
- services/runbook_generator.py: NemotronRunbookGenerator (v1.1)
- generate_runbook() → AUTO_RUNBOOK (DRAFT) + Telegram 審核 card
- generate_anti_pattern() → ANTI_PATTERN (PUBLISHED) + Telegram 通知
- 使用 nvidia.chat() (正確介面),Nemotron 超時時 Minimal fallback
- knowledge_service.py: check_anti_pattern(symptoms_hash, days=7)
- db/models.py: symptoms_hash VARCHAR(16) + ix_knowledge_symptoms_hash
- repositories/knowledge_repository.py: create() 支援 symptoms_hash + status
- auto_repair_service.py: anti_pattern_gate 在 decide() + runbook hook 在 execute()
- migrations/phase8_symptoms_hash.sql: ALTER TABLE + partial index + PUBLISHED constraint
P2 - Config Drift Detection:
- models/drift.py: DriftItem/DriftReport/DriftLevel/DriftIntent/DriftStatus
- services/drift_detector.py: GitStateReader + K8sStateReader + DriftDetector
- services/drift_analyzer.py: 白名單過濾 + DriftLevel 分級
- services/drift_interpreter.py: NemotronDriftInterpreter(意圖分析,不生成修復指令)
- services/drift_remediator.py: rollback(kubectl apply) + adopt(git push gitea)
- api/v1/drift.py: POST /scan, GET /reports, POST /rollback, POST /adopt
- migrations/phase9_drift_reports.sql: drift_reports 表
- k8s/drift-cronjob.yaml: 每小時自動掃描 CronJob
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-04-04 12:35:05 +08:00 |
|
OG T
|
b6e12f74f4
|
test(phase22): Phase 22.4 Nemotron 協作測試 18/18 PASSED
CD Pipeline / build-and-deploy (push) Successful in 7m12s
- 修正 file path: apps/api/src/ → src/ (從 apps/api/ 目錄執行)
- 擴大 snippet size: 800→1500 chars (docstring 過長導致 flag check 超出範圍)
- 擴大 _call_nemotron_tools snippet: 2000→5000 chars (timeout 在函數後段)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-04-04 12:16:28 +08:00 |
|
OG T
|
df3ef9006c
|
fix(auto-repair): 首席架構師 Review — 4 Critical/Important 修復
CD Pipeline / build-and-deploy (push) Successful in 7m2s
Critical #1: KM write task 移出 try/except
- _trigger_learning 的 KM 寫入原在 try 內,learning 失敗時不寫 KM
- 移至 except 後確保成功/失敗都寫入
- 移除冗餘 import asyncio(已在頂層 import)
- Minor: approval.incident_id or None 防空字串
Important #2: migration 加 PRIMARY KEY
- playbook_id 從 UNIQUE 升為 PRIMARY KEY
- prod DB 已執行 ALTER TABLE ADD PRIMARY KEY
Important #3: s.sequence→s.step_number, s.description→s.command
- embed_playbook() 使用不存在的欄位名,RAG 向量索引靜默失敗
- RepairStep 正確欄位: step_number, command
Important #1: PlaybookService._get_rag_service 不再 Service 層快取
- 改為每次呼叫工廠 get_playbook_rag_service()
- 避免舊實例繞過工廠的 is_closed 重建邏輯
冷啟動修復 (首席架構師建議B+C):
- _trigger_playbook_extraction 執行成功後自動設定
execution_success=True, effectiveness_score=4, status=RESOLVED
- skip 路徑 logger.debug → logger.info 提升可觀測性
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-04-04 12:02:03 +08:00 |
|
OG T
|
f6567751a9
|
test(knowledge): pgvector 語意搜尋整合測試 (5 tests)
CD Pipeline / build-and-deploy (push) Has been cancelled
- test_save_embedding: CAST AS vector 語法驗證
- test_semantic_search_returns_results: cosine similarity 查詢
- test_semantic_search_threshold_filters: 正交向量被 threshold 過濾
- test_semantic_search_archived_excluded: archived 不出現
- test_list_unembedded_entries: 未 embed 條目列舉
全部 5/5 PASSED (awoooi_dev PostgreSQL + pgvector)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-04-04 11:55:09 +08:00 |
|
OG T
|
72d7536ead
|
feat(auto-repair): 完整自動修復閉環 + KM 沉澱串接
CD Pipeline / build-and-deploy (push) Has been cancelled
1. DB Migration: playbooks 資料表 (phase7_playbooks_table.sql)
- 這是自動修復無法啟動的根本原因 — table 從未建立
- 5 個索引: status/tags/alert_names/source_incidents/created_at
- 已在 prod DB 執行
2. playbook_service: 萃取後自動沉澱 KM
- extract_from_incident() 完成後 fire-and-forget _write_to_km()
- 內容含症狀模式、修復步驟、信心度、來源 Incident
3. approval_execution: 執行結果沉澱 KM
- _trigger_learning() 後 fire-and-forget _write_execution_result_to_km()
- 成功/失敗記錄都寫入,category=execution_result
完整閉環:
告警 → AI分析 → 查Playbook → 決策 → 執行 → 結果寫KM
↓
Incident解決 → KM(knowledge_extractor)
→ Playbook萃取 → KM
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-04-04 11:54:15 +08:00 |
|
OG T
|
429d81d29b
|
fix(knowledge): I2+I3 首席架構師 Important 修復 — 依賴注入 + exception 細分
CD Pipeline / build-and-deploy (push) Has been cancelled
I2: KnowledgeService 移至 DecisionManager.__init__ 注入
_query_kb_context_inner 使用 self._knowledge_svc,移除函數內 import 耦合
I3: _query_kb_context exception 細分
- asyncio.TimeoutError → warning (預期降級)
- ConnectionError/OSError → warning (Ollama 連線問題,預期降級)
- Exception → error (非預期,提升監控可見性)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-04-04 11:51:43 +08:00 |
|
OG T
|
f846000c8c
|
fix(knowledge): C1 首席架構師必修 — _query_kb_context 5秒 hard timeout
CD Pipeline / build-and-deploy (push) Has been cancelled
C1 修復 (首席架構師 Review 74/100 → 條件通過):
- 抽出 _query_kb_context_inner 含實際查詢邏輯
- _query_kb_context 用 asyncio.wait_for(timeout=5.0) 包裝
- Ollama hang/慢響應最多消耗 5s,保護 30s 決策 SLA
- timeout 時 logger.warning("kb_rag_timeout") 靜默降級
同步移除 LLM prompt 中的 emoji (## 📚 → ## Knowledge Base)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-04-04 11:48:57 +08:00 |
|
OG T
|
860dc1d892
|
feat(knowledge): KB Phase 2 — OpenClaw RAG 整合
CD Pipeline / build-and-deploy (push) Has been cancelled
_dual_engine_analyze 新增 _query_kb_context():
- Incident 分析前語意搜尋相關 KB 條目 (top-3, threshold=0.4)
- 將 KB context 注入 expert_context.diagnosis_context 傳給 LLM
- 失敗時靜默降級,不影響主分析流程
- dual_engine_llm_win log 新增 kb_rag 欄位,可觀測 RAG 命中率
架構: _query_kb_context 透過 get_knowledge_service() 呼叫 Service 層
符合 leWOOOgo 積木化 — decision_manager 不直接存取 DB/pgvector
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-04-04 11:46:47 +08:00 |
|
OG T
|
d0f09705e5
|
fix(auto-repair): 修復三個阻礙自動修復的根本原因
CD Pipeline / build-and-deploy (push) Has been cancelled
1. playbook_rag: Ollama embedding http_client 滾動重啟後 is_closed
- 新增 _get_http_client() 偵測 is_closed 自動重建
- singleton get_playbook_rag_service() 加 is_closed 重建判斷
2. telegram: 加入 ai_model 欄位顯示底層判斷模型
- TelegramMessage.ai_model 欄位
- format() / format_with_nemotron() 顯示 "Nemotron (nemotron-70b)"
- openclaw proposal_dict 加入 model 欄位
- decision_manager / send_approval_card 串接
3. DB: 清除 9 筆 3/26 殭屍 PENDING (mock_fallback CRITICAL 測試記錄)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-04-04 11:46:25 +08:00 |
|
OG T
|
12bc94796a
|
fix(knowledge): asyncpg 不支援 :param::type,改用 CAST(:param AS vector)
CD Pipeline / build-and-deploy (push) Has been cancelled
asyncpg 使用 $1 位置參數,:emb::vector 語法導致 PostgresSyntaxError。
save_embedding 和 semantic_search 均改用 CAST(:emb AS vector) 語法。
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-04-04 11:43:59 +08:00 |
|
OG T
|
cddc4cb1fc
|
fix(knowledge): 首席架構師 Review 修復 C1+C2+I1+I2 (71→~88/100)
CD Pipeline / build-and-deploy (push) Successful in 7m16s
C1: IKnowledgeRepository Protocol 補齊 save_embedding + semantic_search +
list_unembedded_entries,恢復 Interface 先行保護層
C2: embed_all_entries Service 層 raw SQL 移至 Repository.list_unembedded_entries()
Service 改透過 Protocol 呼叫,符合 leWOOOgo 積木化原則
I1: asyncio.create_task 加入 _pending_tasks set 持有引用,防 GC 回收與
Shutdown 時 Task 遺失;task done 後自動 discard
I2: OllamaEmbeddingService 從每次 new 改為 KnowledgeService.__init__ 注入,
單一實例重用
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-04-04 11:22:38 +08:00 |
|
OG T
|
8960bba7fe
|
feat(knowledge): pgvector RAG — 語意搜尋 + 背景 Embedding 管線
CD Pipeline / build-and-deploy (push) Has been cancelled
- repository: save_embedding (raw SQL pgvector cast) + semantic_search (cosine <=>)
- service: create_entry 背景 embed + semantic_search + embed_all_entries 批次補 embed
- router: GET /semantic-search (q/limit/threshold) + POST /embed-all 管理端點
向量模型: nomic-embed-text (Ollama 192.168.0.188, 768 dims)
索引: ivfflat cosine (knowledge_entries.embedding vector(768))
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-04-04 11:17:24 +08:00 |
|
OG T
|
200c382ca4
|
feat(metrics): sparklines 串接真實數據 + TOOL_LINKS 移至 API (2026-04-04 ogt)
CD Pipeline / build-and-deploy (push) Successful in 7m6s
前端 page.tsx:
- 今日事件 sparkline: 過去 6 小時每小時事件數 (從 incidents 計算)
- MTTR sparkline: 各已解決 incident 修復時間序列 (從 incidents 計算)
- 無數據時不顯示 sparkline (undefined 渲染 nothing)
- 移除硬碼 TOOL_LINKS,改讀 API 回傳的 tool.url
後端 monitoring.py:
- 每個 probe 函數回傳 dict 加入 "url" 欄位
- 前端工具連結由後端集中管理,解決多環境問題
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-04-04 11:09:04 +08:00 |
|
OG T
|
5e836bde24
|
test(integration): 新增真實 DB 整合測試 — knowledge_repository + API E2E (2026-04-04 ogt)
CD Pipeline / build-and-deploy (push) Successful in 7m18s
- tests/integration/conftest.py: 連接 awoooi_dev PostgreSQL,每個測試後 rollback
- tests/integration/test_knowledge_repository.py: 23 個真實 DB 測試
- create/get_by_id/list/update/delete(軟刪除)/search/categories/view_count
- tests/integration/test_incident_api.py: 7 個 HTTPS 端點測試
- health check + knowledge API smoke test
- 遵循禁止 Mock 鐵律 (feedback_no_mock_testing.md)
- 本地驗證: 30/30 PASSED
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-04-04 02:35:38 +08:00 |
|
OG T
|
9e78d5222a
|
feat(group-chat): 方案B slash commands — /status /incidents /cost /pods /help (2026-04-03 ogt)
CD Pipeline / build-and-deploy (push) Successful in 7m5s
E2E Health Check / e2e-health (push) Successful in 17s
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-04-03 20:02:27 +08:00 |
|
OG T
|
e833065043
|
feat(group-chat): Reply Bot 訊息時只有被Reply的Bot回應 (2026-04-03 ogt)
CD Pipeline / build-and-deploy (push) Successful in 7m0s
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-04-03 19:48:10 +08:00 |
|
OG T
|
8d09b18477
|
fix(group-chat): 移除雙AI互相評論 — 單獨@只有該AI回覆,雙AI路徑不再互評
CD Pipeline / build-and-deploy (push) Has been cancelled
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-04-03 19:44:49 +08:00 |
|
OG T
|
79a770ffe5
|
feat(group): 移除告警自動 AI 分析 — 老闆指示
CD Pipeline / build-and-deploy (push) Successful in 7m3s
告警發到群組只顯示卡片,不自動觸發 OpenClaw/NemoClaw 分析
老闆和 AI 可手動在群組討論告警內容
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-04-03 19:36:40 +08:00 |
|
OG T
|
b62d7d3eb0
|
feat(chat): OpenClaw 改用 Gemini 2.0 Flash-Lite (最便宜)
CD Pipeline / build-and-deploy (push) Has been cancelled
Input $0.075/1M, Output $0.30/1M (比 Flash 便宜 25%)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-04-03 19:35:13 +08:00 |
|
OG T
|
6cd4280168
|
feat(chat): NemoClaw Claude API 加 token+費用統計
CD Pipeline / build-and-deploy (push) Has been cancelled
Claude Haiku 4.5: Input $0.80/1M, Output $4.00/1M
每次回覆顯示: token 數 | 本次費用 | 本月累計
Redis key: claude_cost:YYYY-MM,TTL 40 天
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-04-03 19:29:22 +08:00 |
|
OG T
|
781a6dac3e
|
feat(chat): NemoClaw→Claude Haiku API + 告警只由 OpenClaw 分析
CD Pipeline / build-and-deploy (push) Successful in 7m20s
老闆指示 (2026-04-03):
1. NemoClaw 改接 Claude API (claude-haiku-4-5),快速中文對話
2. 群組告警分析只觸發 OpenClaw,NemoClaw 不分析告警
3. OpenClaw/NemoClaw 雙向自然語言對話維持
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-04-03 19:19:56 +08:00 |
|
OG T
|
10ad2a67c7
|
fix(chat): gemini-2.0-flash 修正 + 全形小O支援 + NemoClaw 回 NIM
CD Pipeline / build-and-deploy (push) Has been cancelled
1. Gemini 模型名稱: gemini-1.5-flash → gemini-2.0-flash (404修復)
2. 費用計算: 2.0 Flash 定價 Input $0.10/1M, Output $0.40/1M
3. 全形/半形統一: unicodedata.normalize NFKC,支援「小O」全形輸入
4. NemoClaw: Ollama 188 負載高超時,暫回 NIM nemotron-mini-4b
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-04-03 19:17:08 +08:00 |
|
OG T
|
08b02280f8
|
feat(chat): Gemini 月費用上限 $10 USD + Redis 累計追蹤
CD Pipeline / build-and-deploy (push) Successful in 6m55s
- 每次呼叫前檢查當月累計費用,超過 $10 USD 拒絕呼叫
- Redis key: gemini_cost:YYYY-MM,TTL 40 天
- 每次回覆顯示: token 數 | 本次費用 | 本月累計
- 超限時回傳警告訊息告知老闆
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-04-03 19:01:21 +08:00 |
|
OG T
|
2828cd897a
|
feat(chat): OpenClaw→Gemini Flash + NemoClaw→Ollama llama3.2:3b
CD Pipeline / build-and-deploy (push) Has been cancelled
老闆指示 (2026-04-03):
- OpenClaw: Gemini 1.5 Flash API,每次回覆附 token+費用統計
- NemoClaw: Ollama llama3.2:3b,本地快速回應 (3-8s)
- 費用控管: Gemini 月上限 $10 USD
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-04-03 18:59:28 +08:00 |
|
OG T
|
fbf122fa1f
|
fix(chat): OpenClaw 改用 NIM llama-3.1-8b 對話 + NemoClaw timeout 120s + 老闆稱謂
CD Pipeline / build-and-deploy (push) Successful in 7m9s
1. _call_openclaw: 改用 NIM meta/llama-3.1-8b-instruct
舊的 analyze/incident 是告警 API,回覆是告警格式,不適合對話
2. _call_nemotron: 移除 Ollama fallback,回到純 NIM
3. NEMOTRON_TIMEOUT_SECONDS: 55 → 120 (ConfigMap 已更新)
4. 修正「統帥」→「老闆」
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-04-03 18:41:15 +08:00 |
|
OG T
|
2da8da5a25
|
fix(chat): OpenClaw 改用 Ollama qwen2.5 做對話 + NemoClaw 加 Ollama fallback
CD Pipeline / build-and-deploy (push) Successful in 6m51s
問題: _call_openclaw 用 analyze/incident API → 回覆是告警格式,不是自然語言
修法:
1. OpenClaw chat → Ollama qwen2.5:7b-instruct (本地,快速,無格式污染)
2. NemoClaw → NIM 優先,超時 fallback 到 Ollama llama3.2:3b
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-04-03 18:30:31 +08:00 |
|
OG T
|
d1436157b7
|
fix(polling): httpx client timeout 改為分開設定,read=50s > getUpdates 40s
CD Pipeline / build-and-deploy (push) Has been cancelled
根因: httpx.AsyncClient(timeout=30.0) 的 read timeout 30s
< getUpdates 的 long polling timeout 40s
導致每次 getUpdates 都被 client 打斷 → polling loop 無法正常收訊息
修法: httpx.Timeout(connect=10s, read=50s) 讓 long polling 正常等待
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-04-03 18:29:22 +08:00 |
|
OG T
|
dfc1e19c07
|
fix(group): 互相評論補充也加 reply_to_message_id 引用原訊息
CD Pipeline / build-and-deploy (push) Has been cancelled
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-04-03 18:24:51 +08:00 |
|
OG T
|
09241f102e
|
fix(group): 群組訊息移到 security interceptor 前 — 修復 whitelist 擋掉所有群組訊息
CD Pipeline / build-and-deploy (push) Successful in 7m10s
根因: intercept_telegram() 的 whitelist 是字串,user_id 是 int
型別不匹配 → exception → telegram_chat_unauthorized → 群組訊息全被丟棄
修法: SRE 群組訊息優先路由,不走個人 whitelist
(群組成員由 Telegram 群組管理員控制,安全邊界已存在)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-04-03 18:17:22 +08:00 |
|
OG T
|
203855a56e
|
debug(group): 加 group_routing_check log 診斷 chat_id 不匹配
CD Pipeline / build-and-deploy (push) Has been cancelled
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-04-03 18:12:07 +08:00 |
|
OG T
|
63929a5e87
|
feat(group): 別名 小O→OpenClaw 小賀→NemoClaw + NemoClaw 強制繁中
CD Pipeline / build-and-deploy (push) Successful in 7m6s
1. telegram_gateway.py: _handle_group_message 加入別名路由
- 小O / 小o → 只有 OpenClaw 回應
- 小賀 / 小贺 → 只有 NemoClaw 回應
- clean_text 同步移除別名 token
2. chat_manager.py: NEMOCLAW_PERSONA 加強繁體中文強制指令
- 明確「禁止使用英文或其他語言」防止 Nemotron 自動英文回應
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-04-03 18:00:51 +08:00 |
|