12 Commits

Author SHA1 Message Date
OoO
12c8c7e94d V10.538 對齊 ai_calls ollama_other provider
All checks were successful
CD Pipeline / deploy (push) Successful in 1m6s
2026-06-01 02:37:12 +08:00
OoO
c329d96dff 限制 111 fallback context 大小
All checks were successful
CD Pipeline / deploy (push) Successful in 1m10s
2026-05-21 12:44:33 +08:00
OoO
00a808518e 將 111 Ollama fallback 收斂到輕量模型 2026-05-21 12:39:23 +08:00
OoO
2635b22ebc 修正缺貨清單手機表頭溢出
All checks were successful
CD Pipeline / deploy (push) Successful in 56s
2026-05-13 20:16:30 +08:00
OoO
1aeb4a4b8e 移除 AI logger 未用 stack 推斷
Some checks failed
CD Pipeline / deploy (push) Has been cancelled
2026-05-13 11:54:16 +08:00
OoO
f44c429a56 補強 AI logger best-effort 診斷
Some checks failed
CD Pipeline / deploy (push) Has been cancelled
2026-05-13 11:02:38 +08:00
OoO
5625032a8d 記錄 AI caller registry 匯入失敗
All checks were successful
CD Pipeline / deploy (push) Successful in 57s
2026-05-13 10:04:45 +08:00
OoO
002e498648 feat(p20+): COST_TABLE 確認 4 新 Ollama 模型(GCP Primary+Secondary 已拉)
All checks were successful
CD Pipeline / deploy (push) Successful in 2m52s
Operation Ollama-First v5.0 / Phase 20+ 完整啟動

Primary + Secondary 兩台 GCP 完整對稱(10 模型 / ~67GB 各):
   bge-m3:latest         1.2GB  Embedding
   hermes3:latest        4.7GB  Hermes 戰術
   qwen2.5-coder:7b      4.7GB  AiderHeal 既有
   qwen2.5-coder:32b    19.0GB   AiderHeal 32B 升級
   qwen2.5:7b-instruct   4.7GB  Q&A 預設
   qwen3:14b             9.3GB  Q&A / Nemotron 升級
   deepseek-r1:14b       9.0GB   推理鏈備援
   minicpm-v:latest      5.5GB  PPT vision 主
   llava:latest          4.7GB   Vision 備援
   gemma3:4b             3.3GB   輕量任務

ai_call_logger COST_TABLE 確認 4 新模型 + 2 重命名(minicpm-v / llava)
- 解 logger 「unknown model cost」誤報
- 預期啟用:
  - qwen2.5-coder:32b → AiderHeal 大型重構(call site 將擴展)
  - deepseek-r1:14b → EA HITL 推理(取代部分 Gemini Pro)
  - llava:latest → minicpm-v 失敗備援
  - gemma3:4b → sales_copy < 100 字輕量任務

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-04 10:48:22 +08:00
OoO
98063059c2 feat(p14-18): PPT vision + DeepSeek 直連 + caller_registry + Hermes 強化 + postmortem
All checks were successful
CD Pipeline / deploy (push) Successful in 2m50s
Operation Ollama-First v5.0 / Phase 14-18 全套(statesman 批准全部)

Phase 14 — services/ppt_vision_service.py (新檔, 200+ 行)
- minicpm-v:latest(GCP Primary 已拉 5.5GB,代替 qwen2-vl 不存在)
- check_image(image_path) → VisionResult.issues_found 視覺異常清單
- 走 resolve_ollama_host 三主機 retry + mark_unhealthy
- 繁中強制 system prompt + 結構化解析 ⚠️ marker
- feature flag PPT_VISION_ENABLED 預設 OFF

Phase 15 — services/deepseek_service.py (新檔, 170+ 行)
- DeepSeek API 直連 (api.deepseek.com/v1),OpenAI-compatible
- 取代部分 OpenRouter 路徑(直連便宜 ~30-50% + 延遲低)
- deepseek-chat ($0.014/$0.28) / deepseek-reasoner ($0.14/$2.19)
- feature flag DEEPSEEK_DIRECT_ENABLED 預設 OFF
- DeepSeekResponse 含 input_tokens/output_tokens/duration_ms

Phase 16 — services/llm_caller_registry.py (新檔, 130+ 行)
- CALLER_REGISTRY frozenset 集中管理 35+ 個 caller 名(ADR-028 白名單)
- assert_known_caller(strict=False) 整合到 ai_call_logger __init__
- 不在 registry → log warning(不 raise,保留擴展彈性)
- list_callers_by_service() 分組除錯
- 解 critic-A11 第 3 輪 L4 修補(命名分散三層)

Phase 17 — _is_low_quality_response 4 條新規則(A2 警訊深化)
- 規則 5:純英文回應(中文字元 < 30%)
- 規則 6:thinking-mode 漏洞(<think>...</think> 洩漏)
- 規則 7:重複迴圈偵測(前 50 字出現 ≥ 3 次)
- 規則 8:佔位符未填充({{var}} / [TODO] / <待填>)

Phase 18 — docs/operation_ollama_first_v5_postmortem.md (新檔)
- 戰役完整時間軸(Day 1-2)
- 3 大決策替代分析
- 4 個 critical hotfix 教訓
- Owen 三護欄落地對照
- KPI 達成度(Wave 1 提前 4 天 / Wave 2 提前 10 天)
- 統帥手動清單 + 7 條未來戰役教訓

Phase 13 補強(合併本 commit):
- ai_call_logger COST_TABLE 補 7 個新模型(qwen3:14b / qwen2.5:7b-instruct
  / qwen2.5-coder:32b / qwen2-vl:7b / deepseek-r1:14b / gemma3:4b / minicpm-v)

regression: 214 unit tests 全綠(4:02 跑完),2 skipped

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-04 10:19:13 +08:00
OoO
942193db2a feat(p13): OllamaResponse token 補欄 + COST_TABLE 補新模型 + retry 鏈 unit test
All checks were successful
CD Pipeline / deploy (push) Successful in 2m41s
Operation Ollama-First v5.0 / Phase 13 補強

(A) services/ollama_service.py — OllamaResponse 加 input_tokens/output_tokens
- A4 Phase 1 已知 limitation 修補:openclaw_bot_main token=0 假數據誤導日報
- generate() 解 prompt_eval_count + eval_count 寫 OllamaResponse
- 影響:ai_call_logger 收到正確 token 數,token 日報 Ollama 占比準確

(B) services/ai_call_logger.py — COST_TABLE 補 GCP 已拉/候選模型
- qwen2.5:7b-instruct (Phase 3 A7 OpenClaw Q&A 預設)
- qwen3:14b (Phase 3 A9 Nemotron + A7 升級候選)
- qwen2.5-coder:32b (Phase 8 候選)
- qwen2-vl:7b (Phase 13+ PPT vision 候選)
- deepseek-r1:14b / gemma3:4b (推理增強 / 輕量)
- 全部 cost=0(Ollama 自架)
- 解 logger.warning「unknown model cost」誤報

(J) tests/test_ollama_retry_chain.py (10 unit tests) — 驗 hotfix e862a90/6572d52
- T1 self.host @property lazy resolve
- T2 explicit host 凍結不 retry
- T3 generate 第一台 timeout → 第二台成功(核心 retry 鏈)
- T4 三主機都失敗 → success=False
- T5 cache 卡同主機 → break 不無限迴圈
- T6 Phase 13 token 解析驗證
- T7-T9 generate_embedding 同類驗證
- T10 mark_unhealthy 清 resolve cache

regression: 全戰役 14 test 檔仍 zero regression

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-04 10:07:33 +08:00
OoO
943de8466c feat(p7): Anthropic SDK + Claude Opus 4.7 接 Code Review (feature flag OFF)
Some checks failed
CD Pipeline / deploy (push) Has been cancelled
Operation Ollama-First v5.0 / Phase 7 Frontier 升級

services/anthropic_service.py (新檔, 226 行)
- AnthropicService 包裝 + ClaudeResponse dataclass
- Ephemeral prompt cache 5 分鐘 TTL(重複 system_prompt 省 90% 成本)
- usage 解析 input/output/cache_creation/cache_read 四欄位
- ANTHROPIC_API_KEY 未設或 SDK 缺失時 is_available()=False 靜默退化

code_review_pipeline_service.py — _openclaw_assess 加 L1 Claude 分支
- CODE_REVIEW_USE_CLAUDE flag (預設 OFF,等 ANTHROPIC_API_KEY 設定後翻 ON)
- 路由:Claude Opus 4.7 (Arena code Elo 1548) → Gemini → ElephantAlpha 三層
- request_id 串鏈不變

ai_call_logger.py COST_TABLE 補 3 個 Claude 模型:
- claude-opus-4-7:    $15/$75 per M tokens (程式碼 #1)
- claude-sonnet-4-6:  $3/$15  per M tokens (agentic 平衡)
- claude-haiku-4-5:   $0.8/$4 per M tokens (輕量快速)

requirements.txt: 加 anthropic>=0.40.0
.env.example: 加 ANTHROPIC_API_KEY / CODE_REVIEW_USE_CLAUDE / CLAUDE_MODEL

52 unit tests 全綠(22 logger + 18 anthropic + 5 routing + 7 security)

啟用步驟(待統帥手動):
  1. .env 加 ANTHROPIC_API_KEY=sk-ant-...
  2. CODE_REVIEW_USE_CLAUDE=true + restart momo-app
  3. 觀察 ai_calls.cache_read_tokens > 0 確認 cache 生效

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-03 23:31:30 +08:00
OoO
bb891f1a6e feat(observability): ai_call_logger + 23:55 Telegram token 日報
services/ai_call_logger.py(300 行)— 統一 LLM 遙測層
- context manager log_ai_call() / decorator logged_ai_call()
- async fire-and-forget 寫 ai_calls,DB 失敗永不影響主流程
- kill-switch:連續 10 次失敗自動降級為 logger.info
- env AI_CALL_LOGGING_ENABLED=false 一鍵關閉
- COST_TABLE 集中 13 個模型計費(gemini/claude/nim/ollama)
- PII 保護:meta 只存 prompt_hash[:12],不存原文
- 22 unit tests 全綠

services/token_report_service.py(580 行)— 6 段落每日 23:55 日報
- Section 1-6: 總覽 / 供應商分布 / TOP10 caller / 成本預算 / 趨勢 / 告警建議
- 7 條告警規則 + Hermes 規則引擎智能建議
- HTML escape + 4096 字元雙保險
- Telegram 失敗 fallback 訊息
- ai_insights 寫入 PII safe(無 chat_id/username 落地)
- 30 unit tests 全綠

A11 critic 護欄:H6 chat_id PII fix(services/openclaw_bot_routes 4 處 → SHA1[:8])

Operation Ollama-First v5.0 / Phase 1 A4+A5

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-03 23:04:58 +08:00