6 Commits

Author SHA1 Message Date
OoO
7090f08dba V10.418 skip 111 in embedding consistency checks 2026-05-24 15:03:10 +08:00
OoO
353e565e52 V10.417 protect embedding fallback routing
All checks were successful
CD Pipeline / deploy (push) Successful in 1m4s
2026-05-24 14:53:43 +08:00
OoO
36d0e5d5f3 標記 RAG 命中節省 LLM 呼叫
All checks were successful
CD Pipeline / deploy (push) Successful in 56s
2026-05-13 09:21:50 +08:00
OoO
14c5349b69 補齊 AI 觀測表 ORM 與 embedding 簽名
All checks were successful
CD Pipeline / deploy (push) Successful in 56s
2026-05-12 23:13:20 +08:00
OoO
97c446303c feat(p11.0): BGE-M3 跨主機一致性驗證 + 每週日 04:30 cron
Some checks failed
CD Pipeline / deploy (push) Has been cancelled
Operation Ollama-First v5.0 / Phase 11.0 收尾(ADR-033 護欄 #3 完整落地)

services/rag_service.py 新增:
- verify_embedding_consistency() — 跨三主機 BGE-M3 embedding 一致性驗證
  測試文字「momo電商競品分析測試向量一致性檢查」分別呼叫 GCP Primary /
  Secondary / 111 三主機,計算兩兩 cosine 距離。
  max_diff > 1e-4 視為不一致(模型版本漂移)→ logger.error。
- _cosine_distance() — 純 Python,不依賴 numpy
- fail-safe:< 2 主機可達也回 ok=True(戰時部分主機暫斷不算錯)

run_scheduler.py 新增:
- run_embed_consistency_check task wrapper
- schedule.every().sunday.at("04:30").do(...) — 每週一次足夠
  (不需每次啟動驗證,過頻會打三主機 Ollama 浪費)

落地 ADR-033 護欄 #3 完整版:
  簽名鎖定(migration 026 embedding_signature 欄位) 既有
  程式端簽名計算(rag_service.get_embedding_signature) 既有
  RAG 查詢時簽名比對過濾(rag_service._select_hits) 既有
  跨主機一致性驗證 cron  新增 
  既有 14k+ 筆回填  待手動跑 enqueue_missing_insight_embeddings()

regression: 47 unit tests 全綠

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-04 09:31:31 +08:00
OoO
c7d6db31f2 feat(p11): RAG 自主學習 + Promotion Gate 4 階段護欄(feature flag OFF)
Some checks are pending
CD Pipeline / deploy (push) Has started running
Operation Ollama-First v5.0 / Phase 11 / RAG 自主學習迴圈

services/rag_service.py (532 行)
- RAGService.query() — bge-m3 embed + cosine 0.85 threshold + top_k=5
- get_embedding_signature() — v5.0 護欄 #3 一致性檢查 (SHA1[:12])
- fire-and-forget rag_query_log INSERT (不阻塞主流程)
- feedback() — Telegram 👍/👎 寫回 feedback_score
- RAG_ENABLED 預設 OFF(戰前行為不變)

services/learning_pipeline.py (750 行)
- Distiller — 純 Hermes 規則引擎,零 LLM 成本
  Quality 規則:MCP >200 字 0.8 / LLM JSON ok 0.9 / TextRank 0.6 / 👍 1.0 / 👎 0.0
- PromotionGate — Owen v5.0 護欄 #1 鐵律
  Stage 1: quality_score >= 0.7
  Stage 2: 無幻覺檢測(規則引擎,零 LLM)
  Stage 3: 與既有 insight 相似度 < 0.95(Stage 3 在 episode embed 後啟用)
  Stage 4: weight >= 0.8 必經 Telegram 👍/👎
- expire_stale_reviews() — 24h 無回應自動降級 weight=0.5
- hash_human_approver — Telegram username SHA1[:8] PII 保護

services/hermes_analyst_service.py — 新增 analyze() RAG-first
- RAG hit → return synthesize(不燒 LLM)
- RAG miss → 既有 LLM 路徑 + enqueue learning_episodes

services/openclaw_strategist_service.py — Q&A 入口接 RAG-first
- 不動週/月/年報(敘事報告 RAG hit 機率低)

services/telegram_templates.py
- rag_feedback_keyboard() — 👍/👎 inline keyboard
- promotion_review_keyboard() — Stage 4 人工驗收按鈕

routes/openclaw_bot_routes.py — 3 組 callback handler
- rag_fb:{id}:{score} → rag_service.feedback()
- pg_ok:{episode_id} → PromotionGate.promote()
- pg_no:{episode_id} → PromotionGate.reject()

70 unit tests 全綠 + 全戰役 196 tests zero regression(4:17 跑完)

剩餘 limitations(Phase 12+ 補):
1. learning_episodes.embedding 寫入路徑(Stage 3 dedup 暫 skip)
2. PromotionGate worker cron 未掛
3. Telegram awaiting_review 推播未接(callback handler 已就位)

灰度開啟條件(建議 1 週後):
- ANTHROPIC_API_KEY 設定 + RAG_ENABLED=true + threshold=0.90 保守
- feedback_score >= 4 比率 > 70% → threshold 降至 0.85

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-03 23:56:12 +08:00