Commit Graph

139 Commits

Author SHA1 Message Date
OoO
89c400d53e 補上 OpenClaw best-effort 區塊紀錄
All checks were successful
CD Pipeline / deploy (push) Successful in 57s
2026-05-13 10:59:50 +08:00
OoO
7e928509a8 記錄 Agent Actions 動態入口驗證 2026-05-13 10:57:24 +08:00
OoO
f9d3da5c16 記錄 AutoHeal DB guardrail 驗證 2026-05-13 10:39:51 +08:00
OoO
5285abe5b0 記錄 DB migration 覆蓋守門 2026-05-13 10:39:23 +08:00
OoO
4256a04508 記錄 Telegram 與 MCP 缺口驗證 2026-05-13 10:38:51 +08:00
OoO
adfcccffc1 補齊盤點修補 commit 清單
All checks were successful
CD Pipeline / deploy (push) Successful in 59s
2026-05-13 10:33:01 +08:00
OoO
ae79cdd9d6 記錄依賴盤點驗證結果
Some checks failed
CD Pipeline / deploy (push) Has been cancelled
2026-05-13 10:31:00 +08:00
OoO
47c59fdd15 更新自動匯入修補記憶
All checks were successful
CD Pipeline / deploy (push) Successful in 55s
2026-05-13 10:29:11 +08:00
OoO
4e6e9bfe5d 綁定自動匯入日期查詢參數 2026-05-13 10:28:48 +08:00
OoO
e29529f2a9 校正 Observability 修補記憶 hash 2026-05-13 10:18:16 +08:00
OoO
3cb091f598 記錄 Observability fail-safe 區塊失敗
All checks were successful
CD Pipeline / deploy (push) Successful in 59s
2026-05-13 10:18:05 +08:00
OoO
0bc6f18732 更新 Claude 盤點修補記憶 2026-05-13 10:06:14 +08:00
OoO
d30b40a694 更新 Claude 盤點整改記憶 2026-05-13 09:41:21 +08:00
OoO
83645eaadf 記錄 Claude 盤點驗證結果 2026-05-13 09:29:48 +08:00
OoO
605250619c Frontend V3 responsive production update
All checks were successful
CD Pipeline / deploy (push) Successful in 1m3s
2026-05-12 18:27:29 +08:00
OoO
30a173cf69 統一全站暖色視覺與市場情報骨架
All checks were successful
CD Pipeline / deploy (push) Successful in 58s
2026-05-06 20:24:46 +08:00
OoO
153e4c9734 fix(observability): revert unrelated quick review commit files
All checks were successful
CD Pipeline / deploy (push) Successful in 58s
2026-05-06 19:50:52 +08:00
OoO
308efdce25 chore(observability): clarify quick review completion copy
All checks were successful
CD Pipeline / deploy (push) Successful in 1m4s
2026-05-06 19:49:28 +08:00
OoO
dc7fe371bd test(observability): add deploy gate self-test
All checks were successful
CD Pipeline / deploy (push) Successful in 1m0s
2026-05-06 13:44:20 +08:00
OoO
a6100a3d01 ci(observability): centralize deploy gate detection
All checks were successful
CD Pipeline / deploy (push) Successful in 3m2s
2026-05-05 23:47:34 +08:00
OoO
8cb82d4cd5 ci(observability): include QA entrypoints in deploy gate
Some checks failed
CD Pipeline / deploy (push) Has been cancelled
2026-05-05 23:43:34 +08:00
OoO
215bd9b73c ci(observability): verify CSS mirror instead of mutating runner
Some checks failed
CD Pipeline / deploy (push) Has been cancelled
2026-05-05 23:40:45 +08:00
OoO
4380fa641c ci(observability): gate frontend deploys with QA suite
Some checks failed
CD Pipeline / deploy (push) Has been cancelled
2026-05-05 23:39:00 +08:00
OoO
3db8f5c5b2 chore(observability): polish QA entrypoint docs
Some checks failed
CD Pipeline / deploy (push) Has been cancelled
2026-05-05 23:37:00 +08:00
OoO
7225e81c08 chore(observability): pass QA target args through quick review
All checks were successful
CD Pipeline / deploy (push) Successful in 2m7s
2026-05-05 23:32:13 +08:00
OoO
7ce74e32fe docs(memory): record observability UI QA guardrails 2026-05-05 23:28:06 +08:00
OoO
65eea5eb9a chore(observability): add noninteractive QA quick review flags
Some checks failed
CD Pipeline / deploy (push) Has been cancelled
2026-05-05 23:25:55 +08:00
OoO
ce7dd6068c docs(deploy): require observability QA for frontend changes 2026-05-05 23:24:33 +08:00
OoO
be1d1aec03 test(observability): include health in smoke suite
All checks were successful
CD Pipeline / deploy (push) Successful in 4m4s
2026-05-05 23:20:45 +08:00
OoO
cdcbcf1d80 chore(observability): centralize QA page contract
All checks were successful
CD Pipeline / deploy (push) Successful in 1m33s
2026-05-05 22:19:25 +08:00
OoO
346e9672a6 chore(observability): add CSS mirror sync helper
All checks were successful
CD Pipeline / deploy (push) Successful in 1m33s
2026-05-05 22:16:41 +08:00
OoO
15f7c8660d fix(observability): serve CSS from Flask static path
All checks were successful
CD Pipeline / deploy (push) Successful in 1m34s
2026-05-05 22:14:47 +08:00
OoO
6d015c5b6b test(observability): assert design system markers
All checks were successful
CD Pipeline / deploy (push) Successful in 2m24s
2026-05-05 22:08:44 +08:00
OoO
422137efa8 test(observability): validate sidebar route coverage
All checks were successful
CD Pipeline / deploy (push) Successful in 1m41s
2026-05-05 21:46:28 +08:00
OoO
e7d567c6be test(observability): assert page content markers
Some checks failed
CD Pipeline / deploy (push) Failing after 4m55s
2026-05-05 15:53:39 +08:00
OoO
8643ed12ad test(observability): validate nav active page contract
Some checks failed
CD Pipeline / deploy (push) Has been cancelled
2026-05-05 15:48:39 +08:00
OoO
3fca720fa1 test(observability): guard sidebar navigation design
Some checks failed
CD Pipeline / deploy (push) Failing after 2m11s
2026-05-05 15:41:39 +08:00
OoO
6a0d5c138d test(observability): add one-shot QA suite
Some checks failed
CD Pipeline / deploy (push) Has been cancelled
2026-05-05 15:39:55 +08:00
OoO
b963dcf209 test(observability): add production page smoke check
Some checks failed
CD Pipeline / deploy (push) Has been cancelled
2026-05-05 15:35:47 +08:00
OoO
62276f8b0c chore(observability): wire UI guard into quick review
Some checks failed
CD Pipeline / deploy (push) Failing after 1m57s
2026-05-05 15:31:04 +08:00
OoO
07c9e200d0 test(observability): add UI regression guard
Some checks failed
CD Pipeline / deploy (push) Failing after 1m39s
2026-05-05 15:04:21 +08:00
OoO
fa3e0884ad docs(observability): 補齊 UI 治理規範
Some checks failed
CD Pipeline / deploy (push) Failing after 1m38s
2026-05-05 14:59:45 +08:00
OoO
054685826a feat(observability): 重塑 AI 觀測台戰情室 UI
Some checks failed
CD Pipeline / deploy (push) Has been cancelled
2026-05-05 13:17:42 +08:00
OoO
390c32b05d feat(p21): Caller × Context 動態 Model Router + ADR-034
All checks were successful
CD Pipeline / deploy (push) Successful in 2m45s
Operation Ollama-First v5.0 / Phase 21 — 動態路由治理

services/llm_model_router.py (160+ 行)
- 純規則引擎,零 LLM 成本(Python lambda predicate)
- 6 caller × 12 條路由規則:
  • sales_copy: 短文 < 100 字 → gemma3:4b / 長文 → llama3.1:8b
  • hermes_analyst: gap > 20% 或銷量 < -50% → qwen3:14b / 預設 hermes3
  • aider_heal: diff > 200 行 → qwen2.5-coder:32b / 預設 7b
  • openclaw_qa: query > 200 字或 multi_turn → qwen3:14b / 預設 qwen2.5:7b-instruct
  • ppt_vision: minicpm 不健康 → llava / 預設 minicpm-v
  • ea_engine: require_chain_of_thought → deepseek-r1:14b / 預設 Gemini
- feature flag MODEL_ROUTER_ENABLED 預設 OFF(向下相容)
- 失敗安全:predicate 例外 skip 到下一條

tests/test_llm_model_router.py (18 tests 全綠)
- T1 flag OFF 不路由
- T2 sales_copy 短/長文路由
- T3 hermes 簡單/複雜 SKU
- T4 aider_heal 簡單/重構
- T5 ppt_vision 主備援
- T6 ea_engine CoT 路由
- T7 predicate 例外容錯
- T8 utility 函數

ADR-034 — Caller × Context 動態 Model Router
- 6 caller 路由規則對應表
- 5 段否決方案(LLM-based / hardcode / 配置檔 / 統一升級)
- Phase 21.2-21.6 戰略性遷移計畫
- V1-V3 驗收 SQL(caller 整合後 model 分布觀察)

關聯:Primary + Secondary 兩台 GCP 已備齊 10 模型(67GB 對稱)支援所有
路由規則;caller 整合可分階段進行(Phase 21.2-21.5)。

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-04 10:54:12 +08:00
OoO
c13dc22639 feat(p20)+docs: cost auto-throttle + LLM 模型完整評估
All checks were successful
CD Pipeline / deploy (push) Successful in 2m44s
Operation Ollama-First v5.0 / Phase 20 + LLM 模型治理

services/cost_throttle_service.py (新檔, 200+ 行)
- evaluate_throttle_status() 每小時 cron 跑
- 查 ai_call_budgets monthly × 累計 spent → 月底線性外推
- 推估 > 預算 110% → 標 throttled(hysteresis:降到 95% 才解除)
- _push_throttle_alerts: 狀態變化推 Telegram
- is_provider_throttled(provider) public API(給 anthropic/gemini caller 啟動 check)
- COST_THROTTLE_ENABLED 預設 OFF(避免戰時誤節流)

run_scheduler.py 加 2 cron + task wrapper
- 每 1 小時:cost_throttle_evaluate
- 每日 00:05:cost_throttle_reset_if_new_month

docs/llm_model_full_evaluation_20260504.md (260+ 行)
- 場景 × 模型對應矩陣(4 大層次)
  戰術層 / 戰略層 / 多模態 / 雲端 API
- 本次啟動的追加 4 模型(qwen2.5-coder:32b / deepseek-r1:14b /
  llava / gemma3:4b)— Primary + Secondary 並行拉
- Phase 21 路由優化建議(context size + complexity 動態選 model)
- Phase 22 多供應商編排 + cost throttle 整合
- 儲存 / RAM / 延遲評估
- 模型治理 SOP(新增 / 替換 / 淘汰)
- COST_TABLE 對齊(含 deepseek 直連價格)

啟用前置(待統帥):
1. Primary + Secondary 4 模型拉完(背景進行中)
2. .env: COST_THROTTLE_ENABLED=true(觀察 1 週後)
3. ANTHROPIC_API_KEY 設後 Code Review 自動切 Claude Opus 4.7

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-04 10:36:56 +08:00
OoO
98063059c2 feat(p14-18): PPT vision + DeepSeek 直連 + caller_registry + Hermes 強化 + postmortem
All checks were successful
CD Pipeline / deploy (push) Successful in 2m50s
Operation Ollama-First v5.0 / Phase 14-18 全套(statesman 批准全部)

Phase 14 — services/ppt_vision_service.py (新檔, 200+ 行)
- minicpm-v:latest(GCP Primary 已拉 5.5GB,代替 qwen2-vl 不存在)
- check_image(image_path) → VisionResult.issues_found 視覺異常清單
- 走 resolve_ollama_host 三主機 retry + mark_unhealthy
- 繁中強制 system prompt + 結構化解析 ⚠️ marker
- feature flag PPT_VISION_ENABLED 預設 OFF

Phase 15 — services/deepseek_service.py (新檔, 170+ 行)
- DeepSeek API 直連 (api.deepseek.com/v1),OpenAI-compatible
- 取代部分 OpenRouter 路徑(直連便宜 ~30-50% + 延遲低)
- deepseek-chat ($0.014/$0.28) / deepseek-reasoner ($0.14/$2.19)
- feature flag DEEPSEEK_DIRECT_ENABLED 預設 OFF
- DeepSeekResponse 含 input_tokens/output_tokens/duration_ms

Phase 16 — services/llm_caller_registry.py (新檔, 130+ 行)
- CALLER_REGISTRY frozenset 集中管理 35+ 個 caller 名(ADR-028 白名單)
- assert_known_caller(strict=False) 整合到 ai_call_logger __init__
- 不在 registry → log warning(不 raise,保留擴展彈性)
- list_callers_by_service() 分組除錯
- 解 critic-A11 第 3 輪 L4 修補(命名分散三層)

Phase 17 — _is_low_quality_response 4 條新規則(A2 警訊深化)
- 規則 5:純英文回應(中文字元 < 30%)
- 規則 6:thinking-mode 漏洞(<think>...</think> 洩漏)
- 規則 7:重複迴圈偵測(前 50 字出現 ≥ 3 次)
- 規則 8:佔位符未填充({{var}} / [TODO] / <待填>)

Phase 18 — docs/operation_ollama_first_v5_postmortem.md (新檔)
- 戰役完整時間軸(Day 1-2)
- 3 大決策替代分析
- 4 個 critical hotfix 教訓
- Owen 三護欄落地對照
- KPI 達成度(Wave 1 提前 4 天 / Wave 2 提前 10 天)
- 統帥手動清單 + 7 條未來戰役教訓

Phase 13 補強(合併本 commit):
- ai_call_logger COST_TABLE 補 7 個新模型(qwen3:14b / qwen2.5:7b-instruct
  / qwen2.5-coder:32b / qwen2-vl:7b / deepseek-r1:14b / gemma3:4b / minicpm-v)

regression: 214 unit tests 全綠(4:02 跑完),2 skipped

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-04 10:19:13 +08:00
OoO
4e82acc0f5 feat(p10)+docs(adr): MCP 自建 Stack docker-compose + ADR-031
Operation Ollama-First v5.0 / Phase 10 + Phase 12 收尾

docker-compose.mcp.yml — 4+3 容器 MCP stack
- postgres-mcp (port 3001): Claude 直連 momo_pro DB read-only RBAC
- mcp-omnisearch (3003): Tavily 主 + Exa 備(取代 Gemini Grounding)
  避開 Brave(2026-02 取消免費 tier)
- firecrawl-self (3002): 自建爬蟲,SPA 反爬蟲
- filesystem-mcp (3004): 跨主機檔案 read-only

護欄 #2 落地(Owen v5.0 鐵律 / ADR-033):
  firecrawl-self mem_limit:2g + cpus:1.5
  PLAYWRIGHT_BROWSER_POOL_MAX=3
  chrome-reaper sidecar 每小時清 Chrome zombies

安全設計:
- 全部 127.0.0.1 暴露(不外網)
- read-only volume mount(filesystem 只能讀)
- postgres-mcp RBAC mcp_readonly role 限 SELECT 6 熱表
- API key 全走 env var 不寫死

ADR-031 — MCP 自建 Stack 治理決策
- 取代 Gemini Grounding 唯一通路(多供應商策略)
- 預期 70%+ grounding 流量走免費 Tavily
- 188 主機資源 +4-5GB RAM 可控
- Migration Plan:6 步驟(含 Tavily/Exa key 申請 + mcp_readonly role 預建)

啟用前置(待統帥):
1. .env 加 TAVILY_API_KEY / EXA_API_KEY / MCP_POSTGRES_PASSWORD / FIRECRAWL_AUTH_KEY
2. momo-db 建 mcp_readonly role + GRANT SELECT
3. ssh wooo@110 → ssh ollama@188 → docker compose -f docker-compose.mcp.yml up -d

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-04 09:02:07 +08:00
OoO
c29ce83653 docs(adr): ADR-032 RAG 自主學習迴圈 + ADR-033 三護欄
Operation Ollama-First v5.0 / Phase 12 Wave 2 收尾

ADR-032 — RAG 自主學習迴圈
- 雙表分離:rag_query_log (audit) / learning_episodes (蒸餾池) / ai_insights (知識庫)
- Distiller 規則引擎(純 Hermes 零 LLM 成本)
- PromotionGate 4 階段晉升閘
- Telegram 反饋環(rag_feedback / promotion_review keyboard)
- feature flag RAG_ENABLED 預設 OFF
- V1-V4 驗收 SQL(命中率 / 晉升通過率 / 反饋分布 / embedding 一致性)

ADR-033 — RAG 三護欄(Owen v5.0 鐵律)
- 護欄 #1 Promotion Gate:強制反饋門檻,weight>=0.8 必經人工驗收
- 護欄 #2 Firecrawl 資源:Docker mem_limit:2g + chrome-reaper sidecar + 1.8GB 告警
- 護欄 #3 BGE-M3 一致性:embedding_signature SHA1[:12] + 啟動跨主機驗證
- 五案否決理由完整(包含「不要反饋按鈕」「不限資源」「:latest 接受漂移」)

Migration Plan 對照:
   migration 026/028 schema + service 已落地
   Phase 12+ 補:embedding 寫入 / worker cron / Telegram 推播 / Firecrawl 部署 / signature 回填

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-04 00:01:19 +08:00
OoO
d3d90121cf docs(adr): ADR-030 Frontier 多供應商策略 — Anthropic + Google + OpenRouter
Operation Ollama-First v5.0 / Phase 12 / Phase 7 落地後追認

Phase 7 引入 Anthropic Claude(Opus 4.7 接 Code Review)後,
戰役有 2 家 Frontier 供應商,需明確治理準則:

決策矩陣(與 ADR-028 鎖定 7 場景對齊):
- 場景 #5 Code Review : Claude Opus 4.7 (Arena Elo 1548)
  → Gemini 2.5 Flash → ElephantAlpha 49B (3 層 fallback)
- 其他 6 場景維持 Gemini 主鏈

Prompt cache 戰術:
- Anthropic 5min ephemeral:Code Review 命中率預估 80%+,省 ~90% 成本
- Google Gemini:隱式 server-side cache,不可預測

預估月成本:~$32 USD
- Claude $10 + Gemini $8 + NIM $5×2 + OpenRouter $3 + Ollama $0.02

新增供應商 SOP:
1. service wrapper 加 feature flag + is_available() 檢查
2. budget 種子 + ai_calls.provider 白名單
3. unit test (fallback 鏈 + cache hit/miss)
4. 獨立 ADR

對齊:
- migration 024(claude in provider 白名單)
- migration 025(claude $10/月 budget 種子)
- ai_call_logger COST_TABLE(claude-opus/sonnet/haiku 三模型)
- services/anthropic_service.py(Phase 7 落地)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-03 23:42:36 +08:00
OoO
2f20d8d7ba db(p11): rag_query_log + learning_episodes — RAG 自主學習迴圈基礎
All checks were successful
CD Pipeline / deploy (push) Successful in 3m30s
Operation Ollama-First v5.0 / Phase 11 RAG + 自主學習

migrations/027 — rag_query_log(每次 RAG 查詢的 audit log)
- query_text 4KB CHECK + 90 天保留
- VECTOR(1024) bge-m3 embedding (與 ai_insights 一致簽名)
- ivfflat lists=100 索引
- saved_call 欄位追蹤「成功攔截 LLM 呼叫」次數
- feedback_score 1-5(NULL=未反饋)
- 6 條 CHECK 含 chk_rag_saved_consistent

migrations/028 — learning_episodes(蒸餾池 → ai_insights 前哨)
- 8 狀態機:pending/approved/awaiting_review/rejected_*4/expired
- weight 0-1(>=0.8 觸發 PromotionGate Stage 4 人工驗收)
- 9 條 CHECK 含 chk_le_approved_consistent / chk_le_review_consistent
- partial index idx_le_status WHERE in (pending, awaiting_review)
- distilled_text 16KB 上限

docs/phase11_db_design — 設計文檔
- 6 大決策(兩表分離 / ivfflat / partial index / 軟連結 / 90天保留 / 應用層白名單)
- 6 大風險評估(R1 PII / R2 蒸餾失誤 / R3 ivfflat 退化 / R4 dangling FK / R5/R6 trade-off)
- Phase 11 上線後驗收 SQL(EXPLAIN ANALYZE)

PromotionGate 4 階段(v5.0 護欄 #1, ADR-033):
  Stage 1: quality_score >= 0.7
  Stage 2: 無幻覺檢測(規則引擎,零 LLM)
  Stage 3: 與既有 insight 相似度 < 0.95
  Stage 4: weight >= 0.8 必經 Telegram 👍/👎(24h 無回應 → expired)

A4 fullstack-engineer 同時在寫 services/rag_service.py + learning_pipeline.py,
service 完成後一起部署啟用。

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-03 23:39:47 +08:00