Commit Graph

35 Commits

Author SHA1 Message Date
OoO
3db8f5c5b2 chore(observability): polish QA entrypoint docs
Some checks failed
CD Pipeline / deploy (push) Has been cancelled
2026-05-05 23:37:00 +08:00
OoO
7225e81c08 chore(observability): pass QA target args through quick review
All checks were successful
CD Pipeline / deploy (push) Successful in 2m7s
2026-05-05 23:32:13 +08:00
OoO
65eea5eb9a chore(observability): add noninteractive QA quick review flags
Some checks failed
CD Pipeline / deploy (push) Has been cancelled
2026-05-05 23:25:55 +08:00
OoO
be1d1aec03 test(observability): include health in smoke suite
All checks were successful
CD Pipeline / deploy (push) Successful in 4m4s
2026-05-05 23:20:45 +08:00
OoO
cdcbcf1d80 chore(observability): centralize QA page contract
All checks were successful
CD Pipeline / deploy (push) Successful in 1m33s
2026-05-05 22:19:25 +08:00
OoO
346e9672a6 chore(observability): add CSS mirror sync helper
All checks were successful
CD Pipeline / deploy (push) Successful in 1m33s
2026-05-05 22:16:41 +08:00
OoO
15f7c8660d fix(observability): serve CSS from Flask static path
All checks were successful
CD Pipeline / deploy (push) Successful in 1m34s
2026-05-05 22:14:47 +08:00
OoO
6d015c5b6b test(observability): assert design system markers
All checks were successful
CD Pipeline / deploy (push) Successful in 2m24s
2026-05-05 22:08:44 +08:00
OoO
b21b40cae2 fix(observability): soften frontend error copy
All checks were successful
CD Pipeline / deploy (push) Successful in 1m2s
2026-05-05 21:58:49 +08:00
OoO
d93ad659ba fix(observability): polish topbar alert indicator
All checks were successful
CD Pipeline / deploy (push) Successful in 1m33s
2026-05-05 21:52:45 +08:00
OoO
422137efa8 test(observability): validate sidebar route coverage
All checks were successful
CD Pipeline / deploy (push) Successful in 1m41s
2026-05-05 21:46:28 +08:00
OoO
e7d567c6be test(observability): assert page content markers
Some checks failed
CD Pipeline / deploy (push) Failing after 4m55s
2026-05-05 15:53:39 +08:00
OoO
8643ed12ad test(observability): validate nav active page contract
Some checks failed
CD Pipeline / deploy (push) Has been cancelled
2026-05-05 15:48:39 +08:00
OoO
3fca720fa1 test(observability): guard sidebar navigation design
Some checks failed
CD Pipeline / deploy (push) Failing after 2m11s
2026-05-05 15:41:39 +08:00
OoO
6a0d5c138d test(observability): add one-shot QA suite
Some checks failed
CD Pipeline / deploy (push) Has been cancelled
2026-05-05 15:39:55 +08:00
OoO
b963dcf209 test(observability): add production page smoke check
Some checks failed
CD Pipeline / deploy (push) Has been cancelled
2026-05-05 15:35:47 +08:00
OoO
62276f8b0c chore(observability): wire UI guard into quick review
Some checks failed
CD Pipeline / deploy (push) Failing after 1m57s
2026-05-05 15:31:04 +08:00
OoO
07c9e200d0 test(observability): add UI regression guard
Some checks failed
CD Pipeline / deploy (push) Failing after 1m39s
2026-05-05 15:04:21 +08:00
OoO
2bb2e16442 feat(p56): deploy_doctor 擴充 — Observability + CD Pipeline 兩階段檢查
5 階段 → 7 階段:

[3/7] Ollama 主機(從 3 → 5 機)
  + 192.168.0.110:11435 (P53 K8s Nginx Proxy GCP-A)
  + 192.168.0.110:11436 (P53 K8s Nginx Proxy GCP-B)

[6/7] Observability 11 endpoint (新)
  全 prod smoke:mo.wooo.work/observability/* + api/health_indicator
  SPA shell fingerprint 偵測(size=7480 / etag e167a58a... = FAIL)
  302/308/401/403 (auth redirect) 視為 OK = login_required 正常工作
  PROD_BASE_URL env 可覆寫測 staging

[7/7] CD Pipeline (新)
  Gitea API 撈最近 3 個 run,狀態映射 OK/WARN/FAIL
  110 不可達 → 自動 WARN(不阻 deploy doctor exit code)

DB migrations 表清單 + 029 ollama_host_history / 030 ppt_audit_history_db。

本機跑實證:11 endpoint 全綠,Gitea 110 down 正確 WARN。
2026-05-05 12:27:51 +08:00
OoO
f2fbe5f929 feat(p30): admin nav 互聯 + deploy doctor v5.0 腳本
All checks were successful
CD Pipeline / deploy (push) Successful in 2m33s
(1) 6 個 admin 頁底部導覽全互聯(之前缺 Phase 29 兩頁的反向連結)
   - ai_calls / promotion_review / quality_trend / host_health
     全部加 |Budget|PPT Audit| 連結
   - 統帥從任一頁都可一鍵跳到其他 5 頁

(2) scripts/deploy_doctor_v5.py — 統帥手動待辦自助檢查
   5 階段檢查:env vars / DB migrations / Ollama 三主機 /
                LibreOffice / MCP servers
   - 14 個 v5.0 env vars(含 criticality 分級 FAIL/WARN/INFO)
   - 5 張 v5.0 必備 table(ai_calls/mcp_calls/ai_call_budgets/
     rag_query_log/learning_episodes)
   - ai_call_budgets seed ≥8 筆檢查
   - 三主機 /api/tags HTTP probe + healthy 數判定
   - 退出碼:0=全綠 1=WARN 2=FAIL(可進 CI)
   - SSH 188 / 本機都能跑:python3 scripts/deploy_doctor_v5.py

統帥之後想知道「v5.0 還有啥沒部署」直接跑 doctor 看清單,
不用再口頭追問哪些 env vars / 哪幾張 migration。
2026-05-04 13:48:06 +08:00
OoO
838267c293 feat(p1+p3): logger 接 13 caller + Q&A/Nemotron/日報 feature flag 灰度
Phase 1 A4 — 13 個呼叫點接 ai_call_logger(覆蓋率 11.8% → 預估 50%+)
- TOP-1 nemoton_dispatcher: nemotron_dispatch caller (NIM 配額追蹤)
- TOP-2 openclaw_strategist: 4 reports (daily/weekly/monthly/meta) + qa caller
- TOP-3 hermes_analyst: hermes_analyst + hermes_intent (順修 commit 00591c5 殘留 bug)
- TOP-4 code_review_pipeline: code_review_hermes/openclaw/elephant 三鏈 (request_id 串)
- TOP-5 openclaw_bot_routes: openclaw_bot_main/gemini/nim 三層 fallback

Phase 3 A7 — OpenClaw Q&A → qwen3:14b(feature flag OFF)
- OPENCLAW_QA_OLLAMA_FIRST 灰度開關
- 繁中強制 system prompt + Gemini fallback chain
- _is_low_quality_response 品質守門(簡體字檢測 + 拒答訊號 + 結構分數)
- 黃金集 A/B 對照測試框架(10 樣本去 PII)

Phase 3 A8 — OpenClaw 日報 → Hermes 模板(feature flag OFF)
- OPENCLAW_DAILY_HERMES_TEMPLATE 灰度開關
- _compute_daily_kpi 純 SQL + Hermes 規則引擎
- _compute_gemini_insight 精簡 200 字洞察 prompt
- templates/daily_report_v2.j2 + _SafeUndefined 缺欄位優雅降級
- scripts/compare_daily_report_versions.py 雙版本盲測

Phase 3 A9 — Nemotron NIM → qwen3:14b(feature flag OFF)
- NEMOTRON_OLLAMA_FIRST 灰度開關(A2 紅燈:deepseek-r1 假支援,改 qwen3)
- _call_qwen3_dispatch + 既有 NIM tool_calls 解析共用
- 保留 ADR-004「🟡 [降級模式]」Hermes 規則引擎兜底

H6 PII fix — chat_id 進 ai_calls.meta 改 SHA1[:8](4 處 Bot Q&A)

Code Review pipeline — N3 動態 provider tag(gcp/secondary/111)+ A4 logger 三鏈

37 unit tests 全綠(routing 15 + golden 5 + qwen3 8 + daily template 8 + nemotron 1)

Operation Ollama-First v5.0 / Phase 1 A4 + Phase 3 A7+A8+A9

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-03 23:05:38 +08:00
OoO
3b0b4b3d42 fix(ppt): address critic findings on commit 38967ce (HIGH-1, HIGH-2, Medium-4)
All checks were successful
CD Pipeline / deploy (push) Successful in 2m27s
Critic 審查 38967ce 找出 2 HIGH + 4 medium,本 commit 修最緊急三項:

HIGH-2: cleanup_expired_ppt_cache 預設 dry_run=False 危險
- 函式 default 改 dry_run=True(呼叫方需明確傳 False 才實刪)
- launchd 排程腳本改用 DRY_RUN 環境變數,預設 false→實刪,
  但測試時可 DRY_RUN=true 安全跑
- /cache cleanup Telegram 指令預設乾跑,需加 confirm 才實刪
- /cache cleanup days<1 強制乾跑(防呆)

HIGH-1: matplotlib 三個 helper 缺 try/finally → 渲染失敗洩漏 figure
- _mpl_horiz_bar_png / _mpl_line_chart_png / _mpl_pareto_chart_png
  全部用 try/except/finally 包住,例外時 plt.close() 仍執行
- 同時讓渲染失敗回 None(呼叫端自然 fallback 到原生 chart)

Medium-4: scripts/ppt_cleanup.sh PROJECT_DIR 寫死
- 改用相對路徑解析(cd $SCRIPT_DIR/..),手動執行不再需要設環境變數
- VENV_PY 找不到時 fallback 到系統 python3(容器友善)
- 新增 DRY_RUN 環境變數開關(預設 false)

煙霧測試:
- syntax OK (services/ppt_generator.py + routes/openclaw_bot_routes.py)
- monthly v3.1 重生 10 頁 290KB
- cleanup_expired_ppt_cache dry_run default = True ✓

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-02 17:23:02 +08:00
OoO
38967ceea3 feat(ppt): redesign all 6 reports to professional standard + cache versioning
All checks were successful
CD Pipeline / deploy (push) Successful in 13m18s
PPT 模板全面升版至市場專業標準(McKinsey / BCG 月報級別):

Monthly v3.1(10 頁)
- 暖紙感封面(去黑底)+ elevator pitch(亮點/警訊/動能徽章)
- KPI 卡含 △% vs 上月 + 紅綠燈、YoY 同期對比帶
- matplotlib 業績趨勢折線(本月+上月+日均線+高低點)
- 品類分析雙視圖:橫條 + 帕雷托累計(80% 主力線)
- TOP 50 商品(自動分頁)+ vs 上月 △ 排名變化 / 🆕 新進榜
- MCP 情報 4 卡片結構化、AI 行動結構化分區、附錄頁

Daily / Weekly / Strategy / Promo / Competitor v3.0
- 統一暖紙封面、AI 頁去黑底改暖紙 + 焦糖橘色條
- 日週改 matplotlib 折線(含日均線、高低點)
- 全部加附錄頁(資料來源 / 計算口徑 / 模板版本)

Cache 版本控制
- TEMPLATE_VERSIONS 字典自動注入 cache key (tpl_ver)
- 模板升版舊快取自動 miss → 重生
- 新增 _invalidate_ppt_cache() 與 cleanup_expired_ppt_cache() helper
- 新增 /cache status / flush / cleanup Telegram 指令

字型系統
- _set_run_fonts() 用 lxml 直寫 a:latin/a:ea,中英分軌
- 中文走 Microsoft JhengHei,數字/英文走 Consolas(點陣等寬)
- Dockerfile 加 fonts-noto-cjk + fonts-noto-cjk-extra

部署排程
- scripts/install_ppt_cleanup.sh 一鍵安裝 launchd(每日 03:15)
- scripts/ppt_cleanup.sh 清 7 天前過期 PPT 檔 + DB row

資料層
- routes monthly: query_monthly_summary LIMIT 10 → 50,補 orders 欄位
- routes monthly: 拉 prev_month / prev_year 比較資料供 KPI △ 與商品 △ 計算

煙霧測試 6/6 全綠:
- 日報 v3.0 (5 頁) / 週報 v3.0 (6 頁) / 月報 v3.1 (10 頁)
- 策略 v3.0 (6 頁) / 促銷 v3.0 (6 頁) / 競品 v3.0 (5 頁)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-02 16:26:27 +08:00
OoO
1a886d962b fix(telegram): dedupe webhook+polling updates via shared DB guard
All checks were successful
CD Pipeline / deploy (push) Successful in 8m50s
Webhook (Flask) and polling (momo-telegram-bot) consumed the same
Telegram update_id, causing /menu callbacks to fire twice. Add a
shared dedup module backed by telegram_update_dedup table (300s TTL,
60s cleanup) with in-memory fallback, wired into both paths.

Polling launcher now skips startup when webhook is configured to
prevent dual-consumption at the source.

38 tests across webhook, menu keyboards, telegram_api, dedup guard,
and trend bot service.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-02 12:01:04 +08:00
OoO
75de76ac12 fix(momo): block EC404 auto-open with end-to-end URL guard
Some checks failed
CD Pipeline / deploy (push) Has been cancelled
- normalize URLs at write time (scheduler crawlers, routes) to drop
  javascript:/EC404/placeholder i_code (momo_/manual_/pchome_)
- add global click+auxclick guard in base.html and ewoooc_base.html
  that intercepts blocked MOMO URLs and redirects to safe i_code URL
- per-page dashboards reuse the same isLikelyMomoIcode validation
- /api/track_momo_link records blocked events for diagnosis
- ship sanitize_momo_urls.py to clean existing polluted DB rows

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-02 12:00:34 +08:00
OoO
d88dcc8f75 fix(devops): 清理舊端口與危險 compose 操作
All checks were successful
CD Pipeline / deploy (push) Successful in 1m45s
2026-04-30 14:24:53 +08:00
OoO
db21e7e8e8 fix(devops): 移除 startup 腳本危險 compose 操作
Some checks failed
CD Pipeline / deploy (push) Has been cancelled
2026-04-30 14:05:41 +08:00
ogt
a96306fba2 Fix Telegram bot natural language communication issue
- Install python-telegram-bot dependency
- Start Telegram bot service successfully
- Confirm correct group ID (MOMO PRO - small shrimp group)
- Bot now running with all commands and button interface functional
- Natural language processing restored with keyword matching

Fixes issue where Telegram group could not communicate using natural language.
2026-04-22 14:27:50 +08:00
ogt
03c345d46d fix: drift-scanner pods cleanup script and guide
Some checks failed
CD Pipeline / deploy (push) Failing after 50s
- add cleanup script for failed drift-scanner pods
- add comprehensive fix guide with prevention strategies
- resolve pod resource issues in K8s cluster
2026-04-22 11:14:48 +08:00
ogt
4cdf0793a4 fix(review): 修復 /review 機制的 7 個審查問題
必修:
- 路由表新增 .sh → critic、.yaml/.yml → tool-expert 兩條規則
- refactor-specialist 改為並行(不取代 critic),確保 vuln-verifier 觸發條件正確
- Phase B 觸發條件從 'critic 含 🔴' 改為 '任一主審 Agent 含 🔴'

選修:
- Stage 0 新增 >2000 行 diff 保護(降級為 --stat 摘要)
- Stage 2.5 移除 '立即' 矛盾描述,改為 'Phase A 全回報後逐一發送'
- tg_notify.sh: 新增 CHAT_IDS 解析後空值守衛
- tg_notify.sh: 改用 printf | --data-urlencode 'text@-' 支援多行訊息

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-22 01:39:39 +08:00
ogt
a45b61f326 feat(review): 新增 /review pre-commit code review slash command
- .claude/commands/review.md: 整合 12 Agent 的 pre-commit review 指令
  + 依 diff 類型路由:critic / db-expert / migration-engineer / tool-expert
  + Phase B 條件觸發 vuln-verifier(critic 發現 🔴 時)
  + ≥10 Python 檔案改派 refactor-specialist 主審
  + 最終判決:BLOCKED / CAUTION / APPROVED

- scripts/tg_notify.sh: Telegram 告警工具
  + 7 個流程節點全部發送告警(啟動/每個 Agent 完成/最終判決)
  + 支援 info/warn/error 三級別 + jq/bash 雙備案解析
  + token 未設定時 exit 0,不阻斷 review 流程

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-22 01:23:57 +08:00
ogt
0099543c05 fix(security): 全域健檢 — 40 項安全/Bug/品質修復
Some checks failed
CD Pipeline / deploy (push) Failing after 5m18s
🔴 Critical
- auto_heal_service: 補 import re + sqlalchemy.text + 修正 orchestrator 變數名
  + autoheal_playbook→playbooks 表名 + _alert_and_store cooldown 修復
- aider_heal_executor: shell injection 改 shell=False + list 參數
- docker-compose: DISABLE_LOGIN 改 env var + 移除密碼 fallback + POSTGRES_HOST 修正
- app.py: /api/backup /api/run_task 等 6 個管理 API 加 @login_required
- config.py + pg_sync + e2e_test: 移除 wooo_pg_2026 hardcoded 密碼 fallback
- pg_backup.sh: 移除 TELEGRAM_TOKEN= 中間變數,直接用 $TELEGRAM_BOT_TOKEN
- migration 014: trigger_pattern→match_pattern + 補 error_type NOT NULL 欄位

🟡 High
- telegram_bot_service: str(e) 改通用訊息 + session try/finally + 移除 pa:/pr: 舊 callback
- run_scheduler: ElephantAlpha thread 死亡監控 + 自動重啟 + Telegram 告警
  + agent_context 03:30 TTL 定時清理任務
- openclaw_learning_service: build_rag_context 兩路徑加 .limit(200)
- hooks: commit-quality + momo-prod-guard 空 catch 改 stderr+exit(1)
- scripts/code_review: auto_yes 預設改 false
- db_backup_service: PGPASSWORD 透過 env dict 傳遞

📦 Migrations
- 013_autoheal: 修正建表順序 playbooks→incidents(外鍵前向引用)
- 018_add_missing_indexes: heal_logs/incidents 外鍵索引 + cleanup_expired_agent_context()

🟢 Infrastructure
- requirements.txt: 加版本下界 Flask>=2.3 SQLAlchemy>=1.4 等
- cd.yaml: 新增 run_scheduler.py + run_telegram_bot.py 監聽路徑
- .gitignore: insert_playbook_local.py 加入忽略

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-22 01:12:23 +08:00
ogt
ba86f98514 feat: integrate Elephant Alpha ecosystem with full ADR-012/013 compliance
Some checks failed
CD Pipeline / deploy (push) Has been cancelled
- Add ElephantService, AutonomousEngine, Orchestrator, DecisionRouter (EA 4-file stack)
- Fix 10 bugs: URL typo, SQL schema mismatches (price_records JOIN), enum mapping,
  metadata_json, NemoTron PriceThreat dispatch, async/await mismatch, broken imports
- Wire ADR-012 Agent Action Ladder: EventRouter L2 → EA first + AIOrch fallback;
  all decisions dual-write DB + triaged_alert Telegram; momo: callback prefix
- Wire ADR-013 AutoHeal: resource_optimization trigger → AutoHealService
- Add W3 guards: connection cache 300s TTL, $5/hr cost hard limit
- Add W4 persistence: routing decisions + agent performance snapshots → ai_insights
- Add Migration 015: confidence + created_by columns on ai_insights
- Fix run_scheduler.py broken imports (DecisionTracker service didn't exist)
- Fix verify_elephant_integration.py: check_status() → check_connection()

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-20 04:28:26 +08:00
ogt
676c711e7a feat: AI 治理完備 V10.3 — 技術債清零 + DB 備份機制 + 備份 AI 監控
Some checks are pending
CD Pipeline / deploy (push) Waiting to run
技術債清零 (2026-04-19):
- migrations/010: ai_insights 補 decay_exempt/avg_quality/status/ai_model/feedback 欄位
- migrations/011: embedding_retry_queue 持久化表 (ADR-009)
- migrations/012: backup_log 備份記錄表
- services/openclaw_learning_service: 記憶體 Queue → DB retry queue,時間衰減 RAG
- services/nemoton_dispatcher_service: 三個 tool 強制雙寫 ai_insights (_sink_insight_to_km)
- services/import_service: Excel 前置欄位防禦(商品名稱類 + 業績金額類)
- services/ollama_service: generate_embedding 新增 EMBEDDING_HOST env,embedding 永遠走 192.168.0.111
- SYSTEM_VERSION: V9.4 → V10.3

DB 備份機制:
- scripts/pg_backup.sh: host-level pg_dump 備份腳本,cron 每日 02:00,保留 7 天,Telegram 通知
- services/db_backup_service.py: Python 備份 service,寫入 backup_log
- scheduler: run_db_backup_task (02:00) + run_backup_monitor_task (每 6h AI Agent 監控)
- Dockerfile: 加入 postgresql-client

文件:
- CLAUDE.md: 環境架構依 ADR-008 實地重寫,含完整 SSH/Docker 部署 SOP
- PROJECT_CONSTITUTION.md: 內容已整合入 CLAUDE.md,刪除重複檔案

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-19 02:03:45 +08:00
ogt
1b4f3a7bbe feat: EwoooC 初始化 — 完整專案推版至 Gitea
Some checks failed
CD Pipeline / deploy (push) Failing after 59s
- 建立 Gitea Actions CD pipeline (.gitea/workflows/cd.yaml)
- 部署模式: rsync Python 檔案至 188 → docker restart (volume mount)
- Dockerfile/requirements 變動時自動重建 Docker image
- 部署通知: Telegram (開始/成功/失敗)
- 健康檢查: https://mo.wooo.work/health (最多 5 次重試)
- 同步最新 CLAUDE.md / ADR-008 / memory (2026-04-19)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-19 01:21:13 +08:00