OoO d5a4e27344
Some checks failed
CD Pipeline / deploy (push) Has been cancelled
feat(p42): scheduler 每 15 分鐘自動 probe 三主機(不靠人開頁累積歷史)
問題:
Phase 38 加了 host_health_probes 表 + 開觀測台頁面時寫一筆,但
無人開頁時沒人寫 → Telegram cmd:obs_health 顯示「24h uptime」永遠空。

修補:
- run_scheduler.py::run_host_health_probe
  - 每 15 min HTTP probe GCP-A/GCP-B/111 三主機 /api/tags
  - 寫入 host_health_probes(label/url/healthy/unhealthy_mark/
    models_count/response_ms/error_msg)
  - 失敗安全:HTTP/DB 失敗只 log warning
- run_scheduler.py::run_host_health_probe_cleanup
  - 每日 03:00 DELETE 30d 前舊資料(防表膨脹)
- 註冊到 schedule.every(15).minutes 與 schedule.every().day.at("03:00")

效果:
- Web /observability/host_health 24h 趨勢卡永遠有資料(即使無人開頁)
- Telegram cmd:obs_health 三主機在線率永遠有資料
- 三主機歷史完整保留 30 天,超出自動清理

Phase 38+39+40+41+42 觀測台戰役完整收官(7 commits)。

部署驗證:
- mo.wooo.work/observability/host_health → HTTP 200 / 42716 byte
  (Phase 38 為 39124 byte,多 3.5KB 證明 24h 趨勢/MCP/AIOps card 已上線)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-04 19:24:07 +08:00
Description
EwoooC — 商品看板 + 業績報表 + AI KM (Flask + pgvector, Docker Compose on 188)
37 MiB
Languages
PostScript 59.7%
Python 30.9%
HTML 4.2%
CSS 2.1%
JavaScript 1.9%
Other 1.1%