OoO
ffeb28be95
docs: 補齊 .env.example — INITIAL_ADMIN_PASSWORD/BOT_API_TOKEN/SSH_JUMP_*
2026-04-28 14:59:19 +08:00
OoO
0b72e7040f
fix(post-3.5g): Dockerfile CMD restore gunicorn 4-workers (HIGH-5)
...
CD Pipeline / deploy (push) Successful in 9m13s
從 4349db2~1 撈回 production 啟動指令。
問題:
- 4349db2 改回 `CMD ["python", "app.py"]` 用 Flask dev server 跑 production,
單進程、無 worker pool、debug 邏輯保留、效能與安全都不適合對外。
- EXPOSE 5000 與 docker-compose / k8s 實際使用 port 80 不符
(reference_docker_topology.md 確認 momo-pro-system 是 port 80)。
修法:
- CMD 改回:gunicorn --bind 0.0.0.0:80 --workers 4 --timeout 300
--access-logfile - --error-logfile - app:app
- EXPOSE 5000 → EXPOSE 80(對齊容器內實際綁定)
- requirements.txt 已含 gunicorn>=20.1,build 不需要其他改動
驗證:
- grep 確認 CMD 與 EXPOSE 已更新
- gunicorn 在 requirements.txt 中(line 不需求動)
Critic finding: HIGH-5
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com >
2026-04-28 14:40:22 +08:00
OoO
d276853e54
fix(post-3.5g): restore _is_authorized fail-closed for callback + message (CRIT-2 + HIGH-3)
...
從 4349db2~1 撈回 _is_authorized() 並重新套用到 callback 與 message handler。
問題:
- CRIT-2 (callback fail-open):原本只擋 group/supergroup 不匹配,
private chat 任何人都能觸發 callback 指令(按鈕 menu/await/cmd)。
- HIGH-3 (message short-circuit fail):`if ALLOWED_USERS and _uid not in ALLOWED_USERS`
在 OPENCLAW_ALLOWED_USERS 環境變數未設時 → ALLOWED_USERS 為空 set →
`if False and ...` 整段不執行 → 所有 private 訊息都通過。
修法(fail-closed 三檢查):
1. 在頂部 import 區下方還原 `_is_authorized(chat_type, chat_id, user_id)`:
- group/supergroup:chat_id 必須等於 ALLOWED_GROUP
- private:user_id 必須在 ALLOWED_USERS(空 set → 全拒)
- channel / 未知 / 缺欄位 → 拒絕
2. callback handler 替換為 `if not _is_authorized(chat_type, chat_id, cq_from_id)`
並從 cq.get('from') 取 user_id(之前完全沒取)。
3. message handler 替換為統一檢查,未授權回 403 + 靜默(不回 Telegram 避免偵察)。
驗證:
- AST parse OK
- 模擬測試:999999 私訊 → False;111(在白名單)私訊 → True;
錯誤群組 → False;channel → False;None → False
- grep 結果:剩下兩處 `_is_authorized` 呼叫(callback 5195, message 5255),
舊的 `ALLOWED_USERS and _uid not in ALLOWED_USERS` 已移除(只留註解描述歷史)。
Critic findings: CRIT-2 + HIGH-3
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com >
2026-04-28 14:40:22 +08:00
OoO
b49b704e82
fix(post-3.5g): restore generate_embedding for KM dual-write (CRIT-1)
...
從 4349db2~1 撈回 OllamaService.generate_embedding,補齊被誤刪的方法。
問題:
- services/openclaw_learning_service.py:67 仍呼叫 ollama_service.generate_embedding(...)
- 4349db2 大改時把這個方法刪掉,導致每次 NemoTron 寫入學習資料時觸發
AttributeError: 'OllamaService' object has no attribute 'generate_embedding'
- pgvector KM 因此完全停寫,違反 ADR-007 雙寫鐵律
修法:
- 把 method paste 回 OllamaService 末端(line 508)
- 對齊現代 config:os 已在檔案頂部 import,移除方法內重複 import
- embedding 走 EMBEDDING_HOST(Hermes 主機,內網免認證)
- model 預設 bge-m3:latest(ADR-003 對齊)
驗證:
- AST parse OK
- grep 'def generate_embedding' 已存在
Critic finding: CRIT-1
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com >
2026-04-28 14:40:22 +08:00
OoO
d1fc71c6a3
fix: 新增 Vue.js 模板頁面支援的策略 3 和策略 4
CD Pipeline / deploy (push) Successful in 1m19s
2026-04-28 14:26:41 +08:00
OoO
5d0a9606d6
config: 填入 LPN 代碼並啟用三個促銷活動爬蟲 (O7ylWdZJHj8)
CD Pipeline / deploy (push) Successful in 1m14s
2026-04-28 14:10:31 +08:00
OoO
af260c4a01
feat: 新增三個促銷活動爬蟲支援(母親節、520情人節、勞動節)
...
CD Pipeline / deploy (push) Successful in 1m12s
- 新增通用促銷活動爬蟲函式 run_promo_event_task()
- 更新 crawler_config_loader.py 新增三個活動配置
- 更新 run_scheduler.py 動態註冊促銷活動爬蟲
- 新增 API 端點 /api/run_promo_event_task
- 新增三個前端儀表板路由(/edm/mothers_day, /edm/valentine_520, /edm/labor_day)
- 更新所有儀表板頁籤列表
- 新增配置檔案 services/data/crawler_config.json
- 新增使用文件 docs/guides/promo_event_crawler_guide.md
- 更新 agent_actions.py 允許重試列表
2026-04-28 13:57:44 +08:00
OoO
227b114101
fix(ci): use docker compose restart instead of hardcoded container names in sync mode
CD Pipeline / deploy (push) Successful in 1m13s
2026-04-28 13:36:23 +08:00
OoO
1d49c66159
fix(ci): use --no-cache for docker build to bypass cache snapshot corruption
CD Pipeline / deploy (push) Failing after 57s
2026-04-28 13:15:38 +08:00
OoO
0906c4be60
fix: mount routes directory for telegram-bot and scheduler
CD Pipeline / deploy (push) Failing after 1m49s
2026-04-28 12:44:10 +08:00
OoO
1ecec162dd
fix: increase Ollama health check timeout to prevent false offline status
CD Pipeline / deploy (push) Successful in 1m18s
2026-04-28 12:35:58 +08:00
OoO
7bb97ed252
fix: remove hardcoded Telegram Bot token to resolve AiderHeal security warning
CD Pipeline / deploy (push) Successful in 1m21s
2026-04-28 12:34:29 +08:00
OoO
7125ba09d3
fix(post-3.5e): openclaw_answer 三個 store_conversation 呼叫點對齊新簽章
...
CD Pipeline / deploy (push) Successful in 1m16s
承接前一個 commit「store_conversation 簽章改 6 參數」後,遠端先前
b766edf 的「呼叫端縮成 3 args + 改用 chat_id」修法有兩個問題:
1. openclaw_answer(question) 函式 scope 中根本沒有 chat_id 變數,
原本的 args=(chat_id, ...) 在執行時會 NameError,
被 thread 內 except 吞掉,bug 還是發生(剛好相反方向)。
2. b766edf 漏改 L4166(Gemini 直接路徑),三個呼叫點不一致。
本 commit 將 L4113 / L4214 改回 6 個 positional args:
(user_id=0, chat_id=0, question, response, source, used_sources)
對齊新簽章 (user_id, chat_id, user_message, bot_response, source='', used_sources=None)
全部 metadata(source / used_sources / chat_id)保留進 ai_insights.metadata_json。
Out-of-scope(暫不處理):
- user_id / chat_id 寫死 0 不修(待 openclaw_answer 函式接收 chat_id 參數的後續重構)
2026-04-28 12:29:48 +08:00
OoO
d67d309ada
fix(post-3.5e): store_conversation 簽章對齊呼叫端 (E4 P1 bug)
...
問題:3 個呼叫點傳 6 個 positional args,但定義只接 3 個,
TypeError 被 thread 內 except 吞掉,OpenClaw 答題對話沉澱靜默失敗,
違反 ADR-007 持久化鐵律(AI 學習數據必雙寫 DB+KM)。
修法(方案 A 元數據保留):
- 簽章:(user_id, user_message, bot_response)
→ (user_id, chat_id, user_message, bot_response, source='', used_sources=None)
- chat_id / source / used_sources 全部進 metadata,給未來分析用
- 呼叫點不需改動(args 已是 6 個,對齊新簽章)
驗證:AST inspect 確認 3 個呼叫點全部對齊新簽章。
Out-of-scope(暫不處理):
- 呼叫端寫死 user_id=0、chat_id=0,留給下一輪修
- 內部 store_insight 雙寫邏輯不動
錨點:services/openclaw_learning_service.py:330
呼叫點:routes/openclaw_bot_routes.py:4113, 4166, 4214
2026-04-28 12:29:48 +08:00
OoO
433e37d241
fix: remove strict 30s timeout for Ollama chat
CD Pipeline / deploy (push) Has been cancelled
2026-04-28 12:28:57 +08:00
OoO
b766edfde2
fix: store_conversation signature, MCP model, and AI fallback message
CD Pipeline / deploy (push) Successful in 1m18s
2026-04-28 12:26:49 +08:00
OoO
8331c15d1b
fix(post-3.5c): .env.example 補齊 HERMES_URL + DISABLE_LOGIN
...
CD Pipeline / deploy (push) Successful in 1m19s
P1-19:
- 既有 LOGIN_PASSWORD/SECRET_KEY 補上「[必填]」註解
- 新增 DISABLE_LOGIN(auth.py:13 在用,但 .env.example 沒有)
- 新增 Hermes 區塊:HERMES_URL、HERMES_TIMEOUT、EMBEDDING_HOST(註解)
- 統一格式:每條前面加「[必填] / [預設 X]」標註
注意:Elephant Alpha 區塊既有 ELEPHANT_ALPHA_HERMES_URL 是 Elephant 專用,
與本次新增的 HERMES_URL(Hermes Module 2 用)不同變數,分開保留。
2026-04-28 12:15:59 +08:00
OoO
dff19ee835
fix(post-3.5c): ai_routes hermes_stats model 顯示字串清理
...
Item #9:
- routes/ai_routes.py:1640 hermes_stats['model'] 寫死 'hermes3:latest'
- 改為 'Hermes 3' 更易讀;保留語意(model identifier 給未來可能的下游讀者)
調查確認此 key 為 dead code:
- _build_footprint_block / _build_footprint_json (nemoton_dispatcher_service.py:276,303)
只讀 duration_sec 和 tokens,不讀 model key
- hermes_analyst_service.py:419 自組 _last_stats 也不放 model key
- 改動不影響 footprint 顯示與 DB 寫入
2026-04-28 12:15:59 +08:00
OoO
67509a4e42
fix(post-3.5c): Hermes 降級 logger 等級從 error → warning
...
Item #5:
- services/hermes_analyst_service.py:122 降級到規則引擎是預期 fallback
路徑(不是錯誤),改用 logger.warning 與同檔 :175 一致
調查範圍(已查 grep "logger.error" + "降級|hermes|fallback"):
- services/nemoton_dispatcher_service.py:486 NIM content 解析失敗 → 真錯誤,保留 error
- services/nemoton_dispatcher_service.py:564 fallback 派發單筆失敗 → 真錯誤,保留 error
- routes/openclaw_bot_routes.py:4168 無 logger.error,候選位置無此 pattern(已查,無異狀)
2026-04-28 12:15:59 +08:00
OoO
8b51d2d94f
fix(post-3.5c): config.py 新增 EMBEDDING_HOST 常數(C-2 部分達成)
...
餘震 C-2 局部完成:
- config.py 新增 EMBEDDING_HOST 常數(env: EMBEDDING_HOST → fallback HERMES_URL)
- 原計畫同步修 services/ollama_service.py:515,520 的 hardcoded fallback,
但 origin/main 4349db2 (feat: AiderHeal) 已主動移除整個
generate_embedding() 方法 — rebase 衝突解決時採納 origin 決定(--ours),
不重新引入已被刪除的方法
- IP 殘留 fix 自動隨方法刪除而消失;EMBEDDING_HOST 常數保留於 config 以
供未來若恢復 embedding 路徑時集中化使用
ADR-008 集中化原則仍然完整:所有殘留的 IP 硬編碼已都改為 config 讀取
(services/nemoton_dispatcher_service.py:287 已於前個 commit 處理)。
2026-04-28 12:15:59 +08:00
OoO
b954cc37cf
fix(post-3.5c): nemoton dispatcher IP 殘留集中化
...
餘震 C-1:
- services/nemoton_dispatcher_service.py:287 env fallback hardcoded
192.168.0.111,違反 ADR-008 集中化原則
- 改從 config.HERMES_URL 集中讀取
Out-of-scope finding(不在本次修復範圍):
- line 286 仍寫死 "qwen2.5:7b-instruct",但實際模型是 hermes3:latest
(與 hermes_analyst_service.py:30 不一致,應由後續 PR 處理)
2026-04-28 12:15:59 +08:00
OoO
60a7917634
fix(post-3.5c): 修正 hermes_analyst_service docstring 模型名稱誤導
...
餘震 B:
- services/hermes_analyst_service.py:7 註解寫 qwen2.5:7b-instruct
但實際 line 30 HERMES_MODEL = "hermes3:latest"
- 同步修正 host 描述為「HERMES_URL(預設 192.168.0.111:11434)」
2026-04-28 12:15:59 +08:00
OoO
5340475570
fix(post-3.5c): hermes timeout 雙標統一 + 補 keep_alive
...
餘震 A — 昨天 Hermes timeout 真因(incident 核心):
- services/hermes_analyst_service.py:158 硬編碼 timeout=10,與 :406 用
HERMES_TIMEOUT=120 雙標;payload 也沒帶 keep_alive,被別模型擠下後
冷啟動 30+s 必中 timeout
- HERMES_TIMEOUT 從檔內常數提升至 config.py 集中管理(ADR-008)
- 兩個 payload (intent/batch) 都補 keep_alive=24h(ADR-012)
- intent 路徑 timeout 從 10s 改用 HERMES_TIMEOUT;keep_alive 確保熱駐留時
實測仍 < 10s,不會撐到 120s 上限
檔案:
- config.py: 新增 HERMES_TIMEOUT 常數
- services/hermes_analyst_service.py: 移除檔內 HERMES_TIMEOUT、新增
HERMES_KEEP_ALIVE、補 payload keep_alive、line 158 timeout 統一
2026-04-28 12:15:59 +08:00
OoO
32ac92b8f0
fix: _ssh_exec signature in ElephantAlpha
CD Pipeline / deploy (push) Has been cancelled
2026-04-28 12:15:42 +08:00
OoO
3dd73dce03
fix: missing sqlalchemy text import and _ssh_exec in ElephantAlpha
CD Pipeline / deploy (push) Successful in 1m20s
2026-04-28 12:13:44 +08:00
OoO
bc7113bc86
fix: ElephantAlpha crash, AiderHeal Ollama host, MCP integration for Hermes/NemoTron, and MCP hallucination
CD Pipeline / deploy (push) Successful in 1m18s
2026-04-28 12:11:33 +08:00
OoO
30fc7609df
fix: 將預設 Ollama 模型改為 111 主機已有的 llama3.1:8b
CD Pipeline / deploy (push) Successful in 1m17s
2026-04-28 12:00:57 +08:00
OoO
4349db2015
feat: AiderHeal 支援 ssh 與 Ollama 設為首選 AI 引擎
CD Pipeline / deploy (push) Successful in 8m40s
2026-04-28 11:41:12 +08:00
OoO
213216b495
fix: 優化 Telegram Bot 自然對話體驗,移除強制選單並串接 AI 引擎
CD Pipeline / deploy (push) Successful in 1m18s
2026-04-28 11:33:02 +08:00
OoO
6924c8ea8a
fix(ci): rebuild guard 容器名稱錯誤 momo-postgres → momo-db
CD Pipeline / deploy (push) Successful in 1m16s
2026-04-28 10:42:24 +08:00
OoO
b63af671f0
fix: add utils/ volume mount to scheduler + telegram-bot — logger_manager 全容器修復
CD Pipeline / deploy (push) Failing after 1m1s
2026-04-28 10:36:49 +08:00
OoO
7a0f4ef387
fix: add utils/ volume mount to momo-app — logger_manager 無法 import 根本修復
CD Pipeline / deploy (push) Failing after 1m7s
2026-04-28 10:34:15 +08:00
ogt
a97fe8cb3a
fix: url_for('dashboard') → url_for('index') — endpoint 名稱錯誤導致登入 500
CD Pipeline / deploy (push) Failing after 3m5s
2026-04-27 21:30:33 +08:00
ogt
4a648ea6bf
refactor: fix reverse dependencies — logger_manager→utils, dashboard_service extraction
...
- Move SystemLogger implementation to utils/logger_manager.py (pure utility, no deps)
- services/logger_manager.py becomes a backward-compat re-export shim
- database/manager.py and database/vendor_manager.py now import from utils layer
- Extract get_dashboard_stats() to services/dashboard_service.py
- services/task_runner.py no longer imports from routes layer
- routes/dashboard_routes.py get_dashboard_stats() delegates to service layer
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com >
2026-04-27 21:28:23 +08:00
ogt
b9fe98f591
refactor: centralize config — HERMES_URL, SSH params, validate_critical_config()
...
- config.py: add HERMES_URL (default 192.168.0.111:11434), SSH jump params, validate_critical_config()
- services/hermes_analyst_service.py: remove hardcoded HERMES_URL, import from config
- app.py: call validate_critical_config() on startup, log warnings for optional missing vars
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com >
2026-04-27 21:27:47 +08:00
ogt
e611702bb9
refactor: unify 4 isolated SQLAlchemy Base instances to database.models.Base
...
- database/import_models.py: 移除 ext.declarative.declarative_base,改用 from database.models import Base
- database/notification_models.py: 同上
- database/ppt_reports.py: 移除 orm.declarative_base,改用共用 Base
- database/vendor_models.py: 同上
- database/manager.py: 加入 4 個模型的 noqa import,確保 Base.metadata 完整管理所有資料表
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com >
2026-04-27 21:27:20 +08:00
ogt
b0fbd063c8
fix: pchome_routes.py — permission_required 改用 role_required(auth.py 無此函數)
...
CD Pipeline / deploy (push) Successful in 1m16s
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com >
2026-04-27 21:20:52 +08:00
ogt
3414d5bedd
fix(p1): resolve 014 migration conflict, remove orphan file, add healthchecks
...
P1-14: rename migrations/014_code_fix_playbook.sql → 020_code_fix_playbook.sql
to resolve duplicate 014 numbering with 014_telegram_users.sql
P1-22: git rm telegram_ai_integration.py (root orphan) + remove its volume
mount from docker-compose.yml telegram-bot service; services/ copy remains
P1-23: add healthcheck to momo-scheduler and momo-telegram-bot containers;
change VERSION:-latest to VERSION:-stable to prevent unvetted Watchtower pushes
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com >
2026-04-27 21:15:40 +08:00
ogt
237d3af76f
fix: Phase 2 P0 全清零 — 14 項安全與功能修復完成
...
CD Pipeline / deploy (push) Failing after 2m59s
P0-06: google_drive_service.py — pickle.load() 改 JSON token(消除 RCE 風險)
P0-07: bot_api_routes.py:30 — BOT_API_TOKEN 移除硬編碼預設值 clawdbot_momo_2026
P0-08: auto_import_index.html — showAlert innerHTML 改 createTextNode(XSS 修復)
P0-09: abc_analysis_detail.html + dashboard.html + daily_sales.html — Jinja2 | e 轉義
P0-10: openclaw_bot_routes.py:2634 — vendor PPT 補 return ppt_path(廠商報告恢復)
P0-11: telegram_bot_service.py:177-214 — cmd_start/cmd_help 補 try/except
P0-12: app.py:689-712 — 10 個 Blueprint 補齊 register(消滅 404 路由)
P0-13: auto_heal_service.py — 實作 _write_heal_log(),AIOps 稽核閉環補完
P0-14: monitoring/prometheus.yml — 取消 alert_rules comment;新增 alert_rules.yml
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com >
2026-04-27 21:11:52 +08:00
ogt
f59b23f969
security: P0 修復 S1-S5 — 移除所有硬編碼密碼與 SQL Injection 漏洞
...
S1: config.py — LOGIN_PASSWORD 移除硬編碼預設值 0936223270,改 fail-fast
S2: config.py — SECRET_KEY 移除弱預設值,無值或預設值時 sys.exit(1)
S3: services/user_service.py — create_initial_admin 改讀 INITIAL_ADMIN_PASSWORD env
S4: app.py — 匯入流程 table_name 正規表達式白名單驗證,date_list 格式驗證
S5: database/manager.py — ALLOWED_SALES_TABLES frozenset 白名單,日期改參數化查詢
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com >
2026-04-27 20:34:15 +08:00
ogt
b3a7909b2b
fix: add try/except guards to all unprotected Telegram handler functions
...
CD Pipeline / deploy (push) Successful in 1m29s
- Replace 2 silent `except Exception: pass` with logger.warning in handle_callback
- Wrap _handle_await_callback, _handle_main_menu_callback with top-level try/except (query.answer on error)
- Wrap _handle_complex_ai_response, _handle_simple_ai_response, _enhanced_keyword_matching, _process_await_input with top-level try/except (update.message.reply_text on error)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com >
2026-04-27 19:47:49 +08:00
ogt
b4d208d34a
fix: replace raise with warning in nemotron/hermes + fix hardcoded host in footprint
...
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com >
2026-04-27 19:46:04 +08:00
ogt
ac56139e74
fix: translate _get_query_suggestions to zh-TW + add missing promo_range await prompt
...
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com >
2026-04-27 19:45:52 +08:00
ogt
c8ceec1f5f
fix: expand rule engine keywords to catch brand/strategy/investment queries
...
CD Pipeline / deploy (push) Successful in 1m53s
'品牌','廠商','加碼','投資','策略','建議','市場','機會','成長',
'預測','比較','推薦','最佳' now trigger complex routing → Gemini
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com >
2026-04-26 20:23:12 +08:00
ogt
388260666e
perf: reduce Hermes timeout 25s→10s — Gemini handles main response
...
CD Pipeline / deploy (push) Successful in 1m16s
Hermes on 111 GPU takes 17s+ due to concurrent load.
Intent classification is just routing hint; Gemini/NVIDIA NIM does
actual heavy analysis. 10s timeout → quick rule engine fallback → faster UX.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com >
2026-04-25 11:09:24 +08:00
ogt
9d0e083504
fix: increase Hermes timeout 20s→25s (measured 17s from container to 111)
...
CD Pipeline / deploy (push) Successful in 1m22s
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com >
2026-04-25 11:06:15 +08:00
ogt
05f2064346
fix: correct Gemini model name + use accessible NVIDIA NIM model
...
CD Pipeline / deploy (push) Successful in 1m17s
gemini-2.5-flash-preview-05-20 → gemini-2.5-flash (correct API name)
nvidia/llama-3.1-nemotron-ultra-253b-v1 → meta/llama-3.3-70b-instruct
(nemotron-ultra requires premium account, llama-3.3-70b confirmed accessible)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com >
2026-04-25 11:03:11 +08:00
ogt
c299abba5d
fix: restore Hermes to 111+hermes3 + add NVIDIA NIM auto-fallback for OpenClaw
...
CD Pipeline / deploy (push) Successful in 3m0s
Hermes was wrongly redirected to 188 (CPU-only, 60s+ timeout).
111 has hermes3:latest with GPU acceleration (~10s response).
OpenClaw now auto-detects:
1. Gemini (primary, when GEMINI_API_KEY set)
2. NVIDIA NIM nemotron-ultra (auto-fallback, NVIDIA_API_KEY already set)
3. Friendly error only when both are unavailable
This implements the user-requested auto-failover pattern: always try
primary first, silently fall back, restore automatically when primary recovers.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com >
2026-04-25 10:31:00 +08:00
ogt
e9e0ddf54f
fix: json.dumps dict before psycopg2 insert + remove fatal raise in save_context
...
CD Pipeline / deploy (push) Successful in 1m22s
save_context/_save_action_plan passed raw Python dicts as SQL bind params,
causing psycopg2.ProgrammingError that propagated via raise and crashed the
entire AI pipeline, forcing every natural language message to keyword fallback.
Also increase Hermes intent timeout 15s→30s for qwen2.5 cold-start latency.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com >
2026-04-25 10:12:20 +08:00
ogt
e4ad2432fd
fix: remove bogus SSHJumpExecutor re-export that broke telegram AI import chain
...
CD Pipeline / deploy (push) Successful in 1m43s
SSHJumpExecutor class never existed in auto_heal_service.py.
The dead import caused ImportError blocking telegram_ai_integration
from loading, which broke all natural language conversation in the bot.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com >
2026-04-25 09:47:31 +08:00