docs(adr): ADR-013 補充部署後記(踩坑清單 + SSH 設定 + 實測結果)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
@@ -67,3 +67,38 @@ Exception → Incident(DB) → PlayBook 匹配 → Auto-Heal 執行 → HealLog(
|
||||
- P1/P2 等級的 DB_UNREACHABLE / DNS_FAIL 類問題可在 30 秒內完成自動修復
|
||||
- 所有修復知識自動沉澱至 RAG KM,提升未來 AI 的判斷品質
|
||||
- 覆蓋任務:`run_auto_import_task` / `run_icaim_analysis_task` / `run_weekly_strategy_task`
|
||||
|
||||
---
|
||||
|
||||
## 部署後記(2026-04-19 實測)
|
||||
|
||||
### 踩坑修正(共 5 個 hotfix commit)
|
||||
|
||||
| 問題 | 原因 | 修正 |
|
||||
|---|---|---|
|
||||
| `UndefinedTable: playbooks` | 建表順序 `[Incident, Playbook, HealLog]` FK 衝突 | 改為 `[Playbook, Incident, HealLog]` |
|
||||
| `DetachedInstanceError` HealLog/Incident | `commit()` expire_on_commit 後 lazy-load 失敗 | `session.refresh(obj); session.expunge(obj)` |
|
||||
| `TypeError: NoneType:.0f` | fallback HealLog 缺 `duration_ms` | except 分支補 `duration_ms=duration_ms` |
|
||||
| compose=True 雙重呼叫 bug | `_execute_playbook` 先呼叫 compose,馬上覆寫為 docker restart | 刪除 use_compose 分支 |
|
||||
| `No authentication methods available` | paramiko 找不到 SSH key | key 複製至 `/app/config/autoheal_id_ed25519`(rw mount),不需重建容器 |
|
||||
|
||||
### SSH 認證鏈設定
|
||||
|
||||
```
|
||||
momo-scheduler → id_ed25519 → 110 (wooo) → tunnel → 188 (ollama) → docker restart
|
||||
```
|
||||
|
||||
`ollama@188` 的 `id_ed25519.pub` 已加入 `authorized_keys`(第 11 行)。
|
||||
|
||||
### 實測結果
|
||||
|
||||
```
|
||||
result=success duration=3110ms # DNS_FAIL → docker restart momo-db 成功
|
||||
```
|
||||
|
||||
heal_log 在 restart momo-db 後因 DB 瞬斷無法寫入 DB(id=7~9 遺失)屬設計邊界,Telegram 通知仍正常推播。
|
||||
|
||||
### 新增 DB 表
|
||||
|
||||
- `migrations/014_telegram_users.sql` — EventRouter 推播對象(替代 .env 硬編碼)
|
||||
- 種子:`telegram_id=-1003940688311, is_admin=true`
|
||||
|
||||
Reference in New Issue
Block a user