Some checks failed
CD Pipeline / deploy (push) Failing after 59s
- 建立 Gitea Actions CD pipeline (.gitea/workflows/cd.yaml) - 部署模式: rsync Python 檔案至 188 → docker restart (volume mount) - Dockerfile/requirements 變動時自動重建 Docker image - 部署通知: Telegram (開始/成功/失敗) - 健康檢查: https://mo.wooo.work/health (最多 5 次重試) - 同步最新 CLAUDE.md / ADR-008 / memory (2026-04-19) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
87 lines
2.9 KiB
Markdown
87 lines
2.9 KiB
Markdown
# ADR-009:Embedding Retry Queue 持久化(DB-backed)
|
||
|
||
- **Status**: Accepted
|
||
- **Date**: 2026-04-19
|
||
- **Decision Maker**: 統帥
|
||
- **Author**: Claude
|
||
|
||
## Context
|
||
|
||
ADR-007(AI 學習雙寫規範)要求 `store_insight()` 必須同時寫入 `ai_insights` DB 表與 `embedding` 欄位。
|
||
初版實作(2026-04-18)使用 Python `queue.Queue()` 記憶體佇列 + daemon thread worker:
|
||
|
||
```python
|
||
_embedding_queue = Queue() # 記憶體
|
||
threading.Thread(target=_embedding_worker, daemon=True).start()
|
||
```
|
||
|
||
**破洞**:
|
||
- Python 進程重啟(K8s Pod 滾動重啟、gunicorn reload、OOM kill)→ queue 全部遺失
|
||
- Ollama 主機 192.168.0.111 短暫斷線 → 該筆 insight 永遠沒 embedding
|
||
- 違反 ADR-007「雙寫必達」精神:DB 有寫但 KM 沒寫 → RAG 永遠找不到這筆洞察
|
||
|
||
## Decision
|
||
|
||
**Embedding queue 改為 DB 持久化**(`embedding_retry_queue` 表),worker 改為輪詢批次處理。
|
||
|
||
| 面向 | 記憶體 Queue(舊) | DB Queue(新) |
|
||
|---|---|---|
|
||
| 持久化 | ❌ 重啟遺失 | ✅ 持久化 |
|
||
| Retry | ❌ 無 | ✅ attempts 欄位 + 上限 5 次 |
|
||
| 觀測性 | ❌ 無法查詢 | ✅ SQL 即可 |
|
||
| 跨進程 | ❌ 每 Pod 各自一份 | ✅ 多 Pod 共享同一 queue |
|
||
| 吞吐 | ✅ 最快 | ✅ 批次 10 筆/分鐘已足夠 |
|
||
|
||
## Schema(Migration 011)
|
||
|
||
```sql
|
||
CREATE TABLE embedding_retry_queue (
|
||
id SERIAL PRIMARY KEY,
|
||
target_table VARCHAR(50) NOT NULL, -- ai_insights / conversations
|
||
target_id INTEGER NOT NULL,
|
||
text_content TEXT NOT NULL,
|
||
model VARCHAR(50) DEFAULT 'bge-m3:latest',
|
||
status VARCHAR(20) DEFAULT 'pending', -- pending / processing / done / failed
|
||
attempts INTEGER DEFAULT 0,
|
||
last_error TEXT,
|
||
created_at TIMESTAMP,
|
||
updated_at TIMESTAMP,
|
||
processed_at TIMESTAMP
|
||
);
|
||
```
|
||
|
||
## Worker 流程
|
||
|
||
```
|
||
Worker Loop (每 60 秒):
|
||
SELECT * FROM embedding_retry_queue
|
||
WHERE status='pending' AND attempts < 5
|
||
ORDER BY created_at LIMIT 10
|
||
↓
|
||
對每筆:
|
||
UPDATE status='processing'
|
||
呼叫 Ollama bge-m3 生成 embedding
|
||
↓
|
||
成功 → UPDATE target SET embedding = vec;
|
||
UPDATE retry_queue SET status='done', processed_at=NOW()
|
||
失敗 → UPDATE retry_queue SET attempts=attempts+1, last_error=...
|
||
若 attempts ≥ 5 → status='failed'(人工處理)
|
||
```
|
||
|
||
## Consequences
|
||
|
||
### Positive
|
||
- 即使 Pod 重啟也不會丟 embedding 任務
|
||
- 失敗累積可視化(可寫 Grafana 面板監控 `status='failed'` 數量)
|
||
- 多 Pod 共用同一 queue,不重複處理(processing 狀態互斥)
|
||
|
||
### Negative
|
||
- 延遲從「毫秒」拉長到「最多 60 秒」(輪詢間隔)
|
||
- 緩解:AI 洞察不需要即時 embedding,RAG 查詢 < 1 分鐘延遲可接受
|
||
- DB 需多一張表(小幅維運成本)
|
||
|
||
## Related ADRs
|
||
|
||
- ADR-007(雙寫規範):本 ADR 是其 Step 4 的持久化實作
|
||
- ADR-003(本地 embedding):worker 呼叫的後端
|