C-1 Velero: 已確認運作中(daily-awoooi-prod schedule, 13d, MinIO Available)
C-2 Host rsync 備份:
scripts/ops/backup-from-110.sh — 188 每日凌晨 1:00 rsync 備份 110
- Harbor registry data(最高優先)
- Gitea repos
- bitan-pharmacy.git(若存在)
- 成功寫入 /var/run/backup-110.last_success 供 Prometheus 監控
- 失敗時 Telegram 告警
ops/monitoring/alerts-unified.yml — 新增 HostBackupFailed 告警規則
C-3 DR SOP 文件:
docs/runbooks/disaster-recovery/DR-K8s-awoooi.md (<15分鐘)
docs/runbooks/disaster-recovery/DR-Nginx.md (<5分鐘)
docs/runbooks/disaster-recovery/DR-Harbor.md (<30分鐘)
docs/runbooks/disaster-recovery/DR-Bitan.md (<5分鐘)
docs/runbooks/disaster-recovery/DR-Stock.md (<5分鐘)
部署備份腳本說明 (需手動執行):
scp scripts/ops/backup-from-110.sh ollama@192.168.0.188:~/bin/backup-from-110.sh
ssh ollama@192.168.0.188 "chmod +x ~/bin/backup-from-110.sh && mkdir -p /backup/110/{harbor,gitea}"
ssh ollama@192.168.0.188 "echo '0 1 * * * /home/ollama/bin/backup-from-110.sh' | crontab -"
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
1.1 KiB
1.1 KiB
DR-Bitan — bitan-pharmacy 容器崩潰復原 SOP
目標時間: < 5 分鐘
觸發場景: bitan-pharmacy 容器停止、崩潰,或 Docker daemon 重啟後未自動啟動
工具: docker compose, Ansible
最後更新: 2026-04-11 (Claude Sonnet 4.6 Asia/Taipei)
快速復原
ssh wooo@192.168.0.110 "cd /home/wooo/apps/bitan-pharmacy && docker compose up -d"
# 驗收(30 秒內)
curl -s -o /dev/null -w '%{http_code}' https://bitan.wooo.work
# 期望: 200
診斷步驟(若快速復原失敗)
# 查看容器狀態
ssh wooo@192.168.0.110 "docker ps -a | grep bitan"
# 查看最近 log
ssh wooo@192.168.0.110 "docker logs bitan-pharmacy --tail 50"
# 常見問題:
# 1. Port 3003 被佔用 → 找佔用程序: ss -tlnp | grep 3003
# 2. 磁碟空間不足 → df -h
# 3. Image 損壞 → docker compose build && docker compose up -d
用 Ansible 確認狀態
# 在 MacBook 執行
ansible-playbook -i infra/ansible/inventory/hosts.yml \
infra/ansible/playbooks/110-devops.yml \
--tags bitan