Files
awoooi/docs/runbooks/ssh-mcp-setup.md
2026-05-01 17:18:32 +08:00

165 lines
4.3 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# SSH MCP Provider 設定 Runbook
> 建立日期: 2026-04-11 (Claude Sonnet 4.6)
> 關聯: ADR-070 MCP Phase 2a
---
## 架構說明
SSH MCP Provider (`ssh_provider.py`) 讓 API Pod 可透過 SSH 對宿主機執行診斷和修復指令。
```
K8s Pod (awoooi-api)
↓ asyncssh
188 (ollama@192.168.0.188) — Docker/Nginx/PM2 操作
110 (wooo@192.168.0.110) — Harbor/Gitea/Sentry 操作
111 (ollama@192.168.0.111) — Ollama GPU 操作
```
---
## 安全設計
### 四層守衛
1. **SSH Key 認證**ed25519 私鑰掛載至 `/run/secrets/ssh_mcp_key`0400
2. **known_hosts 驗證**`/etc/ssh-mcp/known_hosts` — 僅信任已知主機指紋
3. **參數白名單**:所有工具參數都通過 `_validate_param()` 驗證
4. **指令白名單**:只允許預定義工具,不允許任意指令執行
### 參數驗證規則
| 參數類型 | 規則 |
|---------|------|
| container_name | `^[a-zA-Z0-9][a-zA-Z0-9._-]{0,127}$` |
| service | 同 container_name |
| compose_dir | 必須以 `/opt/``/srv/` 開頭,禁止 `..` |
| domain | FQDN 格式 |
| tail/lines | int1-5000 |
| port | int1-65535 |
---
## 建立步驟
### 1. 生成 SSH Key Pair
```bash
ssh-keygen -t ed25519 -f /tmp/ssh-mcp-key -N "" -C "awoooi-mcp@k3s"
```
### 2. 將公鑰加入目標主機
```bash
ssh-copy-id -i /tmp/ssh-mcp-key.pub ollama@192.168.0.188
ssh-copy-id -i /tmp/ssh-mcp-key.pub wooo@192.168.0.110
ssh-copy-id -i /tmp/ssh-mcp-key.pub wooo@192.168.0.120
ssh-copy-id -i /tmp/ssh-mcp-key.pub wooo@192.168.0.121
```
### 3. 生成 known_hosts
```bash
ssh-keyscan 192.168.0.110 192.168.0.120 192.168.0.121 192.168.0.188 > /tmp/ssh-mcp-known_hosts
```
### 4. 建立 K8s Secret
```bash
kubectl create secret generic ssh-mcp-key \
--from-file=ssh_mcp_key=/tmp/ssh-mcp-key \
--from-literal=known_hosts="$(cat /tmp/ssh-mcp-known_hosts)" \
-n awoooi-prod
# 更新既有 Secret 時,用 merge patch避免 json add 在 key 狀態漂移時失敗
kubectl patch secret ssh-mcp-key -n awoooi-prod --type=merge \
-p "{\"data\":{\"known_hosts\":\"$(base64 -w 0 /tmp/ssh-mcp-known_hosts)\"}}"
# 清除暫存
rm /tmp/ssh-mcp-key /tmp/ssh-mcp-key.pub /tmp/ssh-mcp-known_hosts
```
### 5. ConfigMap 設定(已設定)
```yaml
SSH_MCP_ENABLED: "true"
SSH_MCP_KNOWN_HOSTS_FILE: "/etc/ssh-mcp/known_hosts"
SSH_MCP_HOST_USERS: "192.168.0.188=ollama"
```
### 6. Deployment Volume Mount已設定
```yaml
volumeMounts:
- name: ssh-mcp-key
mountPath: /run/secrets/ssh_mcp_key
subPath: ssh_mcp_key
readOnly: true
- name: ssh-mcp-key
mountPath: /etc/ssh-mcp/known_hosts
subPath: known_hosts
readOnly: true
volumes:
- name: ssh-mcp-key
secret:
secretName: ssh-mcp-key
defaultMode: 0400
optional: true
```
---
## 驗證
```bash
# 確認私鑰掛載
kubectl exec -n awoooi-prod deploy/awoooi-api -- ls -la /run/secrets/ssh_mcp_key
# 期望: -r--r----- ... 411 ...
# 確認 known_hosts 掛載
kubectl exec -n awoooi-prod deploy/awoooi-api -- ls -la /etc/ssh-mcp/known_hosts
kubectl exec -n awoooi-prod deploy/awoooi-api -- wc -c /etc/ssh-mcp/known_hosts
# 確認 provider 已啟用
kubectl logs -n awoooi-prod deploy/awoooi-api | grep '"name": "ssh_host"'
# 期望: "enabled": true
# 測試 SSH 連線(從本機)
ssh -i /tmp/ssh-mcp-key ollama@192.168.0.188 "echo OK"
```
---
## 輪換 Key
```bash
# 1. 生成新 key
ssh-keygen -t ed25519 -f /tmp/ssh-mcp-key-new -N "" -C "awoooi-mcp@k3s-$(date +%Y%m%d)"
# 2. 雙寫(新舊並存,避免服務中斷)
ssh-copy-id -i /tmp/ssh-mcp-key-new.pub ollama@192.168.0.188
ssh-copy-id -i /tmp/ssh-mcp-key-new.pub wooo@192.168.0.110
# 3. 更新 Secret
kubectl create secret generic ssh-mcp-key \
--from-file=ssh_mcp_key=/tmp/ssh-mcp-key-new \
--dry-run=client -o yaml | kubectl apply -f -
# 4. Rollout restart
kubectl rollout restart deploy/awoooi-api -n awoooi-prod
# 5. 確認後移除舊 key從各主機 authorized_keys 刪除舊 key
```
---
## 故障排查
| 症狀 | 原因 | 解決 |
|------|------|------|
| `ssh_host` provider enabled=false | SSH_MCP_ENABLED 未設定 | 確認 ConfigMap |
| known_hosts WARNING | SSH_MCP_KNOWN_HOSTS_FILE 指向空檔 | 確認 Secret 有 known_hosts key若用 subPath 掛載patch 後需 rollout restart API/worker |
| Connection refused | authorized_keys 未加入公鑰 | 重做步驟 2 |
| Host key verification failed | known_hosts 過期 | 重做步驟 3+4 |