Files
awoooi/ops
Your Name 3156ff1c69 feat(aiops): add ssh_docker_prune to auto-repair flywheel for disk-full alerts
Adds Group B SSH MCP tool ssh_docker_prune (image+volume+builder prune
with ≥75% disk usage gate) and routes "docker prune" actions through it.
Flips HostDiskUsageHigh from auto_repair=false to true with mcp_provider
routing labels so the flywheel can self-heal next disk-full event without
hitting the emergency_channel Telegram path.

Trigger: 2026-05-01 → 05-02 Telegram alert storm (peak 53/hr) caused by
empty ssh-mcp-key/known_hosts secret rejecting all SSH and forcing every
disk-full alert through "Host key is not trusted → escalate" loop.
known_hosts patched live; this commit closes the playbook gap so the
next occurrence resolves without manual intervention.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-02 12:31:37 +08:00
..