3 Commits

Author SHA1 Message Date
OoO
0875dd8fda 補強 5.5 自癒安全回看
All checks were successful
CD Pipeline / deploy (push) Successful in 1m11s
2026-04-29 22:48:24 +08:00
ogt
0099543c05 fix(security): 全域健檢 — 40 項安全/Bug/品質修復
Some checks failed
CD Pipeline / deploy (push) Failing after 5m18s
🔴 Critical
- auto_heal_service: 補 import re + sqlalchemy.text + 修正 orchestrator 變數名
  + autoheal_playbook→playbooks 表名 + _alert_and_store cooldown 修復
- aider_heal_executor: shell injection 改 shell=False + list 參數
- docker-compose: DISABLE_LOGIN 改 env var + 移除密碼 fallback + POSTGRES_HOST 修正
- app.py: /api/backup /api/run_task 等 6 個管理 API 加 @login_required
- config.py + pg_sync + e2e_test: 移除 wooo_pg_2026 hardcoded 密碼 fallback
- pg_backup.sh: 移除 TELEGRAM_TOKEN= 中間變數,直接用 $TELEGRAM_BOT_TOKEN
- migration 014: trigger_pattern→match_pattern + 補 error_type NOT NULL 欄位

🟡 High
- telegram_bot_service: str(e) 改通用訊息 + session try/finally + 移除 pa:/pr: 舊 callback
- run_scheduler: ElephantAlpha thread 死亡監控 + 自動重啟 + Telegram 告警
  + agent_context 03:30 TTL 定時清理任務
- openclaw_learning_service: build_rag_context 兩路徑加 .limit(200)
- hooks: commit-quality + momo-prod-guard 空 catch 改 stderr+exit(1)
- scripts/code_review: auto_yes 預設改 false
- db_backup_service: PGPASSWORD 透過 env dict 傳遞

📦 Migrations
- 013_autoheal: 修正建表順序 playbooks→incidents(外鍵前向引用)
- 018_add_missing_indexes: heal_logs/incidents 外鍵索引 + cleanup_expired_agent_context()

🟢 Infrastructure
- requirements.txt: 加版本下界 Flask>=2.3 SQLAlchemy>=1.4 等
- cd.yaml: 新增 run_scheduler.py + run_telegram_bot.py 監聽路徑
- .gitignore: insert_playbook_local.py 加入忽略

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-22 01:12:23 +08:00
ogt
77d3a1da48 feat(ai-ops): ADR-013 AIOps 自動修復閉環完整實作
Some checks failed
CD Pipeline / deploy (push) Failing after 3m24s
架構(Exception → Incident → PlayBook → Heal → KM → Telegram):

新增元件:
- database/autoheal_models.py: Incident/Playbook/HealLog 三張表 + 7 條種子 PlayBook
- migrations/013_autoheal.sql: 建表 DDL + 種子資料(冪等 INSERT)
- services/auto_heal_service.py: 核心引擎 7 步閉環
  - _classify_error: 8 類錯誤自動分類 (DNS_FAIL/DB_UNREACHABLE/OOM/...)
  - _match_playbook: error_type + keyword + 冷卻 + max_retries 保護
  - _execute_playbook: DOCKER_RESTART/SSH_CMD/ALERT_ONLY/WAIT_RETRY
  - _sink_to_km: 修復知識寫入 ai_insights (auto_heal_playbook)
  - SSH 白名單:僅允許 docker restart / compose restart / docker start

修改元件:
- database/manager.py: _init_autoheal_tables() 啟動時建表+種子 PlayBook
- scheduler.py: 3 個核心任務植入 handle_exception
  (run_auto_import_task / run_icaim_analysis_task / run_weekly_strategy_task)
- requirements.txt: paramiko(SSH 跳板;不可用時降級 subprocess+CLI ssh)

安全設計: CMD 白名單 + cooldown + max_retries escalation + DB 冪等 migration

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-19 16:03:49 +08:00