Commit Graph

5 Commits

Author SHA1 Message Date
Your Name
45dbe07188 fix(flywheel): 自動化飛輪六大能力修復(ADR-092 B3)
Some checks failed
run-migration / migrate (push) Failing after 22s
Deploy Alert Rules / Deploy Prometheus Alert Rules (push) Successful in 53s
Type Sync Check / check-type-sync (push) Successful in 2m54s
CD Pipeline / build-and-deploy (push) Has been cancelled
Ansible Lint / lint (push) Has been cancelled
【根因鏈修復】
MCP Provider bugs → PreDecisionInvestigator 失敗 → Agent Debate 無上下文
→ LLM 逾時 → description="待分析" → ADR-091 鐵閘攔截 → tg_sent 未設
→ W-2 Watchdog 誤報「靜默故障」

【六大修復】
1. MCP Provider 三蟲修復
   - ssh_provider: asyncssh.run() → conn.run()
   - prometheus_provider: KeyError 'query' → .get() 容錯
   - k8s_provider: 空 pod_name → 早返回錯誤字典

2. Agent Debate / 決策品質
   - decision_manager: 逾時降級文字改為明確描述(繞過 ADR-091 鐵閘)
   - intent_classifier: LLM 逾時降級至關鍵字分類(非 None)

3. Watchdog 誤報修復(ADR-092 B3)
   - W-2: tg_sent Redis TTL → telegram_message_id IS NULL(DB 真值)
   - W-5 新增: suggested_action IN 空/待分析/NO_ACTION + tg_id IS NULL
   - approval_timeout_resolver: 60min → 15min,batch 50 → 200

4. Config Drift 自動化
   - drift_adopt_service: auto_adopt_if_safe() 六條件安全閘
   - drift.py: 背景任務先嘗試自動採納再發人工 Telegram 卡片

5. Playbook 飛輪穩定
   - playbook_seed_service: 修復幂等性(deprecated 不視為缺失)
   - playbook_evolver: 只載 DRAFT+APPROVED(非全部 294 筆)

6. 可觀測性
   - alert_rule_engine: auto_rule 結構化日誌 + Redis 計數器(pipeline)
   - auto_approve: reject 原因 Redis 計數器
   - heartbeat_report_service: 新增「⚙️ 自動化統計(今日)」區塊

【待人工執行】
psql $DATABASE_URL -f apps/api/migrations/cleanup_duplicate_deprecated_playbooks.sql

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-24 10:55:50 +08:00
Your Name
de2d34d4cd fix(playbook): C1-C4 全流程串接 — evolver保護+seeder復活+規則即時建立+watchdog W-4
Some checks failed
CD Pipeline / build-and-deploy (push) Has been cancelled
C1: playbook_evolver — yaml_rule source playbooks 加 YAML_RULE guard,
    evolver 不再封存 seeder 建立的 APPROVED playbook,保護自動修復鏈路

C2: playbook_seed_service — idempotency SQL 排除 DEPRECATED 記錄,
    evolver 封存後重啟可復活 yaml_rule playbooks

C3: alert_rule_engine — AI 自動生成規則成功後立即呼叫 seed_playbooks_from_rules(),
    不等下次重啟即可建立對應 APPROVED Playbook

C4: ai_slo_watchdog_job — 新增 W-4 APPROVED playbook 數量為 0 告警,
    鏈路斷裂立即 TYPE-8M;total checks 由 3 升為 4

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-20 20:18:11 +08:00
OG T
01fb531c02 fix(Phase 3): Evolver force=True bypass flag + 清理未使用 import
- run_evolver(force=True):管理員手動端點可繞過 feature flag
- 移除 typing.Any 未使用 import
- 移除 _merge_similar 中冗餘的 calculate_jaccard_similarity import

ADR-083 Phase 3 — 2026-04-15 ogt + Claude Sonnet 4.6(亞太)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-15 21:09:01 +08:00
OG T
4718c7667c feat(Phase 3): Evolver loop 排程 + 手動觸發端點 — 合併演練閘道完工
Some checks failed
CD Pipeline / build-and-deploy (push) Has been cancelled
- playbook_evolver.py: 新增 run_evolver_loop()(24h 無限迴圈)
- main.py: 掛載 run_evolver_loop asyncio.create_task
- api/v1/learning.py: POST /api/v1/learning/evolver/run(Phase 3 exit #6 演練端點)
- MASTER §8: 補錄 66c4eda AgentSession + 本次 Evolver 完整退出條件清單

ADR-083 Phase 3 — 2026-04-15 ogt + Claude Sonnet 4.6(亞太)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-15 21:07:56 +08:00
OG T
7da64eaad2 feat(Phase 3): 學習閉環重建 — 三根因修復 + 2x EWMA + Evolver Agent
Some checks failed
CD Pipeline / build-and-deploy (push) Failing after 19m7s
Type Sync Check / check-type-sync (push) Failing after 1m18s
ADR-083 Phase 3 學習閉環重建:

**三根因修復**
- approval_execution.py: fire-and-forget create_task → await asyncio.wait_for(timeout=30) × 2
  (成功路徑 L265 + 失敗路徑 L353,超時記錄 learning_trigger_timeout metric,主流程不 crash)
- models/approval.py: ApprovalRequestBase 新增 matched_playbook_id 欄位
- decision_manager.py: _auto_execute 建立 ApprovalRequest 時填充 matched_playbook_id
- learning_service.py: 雙路徑查找 _matched_pb_id(matched_playbook_id + metadata fallback)

**2x EWMA 負向強化**
- models/playbook.py: 新增 trust_score: float = 0.3(EWMA 動態信任度欄位)
- repositories/playbook_repository.py: update_stats 加 EWMA
  成功: trust = 0.9 × old + 0.1 × 1.0
  失敗: trust = 0.8 × old + 0.2 × 0.0(衰減速度 2x)
  trust < 0.1 → log warning,等 Evolver 封存

**Evolver Agent(新建)**
- services/playbook_evolver.py: 三功能全靜態規則
  1. 低信任封存: trust < 0.1 → DEPRECATED
  2. 休眠封存: 30d 未使用 AND trust < 0.5 → DEPRECATED
  3. 相似合併: 症狀 Jaccard > 0.9 → 保留高 trust,封存低 trust
  AIOPS_P3_EVOLVER_ENABLED=False 預設關閉

**文件**
- ADR-083 學習閉環重建
- MASTER §8 Phase 3 完工記錄

AIOPS_P3_ENABLED=False(預設),骨架就位等統帥批准開啟

Co-Authored-By: Claude Sonnet 4.6(亞太)<noreply@anthropic.com>
2026-04-15 14:01:37 +08:00