From e2742ce9f3eede7bc211dd522ecc5152c119d780 Mon Sep 17 00:00:00 2001 From: Your Name Date: Tue, 21 Apr 2026 21:58:48 +0800 Subject: [PATCH] =?UTF-8?q?docs:=20BUTTON=5FDATA=5FINVALID=20=E6=A0=B9?= =?UTF-8?q?=E6=B2=BB=20+=20Gitea=20Code=20Review=20=E4=BF=AE=E5=BE=A9=20?= =?UTF-8?q?=E8=A8=98=E9=8C=84?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit LOGBOOK + ADR-092 附錄 C — 2026-04-21 修復紀錄 E2E 驗證: telegram_approval_card_sent message_id=25045 (SignOzDown) ✓ Co-Authored-By: Claude Sonnet 4.6 --- docs/LOGBOOK.md | 63 +++++++++++++++++++ ...ADR-092-ai-decision-llm-expansion-layer.md | 30 +++++++++ 2 files changed, 93 insertions(+) diff --git a/docs/LOGBOOK.md b/docs/LOGBOOK.md index 338a95c4..6079e6d8 100644 --- a/docs/LOGBOOK.md +++ b/docs/LOGBOOK.md @@ -6,6 +6,69 @@ --- +## 📍 2026-04-21 下午 — BUTTON_DATA_INVALID 根治 + Gitea Code Review 修復 + +### 問題 +1. **Telegram BUTTON_DATA_INVALID (HTTP 400)** — `devops_tool` 類別按鈕 nonce 超過 64 bytes Telegram 限制(`host_restart_service` nonce = 77B) +2. **Gitea Code Review "AI 分析失敗"** — OpenClaw `/api/v1/analyze/code-review` 端點從未實作(404) +3. **Push review `'dict' object has no attribute 'issues'`** — `local_code_review_service.review_push()` 回傳 dict,呼叫端當 Pydantic model 用 + +### 根因 & 修法 +| 問題 | 根因 | 修法 | +|------|------|------| +| BUTTON_DATA_INVALID | UUID 36 chars + action name (20) + ts + rand = 77B > 64 | base64url encode UUID bytes: 36→22 chars,`host_restart_service` = 63B | +| Code review 404 | OpenClaw 只有 `/analyze/incident` 和 `/analyze/error` | `_call_openclaw_code_review` 改用 `local_code_review_service.review_pr()` | +| push review AttributeError | review_push() 回 dict,呼叫端 `analysis.issues` 屬性訪問 | `_call_openclaw_push_review` 加 dict→CodeReviewResult 轉換 | + +### E2E 驗證 +- `host_restart_service` nonce = 63B ✓,所有 actions ≤ 64B ✓ +- round-trip UUID decode = True ✓ +- `telegram_approval_card_sent` message_id=25045 (SignOzDown devops_tool) ✓ + +### Commits +- `bd73548` BUTTON_DATA_INVALID 根因修復(nonce 超 64B) +- `caeb7a9` base64url UUID 壓縮(徹底修法) +- `acab1cd` Gitea code review 改 local service +- `8fd31ec` (deployed) pipeline 1009 成功 + +### 副發現 +- `KM_CONVERTED` 缺失於 `alert_event_type` PG enum(pre-existing,non-blocking) +- SLO watchdog 回報 18 PENDING 無 TG 確認(是 BUTTON_DATA_INVALID 期間積累的歷史記錄) + +--- + +## 📍 2026-04-21 凌晨 — aider-watch v2 完成 (ADR-091,全景 E2E 驗證) + +### 完成內容 +- **aider CLI 安裝**:aider v0.86.2,OpenRouter Elephant Alpha ($0 free),OAuth 鑑權 +- **aider-watch v2**:Mac client → awoooi 飛輪完整閉環 + - Server:AiderBatchIn / aider_events 表 / Redis stream / AiderEventProcessor worker + - Client:aiderw wrapper / buffer fallback / launchd 5min flush + - AI Router:feedback_from_aider_events COALESCE SQL(session_end model 優先) + +### E2E 驗證全過(3 測試) +- C1: webhook → Redis → PG ✅(2 rows written) +- C2: 斷網 → buffer → flush → PG ✅(buffer drain 後 1 row) +- C3: model_stats_since COALESCE → `{'openrouter/elephant-alpha': 1.0}` ✅ + +### 修復過程踩坑(全景比對發現) +| 坑 | 問題 | 修法 | +|----|------|------| +| stdlib logging | logger.info("...", count=N) → KeyError | → structlog.get_logger | +| worker pool | get_worker_redis() 在 lifespan 未初始化 → RuntimeError 靜默崩潰 | → init_worker_redis_pool() 加到 start() | +| model=unknown | session_start 發出時 model 未知;SQL 只讀 session_start | → session_end 補 model+cwd;SQL COALESCE | +| 假陽性 incident | error_count>=1 就建告警(包含 "no error" 等正常輸出) | → 只在 exit_code!=0 建 incident | +| 死程式碼 | get_aider_event_repository() 有資源洩漏 | → 移除 | + +### Git 提交(共 11+ commits,以 feat/fix 為主) +最後 commit:`9e9bd86 fix(aider-watch): code-review fixes (4 issues)` + +### 下一步(已排 Backlog) +- `USE_AIDER_FEEDBACK=True` 灰度(7天後,若 elephant-alpha success_rate 穩定) +- `session_start` 補回 model(需等 banner parse 完再發,或改成 patch event) + +--- + ## 📍 2026-04-20 上午 — P0.1 + P0.2 + P0.3 三項 Drift/Target 修復 ### 統帥三問 RCA 後決議 diff --git a/docs/adr/ADR-092-ai-decision-llm-expansion-layer.md b/docs/adr/ADR-092-ai-decision-llm-expansion-layer.md index 40ceb2bc..8bf3de92 100644 --- a/docs/adr/ADR-092-ai-decision-llm-expansion-layer.md +++ b/docs/adr/ADR-092-ai-decision-llm-expansion-layer.md @@ -172,3 +172,33 @@ Grade: mature(90+) / in_progress(70-90) / starter(50-70) / initial(<50) | C4 | watchdog 不偵測鏈路斷裂 | W-4 缺失 | `_count_approved_playbooks()`;為 0 → TYPE-8M | de2d34d | **架構鐵律**:`PlaybookSource.YAML_RULE` playbooks 是自動修復鏈路的「基礎設施」,evolver 的 trust-based 退場邏輯不得觸及此類 playbooks。 + +--- + +## 附錄 C:2026-04-21 — BUTTON_DATA_INVALID 根治 + Gitea Code Review 修復 + +**觸發**:Telegram 所有 `devops_tool` 類別告警卡片發送失敗(HTTP 400 BUTTON_DATA_INVALID)+ Gitea PR Code Review 顯示「AI 分析失敗」。 + +### Root Cause 鏈 + +| 症狀 | 斷點 | 根因 | +|------|------|------| +| Telegram 400 BUTTON_DATA_INVALID | `generate_callback_nonce` | UUID(36) + action(20) + ts(10) + rand(8) + colons = 77B > 64B Telegram 限制 | +| Gitea PR "AI 分析失敗" | `_call_openclaw_code_review` | OpenClaw 只有 `/analyze/incident` 和 `/analyze/error`;`/analyze/code-review` 從未實作(404)| +| Push review AttributeError | `_call_openclaw_push_review` | `local_code_review_service.review_push()` 回傳 dict,呼叫端對 dict 做屬性訪問(`analysis.issues`)| + +### 修復 + +1. **nonce 壓縮** `security_interceptor.py` — `generate_callback_nonce` 用 base64url encode UUID bytes(36→22 chars);`parse_callback_data` 對應 decode;`host_restart_service` nonce = 63B +2. **code review 改 local** `gitea_webhook_service.py` — `_call_openclaw_code_review` 改用 `local_code_review_service.review_pr()`(Ollama + Gemini fallback) +3. **push review dict→model** `gitea_webhook_service.py` — `_call_openclaw_push_review` 加 dict→`CodeReviewResult` 轉換 + +### E2E 驗證(2026-04-21 21:57 台北) +- `host_restart_service` nonce = 63B ✓,所有 7 個 actions ≤ 64B ✓ +- UUID round-trip decode = True ✓ +- `telegram_approval_card_sent` message_id=25045(SignOzDown devops_tool)✓ 無 BUTTON_DATA_INVALID + +### Commits +- `acab1cd` fix(gitea): code-review 改 local service + push review dict→CodeReviewResult +- `bd73548` fix(telegram): BUTTON_DATA_INVALID nonce 超 64B 根因修復 +- `8fd31ec` fix(telegram): nonce UUID base64url 壓縮(徹底解決)