docs(aider-watch v2): 補 4 個全景盲點
統帥 2026-04-20 提醒「每次更新都不忘全景」— 在執行前做二次檢查 發現 4 個 plan 未處理的盲點,現補齊: 盲點 1:Mac 外網可達性 - spec §8 + §8b 新增 Tailscale/nginx/VPN 三選一 - plan Task B5 install.sh 前置提醒選配置 盲點 2:incident 洗版(同 session 多 error) - spec §8 新增 coalesce 策略(60s 窗口 per session_id) - plan Task A5 service 實作 create_incident_for_event 加 coalesce 邏輯 - 加 2 個測試 case 驗證同 session reuse + 不同 session 分離 盲點 3:AI Router feedback 首次 rollout 風險 - spec §8 新增 USE_AIDER_FEEDBACK flag 預設 false,灰度 7 天再開 - plan Task A8 route() hook 外包 if settings.USE_AIDER_FEEDBACK block - plan Task A9 config 加 USE_AIDER_FEEDBACK: bool = False 盲點 4:AWOOOI_PG_PW secret 取得 - spec §8c 新增 kubectl get secret → env → shred 流程 - plan Task A0 Step 1 明確寫出 K8s Secret 讀取 + 立即銷毀檔案 符合 feedback_ai_autonomous_direction.md 的全景思考紀律。 執行策略:全 subagent-driven(統帥批准)。 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -56,35 +56,50 @@
|
||||
|
||||
# Phase A: Server-side(10 tasks)
|
||||
|
||||
## Task A0: 生成 HMAC secret + 本地開發 env
|
||||
## Task A0: 前置 — 取 PG 密碼 + 生 HMAC secret + 本機 env
|
||||
|
||||
**Files:**
|
||||
- Create: `/tmp/aider-watch-dev.env` (只在本機用,不進 git)
|
||||
- Create: `/tmp/aider_webhook_secret.txt`(安裝後 shred 刪)
|
||||
|
||||
- [ ] **Step 1: 生成 HMAC secret(32 bytes hex)**
|
||||
- [ ] **Step 1: 從 K8s Secret 取 PG 密碼**(🆕 盲點 4:明確 secret 來源)
|
||||
|
||||
```bash
|
||||
# 需 kubectl 已有 awoooi cluster 權限(~/.kube/config 指向正確 context)
|
||||
kubectl config current-context # 期待 awoooi-* 或 k3s-188
|
||||
kubectl get secret awoooi-secrets -n awoooi \
|
||||
-o jsonpath='{.data.POSTGRES_PASSWORD}' | base64 -d > /tmp/pg_pw.txt
|
||||
export AWOOOI_PG_PW="$(cat /tmp/pg_pw.txt)"
|
||||
shred -u /tmp/pg_pw.txt # 立即銷毀檔案
|
||||
test -n "$AWOOOI_PG_PW" && echo "PG PW loaded (${#AWOOOI_PG_PW} chars)" || exit 1
|
||||
```
|
||||
|
||||
若 `kubectl` 不可用或 context 不對,統帥需手動提供 `AWOOOI_PG_PW` — 不自動生成、不 fallback。
|
||||
|
||||
- [ ] **Step 2: 生成 HMAC secret(32 bytes hex)**
|
||||
|
||||
```bash
|
||||
openssl rand -hex 32 > /tmp/aider_webhook_secret.txt
|
||||
cat /tmp/aider_webhook_secret.txt # 記錄,之後進 K8s secret 用
|
||||
chmod 600 /tmp/aider_webhook_secret.txt
|
||||
cat /tmp/aider_webhook_secret.txt # 記下!之後進 K8s secret + Mac ~/.aider-watch.env
|
||||
```
|
||||
|
||||
- [ ] **Step 2: 設定 local dev env**
|
||||
- [ ] **Step 3: 設定本機 dev env**
|
||||
|
||||
```bash
|
||||
# 這些 env 只用於本地開發/測試;正式部署走 K8s Secret
|
||||
export AIDER_WEBHOOK_SECRET="$(cat /tmp/aider_webhook_secret.txt)"
|
||||
export AIDER_EVENTS_STREAM_KEY="signals:aider:events"
|
||||
export AIDER_PATTERN_EXTRACT_INTERVAL_HOURS="24"
|
||||
export USE_AIDER_FEEDBACK="false" # 🆕 盲點 3:預設關閉,驗證後再開
|
||||
```
|
||||
|
||||
- [ ] **Step 3: 驗證 Redis + PG 可達(awoooi 既有)**
|
||||
- [ ] **Step 4: 驗證 Redis + PG 可達**
|
||||
|
||||
```bash
|
||||
redis-cli -h 192.168.0.120 -p 6380 ping # 期待 PONG
|
||||
PGPASSWORD="$AWOOOI_PG_PW" psql -h 192.168.0.188 -U awoooi_rw -d awoooi -c 'SELECT 1'
|
||||
```
|
||||
|
||||
Expected: Both commands succeed. If not, stop and resolve before proceeding.
|
||||
Expected: 兩個命令都成功。任一失敗 → STOP,先解決再往下。
|
||||
|
||||
---
|
||||
|
||||
@@ -503,12 +518,14 @@ git commit -m "feat(repo): AiderEventRepository CRUD + model_stats + pattern can
|
||||
|
||||
---
|
||||
|
||||
## Task A5: Service — classify + incident + pattern
|
||||
## Task A5: Service — classify + incident coalesce + pattern
|
||||
|
||||
**Files:**
|
||||
- Create: `apps/api/src/services/aider_event_service.py`
|
||||
- Create: `apps/api/tests/test_aider_event_service.py`
|
||||
|
||||
**🆕 盲點 2(incident 洗版)**:同 `session_id` + `aider_activity` 在 60s 窗口內只建 1 個 incident,第 2+ error 合進該 incident 的 `signals` 陣列。Redis key `aider_incident_coalesce:{session_id}` TTL=60s 存 `incident_id`。
|
||||
|
||||
- [ ] **Step 1: 寫測試(mock incident_service,因這是 service 的單元邊界)**
|
||||
|
||||
```python
|
||||
@@ -559,6 +576,56 @@ def test_session_start_no_incident():
|
||||
|
||||
def test_error_creates_incident():
|
||||
assert should_create_incident(_ev("error")) is True
|
||||
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_coalesce_reuses_existing_incident():
|
||||
"""盲點 2:同 session 60s 內只建 1 個 incident。"""
|
||||
from apps.api.src.services.aider_event_service import create_incident_for_event
|
||||
|
||||
# Fake incident_svc + append_signal
|
||||
append_calls = []
|
||||
class FakeIncSvc:
|
||||
async def create_incident(self, **kw): return "inc-first"
|
||||
async def append_signal(self, inc_id, sig): append_calls.append((inc_id, sig))
|
||||
|
||||
# Fake redis: 第一次 get 回 None,set 存;第二次 get 回 "inc-first"
|
||||
store = {}
|
||||
class FakeRedis:
|
||||
async def get(self, k): return store.get(k)
|
||||
async def set(self, k, v, ex=None): store[k] = v
|
||||
|
||||
svc = FakeIncSvc(); redis = FakeRedis()
|
||||
e1 = _ev("error"); e2 = _ev("error")
|
||||
id1 = await create_incident_for_event(e1, svc, redis_client=redis)
|
||||
id2 = await create_incident_for_event(e2, svc, redis_client=redis)
|
||||
assert id1 == id2 == "inc-first"
|
||||
assert len(append_calls) == 1 # 第二個 event 只 append,不建新
|
||||
assert append_calls[0][1]["event_type"] == "error"
|
||||
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_coalesce_different_session_id_builds_new():
|
||||
from apps.api.src.services.aider_event_service import create_incident_for_event
|
||||
seen = []
|
||||
class FakeIncSvc:
|
||||
async def create_incident(self, **kw):
|
||||
seen.append(kw); return f"inc-{len(seen)}"
|
||||
async def append_signal(self, *_): pass
|
||||
store = {}
|
||||
class FakeRedis:
|
||||
async def get(self, k): return store.get(k)
|
||||
async def set(self, k, v, ex=None): store[k] = v
|
||||
|
||||
ea = AiderEventIn(ts=datetime.now(TAIPEI), session_id="sA",
|
||||
host="m", type="error", payload={"cwd":"/x","model":"m1"})
|
||||
eb = AiderEventIn(ts=datetime.now(TAIPEI), session_id="sB",
|
||||
host="m", type="error", payload={"cwd":"/x","model":"m1"})
|
||||
id_a = await create_incident_for_event(ea, FakeIncSvc(), redis_client=FakeRedis())
|
||||
id_b = await create_incident_for_event(eb, FakeIncSvc(), redis_client=FakeRedis())
|
||||
# 不同 session 各自建
|
||||
assert id_a == "inc-1"
|
||||
assert id_b == "inc-1" # 注意:每個 FakeIncSvc 是獨立 instance,所以都是 inc-1
|
||||
```
|
||||
|
||||
- [ ] **Step 2: 跑測試驗失敗**
|
||||
@@ -611,15 +678,46 @@ class IncidentServiceLike(Protocol):
|
||||
signals: dict[str, Any]) -> str: ...
|
||||
|
||||
|
||||
COALESCE_TTL_SEC = 60
|
||||
COALESCE_KEY_PREFIX = "aider_incident_coalesce:"
|
||||
|
||||
|
||||
async def create_incident_for_event(ev: AiderEventIn,
|
||||
incident_svc: IncidentServiceLike) -> str | None:
|
||||
"""若 event 應觸發 incident,建立並回傳 incident_id;否則 None。"""
|
||||
incident_svc: IncidentServiceLike,
|
||||
redis_client=None) -> str | None:
|
||||
"""若 event 應觸發 incident,建立並回傳 incident_id;否則 None。
|
||||
🆕 盲點 2:同 session_id 60s 內只建 1 個 incident,後續合進 signals。"""
|
||||
sev = classify_severity(ev)
|
||||
if not sev:
|
||||
return None
|
||||
p = redact(ev.payload)
|
||||
repo = p.get("cwd") or "<unknown>"
|
||||
model = p.get("model") or "<unknown>"
|
||||
|
||||
# Coalesce 檢查
|
||||
existing_id: str | None = None
|
||||
if redis_client is not None:
|
||||
key = f"{COALESCE_KEY_PREFIX}{ev.session_id}"
|
||||
try:
|
||||
raw = await redis_client.get(key)
|
||||
if raw:
|
||||
existing_id = raw.decode() if isinstance(raw, bytes) else raw
|
||||
except Exception:
|
||||
pass # Redis 失敗不阻塞 fallback 建新 incident
|
||||
|
||||
if existing_id:
|
||||
# 同 session 已有 incident → 合進 signals,不建新的
|
||||
if hasattr(incident_svc, "append_signal"):
|
||||
try:
|
||||
await incident_svc.append_signal(existing_id, {
|
||||
"ts": ev.ts.isoformat(),
|
||||
"event_type": ev.type,
|
||||
"payload_excerpt": p,
|
||||
})
|
||||
except Exception:
|
||||
pass
|
||||
return existing_id
|
||||
|
||||
title_map = {
|
||||
"error": f"[aider] error in {repo}",
|
||||
"silent_timeout": f"[aider] silent timeout in {repo}",
|
||||
@@ -638,6 +736,13 @@ async def create_incident_for_event(ev: AiderEventIn,
|
||||
"payload_excerpt": p,
|
||||
},
|
||||
)
|
||||
# 寫 Redis coalesce key
|
||||
if redis_client is not None and inc_id:
|
||||
try:
|
||||
await redis_client.set(f"{COALESCE_KEY_PREFIX}{ev.session_id}",
|
||||
inc_id, ex=COALESCE_TTL_SEC)
|
||||
except Exception:
|
||||
pass
|
||||
return inc_id
|
||||
|
||||
|
||||
@@ -1017,28 +1122,36 @@ grep -n "def route\|class AIRouter\|AIProvider" apps/api/src/services/ai_router.
|
||||
return out
|
||||
```
|
||||
|
||||
- [ ] **Step 3: 在 AIRouter.route() 內加入 feedback 影響(soft boost)**
|
||||
- [ ] **Step 3: 在 AIRouter.route() 內加入 feedback 影響(🆕 盲點 3:Flag 保護)**
|
||||
|
||||
找到 `def route(...)` 或 `async def route(...)` 決策主體,在 provider 優先序計算後、return 前加:
|
||||
|
||||
```python
|
||||
# Phase 24 feedback hook: aider 實戰成功率 × 原權重
|
||||
try:
|
||||
feedback = await self.feedback_from_aider_events(days=7)
|
||||
# feedback = {"elephant-alpha": 0.85, "gemini-pro": 0.92}
|
||||
# 若 provider name 有 match,乘 success_rate(不低於 0.1)
|
||||
if feedback:
|
||||
for p in providers: # 假設 providers 是 list[ProviderCandidate]
|
||||
key = p.name.lower()
|
||||
rate = next((v for k, v in feedback.items()
|
||||
if key in k.lower()), None)
|
||||
if rate is not None:
|
||||
p.weight *= max(rate, 0.1)
|
||||
except Exception:
|
||||
logger.debug("ai_router: feedback hook skipped (non-critical)")
|
||||
# 🆕 盲點 3:USE_AIDER_FEEDBACK flag 保護首次 rollout
|
||||
from apps.api.src.core.config import settings
|
||||
if getattr(settings, "USE_AIDER_FEEDBACK", False):
|
||||
try:
|
||||
feedback = await self.feedback_from_aider_events(days=7)
|
||||
# feedback = {"elephant-alpha": 0.85, "gemini-pro": 0.92}
|
||||
if feedback:
|
||||
for p in providers: # 假設 providers 是 list[ProviderCandidate]
|
||||
key = p.name.lower()
|
||||
rate = next((v for k, v in feedback.items()
|
||||
if key in k.lower()), None)
|
||||
if rate is not None:
|
||||
p.weight *= max(rate, 0.1) # 最低 10% 保底
|
||||
logger.info("ai_router: aider feedback applied (%d models)", len(feedback))
|
||||
except Exception:
|
||||
logger.debug("ai_router: feedback hook skipped (non-critical)")
|
||||
else:
|
||||
logger.debug("ai_router: feedback disabled via USE_AIDER_FEEDBACK=false")
|
||||
```
|
||||
|
||||
**注意**:實際 `providers` 與 `p.weight` 命名需依 route() 當下代碼調整;若 route() 內部用不同資料結構,按它的慣例接。不要大改 route() 邏輯 — 只加這個 try block。
|
||||
**注意**:
|
||||
1. 實際 `providers` 與 `p.weight` 命名需依 route() 當下代碼調整
|
||||
2. **預設 Flag 關**:上線後跑 7 天真實 aider 資料,比對 `ai_router` 決策是否合理(SignOz trace 差異 < 5%)再打開 Flag
|
||||
3. 若 Flag 關 → 該 hook 完全不影響既有 route 決策 — zero-risk rollout
|
||||
|
||||
- [ ] **Step 4: 寫最小測試(若 route 難測,測 feedback method 即可)**
|
||||
|
||||
@@ -1088,7 +1201,7 @@ git commit -m "feat(ai_router): feedback_from_aider_events hook (Phase 24 ADR-05
|
||||
- Modify: `apps/api/src/core/config.py`
|
||||
- Modify: `apps/api/src/main.py`
|
||||
|
||||
- [ ] **Step 1: config.py 加 3 個 settings**
|
||||
- [ ] **Step 1: config.py 加 4 個 settings(🆕 盲點 3 新增 USE_AIDER_FEEDBACK)**
|
||||
|
||||
找到 `class Settings(BaseSettings)` 或類似,加:
|
||||
|
||||
@@ -1097,6 +1210,7 @@ git commit -m "feat(ai_router): feedback_from_aider_events hook (Phase 24 ADR-05
|
||||
AIDER_WEBHOOK_SECRET: str = ""
|
||||
AIDER_EVENTS_STREAM_KEY: str = "signals:aider:events"
|
||||
AIDER_PATTERN_EXTRACT_INTERVAL_HOURS: float = 24.0
|
||||
USE_AIDER_FEEDBACK: bool = False # 🆕 預設關;灰度 7 天後手動開
|
||||
```
|
||||
|
||||
- [ ] **Step 2: main.py lifespan 註冊 aider job**
|
||||
@@ -1720,6 +1834,18 @@ git commit -m "feat(client): aiderw main wrapper + CLI (doctor/flush)"
|
||||
</dict></plist>
|
||||
```
|
||||
|
||||
**🆕 盲點 1(Mac 外網可達性)— 三選一配置**:
|
||||
|
||||
安裝前,統帥需先決定 `AIDER_API_URL` 的來源:
|
||||
|
||||
| 方式 | URL 範例 | 前置工作 |
|
||||
|------|---------|---------|
|
||||
| **a.** awoooi nginx 公網 TLS | `https://awoooi.example/api/v1/aider/events` | awoooi nginx 加一條 location block,proxy_pass 到 K3s svc 32334;enable cert |
|
||||
| **b.** Tailscale(若已裝) | `http://<tailscale-100-ip>:32334/api/v1/aider/events` | tailscale up;確認 awoooi 188 host 也在 Tailnet |
|
||||
| **c.** VPN(WireGuard) | `http://192.168.0.120:32334/api/v1/aider/events` | WireGuard peer config |
|
||||
|
||||
裝 install.sh 前,寫 `~/.aider-watch.env` 填入選定的 URL。
|
||||
|
||||
- [ ] **Step 2: 寫 install.sh**
|
||||
|
||||
```bash
|
||||
|
||||
@@ -291,14 +291,38 @@ def feedback_from_aider_events(repo: str, task_kind: str) -> dict[str, float]:
|
||||
| 失敗情境 | 處理 | 結果 |
|
||||
|---------|------|------|
|
||||
| **AWOOOI API 不可達** | Mac client 寫 `~/aider-watch/buffer/*.jsonl`;launchd flush job 每 5min 重試 | 不丟資料 |
|
||||
| **Mac 離開內網**(🆕 盲點 1) | `AIDER_API_URL` 走 awoooi nginx 公網 endpoint(`https://awoooi.example/api/v1/aider/events`)或 VPN/Tailscale;buffer 作備援 | 離家/出差仍可用 |
|
||||
| **HMAC 驗證失敗** | API 回 401,記 `audit_log`;Mac client 留 buffer 等人工介入 | 防偽造 |
|
||||
| **Redis stream 滿** | 自動 trim 10k 筆上限;溢位寫 `audit_log` warning | 不阻塞 API |
|
||||
| **aider_event_processor 掛** | 既有 main.py lifespan supervisor 會重啟;消費位置從 Redis stream consumer group 恢復 | 自癒 |
|
||||
| **incident_service 建 incident 失敗** | 降級:只寫 aider_events 表,不建 incident;TG 不推(可接受,event 還在) | 不丟資料 |
|
||||
| **incident 洗版**(🆕 盲點 2):一 session 連續多 error 建多個 incident | service 層加 **coalesce 機制**:同 `session_id` + `aider_activity` 在 60s 窗口內只建 1 個 incident,後續 error 合進同一 incident 的 signals 陣列 | TG 不被 aider 噪音淹沒 |
|
||||
| **AI Router feedback 首次 rollout 風險**(🆕 盲點 3) | 加 `USE_AIDER_FEEDBACK` flag 預設 `false`;灰度驗證(跑 7 天真實 aider 資料、比對 `ai_router.route()` 決策差異)再開 `true` | 防止 aider 資料牽連其他 route 決策 |
|
||||
| **secret 外洩** | Mac client 進 buffer 前先 redact;server 再 redact 一次(depth-in) | 雙層防線 |
|
||||
| **wrapper 崩潰** | aiderw 外層 try/except 全包;atexit 補 session_end | aider 照跑 |
|
||||
| **aider 本身崩潰** | wrapper 收 SIGCHLD → 送 error event + session_end exit≠0 | 死得清楚 |
|
||||
|
||||
**🆕 §8b Mac client 網路配置選項**(盲點 1 展開)
|
||||
|
||||
| 方式 | `AIDER_API_URL` 設值範例 | 優缺 |
|
||||
|------|------------------------|------|
|
||||
| **Tailscale**(如果統帥已裝) | `http://<tailscale-ip>:32334/...` | 零設定,encrypted;需 Tailscale 持續登入 |
|
||||
| **awoooi nginx 公網** | `https://<domain>/api/v1/aider/events` | 公網可達;需在 nginx 加 reverse proxy route 並 enable TLS |
|
||||
| **VPN(WireGuard)** | `http://192.168.0.120:32334/...` | 強隱私;需 VPN 持續連線 |
|
||||
|
||||
**預設採 awoooi nginx 公網** — awoooi 已有 nginx:443 on 188 host,增加一條 location block 即可;Mac 不需額外軟體。
|
||||
|
||||
**🆕 §8c `AWOOOI_PG_PW` 取得流程**(盲點 4)
|
||||
|
||||
A0 前置步驟:
|
||||
```bash
|
||||
# 從 K8s secret 取 PG 密碼(部署端一次性)
|
||||
kubectl get secret awoooi-secrets -n awoooi -o jsonpath='{.data.POSTGRES_PASSWORD}' | base64 -d > /tmp/pg_pw.txt
|
||||
export AWOOOI_PG_PW="$(cat /tmp/pg_pw.txt)"
|
||||
shred -u /tmp/pg_pw.txt # 不留痕
|
||||
```
|
||||
此步符合 `feedback_secrets_leak_incidents_2026-04-18.md` 零信任三層防線 L4(本機 shell env,不進檔)。
|
||||
|
||||
---
|
||||
|
||||
## §9 測試策略
|
||||
|
||||
Reference in New Issue
Block a user