AWOOOI CD
|
84a661beaf
|
chore(cd): deploy 6b93c8f [skip ci]
|
2026-05-05 16:11:35 +08:00 |
|
Your Name
|
6b93c8f454
|
fix(chat): route OpenClaw chat through Ollama lane
CD Pipeline / tests (push) Successful in 5m26s
Code Review / ai-code-review (push) Successful in 25s
CD Pipeline / build-and-deploy (push) Successful in 8m11s
CD Pipeline / post-deploy-checks (push) Has been cancelled
|
2026-05-05 15:57:26 +08:00 |
|
AWOOOI CD
|
3a17a860a0
|
chore(cd): deploy 1cc9de5 [skip ci]
|
2026-05-05 15:41:33 +08:00 |
|
Your Name
|
6ec5c06bad
|
docs(ops): record docker limit cleanup
|
2026-05-05 15:39:46 +08:00 |
|
Your Name
|
44d8322c4d
|
docs(ops): record live runner guardrail fix
|
2026-05-05 15:34:00 +08:00 |
|
Your Name
|
819734f655
|
docs(ops): record runner guardrail follow-up
|
2026-05-05 15:28:31 +08:00 |
|
Your Name
|
1cc9de5722
|
fix(ops): point runner guardrail alerts to host script
CD Pipeline / tests (push) Successful in 5m31s
Code Review / ai-code-review (push) Successful in 30s
Deploy Alert Rules / Deploy Prometheus Alert Rules (push) Successful in 37s
CD Pipeline / build-and-deploy (push) Successful in 7m45s
CD Pipeline / post-deploy-checks (push) Successful in 5m4s
|
2026-05-05 15:25:37 +08:00 |
|
Your Name
|
96c1ba20da
|
fix(ci): cap host-runner helper containers
Code Review / ai-code-review (push) Successful in 27s
|
2026-05-05 15:09:44 +08:00 |
|
Your Name
|
855a39ad95
|
docs(ops): record docker limit alert deploy
|
2026-05-05 15:06:47 +08:00 |
|
Your Name
|
209da7ba33
|
chore(ops): deploy docker limit alert image
CD Pipeline / tests (push) Successful in 5m24s
CD Pipeline / post-deploy-checks (push) Has been cancelled
CD Pipeline / build-and-deploy (push) Has been cancelled
|
2026-05-05 15:05:23 +08:00 |
|
Your Name
|
d08d1e4951
|
fix(ops): alert on missing docker resource limits
CD Pipeline / tests (push) Has been cancelled
CD Pipeline / build-and-deploy (push) Has been cancelled
CD Pipeline / post-deploy-checks (push) Has been cancelled
Code Review / ai-code-review (push) Successful in 23s
Deploy Alert Rules / Deploy Prometheus Alert Rules (push) Successful in 38s
|
2026-05-05 15:01:31 +08:00 |
|
Your Name
|
e24c8ea051
|
fix(ci): align B5 schema with tenant isolation
CD Pipeline / tests (push) Has been cancelled
CD Pipeline / build-and-deploy (push) Has been cancelled
CD Pipeline / post-deploy-checks (push) Has been cancelled
Code Review / ai-code-review (push) Has been cancelled
|
2026-05-05 15:00:07 +08:00 |
|
Your Name
|
72d66e4ae6
|
fix(ops): align stale job cleanup thresholds
Code Review / ai-code-review (push) Successful in 28s
Deploy Alert Rules / Deploy Prometheus Alert Rules (push) Successful in 36s
|
2026-05-05 14:54:17 +08:00 |
|
Your Name
|
5e625f777d
|
fix(ops): add stale gitea job cleanup guard
Code Review / ai-code-review (push) Has been cancelled
Deploy Alert Rules / Deploy Prometheus Alert Rules (push) Has been cancelled
|
2026-05-05 14:50:47 +08:00 |
|
Your Name
|
df72c77880
|
chore(ops): deploy stale gitea job alert image
CD Pipeline / tests (push) Successful in 5m29s
CD Pipeline / build-and-deploy (push) Has been cancelled
CD Pipeline / post-deploy-checks (push) Has been cancelled
|
2026-05-05 14:43:53 +08:00 |
|
Your Name
|
7d45f0cb58
|
fix(ops): alert on stale gitea actions jobs
CD Pipeline / tests (push) Has been cancelled
CD Pipeline / build-and-deploy (push) Has been cancelled
CD Pipeline / post-deploy-checks (push) Has been cancelled
Code Review / ai-code-review (push) Has been cancelled
Deploy Alert Rules / Deploy Prometheus Alert Rules (push) Has been cancelled
|
2026-05-05 14:42:09 +08:00 |
|
Your Name
|
fc1a6196df
|
fix(code-review): keep Gemini fallback opt-in
CD Pipeline / tests (push) Successful in 2m2s
Code Review / ai-code-review (push) Successful in 27s
CD Pipeline / build-and-deploy (push) Has been cancelled
CD Pipeline / post-deploy-checks (push) Has been cancelled
|
2026-05-05 14:38:44 +08:00 |
|
Your Name
|
3b73cc7f94
|
fix(ci): avoid cd on workflow-only changes
Code Review / ai-code-review (push) Has been cancelled
|
2026-05-05 14:37:31 +08:00 |
|
Your Name
|
96b860dc2c
|
docs(ops): record ci stale-run guard
|
2026-05-05 14:35:24 +08:00 |
|
Your Name
|
2e128f90db
|
fix(ci): skip stale code review runs
Code Review / ai-code-review (push) Has been cancelled
CD Pipeline / tests (push) Has been cancelled
CD Pipeline / build-and-deploy (push) Has been cancelled
CD Pipeline / post-deploy-checks (push) Has been cancelled
|
2026-05-05 14:35:09 +08:00 |
|
Your Name
|
228768ff68
|
docs(ops): record host baseline follow-up
|
2026-05-05 14:31:59 +08:00 |
|
Your Name
|
ab0f0a8a62
|
chore(ops): deploy runner classification image
CD Pipeline / tests (push) Successful in 2m35s
CD Pipeline / build-and-deploy (push) Has been cancelled
CD Pipeline / post-deploy-checks (push) Has been cancelled
Code Review / ai-code-review (push) Successful in 26s
|
2026-05-05 14:29:55 +08:00 |
|
Your Name
|
0e14935351
|
fix(ops): classify systemd runner alerts as host resources
CD Pipeline / tests (push) Has been cancelled
CD Pipeline / build-and-deploy (push) Has been cancelled
CD Pipeline / post-deploy-checks (push) Has been cancelled
Code Review / ai-code-review (push) Has been cancelled
|
2026-05-05 14:28:18 +08:00 |
|
Your Name
|
a5192d4e03
|
chore(ops): deploy runner alert routing image
CD Pipeline / build-and-deploy (push) Has been cancelled
CD Pipeline / post-deploy-checks (push) Has been cancelled
CD Pipeline / tests (push) Has been cancelled
Code Review / ai-code-review (push) Has been cancelled
|
2026-05-05 14:21:17 +08:00 |
|
Your Name
|
34d1c76be9
|
fix(ops): route systemd runner baseline alerts
CD Pipeline / build-and-deploy (push) Has been cancelled
CD Pipeline / post-deploy-checks (push) Has been cancelled
CD Pipeline / tests (push) Has been cancelled
Code Review / ai-code-review (push) Has been cancelled
|
2026-05-05 14:19:58 +08:00 |
|
Your Name
|
2b93975d37
|
chore(ops): deploy systemd runner baseline image
CD Pipeline / tests (push) Successful in 2m6s
Code Review / ai-code-review (push) Successful in 26s
CD Pipeline / post-deploy-checks (push) Has been cancelled
CD Pipeline / build-and-deploy (push) Has been cancelled
|
2026-05-05 14:12:30 +08:00 |
|
Your Name
|
fe618960a8
|
fix(ops): monitor systemd runners in host baseline
CD Pipeline / build-and-deploy (push) Has been cancelled
CD Pipeline / tests (push) Has been cancelled
CD Pipeline / post-deploy-checks (push) Has been cancelled
Code Review / ai-code-review (push) Has been cancelled
Deploy Alert Rules / Deploy Prometheus Alert Rules (push) Successful in 39s
|
2026-05-05 14:08:43 +08:00 |
|
Your Name
|
8e22110030
|
fix(governance): keep trust drift watchdog on governance agent
CD Pipeline / tests (push) Successful in 2m51s
Code Review / ai-code-review (push) Successful in 24s
CD Pipeline / build-and-deploy (push) Has started running
CD Pipeline / post-deploy-checks (push) Has been cancelled
|
2026-05-05 14:00:13 +08:00 |
|
Your Name
|
2ff0ef3bb6
|
fix(openclaw): route legacy ollama through failover endpoints
CD Pipeline / tests (push) Failing after 1m49s
CD Pipeline / build-and-deploy (push) Has been skipped
CD Pipeline / post-deploy-checks (push) Has been skipped
Code Review / ai-code-review (push) Successful in 24s
|
2026-05-05 13:55:52 +08:00 |
|
Your Name
|
bb1995f349
|
fix(awooop): use naive utc for run lease timestamps
CD Pipeline / tests (push) Failing after 1m48s
CD Pipeline / build-and-deploy (push) Has been skipped
CD Pipeline / post-deploy-checks (push) Has been skipped
Code Review / ai-code-review (push) Has been cancelled
|
2026-05-05 13:53:07 +08:00 |
|
Your Name
|
e8e6748f70
|
fix(ops): add docker host resource baseline guardrails
CD Pipeline / tests (push) Failing after 1m50s
CD Pipeline / build-and-deploy (push) Has been skipped
CD Pipeline / post-deploy-checks (push) Has been skipped
Code Review / ai-code-review (push) Successful in 25s
Deploy Alert Rules / Deploy Prometheus Alert Rules (push) Successful in 38s
|
2026-05-05 13:45:09 +08:00 |
|
Your Name
|
a57e3d3d75
|
test(consensus): expect redis namespace dual write
CD Pipeline / tests (push) Has been cancelled
CD Pipeline / build-and-deploy (push) Has been cancelled
CD Pipeline / post-deploy-checks (push) Has been cancelled
Code Review / ai-code-review (push) Has been cancelled
|
2026-05-05 13:41:41 +08:00 |
|
Your Name
|
b00a7b050a
|
test(ollama): align inference connect errors with degraded health
CD Pipeline / tests (push) Failing after 2m26s
CD Pipeline / build-and-deploy (push) Has been skipped
CD Pipeline / post-deploy-checks (push) Has been skipped
Code Review / ai-code-review (push) Successful in 28s
|
2026-05-05 13:34:19 +08:00 |
|
Your Name
|
506744ba3a
|
test(ollama): keep slow gcp primary on ollama
CD Pipeline / tests (push) Failing after 2m21s
CD Pipeline / build-and-deploy (push) Has been skipped
CD Pipeline / post-deploy-checks (push) Has been skipped
Code Review / ai-code-review (push) Successful in 26s
|
2026-05-05 13:29:27 +08:00 |
|
Your Name
|
869646459c
|
fix(ollama): treat legacy primary as ollama
CD Pipeline / tests (push) Failing after 1m48s
CD Pipeline / build-and-deploy (push) Has been skipped
CD Pipeline / post-deploy-checks (push) Has been skipped
Code Review / ai-code-review (push) Successful in 28s
|
2026-05-05 13:25:27 +08:00 |
|
Your Name
|
33d4326cce
|
test(ollama): align slow recovery with gcp routing policy
CD Pipeline / tests (push) Failing after 1m51s
CD Pipeline / build-and-deploy (push) Has been skipped
CD Pipeline / post-deploy-checks (push) Has been skipped
Code Review / ai-code-review (push) Successful in 33s
|
2026-05-05 13:21:16 +08:00 |
|
Your Name
|
b3d412f9eb
|
fix(cd): restore gitea workflow yaml parsing
CD Pipeline / tests (push) Failing after 2m20s
CD Pipeline / build-and-deploy (push) Has been skipped
CD Pipeline / post-deploy-checks (push) Has been skipped
Code Review / ai-code-review (push) Successful in 31s
|
2026-05-05 13:17:15 +08:00 |
|
Your Name
|
f78b1b0690
|
fix(ollama): honor provider endpoint selection
Code Review / ai-code-review (push) Successful in 37s
|
2026-05-05 13:14:46 +08:00 |
|
Your Name
|
0ebd0d8a92
|
fix(deploy): 緊急部署 API 2e17325c — governance skip cooldown + watchdog B4
Code Review / ai-code-review (push) Successful in 54s
CI cancel-in-progress 導致 CD 未執行,手動更新 kustomization.yaml。
包含修復:
- governance_dispatcher skip 路徑 cooldown(消除 30s 重複處理)
- watchdog B4 A2/A3/W6 三層修復(消除 META SYSTEM 重複告警)
- Operator Console leWOOOgo 積木化修復(e22b8e7)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-05-05 12:09:29 +08:00 |
|
Your Name
|
2e17325c3f
|
fix(ollama): 更新 failover_manager URL 註解反映 ADR-110 nginx proxy 拓撲
Code Review / ai-code-review (push) Successful in 43s
url_primary/secondary/tertiary 的 comment 還是舊版(ADR-110 前的 IP),
更新為 110:11435→GCP-A / 11436→GCP-B / 11437→Local111 nginx proxy 格式。
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-05-05 11:03:36 +08:00 |
|
Your Name
|
e22b8e7ab2
|
feat(awooop): Operator Console API + 前端(leWOOOgo 積木化修復)
Code Review / ai-code-review (push) Successful in 42s
後端:
- 新增 platform_operator_service.py(DB 存取集中 Service 層)
- Router 層移除 Depends(get_db),改呼叫 Service 函數
- tenants/contracts/operator_runs 三個 Router 符合 leWOOOgo 規範
- __init__.py 整合四個 platform router
前端:
- apps/web/src/app/[locale]/awooop/ 完整建立(7 個頁面)
- layout.tsx:四分頁導覽(tenants/contracts/runs/approvals)
- 全部使用 @/i18n/routing(Link/usePathname/useRouter)避免 i18n 路徑問題
- approvals page:10s 自動刷新、timeout 倒數、緊急紅色高亮
ADR-106/107/112/114/115/116
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-05-05 11:00:20 +08:00 |
|
Your Name
|
aa4ccec429
|
fix(watchdog): ADR-092 B4 — 三層修復消除 META SYSTEM 重複告警 + Ollama 路由強化
Code Review / ai-code-review (push) Successful in 7m16s
問題根因(debugger 全景徹查):
1. Prod 仍跑舊版代碼(ec013f66 後的修法未部署 → 告警字串仍含舊格式)
2. replicas=2 時 Pod 間 grace period 不共享 → violation_codes 分歧 → 不同 SHA256 → dedup 失效
3. 新 Pod 啟動立即執行 _check_once() → rollout 時多發一波
4. W6 violation_codes 含動態 low_count → count 微變繞過 dedup
修復(A2/A3/W6/C1/C2):
- A2:run_ai_slo_watchdog_loop 加 90s leading sleep,避免 rollout 立即觸發
- A3:_grace_active() 改為 Redis cluster-shared(watchdog:cluster_grace, ex=1800s, nx=True)
消除 Pod 間 grace period 不一致;Redis 故障時 fallback 為 process-local monotonic
- W6:violation_codes 移除動態 low_count,改為穩定 "W6:trust_drift"
- C1:ollama_auto_recovery.py recovered_host 改動態 label(依 URL port 判斷 GCP-A/B/Local)
- C2:ConfigMap OLLAMA_FALLBACK_URL 改走 110:11437 nginx proxy,三層容災統一架構
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-05-05 10:31:53 +08:00 |
|
Your Name
|
3f853accf2
|
fix(alerter): Ollama 恢復告警去重修復 — per-host key + 1h TTL
根因:
1. dedup_key 固定為 "alert:recovery",GCP-A 每 10min 健康閃爍就觸發重發
2. 三層容災下不同主機恢復共用同一個 key,互相污染
修法:
- dedup key 改為 "alert:recovery:{safe_host}",各主機獨立 dedup
- RECOVERY_DEDUP_TTL_SEC = 3600(1h),GCP 持續閃爍只報一次
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-05-05 01:22:01 +08:00 |
|
Your Name
|
d934242846
|
feat(infra): ADR-110 補齊 Local Fallback + 密碼 SSH 恢復工具
Ansible Lint / lint (push) Has been cancelled
|
2026-05-05 00:49:14 +08:00 |
|
Your Name
|
10e665a540
|
fix(watchdog): 修復 META SYSTEM 重複告警 — violation_codes 穩定 dedup
Code Review / ai-code-review (push) Successful in 1m3s
根因:violations 字串含動態浮點數(mean_trust/low_ratio),每次微變 → SHA256 不同 → dedup 失效
修法:新增 violation_codes list(穩定 W-code 格式),dedup 計算只用 violation_codes
violations 保持含動態值(顯示用),Telegram 通知照常顯示完整資訊
W-6 Trust Drift dedup key: W6:trust_drift:low_count={N}(不含浮點數)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-05-05 00:06:38 +08:00 |
|
Your Name
|
40badc42cf
|
fix(ollama): 恢復 GCP 優先路由(ADR-110 正式路由)
Code Review / ai-code-review (push) Successful in 54s
E2E Health Check / e2e-health (push) Successful in 2m59s
nginx proxy 架設完成後恢復原設計:
GCP-A (110:11435 → 34.143.170.20:11434) → primary
GCP-B (110:11436 → 34.21.145.224:11434) → secondary
111 (192.168.0.111:11434) → 兜底
OLLAMA_URL=http://192.168.0.110:11435
OLLAMA_SECONDARY_URL=http://192.168.0.110:11436
OLLAMA_FALLBACK_URL=http://192.168.0.111:11434
已用 kubectl set env 熱更新,不動 image tag。
兩台 GCP Ollama 均 200 OK(10 個模型各)。
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-05-04 23:37:42 +08:00 |
|
Your Name
|
ec013f662d
|
fix(watchdog): 修复 Trust Drift 重复告警 + 建立 GCP Ollama nginx proxy
Code Review / ai-code-review (push) Successful in 45s
Ansible Lint / lint (push) Has been cancelled
- ai_slo_watchdog_job: 改用 trust_drift_detector 纯统计 lib
避免与 governance_agent 每小时自检查重复触发 Telegram
- infra/ansible: 建立 110 nginx proxy 转发到 GCP-A/B
端口 11435 -> 34.143.170.20:11434 (GCP-A)
端口 11436 -> 34.21.145.224:11434 (GCP-B)
- docs/runbooks: DEPLOY-GCP-OLLAMA-PROXY.md 完整部署指南
- ops/nginx: 手动部署脚本供 110 直接执行
ADR-110 三层容灾启用前提:先部署 proxy,再改 ConfigMap
|
2026-05-04 23:12:35 +08:00 |
|
Your Name
|
a1b61289f5
|
fix(governance): 修復 skip 路徑無限迴圈 + MCP 評分偏低根因
Code Review / ai-code-review (push) Successful in 59s
根因一:GovernanceDispatcher skip 決策後未記錄任何狀態
- 事件永遠 resolved=False → 每 30s 重撈 → 每輪呼叫 LLM + Prometheus
- 4437 筆 stale 事件積壓,導致 governance_fusion_complete 每 20s 狂刷
修復:
1. Redis 90min 冷卻鍵(governance:skip:{event_id})防止重複 LLM 呼叫
2. 超過 2h 的 stale skip 事件自動標記 resolved=True
3. 直接 bulk-resolve 4437 筆 stale 事件 + 預設 105 筆冷卻鍵
根因二:MCP 評分 0.2 硬地板
- SLI recording rules 尚未在 Prometheus 生效 → result_list=[] → success_count=0
- 公式 0.2 + 0.7*0 = 0.2,融合信心度永遠 < 0.65 threshold
修復:
- 空結果(no_data)≠ MCP 故障,改給 0.5 中性貢獻
- 新公式:weighted = success_count + 0.5 * no_data_count;score = 0.2 + 0.7*(weighted/total)
- MCP 全無資料時:0.2 + 0.7*0.5 = 0.55(而非 0.2)
順帶修正 _score_llm 中過時的 GCP-A fallback URL 註解(實際已走 settings)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-05-04 20:00:54 +08:00 |
|
Your Name
|
45f6f17558
|
fix(watchdog): dedup hash 非確定性 bug — 改用 hashlib.sha256 + setnx atomic
Code Review / ai-code-review (push) Successful in 56s
根因:Python 內建 hash() 受 PYTHONHASHSEED 影響,每次 process 重啟值不同。
每次 kubectl rollout restart → 新 pod 算出不同 dedup_hash → 繞過 1h TTL → 洗版。
症狀:連續 rollout 4-5 次後,META SYSTEM 每分鐘一條狂發(19:39/40/41/42 截圖)。
修法:
1. hash() → hashlib.sha256(content.encode()).hexdigest()[:12](跨 pod/重啟確定性)
2. redis.exists+setex → redis.set(nx=True) atomic setnx(防多 replica 並發多發)
2026-05-04 ogt
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-05-04 19:47:42 +08:00 |
|
Your Name
|
00bc3b0cc9
|
docs(awooop): 補 12-agent-game-rules.md ADR-106/107 關聯連結
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-05-04 19:33:48 +08:00 |
|