AWOOOI CD
|
56b4d8165b
|
chore(cd): deploy c696b99 [skip ci]
|
2026-05-06 13:10:34 +08:00 |
|
OG T
|
c696b99ccf
|
fix(awooop): authenticate approval decisions
Code Review / ai-code-review (push) Successful in 11s
CD Pipeline / tests (push) Successful in 1m3s
CD Pipeline / build-and-deploy (push) Successful in 3m28s
CD Pipeline / post-deploy-checks (push) Successful in 1m25s
|
2026-05-06 13:05:51 +08:00 |
|
AWOOOI CD
|
072cc23a42
|
chore(cd): deploy 682c0b9 [skip ci]
|
2026-05-06 12:51:20 +08:00 |
|
AWOOOI CD
|
96ad3a18ee
|
chore(cd): deploy 9ef9633 [skip ci]
|
2026-05-06 12:42:30 +08:00 |
|
Your Name
|
9ef9633aff
|
fix(alerts): bypass proxy timeout for GCP Ollama
|
2026-05-06 08:55:14 +08:00 |
|
AWOOOI CD
|
df5e6c6626
|
chore(cd): deploy d2aebdd [skip ci]
|
2026-05-06 07:33:25 +08:00 |
|
Your Name
|
09256be62c
|
fix(rag): use bge embeddings on GCP Ollama lane
Code Review / ai-code-review (push) Successful in 11s
CD Pipeline / tests (push) Successful in 1m22s
CD Pipeline / build-and-deploy (push) Failing after 2h14m5s
CD Pipeline / post-deploy-checks (push) Has been cancelled
|
2026-05-06 05:49:37 +08:00 |
|
AWOOOI CD
|
a4fece11cc
|
chore(cd): deploy c2c0b1e [skip ci]
|
2026-05-06 05:32:51 +08:00 |
|
Your Name
|
c2c0b1ec82
|
fix(alerts): let GCP Ollama finish before cloud fallback
Code Review / ai-code-review (push) Successful in 10s
CD Pipeline / tests (push) Successful in 1m9s
CD Pipeline / build-and-deploy (push) Successful in 4m21s
CD Pipeline / post-deploy-checks (push) Successful in 1m16s
|
2026-05-06 05:27:55 +08:00 |
|
AWOOOI CD
|
1d0e80c091
|
chore(cd): deploy 3b64d66 [skip ci]
|
2026-05-06 03:38:45 +08:00 |
|
AWOOOI CD
|
eced8617d3
|
chore(cd): deploy a2c4b3d [skip ci]
|
2026-05-06 00:53:15 +08:00 |
|
AWOOOI CD
|
cb9551fb00
|
chore(cd): deploy 5ed396e [skip ci]
|
2026-05-06 00:24:17 +08:00 |
|
AWOOOI CD
|
87ce02f34d
|
chore(cd): deploy 2aa31c2 [skip ci]
|
2026-05-06 00:10:42 +08:00 |
|
Your Name
|
2aa31c205a
|
fix(ai): require 111 before alert cloud fallback
CD Pipeline / tests (push) Successful in 54s
Code Review / ai-code-review (push) Successful in 10s
CD Pipeline / build-and-deploy (push) Successful in 3m21s
CD Pipeline / post-deploy-checks (push) Successful in 2m2s
|
2026-05-06 00:05:51 +08:00 |
|
AWOOOI CD
|
25b1923d2e
|
chore(cd): deploy e208798 [skip ci]
|
2026-05-05 23:44:08 +08:00 |
|
AWOOOI CD
|
1ba36697ca
|
chore(cd): deploy 405b8b8 [skip ci]
|
2026-05-05 23:34:17 +08:00 |
|
Your Name
|
405b8b8ef9
|
fix(ops): bring drift scanner under gitops
CD Pipeline / tests (push) Successful in 59s
Code Review / ai-code-review (push) Successful in 11s
CD Pipeline / build-and-deploy (push) Successful in 8m52s
CD Pipeline / post-deploy-checks (push) Has been cancelled
|
2026-05-05 23:20:12 +08:00 |
|
Your Name
|
1cc215ec30
|
fix(ops): keep Ollama health checks on alert fast model
CD Pipeline / tests (push) Successful in 52s
Code Review / ai-code-review (push) Successful in 9s
CD Pipeline / post-deploy-checks (push) Has been cancelled
CD Pipeline / build-and-deploy (push) Has been cancelled
|
2026-05-05 23:16:21 +08:00 |
|
AWOOOI CD
|
83daeb3f87
|
chore(cd): deploy c4854bb [skip ci]
|
2026-05-05 23:10:29 +08:00 |
|
AWOOOI CD
|
7baa316224
|
chore(cd): deploy e8f2792 [skip ci]
|
2026-05-05 22:48:02 +08:00 |
|
Your Name
|
bf847ad045
|
fix(ai): stabilize GCP Ollama alert lane
Code Review / ai-code-review (push) Successful in 10s
CD Pipeline / build-and-deploy (push) Has been cancelled
CD Pipeline / post-deploy-checks (push) Has been cancelled
CD Pipeline / tests (push) Has been cancelled
|
2026-05-05 22:20:27 +08:00 |
|
Your Name
|
a4e9a04982
|
fix(ops): harden cold-start schedule recovery
Code Review / ai-code-review (push) Successful in 10s
run-migration / migrate (push) Successful in 7s
CD Pipeline / build-and-deploy (push) Has been cancelled
CD Pipeline / post-deploy-checks (push) Has been cancelled
CD Pipeline / tests (push) Has been cancelled
|
2026-05-05 22:17:10 +08:00 |
|
AWOOOI CD
|
72a1d33f9d
|
chore(cd): deploy bec8212 [skip ci]
|
2026-05-05 21:59:52 +08:00 |
|
Your Name
|
333c8a9cfd
|
fix(cd): target k3s control plane for deploy
CD Pipeline / tests (push) Failing after 1s
CD Pipeline / build-and-deploy (push) Has been skipped
CD Pipeline / post-deploy-checks (push) Has been skipped
Code Review / ai-code-review (push) Successful in 10s
|
2026-05-05 21:21:00 +08:00 |
|
Your Name
|
1baeb7ee61
|
chore(cd): deploy ee5e3bc [skip ci]
|
2026-05-05 21:09:09 +08:00 |
|
AWOOOI CD
|
7b0a4bce98
|
chore(cd): deploy 2221fd3 [skip ci]
|
2026-05-05 16:26:09 +08:00 |
|
AWOOOI CD
|
84a661beaf
|
chore(cd): deploy 6b93c8f [skip ci]
|
2026-05-05 16:11:35 +08:00 |
|
AWOOOI CD
|
3a17a860a0
|
chore(cd): deploy 1cc9de5 [skip ci]
|
2026-05-05 15:41:33 +08:00 |
|
Your Name
|
209da7ba33
|
chore(ops): deploy docker limit alert image
CD Pipeline / tests (push) Successful in 5m24s
CD Pipeline / post-deploy-checks (push) Has been cancelled
CD Pipeline / build-and-deploy (push) Has been cancelled
|
2026-05-05 15:05:23 +08:00 |
|
Your Name
|
df72c77880
|
chore(ops): deploy stale gitea job alert image
CD Pipeline / tests (push) Successful in 5m29s
CD Pipeline / build-and-deploy (push) Has been cancelled
CD Pipeline / post-deploy-checks (push) Has been cancelled
|
2026-05-05 14:43:53 +08:00 |
|
Your Name
|
ab0f0a8a62
|
chore(ops): deploy runner classification image
CD Pipeline / tests (push) Successful in 2m35s
CD Pipeline / build-and-deploy (push) Has been cancelled
CD Pipeline / post-deploy-checks (push) Has been cancelled
Code Review / ai-code-review (push) Successful in 26s
|
2026-05-05 14:29:55 +08:00 |
|
Your Name
|
a5192d4e03
|
chore(ops): deploy runner alert routing image
CD Pipeline / build-and-deploy (push) Has been cancelled
CD Pipeline / post-deploy-checks (push) Has been cancelled
CD Pipeline / tests (push) Has been cancelled
Code Review / ai-code-review (push) Has been cancelled
|
2026-05-05 14:21:17 +08:00 |
|
Your Name
|
2b93975d37
|
chore(ops): deploy systemd runner baseline image
CD Pipeline / tests (push) Successful in 2m6s
Code Review / ai-code-review (push) Successful in 26s
CD Pipeline / post-deploy-checks (push) Has been cancelled
CD Pipeline / build-and-deploy (push) Has been cancelled
|
2026-05-05 14:12:30 +08:00 |
|
Your Name
|
bb1995f349
|
fix(awooop): use naive utc for run lease timestamps
CD Pipeline / tests (push) Failing after 1m48s
CD Pipeline / build-and-deploy (push) Has been skipped
CD Pipeline / post-deploy-checks (push) Has been skipped
Code Review / ai-code-review (push) Has been cancelled
|
2026-05-05 13:53:07 +08:00 |
|
Your Name
|
e8e6748f70
|
fix(ops): add docker host resource baseline guardrails
CD Pipeline / tests (push) Failing after 1m50s
CD Pipeline / build-and-deploy (push) Has been skipped
CD Pipeline / post-deploy-checks (push) Has been skipped
Code Review / ai-code-review (push) Successful in 25s
Deploy Alert Rules / Deploy Prometheus Alert Rules (push) Successful in 38s
|
2026-05-05 13:45:09 +08:00 |
|
Your Name
|
0ebd0d8a92
|
fix(deploy): 緊急部署 API 2e17325c — governance skip cooldown + watchdog B4
Code Review / ai-code-review (push) Successful in 54s
CI cancel-in-progress 導致 CD 未執行,手動更新 kustomization.yaml。
包含修復:
- governance_dispatcher skip 路徑 cooldown(消除 30s 重複處理)
- watchdog B4 A2/A3/W6 三層修復(消除 META SYSTEM 重複告警)
- Operator Console leWOOOgo 積木化修復(e22b8e7)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-05-05 12:09:29 +08:00 |
|
Your Name
|
aa4ccec429
|
fix(watchdog): ADR-092 B4 — 三層修復消除 META SYSTEM 重複告警 + Ollama 路由強化
Code Review / ai-code-review (push) Successful in 7m16s
問題根因(debugger 全景徹查):
1. Prod 仍跑舊版代碼(ec013f66 後的修法未部署 → 告警字串仍含舊格式)
2. replicas=2 時 Pod 間 grace period 不共享 → violation_codes 分歧 → 不同 SHA256 → dedup 失效
3. 新 Pod 啟動立即執行 _check_once() → rollout 時多發一波
4. W6 violation_codes 含動態 low_count → count 微變繞過 dedup
修復(A2/A3/W6/C1/C2):
- A2:run_ai_slo_watchdog_loop 加 90s leading sleep,避免 rollout 立即觸發
- A3:_grace_active() 改為 Redis cluster-shared(watchdog:cluster_grace, ex=1800s, nx=True)
消除 Pod 間 grace period 不一致;Redis 故障時 fallback 為 process-local monotonic
- W6:violation_codes 移除動態 low_count,改為穩定 "W6:trust_drift"
- C1:ollama_auto_recovery.py recovered_host 改動態 label(依 URL port 判斷 GCP-A/B/Local)
- C2:ConfigMap OLLAMA_FALLBACK_URL 改走 110:11437 nginx proxy,三層容災統一架構
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-05-05 10:31:53 +08:00 |
|
Your Name
|
40badc42cf
|
fix(ollama): 恢復 GCP 優先路由(ADR-110 正式路由)
Code Review / ai-code-review (push) Successful in 54s
E2E Health Check / e2e-health (push) Successful in 2m59s
nginx proxy 架設完成後恢復原設計:
GCP-A (110:11435 → 34.143.170.20:11434) → primary
GCP-B (110:11436 → 34.21.145.224:11434) → secondary
111 (192.168.0.111:11434) → 兜底
OLLAMA_URL=http://192.168.0.110:11435
OLLAMA_SECONDARY_URL=http://192.168.0.110:11436
OLLAMA_FALLBACK_URL=http://192.168.0.111:11434
已用 kubectl set env 熱更新,不動 image tag。
兩台 GCP Ollama 均 200 OK(10 個模型各)。
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-05-04 23:37:42 +08:00 |
|
Your Name
|
8629ac709b
|
feat(awooop): Phase 1-8 完整實作 — AwoooP Agent Platform 六平面架構
run-migration / migrate (push) Failing after 59s
Code Review / ai-code-review (push) Successful in 1m8s
Type Sync Check / check-type-sync (push) Successful in 2m27s
## Phase 1-3: Control Plane + Contract System
- awooop_phase1_control_plane_2026-05-04.sql: 12 張核心表 + RLS
- awooop_phase1_batch1_rls_2026-05-04.sql: 全部 FORCE RLS + GRANT
- packages/awooop-contracts/: 六合約 JSON Schema + golden fixtures
- src/models/awooop_contracts.py: Pydantic v2 contract models(extra=forbid)
- src/repositories/contract_repository.py: contract lifecycle(draft→published→active)
- src/services/contract_service.py: HMAC publish sig + Redis multi-sig activate
- src/services/schema_validator.py: LLM output validator(retry×3, E-SCHEMA-001)
## Phase 2: Tenant Isolation
- awooop_phase2_budget_ledger_2026-05-04.sql: budget_ledger + RLS
- src/services/budget_service.py: Token Budget Hard Kill 三層防線
- src/core/context.py: PROJECT_ID ContextVar(31 background loop 自動繼承)
- src/db/base.py + models.py: project_id 欄位 + RLS set_config 注入
- src/hermes/nl_gateway.py: project_id Redis key 前綴(Phase A 雙寫)
- src/services/anomaly_counter.py: per-project 改造(Phase A fallback)
## Phase 4: Platform Shell in Shadow Mode
- awooop_phase4_run_state_2026-05-04.sql: run_state + step_journal + idempotency
- src/services/run_state_machine.py: 8-state FSM + SKIP LOCKED + stale reaper
- src/services/platform_runtime.py: UUID v7 + W3C trace_id + shadow_execute
- src/services/audit_sink.py: PII/secret redaction 9 patterns
- src/api/v1/platform/runs.py: POST/GET /v1/platform/runs(Router→Service 架構)
- src/workers/platform_worker.py: SKIP LOCKED worker + heartbeat + reaper loop
- src/main.py: platform router + lifespan worker start/stop
## Phase 5: MCP Gateway 五閘門
- awooop_phase5_mcp_gateway_2026-05-04.sql: 4 表 + RLS
- src/plugins/mcp/gateway.py: McpGateway(Gate 1~5, E-MCP-GATE-001~009)
- src/plugins/mcp/redaction_middleware.py: 雙層 redaction + 16K 截斷
- src/plugins/mcp/registry.py: __provider name mangling(ADR-116)
- src/plugins/mcp/credential_resolver.py: k8s secret ref 解析
- tests/test_mcp_credential_isolation.py: 10 個迴歸測試(secret leak 防再現)
## Phase 6-8: EwoooC + Channel Hub + Approval Token
- awooop_phase6_ewoooc_onboarding_2026-05-04.sql: ewoooc tenant + 4 read-only MCP tools
- awooop_phase7_channel_hub_2026-05-04.sql: conversation_event + outbound_message
- src/services/provider_proxy.py: ProviderProxy + PlatformEnvelope(ADR-115)
- src/services/channel_hub.py: Telegram inbound mirror + Progressive Feedback(30s)
- src/services/awooop_approval_token.py: HS256 + jti NX replay 防護 + suggest mode
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-05-04 19:31:53 +08:00 |
|
Your Name
|
0a90dab1e9
|
fix(ollama): ADR-110 修正 — 111 升 primary,failover log 改用動態 URL 標識
Code Review / ai-code-review (push) Successful in 56s
根因:K8s pods → GCP-A/B:11434 = connection refused(外網路由不通),
但 ConfigMap 把 GCP-A 設為 OLLAMA_URL(primary),導致容災鏈最終才輪到 111。
ConfigMap (04-configmap.yaml):
- OLLAMA_URL: GCP-A → 192.168.0.111(K8s 內網可達的 primary)
- OLLAMA_SECONDARY_URL: GCP-B → 34.143.170.20(GCP-A,保留待 nginx proxy 後恢復)
- OLLAMA_FALLBACK_URL: 111 → 34.21.145.224(GCP-B,保留待 nginx proxy 後恢復)
- 長期目標:110 架設 nginx proxy 轉發 GCP,ConfigMap 改指向 110:11435/11436
health.py (check_ollama):
- 改為三層輪查(primary → secondary → tertiary)
- primary up → "up";fallback up → "degraded";全掛 → "down"
- 不再只看 OLLAMA_URL 一台,反映實際路由可用狀態
ollama_failover_manager.py (_decide_route / select_provider):
- 變數名改為 url_primary/secondary/tertiary(原 gcp_a/gcp_b/local 與實際 URL 脫鉤)
- routing_reason 改用動態 IP label,不再硬編碼 "GCP-A"/"GCP-B"/"Local"
- _write_failover_audit failed_host 同步改用實際 URL
2026-05-04 ogt
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-05-04 19:17:07 +08:00 |
|
AWOOOI CD
|
035fe20e4d
|
chore(cd): deploy 0068440 [skip ci]
|
2026-05-03 23:45:12 +08:00 |
|
Your Name
|
b1ef05fa8c
|
feat(ollama): ADR-110 GCP 三層容災架構(GCP-A → GCP-B → Local → Gemini)
Code Review / ai-code-review (push) Successful in 50s
CD Pipeline / tests (push) Failing after 1m14s
CD Pipeline / build-and-deploy (push) Has been skipped
CD Pipeline / post-deploy-checks (push) Has been skipped
## 變更摘要
- Primary: http://34.143.170.20:11434 (GCP-A SSD, 9x 載速 + 2x 推理)
- Secondary: http://34.21.145.224:11434 (GCP-B SSD)
- Fallback: http://192.168.0.111:11434 (M1 Pro Local HDD,最後防線)
- 廢止 ADR-105「111 唯一鐵律」,新建 ADR-110
## 核心改動
- config.py: 新增 OLLAMA_SECONDARY_URL;validator 加 GCP IP 白名單(34.143.170.20, 34.21.145.224)
- ollama_failover_manager.py: 三層 Ollama 決策矩陣;並行健康檢查三台;health_111 → health_gcp_a
- ollama_health_monitor.py: host label 萃取改為通用版(支援 GCP 公網 IP)
- failover_alerter.py: 故障/恢復主機動態顯示,不再硬編碼「Ollama 111 (GPU)」
- ollama_auto_recovery.py: notify_recovery 改為 ollama_gcp_a;recovered_host 動態
- k8s/awoooi-prod: configmap + deployment + network-policy 同步更新(egress 加 GCP /32)
- 服務層: 10 個服務檔案硬編碼 192.168.0.111 改為讀 settings.OLLAMA_URL
- 測試: URL 常數更新,新增三層容災場景,GCP IP 白名單驗證測試
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
2026-05-03 22:49:23 +08:00 |
|
Your Name
|
577250a678
|
fix(governance): 修反消音化 W-3/W-4 守衛 + Prometheus 補資料缺失告警
Code Review / ai-code-review (push) Successful in 52s
CD Pipeline / tests (push) Failing after 2m21s
CD Pipeline / build-and-deploy (push) Has been skipped
CD Pipeline / post-deploy-checks (push) Has been skipped
Deploy Alert Rules / Deploy Prometheus Alert Rules (push) Successful in 1m6s
【統帥怒訓 — 違反 feedback_full_chain_first_then_fix.md 鐵律】
前次 commit f1362fcc 用 skip 條件把告警吞掉,是消音化解法:
- W-3:total_exec<10 永遠 skip → Redis 永遠空也不會告警
- W-4:playbooks total==0 永遠 skip → 表被清空也不會告警
- Prometheus NaN sentinel + 既有 < 0.1 規則疊加後沒任何路徑會告警
統帥怒訓「又把告警給消失了」「已經這樣做幾次了」。本 commit 救回告警可見性。
【修法 — 啟動 30 分鐘寬限 + 過期改打資料管線斷新告警】
- ai_slo_watchdog_job.py 新增模組層 _PROCESS_START 與 _grace_active() 守衛:
- W-3a:metric 有資料 + rate<0.30 → 既有「飛輪成功率過低」
- W-3b:rate=None 且 uptime>30min → 新告警「飛輪資料管線無流量」
- W-4a:playbooks total>0 + approved=0 → 既有「自動修復鏈路斷裂」
- W-4b:playbooks total=0 且 uptime>30min → 新告警「Playbook 表初始化失敗」
- 3 份 Prometheus rule(k8s/monitoring/flywheel-alerts.yaml、
ops/monitoring/alerts.yml、ops/monitoring/alerts-unified.yml)新增
FlywheelExecutionRateMissing:absent() 或 NaN 持續 30 分鐘 → 告警,
與 watchdog W-3b 雙保險
【已加入 memory】
feedback_silencing_alerts_recurring_violation.md 鎖入紅線鐵律:
「fresh deploy / init guard 用 skip 吞告警 = 結構性失職,必須分流寬限期 +
過期改打資料管線斷新告警」
【驗證】
106 個治理相關 unit test 全過:
test_trust_drift_watchdog / test_governance_agent / test_failover_alerter /
test_check_trust_drift_commit_outside_context_poc /
test_governance_remediation_dispatch / test_ai_governance_endpoints /
test_governance_dispatcher
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
2026-05-03 12:39:46 +08:00 |
|
Your Name
|
dedb12085b
|
chore(governance,watchdog): enrich alerts and enable prometheus multiproc
CD Pipeline / tests (push) Failing after 1m22s
CD Pipeline / build-and-deploy (push) Has been skipped
CD Pipeline / post-deploy-checks (push) Has been skipped
Code Review / ai-code-review (push) Successful in 43s
Deploy Alert Rules / Deploy Prometheus Alert Rules (push) Successful in 57s
|
2026-05-02 23:44:12 +08:00 |
|
AWOOOI CD
|
68e182381f
|
chore(cd): deploy da772a1 [skip ci]
|
2026-05-02 17:58:22 +08:00 |
|
AWOOOI CD
|
697e13b23a
|
chore(cd): deploy 297afb6 [skip ci]
|
2026-05-02 17:28:56 +08:00 |
|
AWOOOI CD
|
a6409c39e2
|
chore(cd): deploy b3a0f0d [skip ci]
|
2026-05-02 16:49:00 +08:00 |
|
AWOOOI CD
|
329849a559
|
chore(cd): deploy 7795f02 [skip ci]
|
2026-05-01 20:53:02 +08:00 |
|
AWOOOI CD
|
b72eac0712
|
chore(cd): deploy 433f7b0 [skip ci]
|
2026-05-01 17:08:42 +08:00 |
|
Your Name
|
433f7b068e
|
fix(aiops): close ssh and telegram remediation gaps
CD Pipeline / tests (push) Successful in 2m7s
Code Review / ai-code-review (push) Successful in 42s
CD Pipeline / build-and-deploy (push) Successful in 13m14s
CD Pipeline / post-deploy-checks (push) Successful in 4m29s
|
2026-05-01 16:53:02 +08:00 |
|