Your Name
017dba8b00
docs(argocd): codify health persistence config [skip ci]
2026-06-04 09:33:45 +08:00
OG T
6dc03c9a55
fix(argocd)+feat(flywheel): Phase 1 完成 — ArgoCD image 斷路修復 + 冷啟動腳本
...
CD Pipeline / build-and-deploy (push) Has been cancelled
1. k8s/argocd/awoooi-prod-app.yaml:
移除 Deployment image ignoreDifferences
- 原設計造成 CD 更新 kustomization.yaml 後 ArgoCD 不更新 image
- 修復後 GitOps 閉環恢復正常
2. scripts/cold_start_playbooks.py:
ADR-073 Phase 1 Step 8 — 生成 15 個基礎 Playbook (K8s/Docker/DB/Infra)
執行結果: Playbooks 0 → 15
3. scripts/batch_vectorize_km.py:
ADR-073 Phase 1 Step 9 — 批次向量化 KM
執行結果: 711/713 embedding IS NOT NULL
Phase 1 全部完成,飛輪已解封:
- Pod 運行 105998d(含 8be87b0 所有修復)
- debounce 30min + alertname NULL 修復 + _collect_mcp_context 啟用
- 15 Playbooks + 711 KM 向量化
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com >
2026-04-12 14:20:52 +08:00
OG T
7f4ec717ef
feat(gitops): Sprint B-2/B-3 — ArgoCD Application + CD GitOps 模式
...
B-2: k8s/argocd/awoooi-prod-app.yaml
- ArgoCD Application awoooi-prod 建立(已 apply 到 K8s)
- automated sync: prune + selfHeal
- ignoreDifferences: Deployment image + Secret data
- 全部 17 個 K8s 資源已確認 Synced
B-3: .gitea/workflows/cd.yaml — Deploy step 重寫
- 舊: kubectl set image(與 ArgoCD selfHeal 衝突)
- 新: kustomize edit set image → git commit [skip ci] → push → ArgoCD sync
- 新增等待 ArgoCD Synced + Healthy(最多 120s)
- 需建立 Gitea Secret: GITEA_CD_TOKEN(見 ADR-069)
docs/adr/ADR-069-infra-gitops-sprint-b.md
- 決策記錄:循環觸發防護 + ignoreDifferences 設計
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com >
2026-04-11 02:57:42 +08:00
OG T
dc7daf5d81
docs(monitoring): 更新 ArgoCD Metrics 端點文檔
...
- ArgoCD Server Pod 運行在 mon1 (192.168.0.121)
- 更新 Prometheus target 為 192.168.0.121:30883
- 標記配置已部署並驗證
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com >
2026-03-28 23:59:46 +08:00
OG T
e75e578547
feat(monitoring): P1/P2 改進 - ArgoCD Metrics + TLS 證書告警
...
## P1: ArgoCD Metrics
- 新增 ArgoCD Metrics NodePort (30882, 30883)
- 更新 NetworkPolicy 允許 Prometheus (188) 抓取
- 提供 Prometheus scrape config 範本
## P1: NetworkPolicy AI API
- 文檔標註 K8s NetworkPolicy 不支援 FQDN 限制
- 維持現有配置避免 AI 功能中斷
## P2: TLS 證書告警
- 新增 TLSCertExpiringIn30Days (30天預警)
- 新增 TLSCertExpiringIn7Days (7天緊急)
- 新增 TLSCertExpired (已過期)
- 新增 TLSProbeFailure (探測失敗)
## P2: Multi-Sig E2E 測試
- 標記為條件式執行 (API 不可用時自動跳過)
- 避免 CI/CD 因無法連接生產 API 而失敗
首席架構師審查: 2026-03-29 (台北時間)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com >
2026-03-28 23:48:57 +08:00