diff --git a/.agents/skills/04-awoooi-devops-commander.md b/.agents/skills/04-awoooi-devops-commander.md index 99920bdc..7f2c9894 100644 --- a/.agents/skills/04-awoooi-devops-commander.md +++ b/.agents/skills/04-awoooi-devops-commander.md @@ -179,6 +179,78 @@ metadata: - `๐Ÿ“` ็”จ้€”่ชชๆ˜Ž - `โš ๏ธ` ๆณจๆ„ไบ‹้ … +### ๐Ÿ”ด๐Ÿ”ด NetworkPolicy DNS ่ฆๅ‰‡ (2026-03-26) + +> **่ก€็š„ๆ•™่จ“**: DNS ่ฆๅ‰‡ๆจ™็ฑค้Œฏ่ชคๅฐŽ่‡ด 2 ๅคฉ็„กๅ‘Š่ญฆ๏ผ + +```yaml +# โŒ ้Œฏ่ชค: ไฝฟ็”จไธๅญ˜ๅœจ็š„ๆจ™็ฑค +- ports: + - port: 53 + to: + - podSelector: + matchLabels: + environment: prod # CoreDNS ๆฒ’ๆœ‰้€™ๅ€‹ๆจ™็ฑค๏ผ + k8s-app: kube-dns + system: awoooi # CoreDNS ๆฒ’ๆœ‰้€™ๅ€‹ๆจ™็ฑค๏ผ + +# โœ… ๆญฃ็ขบ: ไฝฟ็”จ namespace selector +- ports: + - port: 53 + protocol: UDP + - port: 53 + protocol: TCP + to: + - namespaceSelector: + matchLabels: + kubernetes.io/metadata.name: kube-system + podSelector: + matchLabels: + k8s-app: kube-dns +``` + +### ๐Ÿ”ด๐Ÿ”ด CoreDNS ไธŠๆธธ DNS ่จญๅฎš + +> **่ก€็š„ๆ•™่จ“**: ๅฎนๅ™จๅ…ง็„กๆณ•ไฝฟ็”จ 127.0.0.53 (systemd-resolved) + +```yaml +# โŒ ้Œฏ่ชค: ไฝฟ็”จ /etc/resolv.conf (ๆŒ‡ๅ‘ 127.0.0.53) +forward . /etc/resolv.conf + +# โœ… ๆญฃ็ขบ: ไฝฟ็”จ็œŸๅฏฆ DNS ไผบๆœๅ™จ +forward . 8.8.8.8 1.1.1.1 +``` + +--- + +## ๐Ÿ”ด๐Ÿ”ด๐Ÿ”ด ๅ‘Š่ญฆ้ˆ่ทฏ E2E ้ฉ—่ญ‰ (ADR-025) + +> **2026-03-26**: URL ่ทฏๅพ‘้Œฏ่ชคๅฐŽ่‡ด 2 ๅคฉ็„กๅ‘Š่ญฆ (`webhook` vs `webhooks`) + +### ้ƒจ็ฝฒๅพŒ Smoke Test (ๅผทๅˆถ) + +```bash +# ๆฏๆฌก้ƒจ็ฝฒๅพŒๅฟ…้ ˆๅŸท่กŒ +curl -s -X POST "$API_URL/api/v1/webhooks/alertmanager" \ + -H 'Content-Type: application/json' \ + -d '{"receiver":"smoke-test","status":"firing","alerts":[...]}' \ + | jq -e '.success == true' || exit 1 +``` + +### URL ่ทฏๅพ‘่ฆ็ฏ„ + +| ๆญฃ็ขบ | ้Œฏ่ชค | +|-----|------| +| `/api/v1/webhooks/alertmanager` | `/api/v1/webhook/alertmanager` | +| ่ค‡ๆ•ธๅฝขๅผ `webhooks` | ๅ–ฎๆ•ธๅฝขๅผ `webhook` | + +### Alertmanager ConfigMap ไฟฎๆ”นๆต็จ‹ + +1. ๆๅ– webhook URL +2. curl ๆธฌ่ฉฆ URL ๅฏ้”ๆ€ง +3. ๅฟ…้ ˆๆ”ถๅˆฐ 200 ๆˆ– 422 (ๆ ผๅผ้Œฏไฝ†็ซฏ้ปžๅญ˜ๅœจ) +4. ้ฉ—่ญ‰ๅคฑๆ•— โ†’ **้˜ปๆญข apply** + --- ## Turborepo ๅฟซๅ–ๅผทๅŒ–ๅ”่ญฐ diff --git a/.agents/skills/05-awoooi-sre-qa.md b/.agents/skills/05-awoooi-sre-qa.md index dd420b14..7f032817 100644 --- a/.agents/skills/05-awoooi-sre-qa.md +++ b/.agents/skills/05-awoooi-sre-qa.md @@ -10,7 +10,7 @@ | ๆฌ„ไฝ | ๅ€ผ | |------|-----| -| **็‰ˆๆœฌ** | v1.4 | +| **็‰ˆๆœฌ** | v1.5 | | **ๅปบ็ซ‹ๆ—ฅๆœŸ** | 2026-03-20 (ๅฐๅŒ—) | | **ๅปบ็ซ‹่€…** | Claude Code | | **ๆœ€ๅพŒไฟฎๆ”น** | 2026-03-26 03:30 (ๅฐๅŒ—) | @@ -25,6 +25,7 @@ | v1.2 | 2026-03-25 | Claude Code | ๅŠ ๅ…ฅๆ–‡ไปถ่ณ‡่จŠๅ€ๅกŠ | | v1.3 | 2026-03-26 | Claude Code | **Phase 15 ่ง€ๆธฌๆ€งๆธฌ่ฉฆ** | | v1.4 | 2026-03-26 | Claude Code | **Runner ๆฎญๅฑ้€ฒ็จ‹่จบๆ–ทๆต็จ‹** | +| v1.5 | 2026-03-26 | Claude Code | **LLM ๆธฌ่ฉฆ็ญ–็•ฅ** (้ฆ–ๅธญๆžถๆง‹ๅธซๅฏฉๆŸฅ P3) | --- @@ -114,6 +115,68 @@ cd apps/web && node scripts/verify-frontend.js --- +## ๐Ÿ”ด๐Ÿ”ด๐Ÿ”ด ๅ‘Š่ญฆ้ˆ่ทฏ E2E ้ฉ—่ญ‰ (2026-03-26 ADR-025) + +> **่ก€็š„ๆ•™่จ“**: URL ่ทฏๅพ‘้Œฏ่ชค (`webhook` vs `webhooks`) + DNS ่ฆๅ‰‡้Œฏ่ชค๏ผŒๅฐŽ่‡ด 2 ๅคฉ็„ก Telegram ๅ‘Š่ญฆ + +### ้ƒจ็ฝฒๅพŒ Smoke Test (ๅผทๅˆถ) + +**ไปปไฝ• API ๆˆ– Alertmanager ้ƒจ็ฝฒๅพŒๅฟ…้ ˆๅŸท่กŒ๏ผš** + +```bash +#!/bin/bash +# scripts/smoke-test-alert-chain.sh + +API_URL="${1:-http://192.168.0.120:32334}" + +echo "๐Ÿ”” Testing alert chain..." + +RESPONSE=$(curl -s -X POST "$API_URL/api/v1/webhooks/alertmanager" \ + -H 'Content-Type: application/json' \ + -d '{ + "receiver": "smoke-test", + "status": "firing", + "alerts": [{ + "status": "firing", + "labels": {"alertname": "SmokeTest", "severity": "info"}, + "annotations": {"summary": "Smoke test from CI"} + }] + }') + +# ้ฉ—่ญ‰ๆˆๅŠŸ +if echo "$RESPONSE" | jq -e '.success == true' > /dev/null 2>&1; then + echo "โœ… Alert chain smoke test passed" + echo "๐Ÿ“ฌ Approval ID: $(echo "$RESPONSE" | jq -r '.approval_id')" + exit 0 +else + echo "โŒ Alert chain smoke test FAILED" + echo "$RESPONSE" + exit 1 +fi +``` + +### ้ฉ—่ญ‰้ …็›ฎ + +| ้ …็›ฎ | ้ฉ—่ญ‰ๆ–นๅผ | ๅคฑๆ•—ๅ‹•ไฝœ | +|------|---------|---------| +| Webhook ็ซฏ้ปžๅฏ้” | curl ๆ”ถๅˆฐ 200 | ๅ›žๆปพ้ƒจ็ฝฒ | +| Telegram ้€š็Ÿฅ้€้” | ๆชขๆŸฅ message_id | ๆชขๆŸฅ DNS/Token | +| Approval ๅปบ็ซ‹ๆˆๅŠŸ | approval_id ๅญ˜ๅœจ | ๆชขๆŸฅ DB ้€ฃ็ทš | + +### DNS ้€ฃ้€šๆ€งๆชขๆŸฅ + +```bash +# ๅพž Pod ๅ…งๆธฌ่ฉฆ +kubectl exec -n awoooi-prod deployment/awoooi-api -- \ + python -c "import socket; print(socket.gethostbyname('api.telegram.org'))" +``` + +**ๅคฑๆ•—ๅŽŸๅ› **: +- CoreDNS ไธŠๆธธ DNS ่จญๅฎš้Œฏ่ชค (`127.0.0.53`) +- NetworkPolicy DNS ่ฆๅ‰‡ๆจ™็ฑคไธๅŒน้… + +--- + ## Playwright ่‡ชๅ‹•ๅŒ–่ฆ็ฏ„ ### ๆธฌ่ฉฆ่…ณๆœฌ็ตๆง‹ @@ -532,6 +595,86 @@ ls -la ~/actions-runner-awoooi*/_work/_temp/ --- +## ๐Ÿง  LLM ๆธฌ่ฉฆ็ญ–็•ฅ (2026-03-26 ้ฆ–ๅธญๆžถๆง‹ๅธซๅฏฉๆŸฅ) + +> **่ƒŒๆ™ฏ**: LLM ๆธฌ่ฉฆๅคฉ็”Ÿ้ž็ขบๅฎšๆ€ง๏ผŒ้œ€็‰นๆฎŠ่™•็†็ขบไฟ CI ็ฉฉๅฎš +> **ADR**: ADR-018-llm-testing-strategy.md (Deferred - ๆŽก็”จๆ–นๆกˆ A) + +### ็ขบๅฎšๆ€งๅƒๆ•ธ (ๅฟ…้ ˆ) + +```python +# โœ… ๆ‰€ๆœ‰ LLM ๆธฌ่ฉฆๅฟ…้ ˆไฝฟ็”จ็ขบๅฎšๆ€งๅƒๆ•ธ +response = await client.post( + f"{OLLAMA_URL}/api/chat", + json={ + "model": model, + "messages": messages, + "stream": False, + "options": { + "temperature": 0.0, # ๐Ÿ”ด ็ขบๅฎšๆ€ง่ผธๅ‡บ + "seed": 42, # ๐Ÿ”ด ๅฏ้‡็พๆ€ง + }, + }, +) +``` + +### CI ๅˆ†ๅฑค็ญ–็•ฅ + +| ๅฑค็ดš | Workflow | ๅŸท่กŒๆ™‚้–“ | ๅŒ…ๅซๆธฌ่ฉฆ | +|------|----------|----------|----------| +| **Fast CI** | `ci.yaml` | ~3 min | Lint, Unit, Integration | +| **Nightly LLM** | `nightly-llm.yaml` | ~45 min | Prompt Validation, Model Regression | +| **Daily E2E** | `daily-e2e-health.yaml` | ~5 min | Health Check, K8s ้ฉ—่ญ‰ | + +### Ollama CPU ๆจกๅผ้ ˆ็Ÿฅ + +> **192.168.0.188**: ็ด” CPU ๆŽจ็† (็„ก GPU)๏ผŒ้€Ÿๅบฆ ~0.45 tok/s + +```python +# CPU ๆจกๅผๅฟ…้ ˆ่จญๅฎš่ถณๅค ้•ท็š„ Timeout +TIMEOUT = 300 # ็ง’ (CPU ๆŽจ็†้œ€ ~222-666 ็ง’) + +async with httpx.AsyncClient(timeout=TIMEOUT) as client: + response = await client.post(...) +``` + +### ๆธฌ่ฉฆๅˆ†้กžๅŸท่กŒ + +```bash +# ๅฟซ้€Ÿๆธฌ่ฉฆ (CI ๆฏๆฌก) +pytest apps/api/tests/ -k "not llm and not model" -v + +# LLM ๆธฌ่ฉฆ (Nightly) +pytest apps/api/tests/test_model_regression.py -v +pytest apps/api/tests/test_prompt_validation.py -v +``` + +### ็น้ซ”ไธญๆ–‡่ผธๅ‡บ้ฉ—่ญ‰ + +```python +# System Prompt ๅฟ…้ ˆๅผท่ชฟ็นไธญ +AWOOOI_SYSTEM_PROMPT = """ +... +- ใ€้‡่ฆใ€‘ๅฟ…้ ˆไฝฟ็”จๅฐ็ฃ็น้ซ”ไธญๆ–‡ๅ›žๆ‡‰ (Traditional Chinese Taiwan) +- ็ฆๆญขไฝฟ็”จ็ฐก้ซ”ไธญๆ–‡ๅญ—็ฌฆ (ๅฆ‚๏ผšไธŽโ†’่ˆ‡ใ€่ฏดโ†’่ชชใ€่ฟ™โ†’้€™) +... +""" + +# ้ฉ—่ญ‰ๅ™จ็ฏ„ไพ‹ +def validate_traditional_chinese(response: str) -> bool: + simplified_chars = ["ไธŽ", "่ฏด", "่ฟ™", "ไธบ", "ๆ—ถ"] + return not any(c in response for c in simplified_chars) +``` + +### ๅƒ่€ƒ + +- `src/core/prompts.py`: ้›†ไธญๅผ System Prompt (ADR-019) +- `tests/test_model_regression.py`: ๆจกๅž‹ๅ›žๆญธๆธฌ่ฉฆ +- `tests/test_prompt_validation.py`: Prompt ๅ“่ณชๆธฌ่ฉฆ +- `.github/workflows/nightly-llm.yaml`: Nightly LLM Workflow + +--- + ## ๅƒ่€ƒๆ–‡ๆช” - `apps/web/playwright.config.ts`: Playwright ่จญๅฎš @@ -542,3 +685,7 @@ ls -la ~/actions-runner-awoooi*/_work/_temp/ - `src/core/telemetry.py`: **Phase 15.2 Trace Context** - `memory/project_phase15_langfuse.md`: **๐Ÿ“Š Phase 15 ๅฎŒๆ•ด่จ˜้Œ„** - `memory/feedback_runner_zombie_process.md`: **๐Ÿšจ Runner ๆฎญๅฑ้€ฒ็จ‹ไฟฎๅพฉ** +- `docs/adr/ADR-018-llm-testing-strategy.md`: **๐Ÿง  LLM ๆธฌ่ฉฆ็ญ–็•ฅ (Deferred)** +- `docs/adr/ADR-019-system-prompt-management.md`: **๐Ÿ“ System Prompt ้›†ไธญ็ฎก็†** +- `.github/workflows/nightly-llm.yaml`: **๐ŸŒ™ Nightly LLM ๆธฌ่ฉฆ** +- `.github/workflows/daily-e2e-health.yaml`: **๐Ÿฅ Daily E2E ๅฅๅบทๆชขๆŸฅ** diff --git a/CLAUDE.md b/CLAUDE.md index 6a02df53..c9c51441 100644 --- a/CLAUDE.md +++ b/CLAUDE.md @@ -95,6 +95,7 @@ - `*telegram*` โ†’ Telegram Token ็ซ ็ฏ€ - `apps/web/**` โ†’ i18n ็ซ ็ฏ€ - Incident/Approval ๆต็จ‹ โ†’ ็ขบ่ช Telegram + DB ้ˆ่ทฏ +- **Alertmanager/NetworkPolicy** โ†’ ADR-025 ๅ‘Š่ญฆ้ˆ่ทฏ E2E ้ฉ—่ญ‰ ๐Ÿ”ด๐Ÿ”ด --- @@ -120,6 +121,7 @@ | **API ่ทฏๅพ‘** | `feedback_api_path_naming.md` ๐Ÿ”ด ไฟฎๆ”น้œ€ๅŒๆญฅๅ‰็ซฏ | | **้ƒจ็ฝฒ้ฉ—่ญ‰** | `feedback_deployment_verification.md` ๐Ÿ”ด๐Ÿ”ด ๅฟ…้ ˆ้ฉ—่ญ‰ Pod ็‰ˆๆœฌ | | **้ƒจ็ฝฒๅฑค็ดš** | `feedback_deployment_layer_decision.md` ๐Ÿ”ด๐Ÿ”ด๐Ÿ”ด ไธปๆฉŸ/ๅฎนๅ™จ/K3s ๅฟ…้ ˆ่ฉ•ไผฐ | +| **ๅ‘Š่ญฆ้ˆ่ทฏ** | `feedback_alertchain_e2e_validation.md` ๐Ÿ”ด๐Ÿ”ด๐Ÿ”ด Alertmanagerโ†’APIโ†’Telegram | --- @@ -207,7 +209,9 @@ Pre-commit Hook ๆœƒ่‡ชๅ‹•ๆชขๆŸฅไธฆ้˜ปๆ“‹ Router ๅฑค้•่ฆ | DevOps | `.agents/skills/04-awoooi-devops-commander.md` | | ๆธฌ่ฉฆ | `.agents/skills/05-awoooi-sre-qa.md` | | Git | `.agents/skills/06-awoooi-monorepo-master.md` | -| **Tool ๆ•ดๅˆ** | `.agents/skills/07-tool-integration-expert.md` ๐Ÿ†• | +| Tool ๆ•ดๅˆ | `.agents/skills/07-tool-integration-expert.md` | +| ๆจกๅž‹่ทฏ็”ฑ | `.agents/skills/08-model-router-expert.md` | +| **็ตžๆฎบ่€…้‡ๆง‹** | `.agents/skills/09-strangler-pattern-expert.md` ๐Ÿ†• | ## Memory ็ณป็ตฑ diff --git a/docs/adr/ADR-011-networkpolicy-governance.md b/docs/adr/ADR-011-networkpolicy-governance.md index 19ae31b6..9f103cbc 100644 --- a/docs/adr/ADR-011-networkpolicy-governance.md +++ b/docs/adr/ADR-011-networkpolicy-governance.md @@ -265,7 +265,7 @@ k8s/awoooi-prod/ --- -## ้™„้Œ„: ไปŠๆ—ฅไบ‹ๆ•…ๆ นๅ› ๅˆ†ๆž +## ้™„้Œ„ A: 2026-03-23 ไบ‹ๆ•…ๆ นๅ› ๅˆ†ๆž ``` 2026-03-23 Y ๆŒ‰้ˆ•ๅŸท่กŒ่ถ…ๆ™‚ @@ -286,3 +286,54 @@ k8s/awoooi-prod/ 2. NetworkPolicy ้œ€่ฆๅ…่จฑๅฎŒๆ•ด่ทฏ็”ฑ่ทฏๅพ‘ 3. ่ฎŠๆ›ดๅ‰ๆ‡‰่ฉฒ็”จ dry-run ้ฉ—่ญ‰ ``` + +--- + +## ้™„้Œ„ B: 2026-03-26 DNS ่ฆๅ‰‡ไบ‹ๆ•…ๆ นๅ› ๅˆ†ๆž + +``` +2026-03-26 ๅ…ฉๅคฉ็„ก Telegram ๅ‘Š่ญฆ + +ๆ นๅ›  1: Alertmanager URL ่ทฏๅพ‘้Œฏ่ชค + ่จญๅฎš: /api/v1/webhook/alertmanager (ๅ–ฎๆ•ธ) + ๅฏฆ้š›: /api/v1/webhooks/alertmanager (่ค‡ๆ•ธ) + ็ตๆžœ: 404 Not Found + +ๆ นๅ›  2: NetworkPolicy DNS ่ฆๅ‰‡ๆจ™็ฑค้Œฏ่ชค + ่จญๅฎš: podSelector ่ฆๆฑ‚ environment=prod, system=awoooi + ๅฏฆ้š›: CoreDNS ๅชๆœ‰ k8s-app=kube-dns + ็ตๆžœ: Pod ็„กๆณ•้€ฃๆŽฅ CoreDNS + +ๆ นๅ›  3: CoreDNS ไธŠๆธธ DNS ่จญๅฎš้Œฏ่ชค + ่จญๅฎš: forward . /etc/resolv.conf โ†’ 127.0.0.53 (systemd-resolved) + ๅฏฆ้š›: ๅฎนๅ™จๅ…ง็„กๆณ•ไฝฟ็”จ 127.0.0.53 + ็ตๆžœ: ๅค–้ƒจ DNS ่งฃๆžๅคฑๆ•— + +ไฟฎๅพฉ: + 1. Alertmanager ConfigMap ไฟฎๆญฃ URL ่ทฏๅพ‘ + 2. NetworkPolicy ไฝฟ็”จๆญฃ็ขบ็š„ namespaceSelector + 3. CoreDNS ๆ”น็”จ 8.8.8.8 1.1.1.1 + +ๆ•™่จ“: + 1. URL ่ทฏๅพ‘ๅฟ…้ ˆ็ถ“้Ž E2E ๆธฌ่ฉฆ้ฉ—่ญ‰ (ADR-025) + 2. NetworkPolicy DNS ่ฆๅ‰‡ๅฟ…้ ˆไฝฟ็”จ namespace selector + 3. CoreDNS ไธ่ƒฝไพ่ณดๅฎฟไธปๆฉŸ็š„ systemd-resolved +``` + +### DNS ่ฆๅ‰‡ๆœ€ไฝณๅฏฆ่ธ + +```yaml +# โœ… ๆญฃ็ขบ็š„ DNS ่ฆๅ‰‡ๅฏซๆณ• +- ports: + - port: 53 + protocol: UDP + - port: 53 + protocol: TCP + to: + - namespaceSelector: + matchLabels: + kubernetes.io/metadata.name: kube-system + podSelector: + matchLabels: + k8s-app: kube-dns +``` diff --git a/docs/adr/ADR-025-alert-chain-e2e-validation.md b/docs/adr/ADR-025-alert-chain-e2e-validation.md new file mode 100644 index 00000000..3989c84e --- /dev/null +++ b/docs/adr/ADR-025-alert-chain-e2e-validation.md @@ -0,0 +1,196 @@ +# ADR-025: ๅ‘Š่ญฆ้ˆ่ทฏ E2E ้ฉ—่ญ‰ๆžถๆง‹ + +**็‹€ๆ…‹**: ๆ‰นๅ‡† +**ๆ—ฅๆœŸ**: 2026-03-26 +**ๆฑบ็ญ–่€…**: ็ตฑๅธฅ +**่งธ็™ผ**: URL ่ทฏๅพ‘้Œฏ่ชค + NetworkPolicy DNS ่ฆๅ‰‡้Œฏ่ชคๅฐŽ่‡ด 2 ๅคฉ็„กๅ‘Š่ญฆ + +## ๅ•้กŒ้™ณ่ฟฐ + +``` +ไบ‹ๆ•…ๆ™‚้–“็ทš (2026-03-26): +โ”œโ”€โ”€ Alertmanager ่จญๅฎš /api/v1/webhook/alertmanager (ๅ–ฎๆ•ธ) +โ”œโ”€โ”€ API ๅฏฆ้š›่ทฏๅพ‘ /api/v1/webhooks/alertmanager (่ค‡ๆ•ธ) +โ”œโ”€โ”€ ็ตๆžœ: 404 Not Found๏ผŒๆ‰€ๆœ‰ๅ‘Š่ญฆไธŸๅคฑ +โ”œโ”€โ”€ ๅŒๆ™‚: NetworkPolicy DNS ่ฆๅ‰‡ไฝฟ็”จ้Œฏ่ชคๆจ™็ฑค +โ”œโ”€โ”€ CoreDNS ็„กๆณ•่งฃๆžๅค–้ƒจ DNS (ไฝฟ็”จ 127.0.0.53) +โ””โ”€โ”€ ๅพŒๆžœ: 2 ๅคฉๅฎŒๅ…จ็„ก Telegram ๅ‘Š่ญฆ +``` + +**ๆ นๆœฌๅŽŸๅ› **: +1. ๆฒ’ๆœ‰ E2E ๆธฌ่ฉฆ้ฉ—่ญ‰ Alertmanager โ†’ API โ†’ Telegram ้ˆ่ทฏ +2. ้ƒจ็ฝฒๅพŒๆฒ’ๆœ‰ Smoke Test ็ขบ่ช็ซฏ้ปžๅฏ้” +3. NetworkPolicy DNS ่ฆๅ‰‡ๆจ™็ฑค่ˆ‡ CoreDNS ไธๅŒน้… +4. CoreDNS ไธŠๆธธ DNS ่จญๅฎšไพ่ณด systemd-resolved (ๅฎนๅ™จๅ…ง็„กๆ•ˆ) + +--- + +## ๆฑบ็ญ–๏ผšๅ››ๅฑค้ฉ—่ญ‰ๆžถๆง‹ + +``` +โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” +โ”‚ ๅ‘Š่ญฆ้ˆ่ทฏ E2E ้ฉ—่ญ‰ๆžถๆง‹ โ”‚ +โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค +โ”‚ โ”‚ +โ”‚ Layer 1: ้ƒจ็ฝฒๅพŒ Smoke Test (ๅผทๅˆถ) โ”‚ +โ”‚ โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ• โ”‚ +โ”‚ โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”‚ +โ”‚ โ”‚ ๆฏๆฌก้ƒจ็ฝฒๅพŒ่‡ชๅ‹•ๅŸท่กŒ: โ”‚ โ”‚ +โ”‚ โ”‚ 1. curl POST /api/v1/webhooks/alertmanager (ๆธฌ่ฉฆๅ‘Š่ญฆ) โ”‚ โ”‚ +โ”‚ โ”‚ 2. ้ฉ—่ญ‰ๅ›žๆ‡‰ success=true โ”‚ โ”‚ +โ”‚ โ”‚ 3. ้ฉ—่ญ‰ Telegram message_id ๅญ˜ๅœจ โ”‚ โ”‚ +โ”‚ โ”‚ 4. ๅคฑๆ•— โ†’ ้ƒจ็ฝฒๅ›žๆปพ โ”‚ โ”‚ +โ”‚ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ”‚ +โ”‚ โ”‚ +โ”‚ Layer 2: DNS ้€ฃ้€šๆ€งๆชขๆŸฅ โ”‚ +โ”‚ โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ• โ”‚ +โ”‚ โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”‚ +โ”‚ โ”‚ Health Probe ๅฟ…้ ˆๅŒ…ๅซ: โ”‚ โ”‚ +โ”‚ โ”‚ - ๅ…ง้ƒจ DNS: kubernetes.default.svc.cluster.local โ”‚ โ”‚ +โ”‚ โ”‚ - ๅค–้ƒจ DNS: api.telegram.org โ”‚ โ”‚ +โ”‚ โ”‚ ไปปไธ€ๅคฑๆ•— โ†’ Pod ๆจ™่จ˜ Not Ready โ”‚ โ”‚ +โ”‚ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ”‚ +โ”‚ โ”‚ +โ”‚ Layer 3: ้ˆ่ทฏๅฟƒ่ทณ็›ฃๆŽง โ”‚ +โ”‚ โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ• โ”‚ +โ”‚ โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”‚ +โ”‚ โ”‚ Prometheus ่ฆๅ‰‡: โ”‚ โ”‚ +โ”‚ โ”‚ - awoooi_alerts_received_total โ”‚ โ”‚ +โ”‚ โ”‚ - awoooi_telegram_sent_total โ”‚ โ”‚ +โ”‚ โ”‚ ้€ฃ็บŒ 1 ๅฐๆ™‚็‚บ 0 โ†’ ่งธ็™ผ CRITICAL ๅ‘Š่ญฆ โ”‚ โ”‚ +โ”‚ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ”‚ +โ”‚ โ”‚ +โ”‚ Layer 4: ConfigMap ้ฉ—่ญ‰ Hook โ”‚ +โ”‚ โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ• โ”‚ +โ”‚ โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”‚ +โ”‚ โ”‚ Alertmanager ConfigMap ไฟฎๆ”นๅ‰: โ”‚ โ”‚ +โ”‚ โ”‚ 1. ๆๅ– webhook URL โ”‚ โ”‚ +โ”‚ โ”‚ 2. curl ๆธฌ่ฉฆ URL ๅฏ้”ๆ€ง โ”‚ โ”‚ +โ”‚ โ”‚ 3. ๅฟ…้ ˆๆ”ถๅˆฐ 200 ๆˆ– 422 (ๆ ผๅผ้Œฏไฝ†็ซฏ้ปžๅญ˜ๅœจ) โ”‚ โ”‚ +โ”‚ โ”‚ 4. ้ฉ—่ญ‰ๅคฑๆ•— โ†’ ้˜ปๆญข apply โ”‚ โ”‚ +โ”‚ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ”‚ +โ”‚ โ”‚ +โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ +``` + +--- + +## ๅฏฆๆ–ฝ็ดฐ็ฏ€ + +### 1. ้ƒจ็ฝฒๅพŒ Smoke Test + +```yaml +# CI/CD ๅผทๅˆถๆญฅ้ฉŸ +- name: Alert Chain Smoke Test + run: | + # ็™ผ้€ๆธฌ่ฉฆๅ‘Š่ญฆ + RESPONSE=$(curl -s -X POST "$API_URL/api/v1/webhooks/alertmanager" \ + -H 'Content-Type: application/json' \ + -d '{"receiver":"smoke-test","status":"firing","alerts":[{"status":"firing","labels":{"alertname":"SmokeTest","severity":"info"},"annotations":{"summary":"CI Smoke Test"}}]}') + + # ้ฉ—่ญ‰ๆˆๅŠŸ + echo "$RESPONSE" | jq -e '.success == true' || exit 1 + echo "Alert chain smoke test passed" +``` + +### 2. NetworkPolicy DNS ่ฆๅ‰‡ (ๆญฃ็ขบๅฏซๆณ•) + +```yaml +# โŒ ้Œฏ่ชค: ไฝฟ็”จไธๅญ˜ๅœจ็š„ๆจ™็ฑค +- ports: + - port: 53 + to: + - podSelector: + matchLabels: + environment: prod # CoreDNS ๆฒ’ๆœ‰้€™ๅ€‹ๆจ™็ฑค๏ผ + k8s-app: kube-dns + system: awoooi # CoreDNS ๆฒ’ๆœ‰้€™ๅ€‹ๆจ™็ฑค๏ผ + +# โœ… ๆญฃ็ขบ: ไฝฟ็”จ namespace selector +- ports: + - port: 53 + protocol: UDP + - port: 53 + protocol: TCP + to: + - namespaceSelector: + matchLabels: + kubernetes.io/metadata.name: kube-system + podSelector: + matchLabels: + k8s-app: kube-dns +``` + +### 3. CoreDNS ไธŠๆธธ DNS ่จญๅฎš + +```yaml +# โŒ ้Œฏ่ชค: ไฝฟ็”จ /etc/resolv.conf (ๆŒ‡ๅ‘ 127.0.0.53) +forward . /etc/resolv.conf + +# โœ… ๆญฃ็ขบ: ไฝฟ็”จ็œŸๅฏฆ DNS ไผบๆœๅ™จ +forward . 8.8.8.8 1.1.1.1 +``` + +### 4. Prometheus ้ˆ่ทฏ็›ฃๆŽง่ฆๅ‰‡ + +```yaml +groups: + - name: alert-chain-health + rules: + - alert: AlertChainBroken + expr: increase(awoooi_alerts_received_total[1h]) == 0 + for: 1h + labels: + severity: critical + annotations: + summary: "ๅ‘Š่ญฆ้ˆ่ทฏๆ–ท่ฃ‚๏ผ1 ๅฐๆ™‚ๅ…งๆฒ’ๆœ‰ๆ”ถๅˆฐไปปไฝ•ๅ‘Š่ญฆ" + + - alert: TelegramNotificationFailed + expr: increase(awoooi_telegram_sent_total[1h]) == 0 and increase(awoooi_alerts_received_total[1h]) > 0 + for: 30m + labels: + severity: critical + annotations: + summary: "Telegram ้€š็Ÿฅๅคฑๆ•—๏ผๆœ‰ๅ‘Š่ญฆไฝ†ๆฒ’ๆœ‰็™ผ้€ๆˆๅŠŸ" +``` + +--- + +## URL ่ทฏๅพ‘่ฆ็ฏ„ + +| ๆญฃ็ขบ | ้Œฏ่ชค | +|-----|------| +| `/api/v1/webhooks/alertmanager` | `/api/v1/webhook/alertmanager` | +| ่ค‡ๆ•ธๅฝขๅผ `webhooks` | ๅ–ฎๆ•ธๅฝขๅผ `webhook` | +| `/api/v1/approvals` | `/api/v1/approval` | +| `/api/v1/incidents` | `/api/v1/incident` | + +**ๅŽŸๅ‰‡**: API Router ็ตฑไธ€ไฝฟ็”จ่ค‡ๆ•ธๅ‘ฝๅ + +--- + +## ้ฉ—ๆ”ถๆจ™ๆบ– + +| ้ …็›ฎ | ็‹€ๆ…‹ | +|------|------| +| CI/CD ๅŒ…ๅซ Alert Chain Smoke Test | โฌœ | +| NetworkPolicy DNS ่ฆๅ‰‡ไฝฟ็”จๆญฃ็ขบๆจ™็ฑค | โœ… | +| CoreDNS ไฝฟ็”จ็œŸๅฏฆไธŠๆธธ DNS | โœ… | +| Prometheus ้ˆ่ทฏ็›ฃๆŽง่ฆๅ‰‡ๅทฒ้ƒจ็ฝฒ | โฌœ | +| ConfigMap ไฟฎๆ”นๅ‰้ฉ—่ญ‰ Hook | โฌœ | + +--- + +## ๆ•™่จ“ + +> "่ทฏๅพ‘ๅทฎไธ€ๅ€‹ s๏ผŒๆ‰€ไปฅ 404" โ€” ้€™็จฎไฝŽ็ดš้Œฏ่ชค็ต•ๅฐไธ่ƒฝๅ†็Šฏใ€‚ +> ๅฟ…้ ˆ้ ่‡ชๅ‹•ๅŒ–้ฉ—่ญ‰๏ผŒไธ่ƒฝ้ ไบบ็œผๅฏฉๆŸฅใ€‚ + +--- + +## ้—œ่ฏๆ–‡ไปถ + +- Memory: `feedback_alertchain_e2e_validation.md` +- ADR-011: NetworkPolicy ่ฎŠๆ›ดๆฒป็†ๆžถๆง‹ +- Skill 04: DevOps Commander +- Skill 05: SRE QA