## Phase 0(文件層,全部 Accepted) - ADR-106/107:AwoooP 平台架構 + 儲存策略 - ADR-111~118:Bootstrap → RLS 七項核心 ADR - ADR-119~124:SAGA → Singleton Decomposition 六項 ADR - ADR-UI-01~04:Operator Console 四個 UI ADR ## Phase 1(DB schema + migration) - awooop_phase1_control_plane_2026-05-04.sql:7 張新表 + trigger + RLS - Step 1:三角色(platform_admin/migration BYPASSRLS,awooop_app 受 RLS) - Step 13:GRANT awooop_app 最小權限(7 條) - Step 14:RLS fail-closed,移除 __platform__ 後門 - awooop_phase1_batch1_rls_2026-05-04.sql:高流量四表三步式 ADD COLUMN - awooop_phase1_batch1_backfill.py:SKIP LOCKED 分批回填腳本 - awooop_models.py:7 個 SQLAlchemy 2.x models ## Critic 修正(4 Critical + 3 Major) - C-1:ADD CONSTRAINT IF NOT EXISTS → DO 塊 + pg_constraint 查詢 - C-2:__mapper_args__ 字串 list → primary_key=True on mapped_column - C-3:__platform__ RLS 後門 → 全移除,改用 BYPASSRLS role - C-4:awooop_app role 從未建立 → Step 1 + 7 條 GRANT - M-1:active_pointer_guard SECURITY DEFINER(FORCE RLS 跨租戶保護) - M-2:pg_partman create_parent 加冪等防護 - M-3:immutability trigger 新增身份欄位保護(project_id/family/contract_id) ## Task 1.2 修補 - agent_loader.py:硬編碼 Mac 路徑 → AGENTS_DIR 環境變數 - Dockerfile:補 COPY .claude/agents/ Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
5.2 KiB
5.2 KiB
AwoooP Partition & Retention Runbook
建立:2026-05-04(台北) ADR 依據:ADR-114(channel_event_dedupe)、ADR-119(run_state) 關聯:Phase 1 Task 1.4、Phase 4/7(run_state / mcp_gateway_audit 建立時套用)
概覽
| 表 | Partition 策略 | Retention | 建立 Phase |
|---|---|---|---|
awooop_channel_event_dedupe |
RANGE by created_at(日) |
7 天 | Phase 1 ✅ |
awooop_run_state |
RANGE by created_at(月) |
90 天 hot + 1 年 warm | Phase 4 |
awooop_mcp_gateway_audit |
RANGE by created_at(月) |
90 天 hot + 1 年 warm | Phase 5 |
awooop_agent_audit_log |
RANGE by created_at(月) |
90 天 hot + 1 年 warm | Phase 7 |
1. awooop_channel_event_dedupe(Phase 1,已完成)
pg_partman 維護(建議)
-- 確認 pg_partman 已安裝
SELECT extname, extversion FROM pg_extension WHERE extname = 'pg_partman';
-- 初始化(若 Phase 1 migration 未自動完成)
SELECT partman.create_parent(
p_parent_table := 'public.awooop_channel_event_dedupe',
p_control := 'created_at',
p_type := 'native',
p_interval := '1 day',
p_premake := 4
);
UPDATE partman.part_config
SET retention = '7 days',
retention_keep_table = false
WHERE parent_table = 'public.awooop_channel_event_dedupe';
pg_partman 定期維護(CronJob,每天 00:00)
# K8s CronJob 或 pg_cron 執行
psql $DATABASE_URL -c "SELECT partman.run_maintenance('public.awooop_channel_event_dedupe');"
手動維護(無 pg_partman)
-- 查看現有 partition
SELECT inhrelid::regclass AS partition_name,
pg_get_expr(c.relpartbound, c.oid) AS bounds
FROM pg_inherits i
JOIN pg_class c ON c.oid = i.inhrelid
WHERE inhparent = 'awooop_channel_event_dedupe'::regclass
ORDER BY bounds;
-- 建立下一週 partition(每天執行一次)
DO $$
DECLARE
d DATE := CURRENT_DATE + 1;
BEGIN
EXECUTE format(
'CREATE TABLE IF NOT EXISTS awooop_channel_event_dedupe_%s
PARTITION OF awooop_channel_event_dedupe
FOR VALUES FROM (%L) TO (%L)',
to_char(d, 'YYYYMMDD'),
d::TIMESTAMPTZ,
(d + INTERVAL '1 day')::TIMESTAMPTZ
);
END $$;
-- 刪除 7 天前的 partition(毫秒級,遠優於 DELETE)
DO $$
DECLARE
old_date DATE := CURRENT_DATE - 8;
partition_name TEXT := 'awooop_channel_event_dedupe_' || to_char(old_date, 'YYYYMMDD');
BEGIN
EXECUTE 'DROP TABLE IF EXISTS ' || partition_name;
RAISE NOTICE 'Dropped partition: %', partition_name;
END $$;
2. awooop_run_state(Phase 4 建立時套用)
尚未建立。Phase 4 建立
awooop_run_state表時使用以下模板。
月份 Partition 建立模板
CREATE TABLE awooop_run_state (
run_id UUID NOT NULL DEFAULT gen_random_uuid(),
project_id VARCHAR(64) NOT NULL,
-- ... 其他欄位 ...
created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
PRIMARY KEY (run_id, created_at)
) PARTITION BY RANGE (created_at);
-- pg_partman 初始化
SELECT partman.create_parent(
p_parent_table := 'public.awooop_run_state',
p_control := 'created_at',
p_type := 'native',
p_interval := '1 month',
p_premake := 3
);
UPDATE partman.part_config
SET retention = '90 days', -- hot tier: 90 天自動 DROP
retention_keep_table = true, -- warm tier: 保留為 detached table
retention_schema = 'warm_archive'-- warm partition 移到此 schema
WHERE parent_table = 'public.awooop_run_state';
Retention 策略
| Tier | 資料年齡 | 存放位置 | 清理方式 |
|---|---|---|---|
| Hot | 0~90 天 | public.awooop_run_state_* |
pg_partman 自動管理 |
| Warm | 91 天~1 年 | warm_archive.awooop_run_state_* |
保留(detach,不 DROP) |
| Cold | > 1 年 | S3 / GCS export(可選) | 手動 COPY TO 後 DROP |
3. awooop_mcp_gateway_audit + awooop_agent_audit_log(Phase 5/7)
與 awooop_run_state 相同策略(月份 partition,90 天 hot + 1 年 warm)。
建立時直接套用同一 pg_partman 模板,替換表名即可。
健康檢查
-- 確認各 partition 的資料分佈
SELECT
child.relname AS partition,
pg_size_pretty(pg_total_relation_size(child.oid)) AS size,
(SELECT count(*) FROM ONLY pg_class WHERE oid = child.oid) AS approx_rows
FROM pg_inherits
JOIN pg_class parent ON parent.oid = pg_inherits.inhparent
JOIN pg_class child ON child.oid = pg_inherits.inhrelid
WHERE parent.relname = 'awooop_channel_event_dedupe'
ORDER BY child.relname;
-- 確認最舊 partition(應 <= 7 天前)
SELECT min(created_at), max(created_at) FROM awooop_channel_event_dedupe;
告警規則(建議加入 Prometheus)
# partition 數量異常(應維持 7 ± 2 個)
- alert: AwoooPDedupPartitionCountAbnormal
expr: |
(SELECT count(*) FROM pg_inherits
WHERE inhparent = 'awooop_channel_event_dedupe'::regclass) NOT BETWEEN 5 AND 10
labels:
severity: warning
# pg_partman maintenance 未執行(超過 25 小時)
- alert: AwoooPPartmanMaintenanceStale
expr: time() - awooop_partman_last_run_timestamp > 90000
labels:
severity: warning