Files
awoooi/docs/adr/ADR-107-awooop-control-plane-storage.md
Your Name 13e51802fe feat(awooop): Phase 0 全 ADR + Phase 1 control plane schema(含 critic 四項修正)
## Phase 0(文件層,全部 Accepted)
- ADR-106/107:AwoooP 平台架構 + 儲存策略
- ADR-111~118:Bootstrap → RLS 七項核心 ADR
- ADR-119~124:SAGA → Singleton Decomposition 六項 ADR
- ADR-UI-01~04:Operator Console 四個 UI ADR

## Phase 1(DB schema + migration)
- awooop_phase1_control_plane_2026-05-04.sql:7 張新表 + trigger + RLS
  - Step 1:三角色(platform_admin/migration BYPASSRLS,awooop_app 受 RLS)
  - Step 13:GRANT awooop_app 最小權限(7 條)
  - Step 14:RLS fail-closed,移除 __platform__ 後門
- awooop_phase1_batch1_rls_2026-05-04.sql:高流量四表三步式 ADD COLUMN
- awooop_phase1_batch1_backfill.py:SKIP LOCKED 分批回填腳本
- awooop_models.py:7 個 SQLAlchemy 2.x models

## Critic 修正(4 Critical + 3 Major)
- C-1:ADD CONSTRAINT IF NOT EXISTS → DO 塊 + pg_constraint 查詢
- C-2:__mapper_args__ 字串 list → primary_key=True on mapped_column
- C-3:__platform__ RLS 後門 → 全移除,改用 BYPASSRLS role
- C-4:awooop_app role 從未建立 → Step 1 + 7 條 GRANT
- M-1:active_pointer_guard SECURITY DEFINER(FORCE RLS 跨租戶保護)
- M-2:pg_partman create_parent 加冪等防護
- M-3:immutability trigger 新增身份欄位保護(project_id/family/contract_id)

## Task 1.2 修補
- agent_loader.py:硬編碼 Mac 路徑 → AGENTS_DIR 環境變數
- Dockerfile:補 COPY .claude/agents/

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-04 13:37:11 +08:00

6.7 KiB

ADR-107: AwoooP Control Plane Storage Strategy

Status: Accepted Date: 2026-05-01 Scope: AwoooP control plane storage, contract materialization, cache/watch, K8s projection

Context

ADR-106 defines AwoooP as the multi-tenant Agent Platform with six contracts:

  • Project / Tenant
  • Agent
  • MCP Gateway
  • Policy / Routing
  • Runtime / Run State
  • Communication / Channel Event

Those contracts need a physical source of truth. The key question is whether the AwoooP control plane should be implemented as Kubernetes CRDs first, or as PostgreSQL tables first.

AwoooP must serve more than Kubernetes:

  • AWOOOI on K3s is the first runtime host and tenant.
  • EwoooC / MOMO PRO currently runs Docker workloads.
  • Tsenyang, Bitan, LINE, Slack, Email, and API entrypoints may not all be Kubernetes-native.
  • Runtime state, audit, budget, approval, and conversation events are high-volume product control-plane data, not only deployment desired state.

ADR-105 already established the local pattern for MCP RAG: keep PostgreSQL + pgvector + Redis hot cache as the source of truth unless scale, latency, or tenant isolation requires a later split.

Decision

D1 - Use PostgreSQL as the Control Plane Source of Truth

AwoooP v1 uses PostgreSQL as the authoritative store for the six contract families and runtime governance state.

PostgreSQL owns:

  • contract drafts and immutable published revisions
  • active revision pointers
  • tenant/project records
  • agent registry and version metadata
  • policy and routing revisions
  • MCP tool registry, grants, and audit summaries
  • budget ledgers and hard-stop state
  • ACL subjects and project access
  • run state and approval state
  • channel event and outbound delivery state
  • audit and trace correlation records

Kubernetes etcd is not the source of truth for these records in v1.

D2 - Materialize Contracts as Versioned Revisions

Each published contract must produce an immutable revision.

Minimum revision fields:

  • contract_family
  • contract_id
  • revision_id
  • version
  • lifecycle_status
  • body_json
  • body_schema_version
  • body_hash
  • created_by
  • created_at
  • published_at
  • supersedes_revision_id

Runtime reads only published or active revisions. Mutable drafts are never used for runtime decisions.

D3 - Keep Artifact Bodies Out of Control Tables

Large or independently governed artifacts are stored by reference plus hash:

  • system prompts
  • JSON Schemas
  • eval suites
  • policy fixtures
  • replay fixtures

The database stores:

  • artifact ref
  • SHA-256 hash
  • artifact type
  • expected schema version

This makes historical runs replayable without stuffing large prompt bodies into every contract row.

D4 - Use Redis for Cache and Watch, Not Authority

Redis is allowed for:

  • effective policy cache
  • agent contract cache
  • project boundary cache
  • rate limit counters
  • short-lived session hot state
  • contract revision invalidation
  • worker coordination

Redis is not allowed to be the only source of truth for:

  • contract revisions
  • active revision pointers
  • budget hard-stop decisions
  • run terminal state
  • approval decisions
  • audit history

Any cached value used by runtime must carry the source revision_id and body_hash.

D5 - Use Kubernetes CRDs Only as Runtime Projection

CRDs are not the primary control-plane database in v1.

Allowed future CRD projections:

  • AwoooPRuntime
  • AwoooPWorker
  • MCPServerBinding
  • ChannelIngress
  • TenantRuntimeBinding

These CRDs describe Kubernetes runtime wiring, deployment, service exposure, and MCP server binding. They do not own product contracts, budgets, run state, conversation events, or audit ledgers.

If an operator is later built, it should project from PostgreSQL into Kubernetes objects, not require Kubernetes CRDs to become the source of truth for all AwoooP state.

D6 - Define Read Paths by Stability

Data Source of truth Runtime read path
Contract draft PostgreSQL Admin/API only
Published contract revision PostgreSQL DB read + Redis cache
Active revision pointer PostgreSQL DB read + Redis invalidation
Prompt/schema/eval artifact Git or object storage Ref + hash from DB
Effective policy PostgreSQL-derived Redis cache with revision IDs
Run state PostgreSQL Worker writes DB, optional Redis notification
Budget hard stop PostgreSQL DB transaction + Redis hot counter
Rate limit PostgreSQL policy + Redis counter Redis counter, DB policy
Audit history PostgreSQL DB append/query
K8s runtime wiring CRD projection Kubernetes watch

Consequences

Benefits

  • AwoooP remains useful for Docker, Next.js, API-only, and future non-K8s tenants.
  • Product governance data can be joined with incidents, Playbooks, KM, budget, and audit records.
  • Contract changes are transactional and replayable.
  • Runtime can use Redis for speed without risking split-brain authority.
  • K8s CRDs remain available for the infrastructure layer without forcing all business control-plane data into etcd.

Costs

  • AwoooP needs explicit schema and migration discipline before runtime code.
  • Runtime workers must implement revision-aware caches.
  • K8s operator work, if needed later, must include a projection layer.

Risks

  • PostgreSQL schema can become a dumping ground if contract families are not modeled explicitly.
  • Redis cache can become stale if invalidation is not versioned by revision and hash.
  • Future CRDs can drift from DB state if projection reconciliation is not audited.

Mitigations

  • Every runtime decision records revision_id and body_hash.
  • Active revision changes must emit an invalidation event.
  • Contract publish is append-only; do not mutate published body_json.
  • CRD projection writes must be reconciled and auditable.
  • High-volume audit/event tables should be partitioned before production scale.

Non-Goals

  • This ADR does not create database migrations.
  • This ADR does not create AwoooP directories or runtime services.
  • This ADR does not introduce Kubernetes CRDs.
  • This ADR does not choose Temporal, Celery, Redis Streams, or another worker engine.
  • This ADR does not change paid provider behavior.

Acceptance Criteria

  • PostgreSQL is declared as the v1 AwoooP control-plane source of truth.
  • Redis is limited to cache/watch/counter responsibilities.
  • Artifacts are referenced by ref + hash rather than copied into every runtime record.
  • CRD-first is explicitly rejected for v1.
  • Future CRDs are limited to Kubernetes runtime projection.
  • ADR-106 and the task-routing docs point to this ADR.

References

  • docs/adr/ADR-105-mcp-agent-loop-governance.md
  • docs/adr/ADR-106-agent-platform-architecture.md
  • docs/LOGBOOK.md
  • docs/12-agent-game-rules.md