docs(rls): 記錄 projects canary 套用

This commit is contained in:
Your Name
2026-05-12 21:40:47 +08:00
parent 7f94bc5776
commit edd06485e0
2 changed files with 97 additions and 1 deletions

View File

@@ -1,3 +1,51 @@
## 2026-05-12 | RLS Canary Wave1.2 projects 已套用
**背景**Wave1.1 完成 `awooop_mcp_tool_registry` 後,剩餘低行數候選是 `awooop_projects`。這張表同時支撐 tenant runtime checks 與 Operator Console 跨租戶 project list不能直接用單一 tenant policy 熱開。
**code / DB path 收斂**
- `platform_operator_service.list_tenants()` 改讀 `public.awooop_operator_list_projects()`,讓 Operator Console 走明確 cross-tenant read helper。
- `budget_service._get_tenant_budget_limit(project_id)` 改用 `get_db_context(project_id)`,避免用預設 `awoooi` context 查其他 tenant budget。
- 新增 Wave1.2 apply / rollback SQL
- `scripts/ops/awooop-rls-canary-wave1-2-projects.sql`
- `scripts/ops/awooop-rls-canary-wave1-2-projects-rollback.sql`
- 新增 runbook`docs/runbooks/AWOOOP-RLS-CANARY-WAVE1-2.md`
**deployment-order 紅燈與 rollback**
- 先在 production 建立 `awooop_operator_list_projects()` 並確認 function 回傳 `awoooi` / `ewoooc`
- commit `7d92f0ac` 已推 Gitea main但第一次套用 RLS 時 live API image 仍是 `ff30c61c...`
- 症狀:`/api/v1/platform/tenants` 只回 `awoooi`,表示舊 code 仍直接讀 `awooop_projects` 並被 RLS 正確過濾。
- 已立即執行 rollback SQLrollback 後 `/api/v1/platform/tenants` 恢復 `total=2`
**production re-apply**
- 確認 K8s 已 rollout 到 `192.168.0.110:5000/awoooi/api:7d92f0acd705451d99b4413ab9748482e3675c00`2/2 ready。
- 套用前 gate
- `/api/v1/platform/tenants` → 200`total=2`
- `/api/v1/health` → 200`status=healthy`
- `awooop_operator_list_projects()``awoooi` / `ewoooc`
- 以 188 postgres/operator socket path 重跑 Wave1.2 SQLresult`COMMIT`
**套用後驗證**
- `/api/v1/platform/tenants` → 200`total=2`
- `/api/v1/health` → 200`status=healthy`
- `scripts/ops/awooop-rls-preflight.sh --exact-counts`
- `PASS=7 WARN=1 BLOCKED=1`
- `awooop_projects``rls=true force=true policies=4 fail_open=false`
- 剩餘 blocker 表:`audit_logs``awooop_outbound_message``awooop_run_state``incidents``knowledge_entries``playbooks`
- direct app-role behavior
- no `app.project_id``[]`
- `app.project_id='awoooi'``['awoooi']`
- `app.project_id='ewoooc'``['ewoooc']`
- `awooop_operator_list_projects()` under `awoooi` context → `['awoooi', 'ewoooc']`
**整體進度**
- Wave 0MOMO PostgreSQL backup → AwoooP 失敗通知接線完成。
- Wave 1GitHub deploy 競爭停用、RLS live 驗證、role bootstrap、API runtime access path、manual script gate、Wave1 空表 canary、Wave1.1 MCP tool registry、Wave1.2 projects canary 已完成。
- 尚未完成token rotation需外部輪換、188 certbot 正式修復、剩餘 RLS waves、188 local Ollama 停用窗口。
**下一步**
- 下一批 RLS 候選從 `awooop_outbound_message` / `awooop_run_state` 擇一,先做 query-path 與 rollback rehearsal不要直接熱開 `incidents` / `knowledge_entries` / `playbooks` / `audit_logs`
- 持續保留 `exact_counts_scope` WARN避免把 tenant-visible count 誤讀成 global count。
## 2026-05-12 | RLS Canary Wave1.1 已套用
**背景**Wave1 空表 canary 已完成後下一個候選是低行數非空表。Live preflight 顯示 `awooop_projects=2 rows``awooop_mcp_tool_registry=4 rows`;本輪先做 read-path 盤點再決定範圍。

View File

@@ -4,7 +4,18 @@ This wave targets:
- `awooop_projects`
Status: staged, apply pending.
Status: applied in production on 2026-05-12.
Production state:
- API image gate: `192.168.0.110:5000/awoooi/api:7d92f0acd705451d99b4413ab9748482e3675c00`
- `awooop_projects`: `rls=true force=true policies=4`
- Operator Console tenants endpoint: `total=2`
- Direct table reads:
- no `app.project_id`: `[]`
- `app.project_id='awoooi'`: `['awoooi']`
- `app.project_id='ewoooc'`: `['ewoooc']`
- `awooop_operator_list_projects()` still returns both reviewed projects.
## Safety Model
@@ -31,6 +42,17 @@ Operator Console project-list columns, and grants execute only to `awooop_app`.
## Apply
Before applying, verify the API deployment has the code that calls
`awooop_operator_list_projects()`:
```bash
ssh wooo@192.168.0.120 \
'sudo kubectl -n awoooi-prod get deploy awoooi-api -o wide'
```
The image tag must be `7d92f0ac` or later. If the deployment is older, do not
enable RLS; Operator Console will only see the current tenant row.
```bash
psql "$DATABASE_URL" -v ON_ERROR_STOP=1 \
-f scripts/ops/awooop-rls-canary-wave1-2-projects.sql
@@ -53,6 +75,32 @@ Expected after apply:
- tenant budget lookup can read the matching tenant row.
- global RLS preflight remains blocked only by later-wave tables.
Production verification from 2026-05-12:
- `/api/v1/platform/tenants` from API pod: HTTP 200, `total=2`.
- `/api/v1/health` from API pod: HTTP 200, `status=healthy`.
- `scripts/ops/awooop-rls-preflight.sh --exact-counts`:
- `PASS=7 WARN=1 BLOCKED=1`
- `awooop_projects rls=true force=true policies=4`
- remaining blocker tables: `audit_logs`, `awooop_outbound_message`,
`awooop_run_state`, `incidents`, `knowledge_entries`, `playbooks`.
- Direct app-role behavior:
- `projects_no_context=[]`
- `projects_awoooi_context=['awoooi']`
- `projects_ewoooc_context=['ewoooc']`
- `operator_function_awoooi_context=['awoooi', 'ewoooc']`
## Apply / Rollback Note
An earlier production apply attempt was rolled back immediately because the
live API image was still `ff30c61c...`, before the Operator Console code path
had deployed. The symptom was `/api/v1/platform/tenants` returning only the
`awoooi` row. Rollback restored `total=2`.
After Gitea CD rolled out `7d92f0ac...`, the same SQL was re-applied and the
post-apply verification above passed. Keep this deployment-order gate for any
future changes to cross-tenant read helpers.
## Rollback
```bash