diff --git a/docs/LOGBOOK.md b/docs/LOGBOOK.md index 6c3f2e7c..6cbbf626 100644 --- a/docs/LOGBOOK.md +++ b/docs/LOGBOOK.md @@ -1,3 +1,51 @@ +## 2026-05-12 | RLS Canary Wave1.2 projects 已套用 + +**背景**:Wave1.1 完成 `awooop_mcp_tool_registry` 後,剩餘低行數候選是 `awooop_projects`。這張表同時支撐 tenant runtime checks 與 Operator Console 跨租戶 project list,不能直接用單一 tenant policy 熱開。 + +**code / DB path 收斂**: +- `platform_operator_service.list_tenants()` 改讀 `public.awooop_operator_list_projects()`,讓 Operator Console 走明確 cross-tenant read helper。 +- `budget_service._get_tenant_budget_limit(project_id)` 改用 `get_db_context(project_id)`,避免用預設 `awoooi` context 查其他 tenant budget。 +- 新增 Wave1.2 apply / rollback SQL: + - `scripts/ops/awooop-rls-canary-wave1-2-projects.sql` + - `scripts/ops/awooop-rls-canary-wave1-2-projects-rollback.sql` +- 新增 runbook:`docs/runbooks/AWOOOP-RLS-CANARY-WAVE1-2.md`。 + +**deployment-order 紅燈與 rollback**: +- 先在 production 建立 `awooop_operator_list_projects()` 並確認 function 回傳 `awoooi` / `ewoooc`。 +- commit `7d92f0ac` 已推 Gitea main,但第一次套用 RLS 時 live API image 仍是 `ff30c61c...`。 +- 症狀:`/api/v1/platform/tenants` 只回 `awoooi`,表示舊 code 仍直接讀 `awooop_projects` 並被 RLS 正確過濾。 +- 已立即執行 rollback SQL;rollback 後 `/api/v1/platform/tenants` 恢復 `total=2`。 + +**production re-apply**: +- 確認 K8s 已 rollout 到 `192.168.0.110:5000/awoooi/api:7d92f0acd705451d99b4413ab9748482e3675c00`,2/2 ready。 +- 套用前 gate: + - `/api/v1/platform/tenants` → 200,`total=2`。 + - `/api/v1/health` → 200,`status=healthy`。 + - `awooop_operator_list_projects()` → `awoooi` / `ewoooc`。 +- 以 188 postgres/operator socket path 重跑 Wave1.2 SQL;result:`COMMIT`。 + +**套用後驗證**: +- `/api/v1/platform/tenants` → 200,`total=2`。 +- `/api/v1/health` → 200,`status=healthy`。 +- `scripts/ops/awooop-rls-preflight.sh --exact-counts`: + - `PASS=7 WARN=1 BLOCKED=1`。 + - `awooop_projects` → `rls=true force=true policies=4 fail_open=false`。 + - 剩餘 blocker 表:`audit_logs`、`awooop_outbound_message`、`awooop_run_state`、`incidents`、`knowledge_entries`、`playbooks`。 +- direct app-role behavior: + - no `app.project_id` → `[]`。 + - `app.project_id='awoooi'` → `['awoooi']`。 + - `app.project_id='ewoooc'` → `['ewoooc']`。 + - `awooop_operator_list_projects()` under `awoooi` context → `['awoooi', 'ewoooc']`。 + +**整體進度**: +- Wave 0:MOMO PostgreSQL backup → AwoooP 失敗通知接線完成。 +- Wave 1:GitHub deploy 競爭停用、RLS live 驗證、role bootstrap、API runtime access path、manual script gate、Wave1 空表 canary、Wave1.1 MCP tool registry、Wave1.2 projects canary 已完成。 +- 尚未完成:token rotation(需外部輪換)、188 certbot 正式修復、剩餘 RLS waves、188 local Ollama 停用窗口。 + +**下一步**: +- 下一批 RLS 候選從 `awooop_outbound_message` / `awooop_run_state` 擇一,先做 query-path 與 rollback rehearsal;不要直接熱開 `incidents` / `knowledge_entries` / `playbooks` / `audit_logs`。 +- 持續保留 `exact_counts_scope` WARN,避免把 tenant-visible count 誤讀成 global count。 + ## 2026-05-12 | RLS Canary Wave1.1 已套用 **背景**:Wave1 空表 canary 已完成後,下一個候選是低行數非空表。Live preflight 顯示 `awooop_projects=2 rows`、`awooop_mcp_tool_registry=4 rows`;本輪先做 read-path 盤點再決定範圍。 diff --git a/docs/runbooks/AWOOOP-RLS-CANARY-WAVE1-2.md b/docs/runbooks/AWOOOP-RLS-CANARY-WAVE1-2.md index 78a7a694..7db8a7f0 100644 --- a/docs/runbooks/AWOOOP-RLS-CANARY-WAVE1-2.md +++ b/docs/runbooks/AWOOOP-RLS-CANARY-WAVE1-2.md @@ -4,7 +4,18 @@ This wave targets: - `awooop_projects` -Status: staged, apply pending. +Status: applied in production on 2026-05-12. + +Production state: + +- API image gate: `192.168.0.110:5000/awoooi/api:7d92f0acd705451d99b4413ab9748482e3675c00` +- `awooop_projects`: `rls=true force=true policies=4` +- Operator Console tenants endpoint: `total=2` +- Direct table reads: + - no `app.project_id`: `[]` + - `app.project_id='awoooi'`: `['awoooi']` + - `app.project_id='ewoooc'`: `['ewoooc']` +- `awooop_operator_list_projects()` still returns both reviewed projects. ## Safety Model @@ -31,6 +42,17 @@ Operator Console project-list columns, and grants execute only to `awooop_app`. ## Apply +Before applying, verify the API deployment has the code that calls +`awooop_operator_list_projects()`: + +```bash +ssh wooo@192.168.0.120 \ + 'sudo kubectl -n awoooi-prod get deploy awoooi-api -o wide' +``` + +The image tag must be `7d92f0ac` or later. If the deployment is older, do not +enable RLS; Operator Console will only see the current tenant row. + ```bash psql "$DATABASE_URL" -v ON_ERROR_STOP=1 \ -f scripts/ops/awooop-rls-canary-wave1-2-projects.sql @@ -53,6 +75,32 @@ Expected after apply: - tenant budget lookup can read the matching tenant row. - global RLS preflight remains blocked only by later-wave tables. +Production verification from 2026-05-12: + +- `/api/v1/platform/tenants` from API pod: HTTP 200, `total=2`. +- `/api/v1/health` from API pod: HTTP 200, `status=healthy`. +- `scripts/ops/awooop-rls-preflight.sh --exact-counts`: + - `PASS=7 WARN=1 BLOCKED=1` + - `awooop_projects rls=true force=true policies=4` + - remaining blocker tables: `audit_logs`, `awooop_outbound_message`, + `awooop_run_state`, `incidents`, `knowledge_entries`, `playbooks`. +- Direct app-role behavior: + - `projects_no_context=[]` + - `projects_awoooi_context=['awoooi']` + - `projects_ewoooc_context=['ewoooc']` + - `operator_function_awoooi_context=['awoooi', 'ewoooc']` + +## Apply / Rollback Note + +An earlier production apply attempt was rolled back immediately because the +live API image was still `ff30c61c...`, before the Operator Console code path +had deployed. The symptom was `/api/v1/platform/tenants` returning only the +`awoooi` row. Rollback restored `total=2`. + +After Gitea CD rolled out `7d92f0ac...`, the same SQL was re-applied and the +post-apply verification above passed. Keep this deployment-order gate for any +future changes to cross-tenant read helpers. + ## Rollback ```bash