chore(ops): 補強 RLS role bootstrap gate
All checks were successful
Code Review / ai-code-review (push) Successful in 10s

This commit is contained in:
Your Name
2026-05-12 18:36:35 +08:00
parent 0bc1878778
commit f0255e0300
5 changed files with 180 additions and 5 deletions

View File

@@ -1,3 +1,52 @@
## 2026-05-12 | 188 Ollama Gate 綠燈與 RLS Role Bootstrap 設計
**背景**Wave 1 尚有兩個可收斂點188 local Ollama 是否仍有 direct caller以及 RLS roles 缺失如何安全補上。原則維持:只驗證與準備,不直接 uninstall 188 Ollama不直接 production 熱開 RLS。
**188 Ollama retirement gate**
- 執行 `POST_SINCE='24 hours ago' HEALTH_SINCE='10 minutes ago' scripts/ops/ollama188-retirement-gate.sh`
- 結果:`failures=0 warnings=0`
- PASS 項目:
- repo runtime 已無 `192.168.0.188:11434` / `ollama_188` 引用。
- `awoooi-prod` live envGCP-A `34.143.170.20`、GCP-B `34.21.145.224`、local fallback `192.168.0.111`,未指向 188。
- `awoooi-dev` live env走 110 proxy `11435/11436/11437`,未指向 188。
- Prometheus live config 已無 188 Ollama target。
- 188 `ollama.service` active`OLLAMA_HOST=127.0.0.1:11434`LAN `192.168.0.188:11434` 已拒絕。
- 24 小時內沒有 `/api/generate``/api/chat``/v1/chat/completions` 推理 POST。
- 近期未看到 121/dev health check 打 188。
- 判讀Claude 報告的「188 Local Ollama 還在跑」已驗證為 cleanup candidate不是現行 production caller blocker可以安排 Stop 階段,但不直接 uninstall。
- 更新 `docs/runbooks/OLLAMA-188-RETIREMENT-GATE.md` 記錄 2026-05-12 24h gate 綠燈。
**RLS role bootstrap 補強**
- `scripts/ops/awooop_rls_preflight.py` 補充:
- current DB user `rolcreaterole` / `rolcreatedb`
- required roles 是否存在,以及 current user 是否為 member。
- app role membership gate避免 policies `FOR awooop_app` 套上後 app connection user 不匹配。
- target table owner供後續 owner / FORCE RLS 評估。
- 重新跑 `scripts/ops/awooop-rls-preflight.sh --json`
- current user `awoooi` 不是 superuser、不是 `CREATEROLE`、不是 `BYPASSRLS`
- `awooop_app` / `awooop_platform_admin` / `awooop_migration` 仍不存在。
- 新增 WARNrole bootstrap 需要 postgres / CREATEROLE operator`awooop_app` 缺失,無法評估 app membership。
- target tables owner 多為 `awoooi`,後續 policy/force RLS 可由 owner 路徑處理,但 CREATE ROLE 不能由 app DB user 完成。
- 新增 `scripts/ops/awooop-rls-role-bootstrap.sql`
- **不放在 `apps/api/migrations/`**,避免 Gitea auto-migration 用限權 migrator 嘗試 CREATE ROLE / BYPASSRLS。
- 手動由 `postgres` 或 CREATEROLE operator 執行。
- 建立 `awooop_app``awooop_platform_admin``awooop_migration` NOLOGIN group roles。
- `awooop_platform_admin` / `awooop_migration` 設定 `BYPASSRLS`
- `GRANT awooop_app TO awoooi`,讓現行 app connection user 能匹配 `FOR awooop_app` policy不需立即輪換 `DATABASE_URL`
-`awoooi_migrator` 存在,授權 `awooop_migration` group不建立密碼、不改 K8s Secret。
- 對已存在 target tables 動態 grant `SELECT/INSERT/UPDATE/DELETE``awooop_app`;不啟用 RLS policy。
- 已同步到 188 `/home/ollama/awoooi-ops/awooop-rls-role-bootstrap.sql`,只放檔、不執行。
**驗證**
- `python3 -m py_compile scripts/ops/awooop_rls_preflight.py` → passed。
- `bash -n scripts/ops/awooop-rls-preflight.sh scripts/ops/188-registry-certbot-fix.sh scripts/ops/ollama188-retirement-gate.sh` → passed。
- 188 Ollama 24h gate → `failures=0 warnings=0`
- RLS preflight live run → blocked/warn 結果符合預期;未改 DB。
**下一步**
- 由具 postgres / CREATEROLE 權限者審查後執行 `scripts/ops/awooop-rls-role-bootstrap.sql`,再重跑 `awooop-rls-preflight.sh --exact-counts`
- 188 Ollama 可進入 Stop 候選窗口;仍需保留服務與模型,不能 uninstall。
## 2026-05-12 | RLS Preflight 與 188 Registry Certbot 修復包
**背景**Wave 1 已確認 production RLS 是 P0但不可直接熱開188 `registry.wooo.work` certbot 也已確認失效,但目前 `ollama` SSH 帳號沒有免密 sudo。這輪把兩個紅燈轉成可重跑、可交接、可審批的 remediation 前置包。

View File

@@ -35,6 +35,8 @@ Exit code `2` means the gate is blocked and RLS must not be enabled yet.
- `PASS current_role_rls_enforced`: current DB user is `awoooi`, not superuser and not `BYPASSRLS`.
- `PASS project_context_set_config`: `set_config('app.project_id', 'awoooi', TRUE)` works in the API pod.
- `BLOCKED required_roles`: `awooop_app`, `awooop_platform_admin`, and `awooop_migration` do not exist.
- `WARN role_bootstrap_authority`: current API DB user `awoooi` is not `CREATEROLE`; role bootstrap requires `postgres` or a `CREATEROLE` operator.
- `WARN app_role_membership`: `awooop_app` is missing, so membership cannot be evaluated yet.
- `PASS project_id_columns`: every existing target table has `project_id`.
- `BLOCKED rls_enabled_forced_policy`: existing target tables are not yet RLS enabled, forced, or policied.
- `PASS fail_open_policies`: production DB currently has no fail-open policy expressions.
@@ -43,7 +45,7 @@ Exit code `2` means the gate is blocked and RLS must not be enabled yet.
Current blocker summary:
```text
PASS=5 WARN=0 BLOCKED=2
PASS=5 WARN=2 BLOCKED=2
```
Important exact counts from the same run:
@@ -52,11 +54,11 @@ Important exact counts from the same run:
| --- | ---: | ---: |
| `audit_logs` | 686 | 0 |
| `awooop_mcp_tool_registry` | 4 | 0 |
| `awooop_outbound_message` | 228 | 0 |
| `awooop_outbound_message` | 235 | 0 |
| `awooop_projects` | 2 | 0 |
| `awooop_run_state` | 106 | 0 |
| `awooop_run_state` | 113 | 0 |
| `incidents` | 1518 | 0 |
| `knowledge_entries` | 2099 | 0 |
| `knowledge_entries` | 2102 | 0 |
| `playbooks` | 220 | 0 |
## Remediation Order
@@ -67,6 +69,9 @@ Important exact counts from the same run:
policies are enforced.
- Do not create passworded LOGIN roles in a migration unless the K8s Secret
rotation path is ready.
- Use `scripts/ops/awooop-rls-role-bootstrap.sql` only after review, and run
it manually as `postgres` or a `CREATEROLE` operator. It is intentionally
outside `apps/api/migrations/` so Gitea auto-migration will not run it.
2. Verify all DB access paths use `get_db()` / `get_db_context()` or otherwise set
`app.project_id` before queries.
3. Apply policies first in staging or a canary DB.

View File

@@ -112,3 +112,4 @@ curl -sS --max-time 3 http://192.168.0.188:11434/api/tags || echo LAN_CLOSED
- 24 小時 Gate 尚未綠燈:仍看得到 `192.168.0.88` 在 24 小時內送過推理 POST。
- 2026-05-06 15:14 已執行臨時封口188 只聽 `127.0.0.1:11434`從本機、110、K8s Pod 直連 `192.168.0.188:11434` 均拒絕。
- 2026-05-06 15:22 已執行永久 systemd 修復:`ollama.service` activeoverride 固定 `OLLAMA_HOST=127.0.0.1:11434`,不再有 systemd restart loop。
- 2026-05-12 18:35 重新跑 24 小時 Gaterepo runtime、`awoooi-prod`/`awoooi-dev` live env、Prometheus target、LAN exposure、24 小時推理 POST、dev health check 全部 PASS。判讀188 Ollama 已是 stop candidate但仍不直接 uninstall依階段決策先保留服務、等待明確停用窗口。

View File

@@ -0,0 +1,86 @@
-- AwoooP RLS role bootstrap.
--
-- IMPORTANT:
-- - Do not put this file under apps/api/migrations; Gitea auto-migration should
-- not attempt CREATE ROLE / BYPASSRLS with the limited migrator account.
-- - Run manually as postgres or a CREATEROLE-capable operator after review.
-- - This script does not create passwords and does not change application
-- DATABASE_URL. It creates NOLOGIN group roles and grants awooop_app to the
-- current production app connection role (`awoooi`).
--
-- Suggested command on 188:
-- sudo -u postgres psql -d awoooi_prod -v ON_ERROR_STOP=1 \
-- -f /path/to/awooop-rls-role-bootstrap.sql
--
-- Post-check:
-- bash scripts/ops/awooop-rls-preflight.sh --exact-counts
BEGIN;
DO $$
BEGIN
IF NOT EXISTS (SELECT 1 FROM pg_roles WHERE rolname = 'awooop_app') THEN
EXECUTE 'CREATE ROLE awooop_app NOLOGIN';
END IF;
IF NOT EXISTS (SELECT 1 FROM pg_roles WHERE rolname = 'awooop_platform_admin') THEN
EXECUTE 'CREATE ROLE awooop_platform_admin NOLOGIN';
END IF;
IF NOT EXISTS (SELECT 1 FROM pg_roles WHERE rolname = 'awooop_migration') THEN
EXECUTE 'CREATE ROLE awooop_migration NOLOGIN';
END IF;
END $$;
ALTER ROLE awooop_platform_admin BYPASSRLS;
ALTER ROLE awooop_migration BYPASSRLS;
-- Current production API connects as `awoooi`. Until DATABASE_URL is split to a
-- dedicated LOGIN role, make that role a member of the RLS-constrained group so
-- policies written as `FOR ALL TO awooop_app` apply without secret rotation.
GRANT awooop_app TO awoooi;
-- Keep existing migration account usable without changing its password or
-- DATABASE_URL. The group role is NOLOGIN; operators may SET ROLE during manual
-- RLS migrations if needed.
DO $$
BEGIN
IF EXISTS (SELECT 1 FROM pg_roles WHERE rolname = 'awoooi_migrator') THEN
EXECUTE 'GRANT awooop_migration TO awoooi_migrator';
END IF;
END $$;
-- Minimum grants for existing target tables that already have project_id. RLS
-- policies remain a separate staged migration and are not enabled here.
GRANT USAGE ON SCHEMA public TO awooop_app;
DO $$
DECLARE
table_name text;
target_tables text[] := ARRAY[
'incidents',
'knowledge_entries',
'playbooks',
'audit_logs',
'budget_ledger',
'awooop_projects',
'awooop_contract_revisions',
'awooop_run_state',
'awooop_mcp_tool_registry',
'awooop_mcp_grants',
'awooop_mcp_credential_refs',
'awooop_mcp_gateway_audit',
'awooop_conversation_event',
'awooop_outbound_message'
];
BEGIN
FOREACH table_name IN ARRAY target_tables LOOP
IF to_regclass('public.' || table_name) IS NOT NULL THEN
EXECUTE format('GRANT SELECT, INSERT, UPDATE, DELETE ON public.%I TO awooop_app', table_name);
END IF;
END LOOP;
END $$;
GRANT USAGE, SELECT, UPDATE ON ALL SEQUENCES IN SCHEMA public TO awooop_app;
COMMIT;

View File

@@ -85,6 +85,8 @@ async def collect(exact_counts: bool) -> tuple[list[Check], dict[str, Any]]:
current_user AS current_user,
session_user AS session_user,
r.rolsuper AS current_user_superuser,
r.rolcreaterole AS current_user_createrole,
r.rolcreatedb AS current_user_createdb,
r.rolbypassrls AS current_user_bypassrls
FROM pg_roles r
WHERE r.rolname = current_user
@@ -125,8 +127,13 @@ async def collect(exact_counts: bool) -> tuple[list[Check], dict[str, Any]]:
SELECT
rr.rolname,
r.rolsuper,
r.rolcreaterole,
r.rolbypassrls,
r.oid IS NOT NULL AS exists
r.oid IS NOT NULL AS exists,
CASE
WHEN r.oid IS NULL THEN FALSE
ELSE pg_has_role(current_user, rr.rolname, 'member')
END AS current_user_is_member
FROM required_roles rr
LEFT JOIN pg_roles r ON r.rolname = rr.rolname
ORDER BY rr.rolname
@@ -141,6 +148,29 @@ async def collect(exact_counts: bool) -> tuple[list[Check], dict[str, Any]]:
else:
add(checks, "required_roles", "PASS", "all required RLS roles exist")
if not role.get("current_user_superuser") and not role.get("current_user_createrole") and missing_roles:
add(
checks,
"role_bootstrap_authority",
"WARN",
"current API DB user cannot create missing roles; bootstrap requires postgres/CREATEROLE",
)
elif missing_roles:
add(checks, "role_bootstrap_authority", "PASS", "current DB user can create roles")
app_role = next((row for row in roles if row["rolname"] == "awooop_app" and row["exists"]), None)
if app_role is None:
add(checks, "app_role_membership", "WARN", "awooop_app role missing; membership not evaluated")
elif app_role["current_user_is_member"]:
add(checks, "app_role_membership", "PASS", "current API DB user is member of awooop_app")
else:
add(
checks,
"app_role_membership",
"BLOCKED",
"current API DB user is not a member of awooop_app; policies FOR awooop_app would not apply",
)
table_rows = await rows(
conn,
"""
@@ -153,6 +183,7 @@ async def collect(exact_counts: bool) -> tuple[list[Check], dict[str, Any]]:
c.oid,
c.relrowsecurity,
c.relforcerowsecurity,
pg_get_userbyid(c.relowner) AS table_owner,
COALESCE(c.reltuples, 0)::bigint AS estimated_rows
FROM target t
LEFT JOIN pg_class c
@@ -191,6 +222,7 @@ async def collect(exact_counts: bool) -> tuple[list[Check], dict[str, Any]]:
COALESCE(ps.policy_count, 0) AS policy_count,
COALESCE(ps.has_null_fail_open_policy, FALSE) AS has_null_fail_open_policy,
COALESCE(ps.has_empty_string_fail_open_policy, FALSE) AS has_empty_string_fail_open_policy,
r.table_owner,
r.estimated_rows
FROM rels r
LEFT JOIN project_columns pc ON pc.table_name = r.relname
@@ -273,6 +305,7 @@ def print_human(checks: list[Check], evidence: dict[str, Any]) -> None:
f"current_user={role.get('current_user')} "
f"session_user={role.get('session_user')} "
f"superuser={role.get('current_user_superuser')} "
f"createrole={role.get('current_user_createrole')} "
f"bypassrls={role.get('current_user_bypassrls')}"
)
@@ -287,6 +320,7 @@ def print_human(checks: list[Check], evidence: dict[str, Any]) -> None:
f"policies={row['policy_count']} "
f"fail_open_null={row['has_null_fail_open_policy']} "
f"fail_open_empty={row['has_empty_string_fail_open_policy']} "
f"owner={row['table_owner']} "
f"estimated_rows={row['estimated_rows']}"
)