fix(governance): enforce P2-105 redaction guard
All checks were successful
Code Review / ai-code-review (push) Successful in 21s
CD Pipeline / tests (push) Successful in 1m50s
CD Pipeline / build-and-deploy (push) Successful in 6m31s
CD Pipeline / post-deploy-checks (push) Successful in 2m44s

This commit is contained in:
Your Name
2026-06-13 02:02:16 +08:00
parent f71c2779a8
commit a9b95f99eb
3 changed files with 45 additions and 1 deletions

View File

@@ -45,6 +45,7 @@ def load_latest_ai_agent_critic_reviewer_result_capture(
_require_promotion_gates(payload, str(latest))
_require_candidate_routes(payload, str(latest))
_require_redaction_contract(payload, str(latest))
_require_no_forbidden_display_terms(payload, str(latest))
_require_rollup_consistency(payload, str(latest))
return payload
@@ -262,6 +263,38 @@ def _require_redaction_contract(payload: dict[str, Any], label: str) -> None:
raise ValueError(f"{label}: display redaction fields must remain false: {unsafe}")
def _require_no_forbidden_display_terms(payload: dict[str, Any], label: str) -> None:
forbidden_terms = {
"工作視窗",
"對話內容",
"批准!繼續",
"In app browser",
"My request for Codex",
"work window transcript",
"internal collaboration transcript",
}
hits: list[str] = []
def walk(value: Any, path: str) -> None:
if isinstance(value, dict):
for key, nested in value.items():
walk(nested, f"{path}.{key}" if path else str(key))
return
if isinstance(value, list):
for index, nested in enumerate(value):
walk(nested, f"{path}[{index}]")
return
if isinstance(value, str):
matched = sorted(term for term in forbidden_terms if term in value)
if matched:
hits.append(f"{path}: {', '.join(matched)}")
walk(payload, "")
if hits:
raise ValueError(f"{label}: forbidden display terms found: {hits}")
def _require_rollup_consistency(payload: dict[str, Any], label: str) -> None:
rollups = payload.get("rollups") or {}
truth = payload.get("score_truth") or {}

View File

@@ -112,6 +112,16 @@ def test_rejects_candidate_route_write(tmp_path):
load_latest_ai_agent_critic_reviewer_result_capture(tmp_path)
def test_rejects_forbidden_display_terms(tmp_path):
data = load_latest_ai_agent_critic_reviewer_result_capture()
bad = copy.deepcopy(data)
bad["agent_scorecards"][0]["failure_if_missing"] = "不得顯示工作視窗對話內容"
_write_snapshot(tmp_path, bad)
with pytest.raises(ValueError, match="forbidden display terms"):
load_latest_ai_agent_critic_reviewer_result_capture(tmp_path)
def test_rejects_rollup_mismatch(tmp_path):
data = load_latest_ai_agent_critic_reviewer_result_capture()
bad = copy.deepcopy(data)

View File

@@ -49,6 +49,7 @@
**完成(本地)**
- 新增 `ai_agent_critic_reviewer_result_capture_v1` schema、committed snapshot 與 backend loader固定 5 張 Agent scorecard、5 個 result capture contract、6 個 promotion gate 與 4 條 candidate route。
- Loader 新增可見文案紅線防退化檢查snapshot 若再出現 `工作視窗``對話內容``批准!繼續``In app browser``My request for Codex` 或舊英文內部逐字稿詞API 會 fail closed。
- 新增 `GET /api/v1/agents/agent-critic-reviewer-result-capture` 只讀 API 與測試API 只回傳 scorecard、result capture contract、promotion gate、candidate route 與 redaction boundary。
- 治理頁 `/zh-TW/governance?tab=automation-inventory` 新增 P2-105 區塊,顯示 OpenClaw Critic / Reviewer、Hermes redaction / operator report、NemoTron failure verifier 如何互判、接手與阻擋 unsafe promotion。
- 更新 `AI_AGENT_AUTOMATION_WORKLIST_2026-06-04.md``AI_AGENT_INTERACTION_LEARNING_PROOF_2026-06-11.md` 與 MASTER §3.2 / §5將 P2-105 標記為完成,下一步改為 `P2-106` owner-approved result capture dry-run。
@@ -58,7 +59,7 @@
- `python -m json.tool` 等效解析 P2-105 snapshot / schema / `zh-TW.json` / `en.json` 通過。
- `cmp -s apps/web/messages/zh-TW.json apps/web/messages/en.json` 通過,兩份訊息檔維持繁體中文鏡像。
- `DATABASE_URL='postgresql+asyncpg://test:test@localhost/test' PYTHONPATH=apps/api python -m py_compile apps/api/src/services/ai_agent_critic_reviewer_result_capture.py apps/api/src/api/v1/agents.py` 通過。
- `DATABASE_URL='postgresql+asyncpg://test:test@localhost/test' PYTHONPATH=apps/api python -m pytest -q apps/api/tests/test_ai_agent_critic_reviewer_result_capture.py apps/api/tests/test_ai_agent_critic_reviewer_result_capture_api.py``10 passed`
- `DATABASE_URL='postgresql+asyncpg://test:test@localhost/test' PYTHONPATH=apps/api python -m pytest -q apps/api/tests/test_ai_agent_critic_reviewer_result_capture.py apps/api/tests/test_ai_agent_critic_reviewer_result_capture_api.py``11 passed`
- `pnpm --filter @awoooi/web typecheck` 通過。
- `pnpm --filter @awoooi/web exec next lint --file 'src/app/[locale]/governance/tabs/automation-inventory-tab.tsx' --file src/lib/api-client.ts``✔ No ESLint warnings or errors`