Attach decision envelopes to review queue

2026-05-24 23:03:11 +08:00
parent a4aa796114
commit 2ca3559df2
8 changed files with 194 additions and 7 deletions
--- a/TODO_NEXT_STEPS.txt
+++ b/TODO_NEXT_STEPS.txt
@@ -4,6 +4,7 @@
 ================================================================================

 【已完成】
+   - V10.456 將 PChome 覆核隊列接上 `decision_envelope` contract：`fetch_competitor_review_queue()` 與 `/api/pchome-review/queue` 每筆候選都輸出同一份 SKU、PChome 候選、match evidence、recommended_action、expected_impact 與 HITL guardrails，Dashboard、Agent、Telegram、PPT 後續不得再各自重建比價判讀格式；同版將 review queue cache key 升到 v3，避免正式環境沿用舊 payload。
   - V10.455 讓 EventRouter 對 `decision_envelope` 事件走直送證據模板：NemoTron / 價格比對已產生 SKU、PChome 候選、match evidence 與 HITL guardrails 時，不再進 L1/L2 AI 重新摘要，避免額外模型呼叫與告警文字二次發散；Telegram 決策信封同步補「標的」區塊，顯示 SKU、商品與 PChome 候選。同版補 `audit_competitor_match_attempt_rescore.py --retract-variant-accepted`，可把最新仍帶 `variant_selection_review` 的 `rescore_accepted_current` 批次追加退回 `true_low_confidence`，且不寫正式價差表。
   - V10.454 補 feeder / rescore 正式寫入安全閘門：matcher 若只到 `manual_review` / `identity_review` / `variant_selection_review`，例如 MOMO 多款任選唇膏對 PChome 單一款式，只能進 `true_low_confidence` 覆核，不得由 retryable replay、known identity refresh 或 rescore accepted 語意自動寫入 `competitor_prices` 正式價差。
   - V10.453 補 PChome matcher 安全回收規則：新增 Herbacin 小甘菊護手霜 20ml brandless 同款 anchor；修正 `EX8` 型號不可被誤解析成 `x8` 入數；新增 GONESH / 香氛固體凝膠的一側泛稱、一側明確香味或 No. 款式 veto，避免近門檻 replay 把不同香味、不同入數商品錯寫成正式價差。
--- a/config.py
+++ b/config.py
@@ -325,7 +325,7 @@ YOUTUBE_API_KEY = os.getenv('YOUTUBE_API_KEY', '')
 # ==========================================
 # 系統版本與路徑
 # ==========================================
-SYSTEM_VERSION = "V10.455"
+SYSTEM_VERSION = "V10.456"
 LOG_FILE_PATH = os.path.join(BASE_DIR, 'logs/system.log')
 public_url = PUBLIC_URL  # 用於模板顯示

--- a/docs/AI_INTELLIGENCE_MODULE_SOT.md
+++ b/docs/AI_INTELLIGENCE_MODULE_SOT.md
@@ -2,7 +2,7 @@

 > **最後更新**: 2026-05-24 (台北時間)
 > **狀態**: 🟢 四 AI Agent 自動化閉環已落地；LLM 路由紅線升級為 Ollama-first 三主機級聯，Gemini 備援預設關閉
-> **適用版本**: V10.455
+> **適用版本**: V10.456

 ---

@@ -47,6 +47,7 @@
 - EventRouter / Telegram 的 HITL callback 必須優先使用 `decision_envelope.decision_id` 作為事件追蹤 ID；若上游未帶 `event.id`，`triaged_alert()` 仍會用 `decision_id` 產生 `momo:eig:*` callback，避免價格決策審核落成 `unknown`。所有 `momo:eig:*` callback 必須以 UTF-8 byte-safe 截斷，確保 `callback_data` 不超過 Telegram 64-byte 限制。
 - 競品比價相關的 Agent 建議只能讀 `competitor_match_attempts` / review queue / `competitor_prices` 的既有證據；不得直接寫 `competitor_prices` 或覆蓋 `_should_upsert_competitor_price()` 的保護規則。
 - 已帶 `decision_envelope` 的價格/覆核事件必須由 EventRouter 直接渲染證據模板，不再進 L1/L2 AI 重新摘要；Telegram 決策信封需顯示標的 SKU、商品名稱、PChome 候選、evidence、guardrails 與 HITL 動作，避免已有實證的比價告警被二次生成文字稀釋或造成額外模型成本。
+- PChome 覆核隊列本身也必須輸出 `decision_envelope`：`fetch_competitor_review_queue()`、`fetch_competitor_review_queue_page()` 與 `/api/pchome-review/queue` 的每筆候選需帶相同的 `subject`、`evidence`、`recommended_action`、`expected_impact` 與 `guardrails`，供 Dashboard、Agent、Telegram 與 PPT 共用；任何下游不得另寫一套比價狀態翻譯或繞過 HITL guardrails。

 ## 一、四 AI Agent 路由架構

--- a/docs/memory/code_modularization_inventory_20260430.md
+++ b/docs/memory/code_modularization_inventory_20260430.md
@@ -52,6 +52,7 @@
 - 2026-05-24 追記：同步 111 fallback circuit breaker、NemoTron 決策信封與 Telegram template governance 後的 `run_scheduler.py`、`services/ollama_service.py`、`services/nemoton_dispatcher_service.py`、`services/telegram_templates.py` 行數；此處只更新 inventory，不變更模組化決策。
 - 2026-05-24 追記：同步 PChome 覆核頁 fast-count、輕量 render 與重算可採用指標後的 `routes/dashboard_routes.py` 行數；此處只更新 inventory，不變更 dashboard 行為。
 - 2026-05-24 追記：同步 PChome rescore audit 最新狀態口徑與單位價 multiplier 修正後的 `services/marketplace_product_matcher.py` 行數；此處只更新 inventory，不變更拆分策略。
+- 2026-05-24 追記：同步 PChome review queue 決策信封合併後的 `services/competitor_intel_repository.py` 行數；此處只更新 inventory，不變更拆分策略。

 ## 達到或超過 800 行檔案清單

@@ -86,7 +87,7 @@
 | 953 | `routes/export_routes.py` | P2 Export flow | export command/router glue / file path / download orchestration |
 | 816 | `services/ppt_vision_service.py` | P2 PPT vision QA service | runtime state / queue status / model probe / audit execution 分離 |
 | 2149 | `services/competitor_price_feeder.py` | P2 competitor price feeder | crawler scheduling / price normalization / retryable candidate recovery / cache strategy |
-| 1327 | `services/competitor_intel_repository.py` | P2 competitor intel repository | review queue query / cache shaping / formatting helpers |
+| 1535 | `services/competitor_intel_repository.py` | P2 competitor intel repository | review queue query / cache shaping / formatting helpers |
 | 805 | `routes/bot_api_routes.py` | P2 Bot API Blueprint | route glue / bot action service |
 | 1319 | `routes/market_intel_review_report_routes.py` | P2 market intel review report Blueprint | review report route glue / export payload / phase handoff orchestration |
 | 917 | `routes/market_intel_routes.py` | P2 market intel Blueprint | page route / API route glue / MCP gate route registration helper |
--- a/docs/memory/current_execution_queue_20260524.md
+++ b/docs/memory/current_execution_queue_20260524.md
@@ -10,7 +10,8 @@
 - 每次上線只 recreate `momo-app`、`scheduler`、`telegram-bot`，禁止使用 `--remove-orphans`，禁止影響 `momo-db`。
 - 2026-05-24 21:33 CST 狀態：`main` 已推 Gitea 並部署到 188，正式 `/health` 為 `V10.451`。本輪只 recreate `momo-app`；`scheduler`、`telegram-bot` 未重建但保持 healthy；未使用 `--remove-orphans`，未碰 `momo-db`。Smoke 通過：主要頁面 HTTP 200、三個 app 容器 healthy、`/api/pchome-review/queue` 可用於 `recoverable_low_score` / `legacy_low_score` read-only 查詢，且 10 分鐘錯誤 log 未見 Traceback / ERROR。
 - 2026-05-24 22:17 CST 狀態：`main` 已推 Gitea 並部署到 188，正式 `/health` 為 `V10.453`。本輪 recreate `momo-app`、`scheduler`、`telegram-bot`；未使用 `--remove-orphans`，未碰 `momo-db`。Smoke 通過：三個 app 容器 healthy、Gemini hard disabled 且 24 小時 `ai_calls` 無 Gemini provider、Ollama 順序維持 GCP-A → GCP-B → 111、`/api/pchome-review/queue` 三個 status 查詢成功、rescore audit read-only `selection_mode=latest_sku_only`。
- 2026-05-24 22:52 CST 狀態：`main` 已推 Gitea 並部署到 188，正式 `/health` 為 `V10.455`。本輪 recreate `momo-app`、`scheduler`、`telegram-bot`；未使用 `--remove-orphans`，未碰 `momo-db`。Smoke 通過：三個 app 容器 healthy、EventRouter `decision_envelope` 直送不進 L1/L2 AI handler、Telegram 信封顯示標的 SKU 與 PChome 候選、Gemini hard disabled 且 24 小時 `ai_calls` 無 Gemini provider、Ollama 順序維持 GCP-A → GCP-B → 111、`/api/pchome-review/queue?review_status=rescore_accepted` 查詢成功、3 分鐘錯誤 log 未見 Traceback / ERROR / CRITICAL。
+- 2026-05-24 22:55 CST 狀態：`main` 已推 Gitea 並部署到 188，正式 `/health` 為 `V10.455`。本輪 recreate `momo-app`、`scheduler`、`telegram-bot`；未使用 `--remove-orphans`，未碰 `momo-db`。Smoke 通過：三個 app 容器 healthy、EventRouter `decision_envelope` 直送不進 L1/L2 AI handler、Telegram 信封顯示標的 SKU 與 PChome 候選、Gemini hard disabled 且 24 小時 `ai_calls` 無 Gemini provider、Ollama 順序維持 GCP-A → GCP-B → 111、`/api/pchome-review/queue?review_status=rescore_accepted` 查詢成功、10 分鐘錯誤 log 未見 Traceback / ERROR / CRITICAL。已執行 `--retract-variant-accepted`，最新 `rescore_accepted_current` 中 `variant_selection_review` 殘留為 0。
+- 2026-05-24 23:00 CST 狀態：V10.456 補 PChome review queue `decision_envelope`；待部署後回填正式 `/health` 與 smoke 結果。

 ## 1. MOMO / PChome 核心比價準確率

@@ -40,6 +41,7 @@

 - `decision_envelope` 已接到 NemoTron 價格告警與人工覆核，下一步要讓 OpenClaw、ElephantAlpha、PPT QA 與 review queue 共用同一份 evidence contract。
 - 2026-05-24 22:44 CST 起，EventRouter 對已附 `decision_envelope` 的事件直接渲染證據模板，不呼叫 L1/L2 AI handler；這讓 NemoTron 價格告警、人工覆核與後續 Agent 共用同一份 SKU / PChome / evidence / guardrails，不再二次生成摘要。
+- 2026-05-24 23:00 CST 起，`fetch_competitor_review_queue()`、`fetch_competitor_review_queue_page()` 與 `/api/pchome-review/queue` 每筆候選也帶 `decision_envelope`，包含 SKU/PChome 標的、match evidence、人工下一步、預期價差與不可自動寫正式價差的 guardrails；Dashboard、Agent、Telegram、PPT 後續共用此 contract。
 - 告警不得再輸出空泛「預期效益」；必須帶資料品質、證據來源、HITL 邊界與 trace id。
 - Agent 建議只能輔助排序與分析，不得繞過 matcher / feeder / review service 寫正式價格。

--- a/docs/memory/history_logs.md
+++ b/docs/memory/history_logs.md
@@ -13,6 +13,7 @@
 ## 📅 詳細更新日誌 (考古存檔)

 ### 2026-05-24：PChome 近門檻身份回收第二輪
+- **V10.456 review queue 決策信封**: `fetch_competitor_review_queue()`、`fetch_competitor_review_queue_page()` 與 `/api/pchome-review/queue` 每筆 PChome 覆核候選都輸出 `decision_envelope`，包含標的 SKU/PChome 候選、match evidence、建議人工動作、預期價差、資料品質與「不可自動寫正式價差」guardrails；review queue cache key 升到 v3，避免正式環境沿用舊 payload。
 - **V10.455 EventRouter 決策信封直送**: 已帶 `decision_envelope` 的價格/覆核事件會略過 L1/L2 AI 重新摘要，直接用 Telegram 證據模板通知；決策信封新增標的區塊，顯示 SKU、商品名稱、PChome 候選 ID/名稱，避免 NemoTron 已有實證的價格告警被二次生成文字稀釋或產生額外模型呼叫。
 - **V10.455 rescore variant retraction CLI**: `audit_competitor_match_attempt_rescore.py --retract-variant-accepted` 可找出最新仍為 `rescore_accepted_current` 且帶 `variant_selection_review` 的 SKU，追加 `true_low_confidence` 退回列；保留歷史 audit trail，不刪資料、不寫正式價格表。
 - **V10.454 production rescore 入人工覆核隊列**: 以 latest-sku-only 口徑重算 745 筆 `true_low_confidence`，先追加 2 筆人工覆核列；V10.454 gate 補上 `variant_selection_review` 排除後，SKU `8884618` KATE 怪獸級持色唇膏（MOMO 多款任選 vs PChome 單一水光款）已退回最新 `true_low_confidence`，最終只保留 SKU `10922465` Herbacin 小甘菊護手霜 20ml 為 `rescore_accepted_current`。這次只寫 `competitor_match_attempts` 人工覆核列，未寫 `competitor_prices` / `competitor_price_history`，並已清除 Dashboard 與 competitor intel cache。
--- a/services/competitor_intel_repository.py
+++ b/services/competitor_intel_repository.py
@@ -295,6 +295,145 @@ def _build_unit_comparison_for_attempt(row: dict[str, Any]) -> Optional[dict[str
        return {"comparable": False, "reason": "build_error"}


+def _review_action_code(attempt_status: str) -> str:
+    if attempt_status == "rescore_accepted_current":
+        return "review_accept_identity"
+    if attempt_status in UNIT_COMPARABLE_STATUSES or attempt_status == "manual_unit_price_required":
+        return "unit_price_required"
+    if attempt_status in {"no_result", "refresh_no_result", "manual_needs_research"}:
+        return "needs_research"
+    if attempt_status in {"identity_veto", "manual_rejected"}:
+        return "verify_or_reject_identity"
+    if attempt_status in {"expired_match", "protected_existing_match"}:
+        return "refresh_or_compare_identity"
+    return "human_review"
+
+
+def _review_data_quality(attempt_status: str, item: dict[str, Any]) -> str:
+    if attempt_status in {"no_result", "refresh_no_result", "never_attempted"}:
+        return "missing"
+    if not item.get("candidate_pc_id") or not item.get("candidate_pc_name"):
+        return "missing"
+    if item.get("candidate_pc_price", 0) <= 0 or item.get("momo_price", 0) <= 0:
+        return "partial"
+    if attempt_status == "rescore_accepted_current":
+        return "complete"
+    return "partial"
+
+
+def _review_severity(attempt_status: str, item: dict[str, Any]) -> str:
+    momo_price = _num(item.get("momo_price"))
+    candidate_price = _num(item.get("candidate_pc_price"))
+    price_gap_pct = 0.0
+    if momo_price > 0 and candidate_price > 0:
+        price_gap_pct = (momo_price - candidate_price) / max(candidate_price, 1) * 100
+
+    if attempt_status == "rescore_accepted_current" and price_gap_pct >= 10:
+        return "P1"
+    if attempt_status == "rescore_accepted_current":
+        return "P2"
+    if attempt_status in UNIT_COMPARABLE_STATUSES:
+        return "P2"
+    if attempt_status in {"recoverable_low_score", "expired_match"}:
+        return "P3"
+    return "P4"
+
+
+def _build_review_decision_envelope(item: dict[str, Any]) -> dict[str, Any]:
+    """Build the shared evidence contract for an operator review queue item."""
+    attempt_status = str(item.get("attempt_status") or "")
+    momo_price = _num(item.get("momo_price"))
+    candidate_price = _num(item.get("candidate_pc_price"))
+    gap_amount = None
+    gap_pct = None
+    if momo_price > 0 and candidate_price > 0:
+        gap_amount = round(momo_price - candidate_price, 2)
+        gap_pct = round((momo_price - candidate_price) / max(candidate_price, 1) * 100, 1)
+
+    evidence: list[dict[str, Any]] = [
+        {
+            "type": "review_status",
+            "metric": "attempt_status",
+            "value": attempt_status,
+            "basis": item.get("status_label") or _attempt_status_label(attempt_status),
+        },
+        {
+            "type": "match",
+            "metric": "match_score",
+            "value": round(_num(item.get("best_match_score")), 3),
+            "basis": "/".join(
+                part
+                for part in (
+                    item.get("match_type") or "unknown",
+                    item.get("price_basis") or "unknown",
+                    item.get("alert_tier") or "unknown",
+                )
+                if part
+            ),
+            "confidence": round(_num(item.get("best_match_score")), 3) or None,
+        },
+    ]
+    if gap_pct is not None:
+        evidence.append({
+            "type": "price",
+            "metric": "candidate_gap_pct",
+            "value": f"{gap_pct:+.1f}%",
+            "basis": "MOMO latest price + PChome review candidate",
+        })
+    reason_text = item.get("diagnostic_reason_text")
+    if reason_text:
+        evidence.append({
+            "type": "diagnostic",
+            "metric": "reasons",
+            "value": reason_text,
+            "basis": "match_diagnostic_json.reasons",
+        })
+
+    return {
+        "decision_id": (
+            "review_queue:"
+            f"{item.get('sku') or 'unknown'}:"
+            f"{attempt_status or 'unknown'}:"
+            f"{item.get('candidate_pc_id') or 'no_candidate'}"
+        ),
+        "source_agent": "review_queue",
+        "decision_type": "pchome_match_review",
+        "severity": _review_severity(attempt_status, item),
+        "subject": {
+            "sku": str(item.get("sku") or ""),
+            "name": item.get("name") or "",
+            "event_type": "pchome_match_review",
+            "competitor_product_id": item.get("candidate_pc_id") or "",
+            "competitor_product_name": item.get("candidate_pc_name") or "",
+        },
+        "evidence": evidence,
+        "recommended_action": {
+            "action": _review_action_code(attempt_status),
+            "owner": "營運",
+            "requires_hitl": True,
+        },
+        "expected_impact": {
+            "gap_amount": gap_amount,
+            "candidate_gap_pct": gap_pct,
+            "risk_reduction": "medium" if attempt_status in {"rescore_accepted_current", "recoverable_low_score"} else "watch",
+        },
+        "confidence": round(_num(item.get("best_match_score")), 3),
+        "guardrails": {
+            "can_auto_execute": False,
+            "blocked_reason": "PChome 候選需人工覆核；不得自動寫入正式 competitor_prices",
+            "data_quality": _review_data_quality(attempt_status, item),
+            "attempt_status": attempt_status,
+            "match_type": item.get("match_type") or "",
+            "price_basis": item.get("price_basis") or "",
+            "alert_tier": item.get("alert_tier") or "",
+        },
+        "trace": {
+            "source": "competitor_match_attempts",
+            "attempted_at": item.get("attempted_at") or "",
+        },
+    }
+
+
 def _format_competitor_review_item(row: dict[str, Any]) -> dict[str, Any]:
    item = dict(row)
    unit_comparison = _build_unit_comparison_for_attempt(item)
@@ -310,7 +449,7 @@ def _format_competitor_review_item(row: dict[str, Any]) -> dict[str, Any]:
    alert_tier = diagnostic_payload.get("alert_tier") or _tag_suffix(tags, "alert_tier") or ""
    evidence_flags = diagnostic_payload.get("evidence_flags") or []
    diagnostic_reasons = _extract_match_diagnostic_reasons(match_diagnostic, diagnostic_payload)
-    return {
+    formatted = {
        "sku": str(item.get("sku") or ""),
        "name": item.get("name") or "",
        "category": item.get("category") or "",
@@ -336,6 +475,8 @@ def _format_competitor_review_item(row: dict[str, Any]) -> dict[str, Any]:
        "attempted_at": _date_label(item.get("attempted_at")),
        "unit_comparison": unit_comparison,
    }
+    formatted["decision_envelope"] = _build_review_decision_envelope(formatted)
+    return formatted


 def clear_competitor_intel_cache() -> None:
@@ -791,7 +932,7 @@ def fetch_competitor_review_queue(engine, limit: int = 12) -> list[dict]:
    """可行動的 PChome 比對覆核隊列，供 Dashboard / AI / PPT 共用。"""
    limit = max(1, min(int(limit or 12), 50))
    return _cached_payload(
-        f"review_queue:v2:limit={limit}:floor={PCHOME_MATCH_SCORE_FLOOR}",
+        f"review_queue:v3:limit={limit}:floor={PCHOME_MATCH_SCORE_FLOOR}",
        lambda: _fetch_competitor_review_queue_uncached(engine, limit=limit),
    )

@@ -814,7 +955,7 @@ def fetch_competitor_review_queue_page(
    if status_filter not in REVIEW_STATUS_FILTER_GROUPS:
        status_filter = ""
    cache_key = (
-        "review_queue_page:v2:"
+        "review_queue_page:v3:"
        f"page={page}:per={per_page}:q={search_query.lower()}:cat={category}:"
        f"status={status_filter}:"
        f"count={int(bool(count_total))}:"
--- a/tests/test_competitor_intel_cache.py
+++ b/tests/test_competitor_intel_cache.py
@@ -177,6 +177,46 @@ def test_competitor_review_reasons_prefer_json_payload_labels():
        "nail_tool_function_conflict",
        "schick_razor_line_conflict",
    ]
+    envelope = item["decision_envelope"]
+    assert envelope["decision_type"] == "pchome_match_review"
+    assert envelope["subject"]["sku"] == "SKU-1"
+    assert envelope["subject"]["competitor_product_id"] == "DABC123"
+    assert envelope["guardrails"]["can_auto_execute"] is False
+    assert envelope["guardrails"]["data_quality"] == "partial"
+    assert envelope["guardrails"]["match_type"] == "no_match"
+    assert envelope["recommended_action"]["requires_hitl"] is True
+    assert envelope["recommended_action"]["action"] == "verify_or_reject_identity"
+    assert any(evidence["metric"] == "reasons" for evidence in envelope["evidence"])
+
+
+def test_rescore_accepted_review_item_has_actionable_decision_envelope():
+    from services.competitor_intel_repository import _format_competitor_review_item
+
+    item = _format_competitor_review_item({
+        "sku": "10922465",
+        "name": "【Herbacin 德國小甘菊】小甘菊1號護手霜20ml",
+        "momo_price": 99,
+        "attempt_status": "rescore_accepted_current",
+        "candidate_count": 1,
+        "best_competitor_product_id": "DDAO4C-A79050612",
+        "best_competitor_product_name": "小甘菊經典護手霜20ml",
+        "best_competitor_price": 89,
+        "best_match_score": 0.872,
+        "match_diagnostic_json": {
+            "match_type": "exact",
+            "price_basis": "total_price",
+            "alert_tier": "identity_review",
+            "reasons": ["focused_exact_identity_herbacin_classic_hand_cream_20ml_brandless"],
+        },
+    })
+
+    envelope = item["decision_envelope"]
+    assert envelope["severity"] in {"P1", "P2"}
+    assert envelope["recommended_action"]["action"] == "review_accept_identity"
+    assert envelope["guardrails"]["data_quality"] == "complete"
+    assert envelope["expected_impact"]["gap_amount"] == 10
+    assert envelope["expected_impact"]["candidate_gap_pct"] == 11.2
+    assert any(evidence["metric"] == "candidate_gap_pct" for evidence in envelope["evidence"])


 def test_competitor_ppt_prompt_uses_neutral_ewooc_viewpoint():