diff --git a/TODO_NEXT_STEPS.txt b/TODO_NEXT_STEPS.txt index 3e8d6c9..6e71a1f 100644 --- a/TODO_NEXT_STEPS.txt +++ b/TODO_NEXT_STEPS.txt @@ -4,6 +4,8 @@ ================================================================================ 【已完成】 + - V10.574 接上 PChome 型錄/任選可比覆核隊列:沿用 V10.572 的 `catalog_comparable_count` 安全口徑,將高分、無 hard veto、具同品線身份證據但仍有任選/型錄/商業條件待確認的 `true_low_confidence` 候選,拆成獨立 `catalog_comparable` 篩選與 decision envelope。此隊列仍維持 HITL,不寫入正式 `competitor_prices`、不算 exact matched,並把「型錄可比」與真正「證據不足」分開,讓營運可以先批次處理最有機會轉成單位價或正式身份的候選。 + - V10.573 新增市場情報 Source Governance → Fetch Target bridge:新增 `/api/market_intel/mcp_fetch_target_source_governance_review`、市場情報頁 bridge panel 與 deployment readiness smoke target,交叉審核 Professional Source Governance 與 MCP Fetch Target Review,要求每個 target `platform_code/source_key` 都能對上已通過治理的公開 source contract;仍不抓外站、不讀 robots/sitemap、不開 DB、不寫檔、不執行 CLI、不掛 scheduler。 - V10.572 新增 PChome 決策支援覆蓋率:不放寬 `matched` / `decision_ready` 的 exact identity 門檻,另外把高分、無 hard veto、具同品線與規格證據,但因「任選 / 色號 / 型錄 / 即期」仍需覆核的候選,納入 `catalog_comparable_count` 與 `decision_support_rate`。Dashboard、當日業績、成長分析與 backfill 狀態摘要同步顯示「決策支援覆蓋率 / 精準可告警覆蓋 / 型錄可比 / 單位價」,讓覆蓋率提升建立在可解釋情報分層上,而不是把非 exact 商品硬寫成正式同款。 - V10.571 提升 PChome pending 覆蓋率搜尋召回:`PCHOME_FEEDER_MAX_SEARCH_TERMS` 預設由 5 提升到 6,新增 `PCHOME_FEEDER_SEARCH_COVERAGE_RESCUE_ENABLED`,在主要搜尋詞與原始名稱 fallback 之間插入狹義 coverage rescue terms。搜尋詞會保留 `5.5g`、`2.4g` 等小數規格,不再變成 `5 5g` / `2 4g`;同時排除外出清潔、卸除髒汙、卸防曬等非身份核心噪音。正式 pilot 顯示 CeraVe / TUNEMAKERS / Embryolisse / Neogence / NIVEA 這類雙語品牌商品常卡在 PChome 搜尋召回,因此補上「英文品牌 + 中文品牌 + 核心身份 + 規格」窄搜尋詞;「品牌 + 品類 + 規格」仍只開給安全品類,避免為了拉 pending 覆蓋率引入假陽性。 - V10.570 補 PChome 身份 / 報價證據契約:matcher 的 `match_diagnostic_json` 新增 `identity_evidence`、`offer_evidence`,把品牌、品類、identity anchor、型號、規格、入數與 variant guardrail 拆成結構化證據;覆核隊列與 decision envelope 新增 `difference_highlights`,可直接指出容量、入數、色號、香味、款式、補充包、檔期組合等差異。價格明確標記為 offer evidence,不再被誤當身份證據,Dashboard / PPT / OpenClaw / Webcrumbs 能共用同一份比對證據。 diff --git a/config.py b/config.py index a341dc8..dccc9e5 100644 --- a/config.py +++ b/config.py @@ -402,7 +402,7 @@ YOUTUBE_API_KEY = os.getenv('YOUTUBE_API_KEY', '') # ========================================== # 系統版本與路徑 # ========================================== -SYSTEM_VERSION = "V10.572" +SYSTEM_VERSION = "V10.574" LOG_FILE_PATH = os.path.join(BASE_DIR, 'logs/system.log') public_url = PUBLIC_URL # 用於模板顯示 diff --git a/docs/adr/ADR-035-cross-platform-market-campaign-intelligence.md b/docs/adr/ADR-035-cross-platform-market-campaign-intelligence.md index 2af64de..beeca31 100644 --- a/docs/adr/ADR-035-cross-platform-market-campaign-intelligence.md +++ b/docs/adr/ADR-035-cross-platform-market-campaign-intelligence.md @@ -179,6 +179,7 @@ EwoooC 目前已有 MOMO EDM / 節慶活動資料、`promo_products`、PChome - 2026-05-31 追加 MCP fetch candidate queue writer review decision approval gate:`services.market_intel.mcp_fetch_candidate_queue_writer_review_decision_approval`、`services.market_intel.mcp_fetch_candidate_queue_writer_review_decision_approval_gates`、`services.market_intel.mcp_fetch_candidate_queue_writer_review_decision_approval_sample` 與 `/api/market_intel/mcp_fetch_candidate_queue_writer_review_decision_approval` 在 review decision 通過後只審核 operator human approval 摘要,確認 decision linkage、approval identity、target table、row count、dedupe keys、`approved_for_writer_preflight` approval result、decision/approval evidence refs、artifact paths、matched row exact-identity/variant/overwrite guard、operator confirmations 與 forbidden API actions;API/UI 不讀 approval token、不執行 CLI、不開 DB、不寫 approval record、不寫 decision record、不更新 review_state、不寫 match result、不補 queue、不掛 scheduler,只放行到後續 writer preflight 設計。此 endpoint 已拆入 `routes.market_intel_mcp_review_routes`,避免 `routes.market_intel_mcp_run_routes` 超過 800 行治理門檻。 - 2026-05-31 追加 MCP fetch candidate queue writer review decision approval writer preflight gate:`services.market_intel.mcp_fetch_candidate_queue_writer_review_decision_approval_writer_preflight`、對應 gates/sample 與 `/api/market_intel/mcp_fetch_candidate_queue_writer_review_decision_approval_writer_preflight` 在 human approval 通過後只審核 operator writer preflight 摘要,確認 approval linkage、writer_preflight_id、target operation、row count、dedupe keys、approved decision 到 target review_state 的逐列映射、decision/approval/preflight evidence refs、matched row exact-identity/variant/overwrite guard 與 operator boundary;API/UI 不讀 approval token、不執行 CLI、不開 DB、不寫 preflight/approval/decision/match、不更新 review_state、不補 queue、不掛 scheduler,只放行到後續 CLI review / run package 設計。 - 2026-06-01 追加 Professional Source Governance gate:`services.market_intel.mcp_professional_source_governance`、對應 gates/sample 與 `/api/market_intel/mcp_professional_source_governance` 將 robots/REP、sitemap/lastmod、JSON-LD / schema.org structured data、canonical URL、rate limit、公開資料邊界、provenance、snapshot hash 與 idempotency key 整理為 source contract。此 gate 只審核 operator source governance 摘要,不抓外站、不讀 robots/sitemap、不開 DB、不寫檔、不掛 scheduler;後續 fetch target review 才能引用通過治理的公開來源。 +- 2026-06-03 追加 Source Governance → Fetch Target bridge:`services.market_intel.mcp_fetch_target_source_governance_review` 與 `/api/market_intel/mcp_fetch_target_source_governance_review` 只交叉審核已通過治理的 source contract 與 MCP Fetch Target Review,要求每個 target `platform_code/source_key` 都命中治理摘要;仍不執行外部 fetch、不讀 robots/sitemap、不開 DB、不寫檔、不執行 CLI、不掛 scheduler。 - 2026-05-18 追加 scheduler attach plan preview:`services.market_intel.scheduler_plan` 與 `/api/market_intel/scheduler_plan` 描述未來 `campaign_discovery_daily`、`campaign_product_probe`、`product_match_review_seed` 三個 job 的 cadence、gate、fallback 與安全邊界。此階段不註冊 scheduler job、不啟動 crawler、不連外、不寫 DB;排程掛載必須等 migration、seed、MCP fetch gate、manual sample 與人工批准全過。 - 2026-05-18 追加 match review plan preview:`services.market_intel.match_review_plan` 與 `/api/market_intel/match_review_plan` 定義商品比對訊號、分數門檻、`needs_review → confirmed/rejected` HITL 流程與安全邊界。此階段不建立 review queue、不自動 confirmed、不寫 `market_product_matches`、不呼叫 MCP;價格只能作為輔助訊號,不能單獨決定同品比對。 - 2026-05-18 追加 opportunity plan preview:`services.market_intel.opportunity_plan` 與 `/api/market_intel/opportunity_plan` 定義競品低價威脅、促銷缺口、深折重疊、活動即將結束四類規則與分級策略。此階段不建立 opportunity queue、不派送 Telegram、不產生 AI 摘要、不寫 DB;高風險項必須先有 confirmed match 與 DB evidence 才能升級。 diff --git a/docs/memory/code_modularization_inventory_20260430.md b/docs/memory/code_modularization_inventory_20260430.md index 4953b5b..ce423cd 100644 --- a/docs/memory/code_modularization_inventory_20260430.md +++ b/docs/memory/code_modularization_inventory_20260430.md @@ -56,6 +56,7 @@ - 2026-05-31 追記:同步市場情報 MCP fetch candidate queue writer review decision gate 後的 `services/market_intel/deployment_readiness.py` 行數;本次新增 `services/market_intel/mcp_fetch_candidate_queue_writer_review_decision.py`(498 行)、`services/market_intel/mcp_fetch_candidate_queue_writer_review_decision_gates.py`(241 行)與 `services/market_intel/mcp_fetch_candidate_queue_writer_review_decision_sample.py`(118 行),全部低於 600 行提醒門檻;`routes/market_intel_mcp_run_routes.py` 目前 772 行,仍低於 800 行但已接近門檻,下一段 MCP route 應優先拆第二個 route extension。 - 2026-05-31 追記:同步市場情報 MCP fetch candidate queue writer review decision approval gate 後的 `services/market_intel/deployment_readiness.py` 行數;本次新增 `services/market_intel/mcp_fetch_candidate_queue_writer_review_decision_approval.py`(560 行)、`services/market_intel/mcp_fetch_candidate_queue_writer_review_decision_approval_gates.py`(255 行)、`services/market_intel/mcp_fetch_candidate_queue_writer_review_decision_approval_sample.py`(140 行)與 `routes/market_intel_mcp_review_routes.py`(64 行),全部低於 600 行提醒門檻;`routes/market_intel_mcp_run_routes.py` 維持 770 行,本次未再加 endpoint,改以第二個 MCP review route extension 承接。 - 2026-06-01 追記:同步市場情報 Professional Source Governance gate 後的 `services/market_intel/deployment_readiness.py` 行數;本次新增 `services/market_intel/mcp_professional_source_governance.py`(391 行)、`services/market_intel/mcp_professional_source_governance_gates.py`(266 行)、`services/market_intel/mcp_professional_source_governance_sample.py`(175 行)與 `routes/market_intel_mcp_review_routes.py`(165 行),全部低於 600 行提醒門檻;`services/market_intel/deployment_readiness.py` 仍是既有 P2 大檔,只加 preview-safe check 與 smoke target,後續需延續小 service + route extension 模式。 +- 2026-06-03 追記:新增 `services/market_intel/mcp_fetch_target_source_governance_review.py`(237 行),並將 `mcp_professional_source_governance_sample.py` 擴為 307 行、`routes/market_intel_mcp_review_routes.py` 擴為 207 行;新增服務仍低於 600 行提醒門檻。`services/market_intel/deployment_readiness.py` 擴為 2010 行,仍屬既有 P2 大檔,後續應優先拆 readiness smoke/check registration。 - 2026-05-24 追記:同步背景 Code Review 111 fallback 保護合併後的 `services/code_review_pipeline_service.py` 行數;此處只更新 inventory,不變更 Code Review 行為。 - 2026-05-21 追記:同步 PChome/LUDEYA 商品線名稱漂移比對更新後的 `services/marketplace_product_matcher.py` 行數;此處只更新 inventory,不變更模組化決策。 - 2026-05-21 追記:同步 MAC/Yuskin/AHC 名稱漂移與 bundle equivalent matcher 更新後的 `services/marketplace_product_matcher.py` 行數;此處只更新 inventory,不變更模組化決策。 diff --git a/docs/memory/current_execution_queue_20260524.md b/docs/memory/current_execution_queue_20260524.md index cd71d10..048e99c 100644 --- a/docs/memory/current_execution_queue_20260524.md +++ b/docs/memory/current_execution_queue_20260524.md @@ -104,6 +104,7 @@ - 2026-05-31 起,`V10.506` 新增市場情報 MCP Fetch Candidate Queue Writer Review Decision Approval gate:在 review decision 通過後只審核 operator human approval 摘要,要求 decision linkage、approval identity、target table、row count、dedupe keys、`approved_for_writer_preflight` approval result、decision/approval evidence refs、artifact paths、matched row exact-identity/variant/overwrite guard 與 operator confirmation 對齊;仍不讀 token、不執行 CLI、不開 DB、不寫 approval record、不寫 decision record、不更新 review_state、不寫 match result、不補 queue、不掛 scheduler,只放行到後續 writer preflight 設計。 - 2026-05-31 起,`V10.509` 新增市場情報 MCP Fetch Candidate Queue Writer Review Decision Approval Writer Preflight gate:在 human approval 通過後只審核 operator writer preflight 摘要,要求 approval linkage、writer_preflight_id、target operation、row count、dedupe keys、approved decision 到 target review_state 的逐列映射、decision/approval/preflight evidence refs、matched row exact-identity/variant/overwrite guard 與 operator boundary;仍不讀 token、不執行 CLI、不開 DB、不寫 preflight/approval/decision/match、不更新 review_state、不補 queue、不掛 scheduler,只放行到後續 CLI review / run package 設計。 - 2026-06-01 起,`V10.566` 新增市場情報 Professional Source Governance gate:將 robots/REP、sitemap/lastmod、JSON-LD / schema.org structured data、canonical URL、rate limit、公開資料邊界、provenance、snapshot hash 與 idempotency key 納入 source contract,並接上 `/api/market_intel/mcp_professional_source_governance`、UI preview panel、deployment readiness check 與 production smoke target;仍不抓外站、不讀 robots/sitemap、不開 DB、不寫檔、不掛 scheduler。 +- 2026-06-03 起,`V10.573` 新增市場情報 Source Governance → Fetch Target bridge:`/api/market_intel/mcp_fetch_target_source_governance_review` 交叉審核 Professional Source Governance 與 MCP Fetch Target Review,要求 target `platform_code/source_key` 全部命中已治理 source contract;仍不抓外站、不讀 robots/sitemap、不開 DB、不寫檔、不執行 CLI、不掛 scheduler,只放行到後續人工 fetch run package review。 - 2026-06-02 起,`V10.567` 將 MCP 市場洞察 fallback 收斂為 GCP-A / GCP-B only,不再讓 111 承接非即時市場分析長任務;預設 timeout 25 秒、`num_predict` 500,GCP 不可用時直接保守降級,避免 Elephant Alpha 60 秒 timeout 與 111 負載尖峰。 - 2026-06-02 起,`V10.568` 將價格類 `decision_envelope` 的 Telegram 直送訊息改為專業 brief:標的、價格證據、比對證據、人工下一步四段式;review queue 信封 subject 同步帶 `momo_price` / `competitor_price`,讓 Telegram、PPT、Webcrumbs 與 AI 摘要共用價格證據。 - 2026-06-02 起,`V10.569` 將 Webcrumbs host data 串到 `summarize_review_decision_envelopes()`,payload 新增 `reviewDecisionBrief` 與 review queue / HITL / auto-execute-blocked metadata;共用 UI runtime 讀同一份 PChome 覆核信封摘要,仍只讀 DB、不呼叫 LLM、不抓外站、不寫資料。 diff --git a/docs/memory/history_logs.md b/docs/memory/history_logs.md index e4266e0..cdcb574 100644 --- a/docs/memory/history_logs.md +++ b/docs/memory/history_logs.md @@ -13,6 +13,8 @@ ## 📅 詳細更新日誌 (考古存檔) ### 2026-06-01:PChome 比價新鮮度操作閉環 +- **V10.574 PChome 型錄/任選可比覆核隊列**: 將 V10.572 的 `catalog_comparable_count` 派生口徑正式接進 PChome review queue。高分、無 hard veto、具同品線身份證據但仍有任選/型錄/商業條件待確認的 `true_low_confidence` 會進獨立 `catalog_comparable` 篩選、狀態標籤與 decision envelope;真正 `true_low_confidence` 會排除這批候選,避免重複出現在「證據不足」。此變更不放寬 `MIN_MATCH_SCORE`、不寫正式 `competitor_prices`、不算 exact matched,只把最有機會人工批次確認的候選變成可操作隊列。 +- **V10.573 市場情報 Source Governance → Fetch Target bridge**: 新增 `/api/market_intel/mcp_fetch_target_source_governance_review`、preview service 與市場情報頁 bridge panel,交叉審核 Professional Source Governance 與 MCP Fetch Target Review。此 gate 要求每個 target `platform_code/source_key` 都能對上已通過治理的公開 source contract,並同步納入 deployment readiness preview-safe check 與 production smoke target;API/UI 仍不抓外站、不讀 robots/sitemap、不開 DB、不寫檔、不執行 CLI、不掛 scheduler。 - **V10.572 PChome 決策支援覆蓋率分層**: 覆蓋率不再只有 exact `decision_ready_rate`。`fetch_competitor_coverage()` cache 升到 v11,新增 `catalog_comparable_count`、`decision_support_count`、`decision_support_rate` 與非 exact 支援數;只納入高分、無 hard veto、同時具型錄/任選/商業條件訊號與強身份證據,且排除品類、品線、入數、香味、型號、價格極端等硬衝突的候選。Dashboard、daily、growth 與 backfill JS 同步顯示「決策支援覆蓋率 / 精準可告警覆蓋 / 型錄可比 / 單位價」,提升可用情報覆蓋但不污染正式 `matched`。 - **V10.571 PChome pending 覆蓋率搜尋召回**: `competitor_price_feeder` 預設每個商品最多搜尋詞由 5 組提升為 6 組,並新增 `PCHOME_FEEDER_SEARCH_COVERAGE_RESCUE_ENABLED`。補抓流程會在主要 matcher 搜尋詞與原始名稱 fallback 之間加入狹義 coverage rescue terms,保留 `5.5g` / `2.4g` 等小數規格,並過濾外出清潔、卸除髒汙、卸防曬等非身份核心噪音。正式 pilot 顯示 CeraVe / TUNEMAKERS / Embryolisse / Neogence / NIVEA 這類雙語品牌商品常卡在 PChome 搜尋召回,因此補上「英文品牌 + 中文品牌 + 核心身份 + 規格」窄搜尋詞;`品牌 + 品類 + 規格` 仍只對安全品類開放,目標是提升 pending/no_result 候選取得率,同時維持 matcher hard veto 與 `MIN_MATCH_SCORE` 不變。 - **V10.570 PChome 身份 / 報價證據契約**: `score_marketplace_match()` 現在會在 `match_diagnostic_json` 內輸出 `identity_evidence` 與 `offer_evidence`,把品牌、品類、identity anchor、型號、規格、入數、variant guardrail 與價格 offer 拆層保存。`competitor_intel_repository` 會把這些證據轉成 `difference_highlights` 與 decision envelope 的 identity / offer evidence,讓覆核頁、PPT、OpenClaw、Webcrumbs 與 Telegram 摘要都能理解「為何同款 / 為何不同 / 價格只是報價證據不是身份證據」。 diff --git a/routes/dashboard_routes.py b/routes/dashboard_routes.py index 5833da8..4462f99 100644 --- a/routes/dashboard_routes.py +++ b/routes/dashboard_routes.py @@ -68,6 +68,7 @@ REVIEW_STATUS_OPTIONS = [ 'label': '需單位價', 'statuses': ('unit_comparable', 'refresh_unit_comparable'), }, + {'key': 'catalog_comparable', 'label': '型錄可比', 'statuses': ('true_low_confidence',)}, {'key': 'identity_veto', 'label': '已排除', 'statuses': ('identity_veto',)}, {'key': 'recoverable_low_score', 'label': '近門檻可救', 'statuses': ('recoverable_low_score',)}, {'key': 'true_low_confidence', 'label': '證據不足', 'statuses': ('true_low_confidence',)}, @@ -691,10 +692,19 @@ def _merge_competitor_review_context(overview, review_context): attempt_status = coverage.get('attempt_status') or {} review_status_counts = {} for option in REVIEW_STATUS_OPTIONS: - review_status_counts[option['key']] = sum( - int(attempt_status.get(status) or 0) - for status in option['statuses'] - ) + if option['key'] == 'catalog_comparable': + review_status_counts[option['key']] = int(coverage.get('catalog_comparable_count') or 0) + elif option['key'] == 'true_low_confidence': + review_status_counts[option['key']] = max( + int(attempt_status.get('true_low_confidence') or 0) + - int(coverage.get('catalog_comparable_count') or 0), + 0, + ) + else: + review_status_counts[option['key']] = sum( + int(attempt_status.get(status) or 0) + for status in option['statuses'] + ) overview.update({ 'total_active': int(coverage.get('active_with_price') or overview.get('total_active') or 0), 'matched_count': int(coverage.get('valid_matches') or overview.get('matched_count') or 0), diff --git a/routes/market_intel_mcp_review_routes.py b/routes/market_intel_mcp_review_routes.py index ce8eac1..735f644 100644 --- a/routes/market_intel_mcp_review_routes.py +++ b/routes/market_intel_mcp_review_routes.py @@ -13,6 +13,9 @@ from services.market_intel.mcp_fetch_candidate_queue_writer_review_decision_appr from services.market_intel.mcp_fetch_candidate_queue_writer_review_decision_approval_writer_preflight import ( build_mcp_fetch_candidate_queue_writer_review_decision_approval_writer_preflight_preview, ) +from services.market_intel.mcp_fetch_target_source_governance_review import ( + build_mcp_fetch_target_source_governance_review_preview, +) from services.market_intel.mcp_professional_source_governance import ( build_mcp_professional_source_governance_preview, ) @@ -163,3 +166,42 @@ def market_intel_mcp_professional_source_governance(): phase=service.phase, ) ) + + +@market_intel_bp.route( + "/api/market_intel/mcp_fetch_target_source_governance_review", + methods=["GET", "POST"], +) +@login_required +def market_intel_mcp_fetch_target_source_governance_review(): + professional_source_governance_package = None + target_review_package = None + operator_confirmations = None + if request.method == "POST": + payload = request.get_json(silent=True) or {} + package = ( + payload.get("fetch_target_source_governance_review_package") + or payload.get("source_governed_target_review_package") + or payload + ) + professional_source_governance_package = ( + package.get("professional_source_governance_package") + or package.get("source_governance_package") + or package.get("operator_source_governance") + ) + target_review_package = ( + package.get("target_review_package") + or package.get("mcp_fetch_target_review") + or package.get("target_review") + ) + operator_confirmations = package.get("operator_confirmations", {}) + + service = MarketIntelService() + return jsonify( + build_mcp_fetch_target_source_governance_review_preview( + professional_source_governance_package=professional_source_governance_package, + target_review_package=target_review_package, + operator_confirmations=operator_confirmations, + phase=service.phase, + ) + ) diff --git a/services/competitor_intel_repository.py b/services/competitor_intel_repository.py index d5d5cad..15e2f01 100644 --- a/services/competitor_intel_repository.py +++ b/services/competitor_intel_repository.py @@ -87,6 +87,7 @@ REVIEW_QUEUE_ATTEMPT_STATUSES = ACTIONABLE_ATTEMPT_STATUSES | MANUAL_CLOSED_ATTE REVIEW_STATUS_FILTER_GROUPS = { "rescore_accepted": ("rescore_accepted_current",), "unit_comparable": ("unit_comparable", "refresh_unit_comparable"), + "catalog_comparable": ("true_low_confidence",), "identity_veto": ("identity_veto",), "low_score": ("low_score", "refresh_low_score", "recoverable_low_score", "true_low_confidence"), "recoverable_low_score": ("recoverable_low_score",), @@ -144,6 +145,7 @@ MANUAL_REVIEW_ACTION_LABELS = { DECISION_ACTION_LABELS = { "compare_existing_identity": "比較既有正式候選與新候選", "review_accept_identity": "人工確認身份後採用同款", + "review_catalog_comparable": "確認型錄 / 任選可比條件", "unit_price_required": "確認單位價 / 組合差異", "needs_research": "補搜尋詞或重新抓取", "verify_or_reject_identity": "確認身份或否決候選", @@ -303,6 +305,23 @@ def _parse_tag_list(value: Any) -> list[str]: return [] +def _jsonb_any_array_predicate(jsonb_expr: str, values: set[str]) -> str: + value_sql = ", ".join(repr(value) for value in sorted(values)) + return f"(COALESCE({jsonb_expr}, '[]'::jsonb) ?| ARRAY[{value_sql}])" + + +def _catalog_comparable_sql(alias: str = "la") -> str: + diagnostic_codes = f"{alias}.diagnostic_codes" + return f"""( + {alias}.attempt_status = 'true_low_confidence' + AND COALESCE({alias}.hard_veto, false) = false + AND COALESCE({alias}.best_match_score, 0) >= {CATALOG_COMPARABLE_SCORE_FLOOR} + AND {_jsonb_any_array_predicate(diagnostic_codes, CATALOG_COMPARABLE_SIGNAL_REASONS)} + AND {_jsonb_any_array_predicate(diagnostic_codes, CATALOG_COMPARABLE_IDENTITY_REASONS)} + AND NOT {_jsonb_any_array_predicate(diagnostic_codes, CATALOG_COMPARABLE_BLOCK_REASONS)} + )""" + + def _tag_suffix(tags: list[str], prefix: str) -> str: marker = f"{prefix}_" for tag in tags: @@ -614,6 +633,11 @@ def _parse_existing_match_conflict(error_message: Any) -> dict[str, Any]: def _build_review_decision_envelope(item: dict[str, Any]) -> dict[str, Any]: """Build the shared evidence contract for an operator review queue item.""" attempt_status = str(item.get("attempt_status") or "") + action_code = ( + "review_catalog_comparable" + if item.get("catalog_comparable") + else _review_action_code(attempt_status) + ) momo_price = _num(item.get("momo_price")) candidate_price = _num(item.get("candidate_pc_price")) gap_amount = None @@ -652,6 +676,13 @@ def _build_review_decision_envelope(item: dict[str, Any]) -> dict[str, Any]: "value": f"{gap_pct:+.1f}%", "basis": "MOMO latest price + PChome review candidate", }) + if item.get("catalog_comparable"): + evidence.append({ + "type": "review_bucket", + "metric": "catalog_comparable", + "value": "true", + "basis": "true_low_confidence + high score + identity anchor + catalog/variant review signal + no hard veto", + }) identity_evidence = item.get("identity_evidence") identity_summary = _build_identity_evidence_summary(identity_evidence) if identity_summary: @@ -736,7 +767,7 @@ def _build_review_decision_envelope(item: dict[str, Any]) -> dict[str, Any]: "offer_evidence": offer_evidence if isinstance(offer_evidence, dict) else {}, "difference_highlights": difference_highlights if isinstance(difference_highlights, list) else [], "recommended_action": { - "action": _review_action_code(attempt_status), + "action": action_code, "owner": "營運", "requires_hitl": True, }, @@ -762,6 +793,7 @@ def _build_review_decision_envelope(item: dict[str, Any]) -> dict[str, Any]: if isinstance(identity_evidence, dict) else "" ), + "catalog_comparable": bool(item.get("catalog_comparable")), "price_is_identity_evidence": False, }, "trace": { @@ -913,14 +945,24 @@ def _format_competitor_review_item(row: dict[str, Any]) -> dict[str, Any]: diagnostic_reasons = _extract_match_diagnostic_reasons(match_diagnostic, diagnostic_payload) difference_highlights = _build_review_difference_highlights(diagnostic_reasons, identity_evidence) existing_match_conflict = _parse_existing_match_conflict(match_diagnostic) + catalog_comparable = bool(item.get("catalog_comparable")) + status_label = _attempt_status_label(item.get("attempt_status")) + action_label = _attempt_action_label(item.get("attempt_status")) + review_bucket = str(item.get("attempt_status") or "") + if catalog_comparable: + status_label = "型錄/任選可比" + action_label = "人工確認型錄、任選與規格條件後,再轉單位價或採用身份" + review_bucket = "catalog_comparable" formatted = { "sku": str(item.get("sku") or ""), "name": item.get("name") or "", "category": item.get("category") or "", "momo_price": _num(item.get("momo_price")), "attempt_status": item.get("attempt_status") or "", - "status_label": _attempt_status_label(item.get("attempt_status")), - "action_label": _attempt_action_label(item.get("attempt_status")), + "review_bucket": review_bucket, + "status_label": status_label, + "action_label": action_label, + "catalog_comparable": catalog_comparable, "candidate_count": int(item.get("candidate_count") or 0), "candidate_pc_id": item.get("best_competitor_product_id"), "candidate_pc_name": item.get("best_competitor_product_name") or "", @@ -1018,7 +1060,7 @@ def _cached_payload(cache_key: str, producer, ttl_seconds: int = COMPETITOR_INTE def fetch_competitor_coverage(engine) -> dict: return _cached_payload( - f"coverage:v11:floor={PCHOME_MATCH_SCORE_FLOOR}:catalog_floor={CATALOG_COMPARABLE_SCORE_FLOOR}:manual_reviews=1:rescore=1:review_no_fresh=1:decision_ready=1:open_queue=1:unknown_freshness=1:decision_support=1", + f"coverage:v12:floor={PCHOME_MATCH_SCORE_FLOOR}:catalog_floor={CATALOG_COMPARABLE_SCORE_FLOOR}:manual_reviews=1:rescore=1:review_no_fresh=1:decision_ready=1:open_queue=1:unknown_freshness=1:decision_support=1", lambda: _fetch_competitor_coverage_uncached(engine), ) @@ -1165,12 +1207,7 @@ def _fetch_competitor_coverage_uncached(engine) -> dict: LEFT JOIN fresh_competitor fc ON fc.sku = lm.sku JOIN latest_attempt la ON la.sku = lm.sku WHERE fc.sku IS NULL - AND la.attempt_status = 'true_low_confidence' - AND COALESCE(la.hard_veto, false) = false - AND COALESCE(la.best_match_score, 0) >= {CATALOG_COMPARABLE_SCORE_FLOOR} - AND (COALESCE(la.diagnostic_codes, '[]'::jsonb) ?| ARRAY[{", ".join(repr(reason) for reason in sorted(CATALOG_COMPARABLE_SIGNAL_REASONS))}]) - AND (COALESCE(la.diagnostic_codes, '[]'::jsonb) ?| ARRAY[{", ".join(repr(reason) for reason in sorted(CATALOG_COMPARABLE_IDENTITY_REASONS))}]) - AND NOT (COALESCE(la.diagnostic_codes, '[]'::jsonb) ?| ARRAY[{", ".join(repr(reason) for reason in sorted(CATALOG_COMPARABLE_BLOCK_REASONS))}]) + AND {_catalog_comparable_sql("la")} ) AS catalog_comparable_count, COALESCE(la.attempt_status, 'never_attempted') AS attempt_status, COUNT(*) AS status_count @@ -1497,7 +1534,7 @@ def fetch_competitor_review_queue(engine, limit: int = 12) -> list[dict]: """可行動的 PChome 比對覆核隊列,供 Dashboard / AI / PPT 共用。""" limit = max(1, min(int(limit or 12), 50)) return _cached_payload( - f"review_queue:v3:limit={limit}:floor={PCHOME_MATCH_SCORE_FLOOR}", + f"review_queue:v4:limit={limit}:floor={PCHOME_MATCH_SCORE_FLOOR}:catalog=1", lambda: _fetch_competitor_review_queue_uncached(engine, limit=limit), ) @@ -1520,7 +1557,7 @@ def fetch_competitor_review_queue_page( if status_filter not in REVIEW_STATUS_FILTER_GROUPS: status_filter = "" cache_key = ( - "review_queue_page:v3:" + "review_queue_page:v4:" f"page={page}:per={per_page}:q={search_query.lower()}:cat={category}:" f"status={status_filter}:" f"count={int(bool(count_total))}:" @@ -1550,6 +1587,7 @@ def _review_queue_cte_and_filter( status_filter = (status_filter or "").strip() status_values = REVIEW_STATUS_FILTER_GROUPS.get(status_filter) or tuple(ACTIONABLE_ATTEMPT_STATUSES) status_sql = ", ".join(f"'{status}'" for status in status_values) + catalog_comparable_expr = _catalog_comparable_sql("la") filters = [ f"la.attempt_status IN ({status_sql})", f"""NOT EXISTS ( @@ -1564,6 +1602,10 @@ def _review_queue_cte_and_filter( AND COALESCE(cp.tags, '[]'::jsonb) ? 'identity_v2' )""", ] + if status_filter == "catalog_comparable": + filters.append(catalog_comparable_expr) + elif status_filter == "true_low_confidence": + filters.append(f"NOT {catalog_comparable_expr}") if search_query: params["search_like"] = f"%{search_query.lower()}%" filters.append("(LOWER(p.name) LIKE :search_like OR LOWER(p.i_code) LIKE :search_like)") @@ -1582,6 +1624,8 @@ def _review_queue_cte_and_filter( cma.best_competitor_product_name, cma.best_competitor_price, cma.best_match_score, + cma.hard_veto, + cma.diagnostic_codes, cma.match_diagnostic_json, cma.error_message, cma.attempted_at @@ -1601,18 +1645,22 @@ def _review_queue_cte_and_filter( la.best_competitor_product_name, la.best_competitor_price, la.best_match_score, + la.hard_veto, + la.diagnostic_codes, la.match_diagnostic_json, la.error_message, la.attempted_at, + {catalog_comparable_expr} AS catalog_comparable, CASE WHEN la.attempt_status = 'rescore_accepted_current' THEN 0 WHEN la.attempt_status IN ('unit_comparable', 'refresh_unit_comparable') THEN 1 WHEN la.attempt_status = 'identity_veto' THEN 2 - WHEN la.attempt_status IN ('recoverable_low_score', 'low_score', 'refresh_low_score') THEN 3 - WHEN la.attempt_status = 'protected_existing_match' THEN 4 - WHEN la.attempt_status = 'true_low_confidence' THEN 5 - WHEN la.attempt_status = 'expired_match' THEN 6 - ELSE 7 + WHEN {catalog_comparable_expr} THEN 3 + WHEN la.attempt_status IN ('recoverable_low_score', 'low_score', 'refresh_low_score') THEN 4 + WHEN la.attempt_status = 'protected_existing_match' THEN 5 + WHEN la.attempt_status = 'true_low_confidence' THEN 6 + WHEN la.attempt_status = 'expired_match' THEN 7 + ELSE 8 END AS priority_rank FROM latest_attempt la JOIN products p @@ -1763,6 +1811,8 @@ def _fetch_competitor_review_queue_uncached(engine, limit: int = 12) -> list[dic cma.best_competitor_product_name, cma.best_competitor_price, cma.best_match_score, + cma.hard_veto, + cma.diagnostic_codes, cma.match_diagnostic_json, cma.error_message, cma.attempted_at @@ -1781,9 +1831,12 @@ def _fetch_competitor_review_queue_uncached(engine, limit: int = 12) -> list[dic la.best_competitor_product_name, la.best_competitor_price, la.best_match_score, + la.hard_veto, + la.diagnostic_codes, la.match_diagnostic_json, la.error_message, - la.attempted_at + la.attempted_at, + {_catalog_comparable_sql("la")} AS catalog_comparable FROM latest_momo lm JOIN latest_attempt la ON la.sku = lm.sku LEFT JOIN valid_competitor vc ON vc.sku = lm.sku @@ -1807,11 +1860,12 @@ def _fetch_competitor_review_queue_uncached(engine, limit: int = 12) -> list[dic WHEN la.attempt_status = 'rescore_accepted_current' THEN 0 WHEN la.attempt_status IN ('unit_comparable', 'refresh_unit_comparable') THEN 1 WHEN la.attempt_status = 'identity_veto' THEN 2 - WHEN la.attempt_status IN ('recoverable_low_score', 'low_score', 'refresh_low_score') THEN 3 - WHEN la.attempt_status = 'protected_existing_match' THEN 4 - WHEN la.attempt_status = 'true_low_confidence' THEN 5 - WHEN la.attempt_status = 'expired_match' THEN 6 - ELSE 7 + WHEN {_catalog_comparable_sql("la")} THEN 3 + WHEN la.attempt_status IN ('recoverable_low_score', 'low_score', 'refresh_low_score') THEN 4 + WHEN la.attempt_status = 'protected_existing_match' THEN 5 + WHEN la.attempt_status = 'true_low_confidence' THEN 6 + WHEN la.attempt_status = 'expired_match' THEN 7 + ELSE 8 END, lm.momo_price DESC NULLS LAST, la.best_match_score DESC NULLS LAST, diff --git a/services/market_intel/deployment_readiness.py b/services/market_intel/deployment_readiness.py index 5862bcf..62793c2 100644 --- a/services/market_intel/deployment_readiness.py +++ b/services/market_intel/deployment_readiness.py @@ -63,6 +63,9 @@ from services.market_intel.mcp_activation_evidence import build_mcp_activation_e from services.market_intel.mcp_fetch_target_review import ( build_mcp_fetch_target_review_preview, ) +from services.market_intel.mcp_fetch_target_source_governance_review import ( + build_mcp_fetch_target_source_governance_review_preview, +) from services.market_intel.mcp_fetch_run_package import ( build_mcp_fetch_run_package_preview, ) @@ -321,6 +324,11 @@ PRODUCTION_SMOKE_TARGETS = ( + ("/api/market_intel/mcp_professional_source_governance",) + PRODUCTION_SMOKE_TARGETS[-1:] ) +PRODUCTION_SMOKE_TARGETS = ( + PRODUCTION_SMOKE_TARGETS[:-1] + + ("/api/market_intel/mcp_fetch_target_source_governance_review",) + + PRODUCTION_SMOKE_TARGETS[-1:] +) def _run_review_preview_safe(payload, mode): @@ -439,6 +447,11 @@ def build_deployment_readiness_preview(*, service, market_intel_tables, schema_s phase=service.phase, ) ) + mcp_fetch_target_source_governance_review = ( + build_mcp_fetch_target_source_governance_review_preview( + phase=service.phase, + ) + ) scheduler_plan = service.build_scheduler_plan() manual_sample_plan = service.build_manual_sample_plan() manual_sample_acceptance = service.build_manual_sample_acceptance() @@ -1556,6 +1569,37 @@ def build_deployment_readiness_preview(*, service, market_intel_tables, schema_s and not mcp_professional_source_governance["payload_persisted"] and not mcp_professional_source_governance["scheduler_attached"] ), + "mcp_fetch_target_source_governance_review_preview_safe": bool( + mcp_fetch_target_source_governance_review["mode"] + == "mcp_fetch_target_source_governance_review_preview" + and not mcp_fetch_target_source_governance_review[ + "network_request_allowed" + ] + and not mcp_fetch_target_source_governance_review[ + "external_network_executed" + ] + and not mcp_fetch_target_source_governance_review["fetch_executed"] + and not mcp_fetch_target_source_governance_review["payload_persisted"] + and not mcp_fetch_target_source_governance_review[ + "bridge_review_persisted" + ] + and not mcp_fetch_target_source_governance_review[ + "api_fetches_robots_txt" + ] + and not mcp_fetch_target_source_governance_review["api_fetches_sitemap"] + and not mcp_fetch_target_source_governance_review[ + "api_fetches_source_url" + ] + and not mcp_fetch_target_source_governance_review[ + "api_opens_database_connection" + ] + and not mcp_fetch_target_source_governance_review[ + "api_writes_database" + ] + and not mcp_fetch_target_source_governance_review["api_writes_file"] + and not mcp_fetch_target_source_governance_review["api_executes_cli"] + and not mcp_fetch_target_source_governance_review["scheduler_attached"] + ), "candidate_queue_writer_postwrite_smoke_planned_safe": bool( candidate_queue_writer_postwrite_smoke["mode"] == "candidate_queue_writer_postwrite_smoke_planned" @@ -1888,6 +1932,7 @@ def build_deployment_readiness_preview(*, service, market_intel_tables, schema_s "mcp_fetch_candidate_queue_writer_review_decision_approval": mcp_fetch_candidate_queue_writer_review_decision_approval, "mcp_fetch_candidate_queue_writer_review_decision_approval_writer_preflight": mcp_fetch_candidate_queue_writer_review_decision_approval_writer_preflight, "mcp_professional_source_governance": mcp_professional_source_governance, + "mcp_fetch_target_source_governance_review": mcp_fetch_target_source_governance_review, "scheduler_plan": scheduler_plan, "manual_sample_plan": manual_sample_plan, "manual_sample_acceptance": manual_sample_acceptance, diff --git a/services/market_intel/mcp_fetch_target_source_governance_review.py b/services/market_intel/mcp_fetch_target_source_governance_review.py new file mode 100644 index 0000000..e2098fa --- /dev/null +++ b/services/market_intel/mcp_fetch_target_source_governance_review.py @@ -0,0 +1,237 @@ +"""Bridge gate between source governance and fetch target review. + +This module only cross-checks two operator supplied review summaries: +professional source governance and MCP fetch target review. It does not +fetch external pages, read robots/sitemap, open DB connections, persist +payloads, execute CLI commands, or attach schedulers. +""" + +from services.market_intel.mcp_fetch_target_review import ( + build_mcp_fetch_target_review_preview, +) +from services.market_intel.mcp_professional_source_governance import ( + build_mcp_professional_source_governance_preview, +) + + +def _as_dict(value): + return value if isinstance(value, dict) else {} + + +def _as_list(value): + return value if isinstance(value, list) else [] + + +def _unwrap_source_governance_package(package): + package = _as_dict(package) + return ( + package.get("operator_source_governance") + or package.get("source_governance") + or package.get("market_source_governance") + or package + ) + + +def _target_package_from_input(package): + package = _as_dict(package) + return { + "handoff_package": package.get("handoff_package", {}), + "handoff_review": package.get("handoff_review"), + "target_review": ( + package + if "platform_targets" in package + else package.get("target_review", {}) + ), + } + + +def _target_sources(target_review): + target_review = _as_dict(target_review) + sources = [] + for target in _as_list(target_review.get("platform_targets")): + if not isinstance(target, dict): + continue + platform_code = str(target.get("platform_code") or "").lower() + source_keys = target.get("source_keys") or [] + if isinstance(source_keys, str): + source_keys = [source_keys] + for source_key in source_keys: + if platform_code and source_key: + sources.append( + { + "platform_code": platform_code, + "source_key": str(source_key), + } + ) + return sources + + +def _governed_source_index(source_governance): + return { + ( + source.get("platform_code"), + source.get("source_key"), + ) + for source in _as_list(source_governance.get("sources")) + if source.get("platform_code") and source.get("source_key") + } + + +def _source_alignment(target_sources, source_governance): + governed = _governed_source_index(source_governance) + missing = [ + source + for source in target_sources + if (source["platform_code"], source["source_key"]) not in governed + ] + return { + "target_sources": target_sources, + "governed_source_count": len(governed), + "target_source_count": len(target_sources), + "missing_governed_sources": missing, + "all_target_sources_governed": bool(target_sources and not missing), + } + + +def _sample_package(): + source_governance = build_mcp_professional_source_governance_preview() + target_review = build_mcp_fetch_target_review_preview() + return { + "professional_source_governance_package": source_governance[ + "sample_professional_source_governance_package" + ], + "target_review_package": target_review["sample_target_review_package"], + "operator_confirmations": { + "source_governance_reviewed": True, + "target_review_reviewed": True, + "all_target_sources_reference_governed_sources": True, + "no_api_external_fetch": True, + "no_database_write": True, + "no_scheduler_attach": True, + }, + } + + +def build_mcp_fetch_target_source_governance_review_preview( + *, + professional_source_governance_package=None, + target_review_package=None, + operator_confirmations=None, + phase=None, +): + """Review target/source governance alignment without side effects.""" + source_package_received = professional_source_governance_package is not None + target_package_received = target_review_package is not None + confirmations = _as_dict(operator_confirmations) + source_governance = build_mcp_professional_source_governance_preview( + operator_source_governance=_unwrap_source_governance_package( + professional_source_governance_package + ) + if source_package_received + else None, + phase=phase, + ) + target_package = _target_package_from_input(target_review_package) + target_review = build_mcp_fetch_target_review_preview( + handoff_package=target_package["handoff_package"], + handoff_review=target_package["handoff_review"], + target_review=target_package["target_review"], + phase=phase, + ) + alignment = _source_alignment( + _target_sources(target_package["target_review"]), + source_governance, + ) + confirmation_status = { + "source_governance_reviewed": bool( + confirmations.get("source_governance_reviewed") + ), + "target_review_reviewed": bool(confirmations.get("target_review_reviewed")), + "all_target_sources_reference_governed_sources": bool( + confirmations.get("all_target_sources_reference_governed_sources") + ), + "no_api_external_fetch": bool(confirmations.get("no_api_external_fetch")), + "no_database_write": bool(confirmations.get("no_database_write")), + "no_scheduler_attach": bool(confirmations.get("no_scheduler_attach")), + } + gates = [ + { + "key": "professional_source_governance_package_received", + "label": "已提供 Professional Source Governance package", + "passed": source_package_received, + }, + { + "key": "professional_source_governance_accepted", + "label": "來源治理已通過 robots/sitemap/structured-data/public boundary gate", + "passed": source_governance[ + "mcp_professional_source_governance_accepted" + ], + }, + { + "key": "fetch_target_review_package_received", + "label": "已提供 MCP Fetch Target Review package", + "passed": target_package_received, + }, + { + "key": "fetch_target_review_accepted", + "label": "Fetch target review 已通過 adapter/source/rate-limit gate", + "passed": target_review["mcp_fetch_target_review_accepted"], + }, + { + "key": "all_target_sources_governed", + "label": "每個 fetch target source_key 都已存在於通過治理的 source contract", + "passed": alignment["all_target_sources_governed"], + }, + { + "key": "operator_confirmations_complete", + "label": "操作員確認治理與 target 已人工覆核,且 API 不連外/不寫 DB/不掛 scheduler", + "passed": all(confirmation_status.values()), + }, + { + "key": "bridge_side_effect_free", + "label": "本 bridge API 只做交叉審核,不執行 fetch、DB、file、CLI 或 scheduler", + "passed": True, + }, + ] + blocked_reasons = [gate["key"] for gate in gates if not gate["passed"]] + accepted = bool(source_package_received and target_package_received and not blocked_reasons) + return { + "mode": ( + "mcp_fetch_target_source_governance_review" + if accepted + else "mcp_fetch_target_source_governance_review_preview" + ), + "phase": phase, + "source_governance_package_received": source_package_received, + "target_review_package_received": target_package_received, + "mcp_fetch_target_source_governance_review_accepted": accepted, + "ready_for_mcp_fetch_target_review_with_source_governance": accepted, + "ready_for_manual_fetch_run_package_review": accepted, + "gate_count": len(gates), + "passed_gate_count": sum(1 for gate in gates if gate["passed"]), + "blocked_reasons": blocked_reasons, + "gates": gates, + "source_alignment": alignment, + "operator_confirmation_status": confirmation_status, + "professional_source_governance": source_governance, + "mcp_fetch_target_review": target_review, + "sample_fetch_target_source_governance_review_package": _sample_package(), + "next_operator_steps": [ + "使用通過治理的 source contract 更新後續 fetch run package。", + "正式 fetch 仍只能由 operator run command 執行,API 不打外站。", + "fetch receipt、parser review、candidate handoff 與 queue writer 仍需各自 gate。", + ], + "network_request_allowed": False, + "external_network_executed": False, + "fetch_executed": False, + "payload_persisted": False, + "bridge_review_persisted": False, + "api_fetches_robots_txt": False, + "api_fetches_sitemap": False, + "api_fetches_source_url": False, + "api_opens_database_connection": False, + "api_writes_database": False, + "api_writes_file": False, + "api_executes_cli": False, + "scheduler_attached": False, + } diff --git a/services/market_intel/mcp_professional_source_governance_sample.py b/services/market_intel/mcp_professional_source_governance_sample.py index cf7da89..cff5844 100644 --- a/services/market_intel/mcp_professional_source_governance_sample.py +++ b/services/market_intel/mcp_professional_source_governance_sample.py @@ -46,6 +46,39 @@ _SAMPLE_PROFESSIONAL_SOURCE_GOVERNANCE_PACKAGE = { "platform_code:source_key:canonical_url_hash" ), }, + { + "platform_code": "momo", + "source_key": "momo_flash_sale", + "source_url": "https://www.momoshop.com.tw/category/DgrpCategory.jsp?d_code=2142500000", + "canonical_url": "https://www.momoshop.com.tw/category/DgrpCategory.jsp?d_code=2142500000", + "robots_url": "https://www.momoshop.com.tw/robots.txt", + "sitemap_url": "https://www.momoshop.com.tw/sitemap.xml", + "lastmod_source": "sitemap_or_http_last_modified", + "robots_policy_checked": True, + "robots_allowed": True, + "tos_public_page_checked": True, + "login_required": False, + "member_or_order_data": False, + "cart_order_or_pii": False, + "anti_bot_bypass_required": False, + "structured_data_preferred": True, + "json_ld_first": True, + "dom_selector_fallback_allowed": True, + "structured_data_types": ["ItemList", "Product", "Offer"], + "selector_version": "momo_flash_sale_source_v1", + "crawl_delay_seconds": 2.5, + "max_requests_per_run": 10, + "public_cache_ttl_hours": 12, + "evidence_artifact_path": ( + ARTIFACT_PREFIX + + "professional-source-governance-momo-flash-sale.json" + ), + "provenance_required": True, + "snapshot_hash_required": True, + "idempotency_key_strategy": ( + "platform_code:source_key:canonical_url_hash" + ), + }, { "platform_code": "pchome", "source_key": "pchome_home", @@ -78,9 +111,42 @@ _SAMPLE_PROFESSIONAL_SOURCE_GOVERNANCE_PACKAGE = { "platform_code:source_key:canonical_url_hash" ), }, + { + "platform_code": "pchome", + "source_key": "pchome_region_beauty", + "source_url": "https://24h.pchome.com.tw/region/DA", + "canonical_url": "https://24h.pchome.com.tw/region/DA", + "robots_url": "https://24h.pchome.com.tw/robots.txt", + "sitemap_url": "https://24h.pchome.com.tw/sitemap.xml", + "lastmod_source": "sitemap_or_http_last_modified", + "robots_policy_checked": True, + "robots_allowed": True, + "tos_public_page_checked": True, + "login_required": False, + "member_or_order_data": False, + "cart_order_or_pii": False, + "anti_bot_bypass_required": False, + "structured_data_preferred": True, + "json_ld_first": True, + "dom_selector_fallback_allowed": True, + "structured_data_types": ["ItemList", "Product", "Offer"], + "selector_version": "pchome_region_beauty_source_v1", + "crawl_delay_seconds": 2.0, + "max_requests_per_run": 8, + "public_cache_ttl_hours": 24, + "evidence_artifact_path": ( + ARTIFACT_PREFIX + + "professional-source-governance-pchome-region-beauty.json" + ), + "provenance_required": True, + "snapshot_hash_required": True, + "idempotency_key_strategy": ( + "platform_code:source_key:canonical_url_hash" + ), + }, { "platform_code": "coupang", - "source_key": "coupang_tw_home", + "source_key": "coupang_home", "source_url": "https://www.tw.coupang.com/", "canonical_url": "https://www.tw.coupang.com/", "robots_url": "https://www.tw.coupang.com/robots.txt", @@ -111,6 +177,72 @@ _SAMPLE_PROFESSIONAL_SOURCE_GOVERNANCE_PACKAGE = { "platform_code:source_key:canonical_url_hash" ), }, + { + "platform_code": "coupang", + "source_key": "coupang_global", + "source_url": "https://www.tw.coupang.com/categories/beauty", + "canonical_url": "https://www.tw.coupang.com/categories/beauty", + "robots_url": "https://www.tw.coupang.com/robots.txt", + "sitemap_url": "https://www.tw.coupang.com/sitemap.xml", + "lastmod_source": "sitemap_or_http_last_modified", + "robots_policy_checked": True, + "robots_allowed": True, + "tos_public_page_checked": True, + "login_required": False, + "member_or_order_data": False, + "cart_order_or_pii": False, + "anti_bot_bypass_required": False, + "structured_data_preferred": True, + "json_ld_first": True, + "dom_selector_fallback_allowed": True, + "structured_data_types": ["ItemList", "Product", "Offer"], + "selector_version": "coupang_global_source_v1", + "crawl_delay_seconds": 3.0, + "max_requests_per_run": 6, + "public_cache_ttl_hours": 24, + "evidence_artifact_path": ( + ARTIFACT_PREFIX + + "professional-source-governance-coupang-global.json" + ), + "provenance_required": True, + "snapshot_hash_required": True, + "idempotency_key_strategy": ( + "platform_code:source_key:canonical_url_hash" + ), + }, + { + "platform_code": "shopee", + "source_key": "shopee_home", + "source_url": "https://shopee.tw/", + "canonical_url": "https://shopee.tw/", + "robots_url": "https://shopee.tw/robots.txt", + "sitemap_url": "https://shopee.tw/sitemap.xml", + "lastmod_source": "sitemap_or_http_last_modified", + "robots_policy_checked": True, + "robots_allowed": True, + "tos_public_page_checked": True, + "login_required": False, + "member_or_order_data": False, + "cart_order_or_pii": False, + "anti_bot_bypass_required": False, + "structured_data_preferred": True, + "json_ld_first": True, + "dom_selector_fallback_allowed": True, + "structured_data_types": ["ItemList", "Product", "Offer"], + "selector_version": "shopee_home_source_v1", + "crawl_delay_seconds": 3.0, + "max_requests_per_run": 6, + "public_cache_ttl_hours": 24, + "evidence_artifact_path": ( + ARTIFACT_PREFIX + + "professional-source-governance-shopee-home.json" + ), + "provenance_required": True, + "snapshot_hash_required": True, + "idempotency_key_strategy": ( + "platform_code:source_key:canonical_url_hash" + ), + }, { "platform_code": "shopee", "source_key": "shopee_mall", diff --git a/templates/market_intel/disabled.html b/templates/market_intel/disabled.html index 3c1c2f0..2f16777 100644 --- a/templates/market_intel/disabled.html +++ b/templates/market_intel/disabled.html @@ -1176,6 +1176,32 @@ +
MCP / SOURCE GOVERNED TARGET
+BRIDGE GATES
+SOURCE ALIGNMENT
+UPSTREAM REVIEWS
+NEXT
+