V10.527 收斂過期 identity 救援隊列
All checks were successful
CD Pipeline / deploy (push) Successful in 1m6s

This commit is contained in:
OoO
2026-06-01 01:03:25 +08:00
parent b9b3a410ff
commit 3a045fddfc
6 changed files with 123 additions and 7 deletions

View File

@@ -4,6 +4,7 @@
================================================================================
【已完成】
- V10.527 收斂 PChome 過期 identity 搜尋救援隊列:`recover-stale` 不再直接吃全部過期 `identity_v2`,改走 `_fetch_expired_identity_recovery_skus()`,只收既有正式診斷為 `exact_identity / total_price / price_alert_exact` 且無 variant、catalog、commercial condition、count、bundle、unit-price 等阻擋理由的舊配對;名稱含任選、多款、香味、色號、即期、融燭燈、香氛蠟燭等高風險訊號也先排除,避免慢速 fresh search 把人工覆核型 stale pair 全部掃進來。
- V10.526 將 PChome 近門檻重評池與過期 identity 搜尋救援變成可觀測、可操作產線:`preview_retryable_candidate_revalidation()` / `preview_expired_identity_recovery()` 都是 read-only不啟動 PChome 搜尋、不呼叫 LLM、不寫 DB`/api/ai/pchome-match/backfill/status` 回傳 `revalidation_preview` / `stale_recovery_preview`Dashboard 顯示「可重評 / 窄門 / 可救援」數字,並新增「救援過期 40 筆」按鈕呼叫 `/api/ai/pchome-match/recover-stale`,只在舊 PChome ID 缺失或低分時走受控 fresh-search recovery最後仍經 hard veto、auto price write safety 與 overwrite protection。
- V10.525 補高分 review-gated exact 舊候選重評入口:`run_retryable_candidate_revalidation()` 仍以 `low_score / refresh_low_score / recoverable_low_score` 為主,只額外允許 Beauty Foot / KAMERIA / TS6 / Vaseline 這批已補 focused exact 規則、舊分數 >= 0.95、無商業狀態 / 款式 / 入數 / 組合阻擋理由的 `true_low_confidence` 進窄門重評,讓 V10.523 的安全規則可以實際回收舊資料,不把所有人工審核候選打開。
- V10.524 將「待刷新」變成可操作入口:商品看板 PChome 補抓產線新增「刷新過期 120 筆」按鈕,呼叫 `/api/ai/pchome-match/refresh-stale` 背景執行 `run_expired_identity_refresh()`,只刷新既有 `identity_v2` 的 PChome product_id不跑 fresh search recovery、不呼叫 LLM完成後重算 AI 挑品並清除 Dashboard / 競價快取。

View File

@@ -402,7 +402,7 @@ YOUTUBE_API_KEY = os.getenv('YOUTUBE_API_KEY', '')
# ==========================================
# 系統版本與路徑
# ==========================================
SYSTEM_VERSION = "V10.526"
SYSTEM_VERSION = "V10.527"
LOG_FILE_PATH = os.path.join(BASE_DIR, 'logs/system.log')
public_url = PUBLIC_URL # 用於模板顯示

View File

@@ -90,7 +90,7 @@ SQL漏斗(~300筆)
- 配對來源仍以 PChome crawler 真實搜尋結果為準;無競品資料時不生成挑品。
- 比對覆蓋率補強入口:`POST /api/ai/pchome-match/backfill`,優先補抓仍無有效 PChome 配對的高價 ACTIVE 商品,完成後自動重算 AI 挑品清單。
- 過期價格刷新入口:`POST /api/ai/pchome-match/refresh-stale`,只針對已建立 `identity_v2``expires_at` 過期的 PChome product_id 執行 `run_expired_identity_refresh()`;不得跑 fresh search recovery不得呼叫 LLM完成後重算 AI 挑品並清除 Dashboard / competitor intel cache。
- 過期 identity 搜尋救援入口:`POST /api/ai/pchome-match/recover-stale` 僅供操作員手動觸發,對已過期 `identity_v2` 先走既有 PChome product_id refresh只有舊 ID 查無商品或重評低於門檻時,才允許受控 fresh search recovery。這條路徑可抓 PChome但不得呼叫 LLM正式寫入仍必須通過 matcher、hard veto、auto price write safety 與 overwrite protection。
- 過期 identity 搜尋救援入口:`POST /api/ai/pchome-match/recover-stale` 僅供操作員手動觸發,對已過期 `identity_v2` 先走既有 PChome product_id refresh只有舊 ID 查無商品或重評低於門檻時,才允許受控 fresh search recovery。救援隊列必須先排除 variant、catalog、commercial condition、count、bundle、unit-price 與任選 / 多款 / 香味 / 色號 / 即期 / 融燭燈 / 香氛蠟燭等高風險名稱訊號。這條路徑可抓 PChome但不得呼叫 LLM正式寫入仍必須通過 matcher、hard veto、auto price write safety 與 overwrite protection。
- 補抓狀態入口:`GET /api/ai/pchome-match/backfill/status` 除背景任務狀態外,必須回傳 read-only coverage snapshot`active_with_price` / `valid_matches` / `match_rate` / `fresh_matches` / `fresh_match_rate` / `stale_matches` / `pending` / `actionable_review_count`,供 Dashboard 顯示目前該刷新過期價格或補抓未搜尋商品;此端點不寫 DB、不呼叫 LLM、不抓外站。
- 排程閉環:`run_pchome_match_backfill_task` 每日 10:30 執行,補抓 PChome 待比對商品、寫入歷史價格,再重算 `strategy='product_pick'` 清單。
- PChome / MOMO 競價摘要出口 `services/competitor_intel_repository.py` 使用 30 分鐘共享快取(`COMPETITOR_INTEL_CACHE_TTL_SECONDS` 可調),避免 `/growth_analysis``/daily_sales`、PPT/AI 報表每次請求重跑昂貴覆蓋率與價差趨勢查詢;`run_competitor_price_feeder_task` 與 PChome backfill 完成後會主動清除快取。快取只包摘要輸出,不改 matcher 的高信心門檻與 identity_v2 準確性規則。

View File

@@ -13,6 +13,7 @@
## 📅 詳細更新日誌 (考古存檔)
### 2026-06-01PChome 比價新鮮度操作閉環
- **V10.527 PChome 過期 identity 搜尋救援隊列收斂**: V10.526 production smoke 發現直接對全部過期 `identity_v2` 做 rescue 會把香氛 / 色號 / 目錄款 / 商業狀態差異等人工覆核型 stale pair 送進慢速 fresh search20 筆耗時 361 秒且 0 筆成功。新增 `_fetch_expired_identity_recovery_skus()` 作為救援專用隊列,只收既有正式診斷為 `exact_identity / total_price / price_alert_exact` 且無 variant、catalog、commercial condition、count、bundle、unit-price 等阻擋理由的舊配對;名稱含任選、多款、香味、色號、即期、融燭燈、香氛蠟燭等高風險訊號先排除。
- **V10.526 PChome 重評預覽與過期 identity 搜尋救援**: `/api/ai/pchome-match/backfill/status` 新增 60 秒快取的 `revalidation_preview``stale_recovery_preview`Dashboard 補抓產線顯示「可重評 / 窄門 / 可救援」數字;兩個 preview 都只讀 DB不啟動 PChome 搜尋、不呼叫 LLM、不寫 `competitor_match_attempts` 或正式價格表。另新增 `/api/ai/pchome-match/recover-stale` 與「救援過期 40 筆」按鈕,對過期 `identity_v2` 先查既有 product_id只有在舊 ID 缺失或低分時才走受控 fresh-search recovery最後仍經 hard veto、auto price write safety 與 overwrite protection 才能寫入正式比價。
- **V10.525 高分 review-gated exact 舊候選窄門重評**: `run_retryable_candidate_revalidation()` 保持主戰場為 `low_score / refresh_low_score / recoverable_low_score`,只額外收 Beauty Foot、KAMERIA、TS6、Vaseline 這批已由 V10.523 補 focused exact 規則的 `true_low_confidence` 舊候選。入口要求舊分數 >= 0.95、仍為 `exact_identity`、具備 `strong_exact_spec_match`,且不得含 `commercial_condition_gap`、variant、count、bundle、refill 等阻擋理由;讓已驗證真同款可被回刷,不把整個人工審核池自動打開。
- **V10.524 PChome 過期價格刷新手動入口**: 商品看板 PChome 補抓產線新增「刷新過期 120 筆」按鈕與 `/api/ai/pchome-match/refresh-stale`,背景執行既有 `run_expired_identity_refresh()`,只刷新已建立 `identity_v2` 的 PChome product_id不跑 fresh search recovery、不呼叫 LLM完成後重算 AI 挑品並清除 Dashboard / competitor intel cache`stale_matches` 從觀測指標變成可直接操作的任務。

View File

@@ -96,6 +96,39 @@ REVALIDATABLE_REVIEW_SQL_REASON_LIST = ", ".join(
REVALIDATABLE_REVIEW_BLOCK_SQL_REASON_LIST = ", ".join(
f"'{reason}'" for reason in sorted(REVALIDATABLE_REVIEW_BLOCK_REASONS)
)
STALE_IDENTITY_RECOVERY_BLOCK_REASONS = {
"accessory_case_conflict",
"aroma_lamp_style_selection_gap",
"aroma_scent_variant_conflict",
"bundle_offer_conflict",
"candle_catalog_selection_gap",
"catalog_count_omission",
"commercial_condition_gap",
"count_conflict",
"makeup_catalog_selection_gap",
"makeup_finish_conflict",
"makeup_usage_conflict",
"multi_component_conflict",
"multi_component_count_conflict",
"named_component_quantity_conflict",
"nail_tool_function_conflict",
"price_ratio_extreme",
"price_ratio_wide",
"product_line_conflict",
"refill_pack_conflict",
"romand_lip_line_conflict",
"unit_comparable",
"variant_descriptor_conflict",
"variant_option_conflict",
"variant_selection_review",
}
STALE_IDENTITY_RECOVERY_BLOCK_SQL_REASON_LIST = ", ".join(
f"'{reason}'" for reason in sorted(STALE_IDENTITY_RECOVERY_BLOCK_REASONS)
)
STALE_IDENTITY_RECOVERY_BLOCK_NAME_PATTERN = (
r"(任選|多款|色號|顏色|款式|香味|香調|即期|短效|航空版|"
r"融燭燈|融蠟燈|香氛蠟燭|精油蠟燭|蠟燭|限定|組合任選)"
)
# ── Feeder 結果 ───────────────────────────────────────
@dataclass
@@ -1259,7 +1292,7 @@ class CompetitorPriceFeeder:
不查 PChome、不重新比對、不寫 attempts / prices。
"""
preview_limit = max(1, min(int(limit), 120))
rows = self._fetch_expired_identity_skus(limit=preview_limit)
rows = self._fetch_expired_identity_recovery_skus(limit=preview_limit)
examples: list[dict] = []
for row in rows[:5]:
examples.append({
@@ -1280,6 +1313,85 @@ class CompetitorPriceFeeder:
"boundary": "read_only_no_crawl_no_llm_no_db_write",
}
def _fetch_expired_identity_recovery_skus(self, limit: int = 40) -> list:
"""
取得適合 fresh-search recovery 的過期 identity_v2 商品。
這比一般 expired refresh 更窄:只收過去已是 exact / total_price /
price_alert_exact 的正式配對,且排除款式、香味、型態、入數、商業狀態等
高風險診斷或名稱訊號。避免把本來應該人工覆核的 stale pair 送進慢速搜尋。
"""
if self.engine is None:
raise RuntimeError("需要注入 SQLAlchemy engine")
from sqlalchemy import text
sql = text(f"""
WITH latest_momo AS (
SELECT
p.id AS product_id,
p.i_code AS sku,
p.name,
p.category,
pr.price AS momo_price,
ROW_NUMBER() OVER (PARTITION BY p.id ORDER BY pr.timestamp DESC) AS rn
FROM products p
JOIN price_records pr ON pr.product_id = p.id
WHERE p.status = 'ACTIVE'
)
SELECT
lm.product_id,
lm.sku,
lm.name,
lm.category,
lm.momo_price,
cp.competitor_product_id,
cp.competitor_product_name,
cp.match_score,
cp.expires_at
FROM latest_momo lm
JOIN competitor_prices cp
ON cp.sku = lm.sku
AND cp.source = 'pchome'
AND cp.competitor_product_id IS NOT NULL
AND cp.competitor_product_id <> ''
AND cp.expires_at IS NOT NULL
AND cp.expires_at <= CURRENT_TIMESTAMP
AND COALESCE(cp.match_score, 0) >= :match_score_floor
AND COALESCE(cp.tags, '[]'::jsonb) ? 'identity_v2'
WHERE lm.rn = 1
AND (
COALESCE(cp.tags, '[]'::jsonb) ? 'price_basis_total_price'
OR cp.match_diagnostic_json->>'price_basis' = 'total_price'
)
AND (
COALESCE(cp.tags, '[]'::jsonb) ? 'alert_tier_price_alert_exact'
OR cp.match_diagnostic_json->>'alert_tier' = 'price_alert_exact'
)
AND COALESCE(cp.match_diagnostic_json->>'comparison_mode', 'exact_identity') = 'exact_identity'
AND COALESCE(cp.hard_veto, false) = false
AND NOT (
COALESCE(cp.match_diagnostic_json->'reasons', '[]'::jsonb)
?| array[{STALE_IDENTITY_RECOVERY_BLOCK_SQL_REASON_LIST}]
)
AND COALESCE(lm.name, '') !~* :blocked_name_pattern
AND COALESCE(cp.competitor_product_name, '') !~* :blocked_name_pattern
ORDER BY
cp.expires_at ASC,
lm.momo_price DESC NULLS LAST,
lm.sku
LIMIT :limit
""")
with self.engine.connect() as conn:
rows = conn.execute(
sql,
{
"limit": max(1, min(int(limit), 120)),
"match_score_floor": MIN_MATCH_SCORE,
"blocked_name_pattern": STALE_IDENTITY_RECOVERY_BLOCK_NAME_PATTERN,
},
).fetchall()
return [dict(r._mapping) for r in rows]
def _fetch_expired_identity_skus(self, limit: int = 120) -> list:
"""
取得 identity_v2 已確認、但 PChome 價格快取過期的商品。
@@ -2497,7 +2609,7 @@ class CompetitorPriceFeeder:
safety、overwrite protection 才能寫入正式 competitor_prices。
"""
try:
skus = self._fetch_expired_identity_skus(limit=max(1, min(int(limit), 120)))
skus = self._fetch_expired_identity_recovery_skus(limit=max(1, min(int(limit), 120)))
except Exception as e:
logger.error(f"[Feeder] 讀取過期 identity_v2 搜尋救援商品失敗: {e}")
return FeederResult(0, 0, 0, 0, 1, 0.0)

View File

@@ -108,6 +108,8 @@ def test_competitor_feeder_persists_all_match_attempt_outcomes():
assert "_fetch_retryable_candidate_skus" in source
assert "preview_retryable_candidate_revalidation" in source
assert "preview_expired_identity_recovery" in source
assert "_fetch_expired_identity_recovery_skus" in source
assert "STALE_IDENTITY_RECOVERY_BLOCK_REASONS" in source
assert "read_only_no_crawl_no_llm_no_db_write" in source
assert "run_retryable_candidate_revalidation" in source
assert "run_expired_identity_search_recovery" in source
@@ -282,7 +284,7 @@ def test_competitor_feeder_expired_identity_recovery_preview_is_read_only(monkey
},
]
monkeypatch.setattr(feeder, "_fetch_expired_identity_skus", fake_fetch)
monkeypatch.setattr(feeder, "_fetch_expired_identity_recovery_skus", fake_fetch)
payload = feeder.preview_expired_identity_recovery(limit=2)
@@ -306,7 +308,7 @@ def test_competitor_feeder_expired_search_recovery_allows_fresh_recovery(monkeyp
calls.append((items, kwargs))
return FeederResult(1, 1, 0, 0, 0, 0.1)
monkeypatch.setattr(feeder, "_fetch_expired_identity_skus", fake_fetch)
monkeypatch.setattr(feeder, "_fetch_expired_identity_recovery_skus", fake_fetch)
monkeypatch.setattr(feeder, "_run_known_identity_refresh_items", fake_run)
result = feeder.run_expired_identity_search_recovery(limit=999)
@@ -1826,7 +1828,7 @@ def test_competitor_feeder_expired_recovery_allows_fresh_search(monkeypatch):
captured = {}
monkeypatch.setattr(
feeder,
"_fetch_expired_identity_skus",
"_fetch_expired_identity_recovery_skus",
lambda limit: [{"sku": "STALE-1", "competitor_product_id": "OLD-PID"}],
)