V10.560 串接手動比價補強流程

This commit is contained in:
OoO
2026-06-01 21:22:52 +08:00
parent 42b1c25418
commit 07db301c54
5 changed files with 77 additions and 13 deletions

View File

@@ -4,6 +4,7 @@
================================================================================
【已完成】
- V10.560 串起手動 PChome 比價補強三段式流程:`/api/ai/pchome-match/backfill` 現在不只跑近門檻重評與未配對補抓,也會先用小批次 `run_expired_identity_refresh()` 刷新已知 `identity_v2` 舊價格,讓操作員按一次補強就能同時處理「舊 identity 新鮮度」、「near-threshold low_score」與「pending identity」三條主線。結果 payload 新增 `stale_identity_refresh` 分段統計,方便後續 Dashboard / 簡報 / AI 決策知道覆蓋率改善是來自刷新、重評或補抓。
- V10.559 收斂 retryable 有效身份新鮮度:`_fetch_retryable_candidate_skus()` 不再把 `expires_at IS NULL` 的舊 PChome `identity_v2` 當成有效阻擋條件,只有明確 `expires_at > CURRENT_TIMESTAMP` 的新鮮 identity 才會阻止 near-threshold revalidation。未知新鮮度仍走 V10.551 的 expired / recovery 刷新入口,重評後仍必須通過最新版 matcher、hard-veto、auto write safety 與既有正式候選覆寫保護,避免為了拉覆蓋率犧牲準確率。
- V10.558 補 legacy focused identity reason 回刷窄門:舊 attempt 若沒有新版 `focused_exact_total_price_safe` marker但已帶具名 `focused_exact_identity_*` 且該 identity 屬於 matcher total-price safe set並且舊分數已達全域 `MIN_MATCH_SCORE`,可進近門檻重評。這補上歷史資料缺 marker 的漏接情境;仍要求無 hard veto、`exact_identity`、無 commercial / variant / count / bundle 阻擋,最後由最新版 matcher 決定是否能寫正式價差。
- V10.557 收緊 focused reason-based 回刷 guard上一版 reason-based 回刷現在不只要求 `focused_exact_total_price_safe`,還必須同時命中一條具名 `focused_exact_identity_*` 且該 identity 屬於 matcher 的 total-price safe set。這避免未來只有總開關、缺少具名身份證據的舊 attempt 被納入回刷rom&nd / Solone / Summers Eve 等 review-only focused line 仍被測試鎖在自動價差線外。

View File

@@ -402,7 +402,7 @@ YOUTUBE_API_KEY = os.getenv('YOUTUBE_API_KEY', '')
# ==========================================
# 系統版本與路徑
# ==========================================
SYSTEM_VERSION = "V10.559"
SYSTEM_VERSION = "V10.561"
LOG_FILE_PATH = os.path.join(BASE_DIR, 'logs/system.log')
public_url = PUBLIC_URL # 用於模板顯示

View File

@@ -13,6 +13,7 @@
## 📅 詳細更新日誌 (考古存檔)
### 2026-06-01PChome 比價新鮮度操作閉環
- **V10.560 手動 PChome 比價補強三段式串接**: `/api/ai/pchome-match/backfill` 與每日 scheduler 口徑對齊,手動執行時先小批次刷新過期 `identity_v2`,再跑近門檻候選重評,最後補抓高優先未配對商品。回傳結果新增 `stale_identity_refresh` 分段統計,讓後續 Dashboard、簡報與 AI 決策能區分覆蓋率改善來自舊 identity 新鮮度回補、matcher 回刷,還是 fresh search 補抓。
- **V10.559 retryable 有效身份新鮮度收斂**: `_fetch_retryable_candidate_skus()` 的既有 identity 阻擋條件改成只接受 `cp.expires_at > CURRENT_TIMESTAMP`,不再讓 `expires_at IS NULL` 的未知新鮮度舊配對壓住近門檻候選回刷。未知新鮮度仍由 expired identity refresh / recovery 路徑處理,最後寫入仍必須通過現行 matcher、hard-veto、auto write safety 與 stronger existing production match 保護。
- **V10.558 legacy focused identity reason 回刷補漏**: `_fetch_retryable_candidate_skus()` 在 V10.557 的具名 identity guard 之外,補上歷史資料缺 marker 的情境:舊 attempt 若沒有新版 `focused_exact_total_price_safe`,但已有具名 `focused_exact_identity_*` 且該 identity 屬於 matcher total-price safe set並且舊分數已達全域 `MIN_MATCH_SCORE`,可進近門檻重評。仍要求無 hard veto、`exact_identity`、無 commercial / variant / count / bundle 阻擋,最後由最新版 matcher 決定是否能寫正式價差。
- **V10.557 focused reason-based 回刷具名 identity guard**: V10.555 的結構化 reason 回刷再收緊,`_fetch_retryable_candidate_skus()` 不只要求 `focused_exact_total_price_safe`,還必須同時命中一條具名 `focused_exact_identity_*` 且該 identity 來自 matcher 的 total-price safe set。這避免只有總開關、缺少身份線索的舊 attempt 被納入回刷rom&nd、Solone、Summers Eve 等 review-only focused line 仍被測試鎖在自動價差線外。

View File

@@ -1748,6 +1748,19 @@ def _feeder_result_payload(result):
}
def _empty_feeder_result_payload():
return {
'total_skus': 0,
'matched': 0,
'skipped_no_result': 0,
'skipped_low_score': 0,
'errors': 0,
'history_written': 0,
'attempts_written': 0,
'duration_sec': 0.0,
}
def _pick_result_payload(result):
return {
'candidates': int(getattr(result, 'candidates', 0) or 0),
@@ -1756,18 +1769,53 @@ def _pick_result_payload(result):
}
def _combined_feeder_payload(revalidation_result, feeder_result):
def _combined_feeder_payload(revalidation_result, feeder_result, stale_refresh_result=None):
stale_refresh_payload = (
_feeder_result_payload(stale_refresh_result)
if stale_refresh_result is not None
else _empty_feeder_result_payload()
)
revalidation_payload = _feeder_result_payload(revalidation_result)
feeder_payload = _feeder_result_payload(feeder_result)
return {
'total_skus': revalidation_payload['total_skus'] + feeder_payload['total_skus'],
'matched': revalidation_payload['matched'] + feeder_payload['matched'],
'skipped_no_result': revalidation_payload['skipped_no_result'] + feeder_payload['skipped_no_result'],
'skipped_low_score': revalidation_payload['skipped_low_score'] + feeder_payload['skipped_low_score'],
'errors': revalidation_payload['errors'] + feeder_payload['errors'],
'history_written': revalidation_payload['history_written'] + feeder_payload['history_written'],
'attempts_written': revalidation_payload['attempts_written'] + feeder_payload['attempts_written'],
'duration_sec': round(revalidation_payload['duration_sec'] + feeder_payload['duration_sec'], 2),
'total_skus': (
stale_refresh_payload['total_skus']
+ revalidation_payload['total_skus']
+ feeder_payload['total_skus']
),
'matched': (
stale_refresh_payload['matched']
+ revalidation_payload['matched']
+ feeder_payload['matched']
),
'skipped_no_result': (
stale_refresh_payload['skipped_no_result']
+ revalidation_payload['skipped_no_result']
+ feeder_payload['skipped_no_result']
),
'skipped_low_score': (
stale_refresh_payload['skipped_low_score']
+ revalidation_payload['skipped_low_score']
+ feeder_payload['skipped_low_score']
),
'errors': stale_refresh_payload['errors'] + revalidation_payload['errors'] + feeder_payload['errors'],
'history_written': (
stale_refresh_payload['history_written']
+ revalidation_payload['history_written']
+ feeder_payload['history_written']
),
'attempts_written': (
stale_refresh_payload['attempts_written']
+ revalidation_payload['attempts_written']
+ feeder_payload['attempts_written']
),
'duration_sec': round(
stale_refresh_payload['duration_sec']
+ revalidation_payload['duration_sec']
+ feeder_payload['duration_sec'],
2,
),
'stale_identity_refresh': stale_refresh_payload,
'retryable_candidate_revalidation': revalidation_payload,
'unmatched_priority_backfill': feeder_payload,
}
@@ -1818,6 +1866,13 @@ def api_pchome_match_backfill():
engine = create_engine(DATABASE_PATH)
feeder = CompetitorPriceFeeder(engine=engine)
stale_refresh_limit = max(5, min(40, max(5, limit // 3)))
update_pchome_backfill_run(
run_id,
stage='refreshing_stale',
message=f'正在先刷新 {stale_refresh_limit} 筆已知 PChome identity補回可用比價新鮮度',
)
stale_refresh_result = feeder.run_expired_identity_refresh(limit=stale_refresh_limit)
revalidation_limit = min(limit, 80)
update_pchome_backfill_run(
run_id,
@@ -1835,7 +1890,11 @@ def api_pchome_match_backfill():
message=f'正在補抓 {unmatched_limit} 筆高優先尚未搜尋商品',
)
result = feeder.run_unmatched_priority(limit=unmatched_limit)
result_payload = _combined_feeder_payload(revalidation_result, result)
result_payload = _combined_feeder_payload(
revalidation_result,
result,
stale_refresh_result=stale_refresh_result,
)
update_pchome_backfill_run(
run_id,
stage='generating_picks',
@@ -1859,7 +1918,7 @@ def api_pchome_match_backfill():
result=result_payload,
pick_result=pick_payload,
message=(
f"PChome 補抓完成:比對 {result_payload['total_skus']} 筆、"
f"PChome 比價補強完成:刷新/重評/補抓 {result_payload['total_skus']} 筆、"
f"新增/更新 {result_payload['matched']} 筆、"
f"AI 挑品寫入 {pick_payload['written']}"
),
@@ -1887,7 +1946,7 @@ def api_pchome_match_backfill():
return jsonify({
'success': True,
'message': f'已啟動 PChome 未搜尋補抓,優先處理 {limit} 筆高價未配對商品;完成後會重算 AI 挑品清單',
'message': f'已啟動 PChome 比價補強,會先刷新舊 identity再重評近門檻與補抓 {limit} 筆高價未配對商品;完成後會重算 AI 挑品清單',
'limit': limit,
'data': _get_pchome_backfill_status_payload(),
}), 202

View File

@@ -530,6 +530,9 @@ def test_ai_product_pick_agent_uses_real_competitor_data_and_dashboard_action():
assert "stale_recovery_preview" in route_source
assert "status['coverage'] = _build_pchome_backfill_coverage_payload()" in route_source
assert "run_unmatched_priority(limit=unmatched_limit)" in route_source
assert "stale_refresh_limit = max(5, min(40, max(5, limit // 3)))" in route_source
assert "stale_refresh_result = feeder.run_expired_identity_refresh(limit=stale_refresh_limit)" in route_source
assert "stale_identity_refresh" in route_source
assert "run_expired_identity_refresh(limit=limit)" in route_source
assert "run_expired_identity_search_recovery(limit=limit)" in route_source
assert "stage='refreshing_stale'" in route_source