feat(AIOps): 全開 P1-P6 feature flags + Nemotron + offline replay loop
Some checks failed
CD Pipeline / build-and-deploy (push) Has been cancelled

- configmap: 啟用 AIOPS_P1~P6 全部總開關與子開關
- configmap: ENABLE_NEMOTRON_COLLABORATION=true(回歸 120s timeout)
- feature_flags.py: 補齊 AIOPS_P6_GOVERNANCE_ENABLED 缺失欄位
- main.py: 掛載 run_offline_replay_loop(ADR-087 Phase 6)

2026-04-15 ogt + Claude Sonnet 4.6(亞太)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
OG T
2026-04-15 21:59:51 +08:00
parent ecfb7148bf
commit 76558a3cd9
4 changed files with 45 additions and 3 deletions

View File

@@ -173,6 +173,10 @@ class AIOpsFeatureFlags(BaseSettings):
default=False,
description="P6: Playbook trust 分布漂移偵測是否啟用",
)
AIOPS_P6_GOVERNANCE_ENABLED: bool = Field(
default=False,
description="P6: 治理閉環總開關offline_replay_service / model_rollback_service 守衛)",
)
def is_phase_enabled(self, phase: int) -> bool:
"""

View File

@@ -383,6 +383,15 @@ async def lifespan(_app: FastAPI) -> AsyncGenerator[None, None]:
except Exception as e:
logger.warning("proactive_inspector_schedule_failed", error=str(e))
# ADR-087 Phase 6: 離線回放(每 7 天)— 決策一致率基線
# 2026-04-15 ogt + Claude Sonnet 4.6(亞太): Phase 6 初始建立
try:
from src.jobs.offline_replay_service import run_offline_replay_loop
asyncio.create_task(run_offline_replay_loop())
logger.info("offline_replay_loop_scheduled", interval_sec=604800)
except Exception as e:
logger.warning("offline_replay_loop_schedule_failed", error=str(e))
yield
# Shutdown

View File

@@ -119,3 +119,33 @@ data:
SENTRY_MCP_ENABLED: "true"
# Prometheus server 在 110:9090 (非 188)
PROMETHEUS_URL: "http://192.168.0.110:9090"
# ============================================================================
# AIOps Phase 1-6 Feature Flags (2026-04-15 ogt: 全開,資料先全寫入 DB)
# ============================================================================
AIOPS_P1_ENABLED: "true"
AIOPS_P1_PRE_DECISION_INVESTIGATOR: "true"
AIOPS_P1_POST_EXECUTION_VERIFIER: "true"
AIOPS_P2_ENABLED: "true"
AIOPS_P2_CRITIC_ENABLED: "true"
AIOPS_P2_AGENT_TIMEOUT_SEC: "15"
AIOPS_P3_ENABLED: "true"
AIOPS_P3_EVOLVER_ENABLED: "true"
AIOPS_P3_FINETUNE_EXPORT: "true"
AIOPS_P3_KNOWLEDGE_DECAY: "true"
AIOPS_P4_ENABLED: "true"
AIOPS_P4_DYNAMIC_BASELINE: "true"
AIOPS_P4_LOG_ANOMALY: "true"
AIOPS_P4_TREND_PREDICTOR: "true"
AIOPS_P4_PROACTIVE_INSPECTOR: "true"
AIOPS_P4_SHADOW_MODE: "true"
AIOPS_P5_ENABLED: "true"
AIOPS_P5_BLAST_RADIUS_CHECK: "true"
AIOPS_P5_GITOPS_PR: "false"
AIOPS_P5_DRY_RUN_ENFORCED: "true"
AIOPS_P6_ENABLED: "true"
AIOPS_P6_GOVERNANCE_ENABLED: "true"
AIOPS_P6_SELF_DEMOTION: "true"
AIOPS_P6_OFFLINE_REPLAY: "true"
AIOPS_P6_KB_ROT_CLEANER: "true"
AIOPS_P6_TRUST_DRIFT_DETECTOR: "true"

View File

@@ -61,9 +61,8 @@ spec:
- name: USE_AI_ROUTER
value: "true"
- name: ENABLE_NEMOTRON_COLLABORATION
# 2026-04-12 ogt: 用 — Ollama 111 tool_call 60s×2 > asyncio.wait_for 30s
# → expert_system fallback → confidence=0.0,待 Ollama 穩定後恢復
value: "false"
# 2026-04-15 ogt: 重新啟用 — asyncio.wait_for=120sOllama 已等待回應
value: "true"
- name: NEMOTRON_TIMEOUT_SECONDS
value: "55"
- name: TELEGRAM_ENABLE_POLLING