awoooi

Go to file

CD Pipeline / build-and-deploy (push) Has been cancelled

Details

fix(drift-narrator): B 方案 LLM 驅動智能摘要 — 徹底消滅 str()[:30] 暴力截斷

2026-04-18 下午（台北時區）—— ogt + Claude Opus 4.7 (1M)

根因:
_format_drift_summary() 對 dict/list 型別的 git_value/actual_value
直接呼叫 str()[:30] 暴力截斷,產生像 "[{'name': 'repair-ssh-key', 's"
這種亂碼掉半個 dict key 的亂七八糟輸出,徹底違背「AI 自主化」原則。

B 方案架構決策:
「捨棄 Python 寫死的字串解析邏輯。將原始 Config Diff 結構直接作為
Context,餵給 Hermes/NemoTron,利用 prompt 規定輸出格式,讓 LLM 自己
消化並輸出包含紅黃燈標示的 Top 5 人類易讀摘要。」

實作:
1. _NARRATIVE_PROMPT 重寫 — 要求 LLM 回傳 {narrative, items[]} JSON
   - drift items 以 JSON serialize 餵進 prompt（保留 200 字 context）
   - items 限 5 筆,HIGH 優先
   - summary 30 字繁中口語（非技術 repr）
2. _generate_narrative_and_items() 新方法 — 解析 LLM JSON 並驗證結構
3. _format_drift_for_llm() 新方法 — 結構化 JSON 給 LLM（取代舊 str 版）
4. _render_telegram_body() 新方法 — 組裝乾淨的 Telegram 卡片
   範例輸出:
     🤖 AI 研判
     <LLM 4-5 行敘述>

     📊 漂移明細 (HIGH: 1 | MEDIUM: 29)
     🔴 spec.template.spec.volumes: 新增 2 項 repair-ssh-key 掛載
     🟡 spec.template.spec.serviceAccount: (未設) → awoooi-executor
     ... 還有 27 項 (按 🔍 查看 Diff)

5. Fallback 強化 — _smart_shorten() + _fallback_items()
   LLM 失敗時用型別感知的 Python 摘要（dict/list 顯示大小,不暴力 repr）

移除:
- _format_drift_summary() — 舊的暴力截斷實作
- _generate_narrative() — 只回 string 的舊介面

保留:
- _fallback_narrative() / _format_intent_summary() — 仍有用
- Redis 快取 / trigger 條件 / DB update — 邏輯不變

MVP 階段:
本 commit 只改視覺呈現,沒動 automation_operation_log / ai_collaboration_trace
稽核寫入。等 Telegram 視覺驗證 OK 後再做 Phase 2 加入 DB 稽核。

相關:
  - feedback_ai_autonomous_direction.md 北極星原則
  - 1ff3405 今早的 JSON 裸奔 hotfix（只修了 narrative,沒修 items）

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

2026-04-18 15:54:16 +08:00

.agents

docs(review): 首席架構師 Code Review 後 — ADR-064/067 + Skill 02 補全記錄

2026-04-11 21:35:25 +08:00

.claude

refactor(phase-s): Phase S 技術債清理 - 五項架構改善

2026-04-01 13:12:02 +08:00

.gitea/workflows

feat(cd): ADR-090-B CD 注入 L2→L3 13 個 key — 消滅 K8s 單點盲區

2026-04-18 15:26:28 +08:00

.github/workflows

feat(cicd): ADR-039 完成 - GitHub Actions 停用，Gitea 主倉

2026-03-30 01:07:32 +08:00

.playwright-mcp

docs: Sprint 5R 前端重構批准 — ADR-065 + 設計稿 + Skills + LOGBOOK

2026-04-09 15:15:43 +08:00

apps

fix(drift-narrator): B 方案 LLM 驅動智能摘要 — 徹底消滅 str()[:30] 暴力截斷

2026-04-18 15:54:16 +08:00

architecture

docs: Phase 14 紅區治理 + Skills 01/03 更新

2026-03-26 09:55:47 +08:00

docs

feat(db): ADR-090 L4 AIOps 地基 — 資產盤點 × 7 項自動化覆蓋矩陣永久化 DB

2026-04-18 13:18:46 +08:00

infra

feat(infra): B-1 Ansible Host IaC 骨架完整版

2026-04-11 02:47:10 +08:00

k8s

chore(cd): deploy 7542e6e [skip ci]

2026-04-18 07:36:38 +00:00

ops

fix(alertmanager): webhook URL 改指向 VIP 192.168.0.125:32334

2026-04-16 03:19:58 +08:00

packages

chore(types): 同步 shared-types 自動產生

2026-04-17 22:12:16 +08:00

scripts

feat(ops): ADR-090-B 零信任收尾範本 — wrapper / sudoers / migrator / CI

2026-04-18 13:23:39 +08:00

.awoooi-agent-rules.md

refactor: Rename ClawBot → OpenClaw across documentation

2026-03-23 14:05:53 +08:00

.dependency-cruiser.cjs

chore: 未提交變更整理 (API core + docs + scripts)

2026-03-26 19:10:12 +08:00

.dockerignore

fix(docker): .dockerignore 白名單允許 scripts/cron_km_vectorize.py

2026-04-12 15:26:41 +08:00

.gitignore

feat(llmops): 啟用 Langfuse LLMOps 追蹤 + CD 自動注入 Keys

2026-04-01 22:19:22 +08:00

.npmrc

feat: add all application source code

2026-03-22 18:57:44 +08:00

.pre-commit-config.yaml

feat(phase6-9): Complete modular architecture and Agent Teams

2026-03-23 18:40:36 +08:00

.secrets.baseline

feat(phase6-9): Complete modular architecture and Agent Teams

2026-03-23 18:40:36 +08:00

.spectral.yaml

fix(ci): Add spectral config for OpenAPI validation

2026-03-24 09:22:49 +08:00

capabilities.json

feat(soul): OpenClaw v5.6 — ADR-067五大Ollama應用 + Guardrail BLOCK層

2026-04-10 21:50:37 +08:00

CLAUDE.md

docs: 更新 CLAUDE.md + HARD_RULES.md v2.0 + LOGBOOK (2026-04-16)

2026-04-16 01:20:16 +08:00

deploy-infra.sh

feat(phase6-9): Complete modular architecture and Agent Teams

2026-03-23 18:40:36 +08:00

docker-compose.yml

docs: 紅區治理 + 部署文檔更新

2026-03-26 09:55:58 +08:00

Gemini_Generated_Image_sxbfrvsxbfrvsxbf (2).png

feat(api): Add sync-from-approvals endpoint for incident backfill

2026-03-25 00:09:44 +08:00

Gemini_Generated_Image_sxbfrvsxbfrvsxbf.png

feat(api): Add sync-from-approvals endpoint for incident backfill

2026-03-25 00:09:44 +08:00

GLOBAL_RULES.md

docs: Phase 14 紅區治理 + Skills 01/03 更新

2026-03-26 09:55:47 +08:00

package.json

feat(ci): Phase 14.2 dependency-cruiser 整合

2026-03-26 09:18:51 +08:00

phase-r-r4-authorizations.png

refactor(phase-s): Phase S 技術債清理 - 五項架構改善

2026-04-01 13:12:02 +08:00

phase-r-r4-frontend-home.png

refactor(phase-s): Phase S 技術債清理 - 五項架構改善

2026-04-01 13:12:02 +08:00

pnpm-lock.yaml

feat(web): Sprint 5 Phase 0 — 安裝 React Flow + elkjs + 保留經典首頁

2026-04-08 18:07:59 +08:00

pnpm-workspace.yaml

fix: add root monorepo config files (pnpm-workspace.yaml)

2026-03-22 19:00:45 +08:00

README.md

feat(phase6-9): Complete modular architecture and Agent Teams

2026-03-23 18:40:36 +08:00

SOUL.md

feat(soul): OpenClaw v5.6 — ADR-067五大Ollama應用 + Guardrail BLOCK層

2026-04-10 21:50:37 +08:00

tsconfig.json

fix: add root monorepo config files (pnpm-workspace.yaml)

2026-03-22 19:00:45 +08:00

tsconfig.tsbuildinfo

feat: integrate Sentry + fix CI/CD issues

2026-03-24 15:19:52 +08:00

turbo.json

feat: integrate Sentry + fix CI/CD issues

2026-03-24 15:19:52 +08:00

verify_telegram_ui.py

fix(telegram): 修復死卡按鈕 + 重複渲染 + 智能截斷三連修

2026-04-17 13:57:42 +08:00

README.md

     █████╗ ██╗    ██╗ ██████╗  ██████╗  ██████╗ ██╗
    ██╔══██╗██║    ██║██╔═══██╗██╔═══██╗██╔═══██╗██║
    ███████║██║ █╗ ██║██║   ██║██║   ██║██║   ██║██║
    ██╔══██║██║███╗██║██║   ██║██║   ██║██║   ██║██║
    ██║  ██║╚███╔███╔╝╚██████╔╝╚██████╔╝╚██████╔╝██║
    ╚═╝  ╚═╝ ╚══╝╚══╝  ╚═════╝  ╚═════╝  ╚═════╝ ╚═╝

Zero-Touch Ops. Human-Centric Decisions.

AI-Powered Intelligent Operations Platform

Demo · Documentation · Contributing

The Future of Operations is Here

When your system breaks at 3 AM, AWOOOI doesn't just alert you—it analyzes the blast radius, calculates how much money you're burning, and presents a one-click fix. You approve. It executes. You go back to sleep.

┌─────────────────────────────────────────────────────────────────────────────┐
│                                                                             │
│   ALERT: frontend 5xx rate > 15%                                            │
│                                                                             │
│   ┌─────────────┐      ┌─────────────┐      ┌─────────────┐                │
│   │  GraphRAG   │ ──▶  │  Dry-Run    │ ──▶  │  Multi-Sig  │                │
│   │  Analysis   │      │  Simulation │      │  Approval   │                │
│   └─────────────┘      └─────────────┘      └─────────────┘                │
│         │                    │                    │                        │
│         ▼                    ▼                    ▼                        │
│   Root Cause:          Blast Radius:        [x] devops-alice               │
│   postgres-db          1 pod, 0 data loss   [x] sre-bob                    │
│                                                                             │
│   Monthly Savings: $523.60 if fixed                                         │
│                                                                             │
│   [ APPROVE & EXECUTE ]                                                     │
│                                                                             │
└─────────────────────────────────────────────────────────────────────────────┘

AWOOOI (AI + WOOO Intelligent Operations) transforms reactive firefighting into proactive, AI-assisted decision-making—while keeping humans firmly in control of critical actions.

Enterprise Moats

Four pillars that make AWOOOI enterprise-ready from Day 1:

Privacy Shield

Your PII never leaves your premises. Period.

# Before: Raw sensitive data
"User 192.168.1.100 with email admin@company.com triggered alert"

# After: Consistent pseudonymization
"User [IP_1] with email [EMAIL_1] triggered alert"
# Same value → Same label (AI maintains context without seeing real data)

Regex-based detection: IP, Email, UUID, API Keys, JWT
Consistent hashing: [IP_1] always maps to the same IP within a session
Rehydration Engine: Labels restored only at MCP execution boundary
Zero PII in logs, zero PII to cloud LLMs

GraphRAG: Topology-Aware Intelligence

AI that understands your microservices like a senior SRE.

                    ┌─────────────────────────────────────┐
                    │         BLAST RADIUS ANALYSIS       │
                    │         (Upstream Impact)           │
                    └─────────────────────────────────────┘

                         ┌─────────────┐
                         │   ingress   │  ← Will be affected
                         └──────┬──────┘
                                │ depends on
                                ▼
                         ┌─────────────┐
                         │  frontend   │  ← Target service
                         └──────┬──────┘
                                │ calls
                                ▼
        ┌───────────────────────┼───────────────────────┐
        │                       │                       │
        ▼                       ▼                       ▼
┌──────────────┐      ┌──────────────┐      ┌──────────────┐
│ auth-service │      │ product-api  │      │  order-api   │
└──────┬───────┘      └──────┬───────┘      └──────┬───────┘
       │                     │                     │
       └─────────────────────┼─────────────────────┘
                             ▼
                    ┌──────────────┐
                    │ postgres-db  │ X ROOT CAUSE
                    └──────────────┘

BFS-based traversal with configurable max_depth (default: 3)
Dual-direction analysis: Upstream (blast radius) + Downstream (root cause)
Priority ranking: DATABASE > CACHE > QUEUE for root cause identification
Multiple root causes: No single-point assumptions—collect ALL unhealthy dependencies

Multi-Sig & Dry-Run: Defense in Depth

Every critical action is simulated, validated, and co-signed.

┌────────────────────────────────────────────────────────────────┐
│                      RISK MATRIX                               │
├────────────┬─────────────┬─────────────────────────────────────┤
│ Risk Level │ Signatures  │ Required Roles                      │
├────────────┼─────────────┼─────────────────────────────────────┤
│ LOW        │ 0 (auto)    │ —                                   │
│ MEDIUM     │ 1           │ admin, devops, sre                  │
│ HIGH       │ 2           │ admin, devops, sre                  │
│ CRITICAL   │ 2           │ CTO + CISO (mandatory)              │
└────────────┴─────────────┴─────────────────────────────────────┘

TOCTOU Protection (Time-of-Check to Time-of-Use):

1. User clicks "Approve"
2. System re-runs Dry-Run immediately before execution
3. If state changed → Status = VOIDED (not cleared!)
4. Full audit trail preserved for compliance

Dry-Run Checks:

RBAC Permission validation
Syntax & parameter validation
Resource existence verification
PodDisruptionBudget compliance
Blast radius calculation

Progressive Autonomy: Trust That Evolves

The more you approve, the less you need to.

┌─────────────────────────────────────────────────────────────────┐
│                    TRUST SCORE PROGRESSION                      │
├─────────────────────────────────────────────────────────────────┤
│                                                                 │
│  Score: 0 ──────────────────────────────────────────────▶ 10+  │
│         │                    │                          │       │
│         ▼                    ▼                          ▼       │
│    ┌─────────┐         ┌─────────┐              ┌─────────┐    │
│    │  HIGH   │   ──▶   │ MEDIUM  │    ──▶      │   LOW   │    │
│    │ 2-sig   │  @10    │  1-sig  │    @5       │  auto   │    │
│    └─────────┘         └─────────┘              └─────────┘    │
│                                                                 │
│  ⚠️  CRITICAL operations NEVER auto-downgrade (enterprise law) │
│                                                                 │
│  Single REJECT → Trust score resets to 0 (instant collapse)    │
│                                                                 │
└─────────────────────────────────────────────────────────────────┘

Approve → +1 trust score
Reject → Score resets to 0 (trust collapses instantly)
Pattern-based: restart_pod:nginx-* builds trust separately from delete_pvc:*
CRITICAL operations (DROP TABLE, DELETE NAMESPACE) → Always requires human dual-signature

leWOOOgo Engine Architecture

AWOOOI is built on the leWOOOgo Engine—a modular, plugin-based architecture inspired by LEGO blocks:

┌─────────────────────────────────────────────────────────────────────────────┐
│                           leWOOOgo Engine                                   │
├─────────────────────────────────────────────────────────────────────────────┤
│                                                                             │
│   ┌─────────┐   ┌─────────┐   ┌─────────┐   ┌─────────┐   ┌─────────┐     │
│   │  INPUT  │   │  BRAIN  │   │ OUTPUT  │   │ ACTION  │   │  DATA   │     │
│   │ ─────── │   │ ─────── │   │ ─────── │   │ ─────── │   │ ─────── │     │
│   │Webhooks │   │ Ollama  │   │  Slack  │   │   K8s   │   │ Postgres│     │
│   │  Kafka  │   │ OpenAI  │   │ Discord │   │  Shell  │   │  Redis  │     │
│   │Prometheus│   │ Claude  │   │  Email  │   │   MCP   │   │  S3     │     │
│   └────┬────┘   └────┬────┘   └────┬────┘   └────┬────┘   └────┬────┘     │
│        │             │             │             │             │           │
│        └─────────────┴─────────────┴─────────────┴─────────────┘           │
│                                    │                                        │
│                            ┌───────┴───────┐                               │
│                            │      UI       │                               │
│                            │ ───────────── │                               │
│                            │   Next.js     │                               │
│                            │ ApprovalCard  │                               │
│                            │ThinkingStream │                               │
│                            └───────────────┘                               │
│                                                                             │
└─────────────────────────────────────────────────────────────────────────────┘

Module Overview

Module	Purpose	Key Components
INPUT	Event ingestion	Prometheus AlertManager, Kafka, Webhooks
BRAIN	AI reasoning	Ollama (local), OpenAI, Claude, GraphRAG
OUTPUT	Notifications	Slack, Discord, Email, Custom webhooks
ACTION	Execution	K8s API, Shell, MCP Bridge, Ansible
DATA	Persistence	PostgreSQL, Redis, S3, Vector DB
UI	Human interface	Next.js 14, ApprovalCard, ThinkingTerminal

MCP (Model Context Protocol) Support

// MCP enables AI to safely interact with external tools
await mcpBridge.callTool("kubernetes", "restart_pod", {
  pod_name: "[POD_1]",      // Redacted in logs
  namespace: "production",
  graceful: true,
});
// Rehydration happens at execution boundary only

FinOps: Day-1 ROI

Every wasted resource has a dollar sign. AWOOOI shows you exactly how much.

┌─────────────────────────────────────────────────────────────────┐
│                    FINOPS COST ANALYSIS                         │
├─────────────────────────────────────────────────────────────────┤
│                                                                 │
│   MONTHLY WASTE DETECTED: $523.60                               │
│                                                                 │
│   ┌──────────────────┬──────────────────┬──────────────────┐   │
│   │   REALIZABLE     │      FREED       │     ANNUAL       │   │
│   │   $480.00/mo     │    $43.60/mo     │   $5,760/yr      │   │
│   │   ────────────   │   ────────────   │   ────────────   │   │
│   │   PVC deletion   │   Pod cleanup    │   if all fixed   │   │
│   │   Node resize    │   (needs scale)  │                  │   │
│   └──────────────────┴──────────────────┴──────────────────┘   │
│                                                                 │
│   TOP RECOMMENDATIONS:                                          │
│   ├─ Delete orphaned PVC 'data-postgres-backup'    -$40.00 LOW │
│   ├─ Resize node 'worker-large-01'                -$340.00 HIGH│
│   └─ Delete zombie Pod 'legacy-api-5d7b8'          -$76.00 MED │
│                                                                 │
└─────────────────────────────────────────────────────────────────┘

Scan Types:

Orphaned PVCs: Storage not mounted by any Pod
Zombie Pods: CPU < 1% for 7+ consecutive days
Over-provisioned Nodes: High request, low actual usage

Safety Buffer: wasted = requested - (actual × 1.2) prevents OOM from aggressive recommendations.

Quick Start

Prerequisites

Python 3.11+
Node.js 18+
pnpm 8+
Docker (optional, for local Ollama)

Installation

# Clone the repository
git clone https://github.com/anthropics/awoooi.git
cd awoooi

# Install dependencies
pnpm install

# Setup Python environment
cd apps/api
python -m venv venv
source venv/bin/activate  # or `venv\Scripts\activate` on Windows
pip install -r requirements.txt

Run Tracer Bullet 2.0 (E2E Demo)

Experience the full AWOOOI loop in 30 seconds:

cd apps/api
python scripts/tracer_bullet_2.py

Expected Output:

============================================================
TRACER BULLET 2.0 - FULL LOOP TEST
Test ID: tb2-20260319143052
============================================================

[x] [trigger_alert] PASS
[x] [graphrag_analysis] PASS
[x] [generate_approval] PASS
[x] [multisig_approval] PASS
[x] [mcp_execution] PASS

============================================================
TEST SUMMARY
============================================================
  Total Steps: 5
  Passed: 5
  Failed: 0
  Status: ALL PASSED

Start Development Servers

# Terminal 1: API Server
cd apps/api
uvicorn src.main:app --reload --port 8000

# Terminal 2: Web Server
cd apps/web
pnpm dev

Open http://localhost:3000 to see the AWOOOI dashboard.

Project Structure

awoooi/
├── apps/
│   ├── api/                    # FastAPI Backend
│   │   ├── src/
│   │   │   ├── services/       # Core services
│   │   │   │   ├── approval.py     # Multi-Sig engine
│   │   │   │   ├── dry_run.py      # Dry-Run engine
│   │   │   │   ├── trust_engine.py # Progressive autonomy
│   │   │   │   └── graph_rag.py    # Topology analysis
│   │   │   └── plugins/
│   │   │       ├── security/       # Privacy Shield
│   │   │       ├── mcp/            # MCP Bridge
│   │   │       └── finops/         # Cost analyzer
│   │   └── scripts/
│   │       └── tracer_bullet_2.py  # E2E test
│   │
│   └── web/                    # Next.js Frontend
│       └── src/
│           ├── components/
│           │   └── agent/
│           │       ├── approval-card.tsx
│           │       └── thinking-terminal.tsx
│           └── stores/
│               └── agent.store.ts
│
├── packages/
│   └── lewooogo-core/          # Shared types & contracts
│
└── docs/
    └── adr/                    # Architecture Decision Records

Roadmap

Phase	Status	Description
Phase 0	Complete	Contracts & Scaffolding
Phase 1	Complete	Core Integration (Monorepo, SSE, Ollama)
Phase 2	Complete	HITL (ApprovalCard, Dry-Run, Multi-Sig)
Phase 3	Complete	Enterprise (Privacy Shield, GraphRAG, FinOps)
Phase 4	In Progress	Production Hardening & GA Release
Phase 5	Planned	Multi-cluster, Federation, SaaS

Contributing

We welcome contributions! Please see our Contributing Guide for details.

# Run tests
pnpm test

# Run linting
pnpm lint

# Format code
pnpm format

License

MIT License - see LICENSE for details.

Built with love by 岑洋國際行銷有限公司

Turning 3 AM pages into peaceful nights since 2026

    "The best incident is the one you never have to wake up for."
                                        — AWOOOI Philosophy

Languages

Python 78%

TypeScript 17.5%

Shell 3.3%

HTML 0.4%

PLpgSQL 0.4%

Other 0.2%

README.md Unescape Escape

Zero-Touch Ops. Human-Centric Decisions.

The Future of Operations is Here

Enterprise Moats

Privacy Shield

GraphRAG: Topology-Aware Intelligence

Multi-Sig & Dry-Run: Defense in Depth

Progressive Autonomy: Trust That Evolves

leWOOOgo Engine Architecture

Module Overview

MCP (Model Context Protocol) Support

FinOps: Day-1 ROI

Quick Start

Prerequisites

Installation

Run Tracer Bullet 2.0 (E2E Demo)

Start Development Servers

Project Structure

Roadmap

Contributing

License

README.md