Files
awoooi/README.md
OG T 7478dc0254 feat(phase6-9): Complete modular architecture and Agent Teams
Phase 6.4 - Modular Architecture:
- Add lewooogo-brain adapters for LLM providers
- Add lewooogo-data dual memory (Redis + PostgreSQL)
- Implement consensus engine for multi-agent decisions
- Add incident memory service for historical context

Phase 9 - Agent Teams (Claude Agent SDK):
- Add base agent class with Claude Sonnet 4 integration
- Implement action planner, blast radius, and security agents
- Add agent API endpoints and proposal workflow
- Integrate ADR-009 OpenClaw Agent Teams architecture

DevOps & CI/CD:
- Add GitHub Actions CI/CD workflows (ci.yaml, cd.yaml)
- Add pre-commit hooks and secrets baseline
- Add docker-compose for local development
- Update Kubernetes network policies

Frontend Improvements:
- Add auto-healing error boundary component
- Update i18n messages for agent features
- Enhance dual-state incident card with execution feedback

Documentation:
- Add 7 ADRs covering MCP, design system, architecture decisions
- Update ARCHITECTURE_MEMORY.md with modular design
- Add GLOBAL_RULES.md and SOUL.md for project identity

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-03-23 18:40:36 +08:00

435 lines
21 KiB
Markdown
Raw Permalink Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
<div align="center">
```
█████╗ ██╗ ██╗ ██████╗ ██████╗ ██████╗ ██╗
██╔══██╗██║ ██║██╔═══██╗██╔═══██╗██╔═══██╗██║
███████║██║ █╗ ██║██║ ██║██║ ██║██║ ██║██║
██╔══██║██║███╗██║██║ ██║██║ ██║██║ ██║██║
██║ ██║╚███╔███╔╝╚██████╔╝╚██████╔╝╚██████╔╝██║
╚═╝ ╚═╝ ╚══╝╚══╝ ╚═════╝ ╚═════╝ ╚═════╝ ╚═╝
```
### **Zero-Touch Ops. Human-Centric Decisions.**
*AI-Powered Intelligent Operations Platform*
[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
[![Python 3.11+](https://img.shields.io/badge/python-3.11+-blue.svg)](https://www.python.org/downloads/)
[![Next.js 14](https://img.shields.io/badge/Next.js-14-black.svg)](https://nextjs.org/)
[![TypeScript](https://img.shields.io/badge/TypeScript-5.0-blue.svg)](https://www.typescriptlang.org/)
[Demo](#-quick-start) · [Documentation](#-architecture) · [Contributing](#-contributing)
</div>
---
## The Future of Operations is Here
> **When your system breaks at 3 AM, AWOOOI doesn't just alert you—it analyzes the blast radius, calculates how much money you're burning, and presents a one-click fix. You approve. It executes. You go back to sleep.**
```
┌─────────────────────────────────────────────────────────────────────────────┐
│ │
│ ALERT: frontend 5xx rate > 15% │
│ │
│ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ │
│ │ GraphRAG │ ──▶ │ Dry-Run │ ──▶ │ Multi-Sig │ │
│ │ Analysis │ │ Simulation │ │ Approval │ │
│ └─────────────┘ └─────────────┘ └─────────────┘ │
│ │ │ │ │
│ ▼ ▼ ▼ │
│ Root Cause: Blast Radius: [x] devops-alice │
│ postgres-db 1 pod, 0 data loss [x] sre-bob │
│ │
│ Monthly Savings: $523.60 if fixed │
│ │
│ [ APPROVE & EXECUTE ] │
│ │
└─────────────────────────────────────────────────────────────────────────────┘
```
**AWOOOI** (AI + WOOO Intelligent Operations) transforms reactive firefighting into proactive, AI-assisted decision-making—while keeping humans firmly in control of critical actions.
---
## Enterprise Moats
Four pillars that make AWOOOI enterprise-ready from Day 1:
### Privacy Shield
> **Your PII never leaves your premises. Period.**
```python
# Before: Raw sensitive data
"User 192.168.1.100 with email admin@company.com triggered alert"
# After: Consistent pseudonymization
"User [IP_1] with email [EMAIL_1] triggered alert"
# Same value → Same label (AI maintains context without seeing real data)
```
- Regex-based detection: IP, Email, UUID, API Keys, JWT
- Consistent hashing: `[IP_1]` always maps to the same IP within a session
- **Rehydration Engine**: Labels restored only at MCP execution boundary
- Zero PII in logs, zero PII to cloud LLMs
---
### GraphRAG: Topology-Aware Intelligence
> **AI that understands your microservices like a senior SRE.**
```
┌─────────────────────────────────────┐
│ BLAST RADIUS ANALYSIS │
│ (Upstream Impact) │
└─────────────────────────────────────┘
┌─────────────┐
│ ingress │ ← Will be affected
└──────┬──────┘
│ depends on
┌─────────────┐
│ frontend │ ← Target service
└──────┬──────┘
│ calls
┌───────────────────────┼───────────────────────┐
│ │ │
▼ ▼ ▼
┌──────────────┐ ┌──────────────┐ ┌──────────────┐
│ auth-service │ │ product-api │ │ order-api │
└──────┬───────┘ └──────┬───────┘ └──────┬───────┘
│ │ │
└─────────────────────┼─────────────────────┘
┌──────────────┐
│ postgres-db │ X ROOT CAUSE
└──────────────┘
```
- **BFS-based traversal** with configurable `max_depth` (default: 3)
- **Dual-direction analysis**: Upstream (blast radius) + Downstream (root cause)
- **Priority ranking**: DATABASE > CACHE > QUEUE for root cause identification
- **Multiple root causes**: No single-point assumptions—collect ALL unhealthy dependencies
---
### Multi-Sig & Dry-Run: Defense in Depth
> **Every critical action is simulated, validated, and co-signed.**
```
┌────────────────────────────────────────────────────────────────┐
│ RISK MATRIX │
├────────────┬─────────────┬─────────────────────────────────────┤
│ Risk Level │ Signatures │ Required Roles │
├────────────┼─────────────┼─────────────────────────────────────┤
│ LOW │ 0 (auto) │ — │
│ MEDIUM │ 1 │ admin, devops, sre │
│ HIGH │ 2 │ admin, devops, sre │
│ CRITICAL │ 2 │ CTO + CISO (mandatory) │
└────────────┴─────────────┴─────────────────────────────────────┘
```
**TOCTOU Protection** (Time-of-Check to Time-of-Use):
```
1. User clicks "Approve"
2. System re-runs Dry-Run immediately before execution
3. If state changed → Status = VOIDED (not cleared!)
4. Full audit trail preserved for compliance
```
**Dry-Run Checks**:
- RBAC Permission validation
- Syntax & parameter validation
- Resource existence verification
- PodDisruptionBudget compliance
- Blast radius calculation
---
### Progressive Autonomy: Trust That Evolves
> **The more you approve, the less you need to.**
```
┌─────────────────────────────────────────────────────────────────┐
│ TRUST SCORE PROGRESSION │
├─────────────────────────────────────────────────────────────────┤
│ │
│ Score: 0 ──────────────────────────────────────────────▶ 10+ │
│ │ │ │ │
│ ▼ ▼ ▼ │
│ ┌─────────┐ ┌─────────┐ ┌─────────┐ │
│ │ HIGH │ ──▶ │ MEDIUM │ ──▶ │ LOW │ │
│ │ 2-sig │ @10 │ 1-sig │ @5 │ auto │ │
│ └─────────┘ └─────────┘ └─────────┘ │
│ │
│ ⚠️ CRITICAL operations NEVER auto-downgrade (enterprise law) │
│ │
│ Single REJECT → Trust score resets to 0 (instant collapse) │
│ │
└─────────────────────────────────────────────────────────────────┘
```
- **Approve** → +1 trust score
- **Reject** → Score resets to 0 (trust collapses instantly)
- Pattern-based: `restart_pod:nginx-*` builds trust separately from `delete_pvc:*`
- CRITICAL operations (DROP TABLE, DELETE NAMESPACE) → **Always requires human dual-signature**
---
## leWOOOgo Engine Architecture
AWOOOI is built on the **leWOOOgo Engine**—a modular, plugin-based architecture inspired by LEGO blocks:
```
┌─────────────────────────────────────────────────────────────────────────────┐
│ leWOOOgo Engine │
├─────────────────────────────────────────────────────────────────────────────┤
│ │
│ ┌─────────┐ ┌─────────┐ ┌─────────┐ ┌─────────┐ ┌─────────┐ │
│ │ INPUT │ │ BRAIN │ │ OUTPUT │ │ ACTION │ │ DATA │ │
│ │ ─────── │ │ ─────── │ │ ─────── │ │ ─────── │ │ ─────── │ │
│ │Webhooks │ │ Ollama │ │ Slack │ │ K8s │ │ Postgres│ │
│ │ Kafka │ │ OpenAI │ │ Discord │ │ Shell │ │ Redis │ │
│ │Prometheus│ │ Claude │ │ Email │ │ MCP │ │ S3 │ │
│ └────┬────┘ └────┬────┘ └────┬────┘ └────┬────┘ └────┬────┘ │
│ │ │ │ │ │ │
│ └─────────────┴─────────────┴─────────────┴─────────────┘ │
│ │ │
│ ┌───────┴───────┐ │
│ │ UI │ │
│ │ ───────────── │ │
│ │ Next.js │ │
│ │ ApprovalCard │ │
│ │ThinkingStream │ │
│ └───────────────┘ │
│ │
└─────────────────────────────────────────────────────────────────────────────┘
```
### Module Overview
| Module | Purpose | Key Components |
|--------|---------|----------------|
| **INPUT** | Event ingestion | Prometheus AlertManager, Kafka, Webhooks |
| **BRAIN** | AI reasoning | Ollama (local), OpenAI, Claude, GraphRAG |
| **OUTPUT** | Notifications | Slack, Discord, Email, Custom webhooks |
| **ACTION** | Execution | K8s API, Shell, MCP Bridge, Ansible |
| **DATA** | Persistence | PostgreSQL, Redis, S3, Vector DB |
| **UI** | Human interface | Next.js 14, ApprovalCard, ThinkingTerminal |
### MCP (Model Context Protocol) Support
```typescript
// MCP enables AI to safely interact with external tools
await mcpBridge.callTool("kubernetes", "restart_pod", {
pod_name: "[POD_1]", // Redacted in logs
namespace: "production",
graceful: true,
});
// Rehydration happens at execution boundary only
```
---
## FinOps: Day-1 ROI
> **Every wasted resource has a dollar sign. AWOOOI shows you exactly how much.**
```
┌─────────────────────────────────────────────────────────────────┐
│ FINOPS COST ANALYSIS │
├─────────────────────────────────────────────────────────────────┤
│ │
│ MONTHLY WASTE DETECTED: $523.60 │
│ │
│ ┌──────────────────┬──────────────────┬──────────────────┐ │
│ │ REALIZABLE │ FREED │ ANNUAL │ │
│ │ $480.00/mo │ $43.60/mo │ $5,760/yr │ │
│ │ ──────────── │ ──────────── │ ──────────── │ │
│ │ PVC deletion │ Pod cleanup │ if all fixed │ │
│ │ Node resize │ (needs scale) │ │ │
│ └──────────────────┴──────────────────┴──────────────────┘ │
│ │
│ TOP RECOMMENDATIONS: │
│ ├─ Delete orphaned PVC 'data-postgres-backup' -$40.00 LOW │
│ ├─ Resize node 'worker-large-01' -$340.00 HIGH│
│ └─ Delete zombie Pod 'legacy-api-5d7b8' -$76.00 MED │
│ │
└─────────────────────────────────────────────────────────────────┘
```
**Scan Types**:
- **Orphaned PVCs**: Storage not mounted by any Pod
- **Zombie Pods**: CPU < 1% for 7+ consecutive days
- **Over-provisioned Nodes**: High request, low actual usage
**Safety Buffer**: `wasted = requested - (actual × 1.2)` prevents OOM from aggressive recommendations.
---
## Quick Start
### Prerequisites
- Python 3.11+
- Node.js 18+
- pnpm 8+
- Docker (optional, for local Ollama)
### Installation
```bash
# Clone the repository
git clone https://github.com/anthropics/awoooi.git
cd awoooi
# Install dependencies
pnpm install
# Setup Python environment
cd apps/api
python -m venv venv
source venv/bin/activate # or `venv\Scripts\activate` on Windows
pip install -r requirements.txt
```
### Run Tracer Bullet 2.0 (E2E Demo)
Experience the full AWOOOI loop in 30 seconds:
```bash
cd apps/api
python scripts/tracer_bullet_2.py
```
**Expected Output**:
```
============================================================
TRACER BULLET 2.0 - FULL LOOP TEST
Test ID: tb2-20260319143052
============================================================
[x] [trigger_alert] PASS
[x] [graphrag_analysis] PASS
[x] [generate_approval] PASS
[x] [multisig_approval] PASS
[x] [mcp_execution] PASS
============================================================
TEST SUMMARY
============================================================
Total Steps: 5
Passed: 5
Failed: 0
Status: ALL PASSED
```
### Start Development Servers
```bash
# Terminal 1: API Server
cd apps/api
uvicorn src.main:app --reload --port 8000
# Terminal 2: Web Server
cd apps/web
pnpm dev
```
Open [http://localhost:3000](http://localhost:3000) to see the AWOOOI dashboard.
---
## Project Structure
```
awoooi/
├── apps/
│ ├── api/ # FastAPI Backend
│ │ ├── src/
│ │ │ ├── services/ # Core services
│ │ │ │ ├── approval.py # Multi-Sig engine
│ │ │ │ ├── dry_run.py # Dry-Run engine
│ │ │ │ ├── trust_engine.py # Progressive autonomy
│ │ │ │ └── graph_rag.py # Topology analysis
│ │ │ └── plugins/
│ │ │ ├── security/ # Privacy Shield
│ │ │ ├── mcp/ # MCP Bridge
│ │ │ └── finops/ # Cost analyzer
│ │ └── scripts/
│ │ └── tracer_bullet_2.py # E2E test
│ │
│ └── web/ # Next.js Frontend
│ └── src/
│ ├── components/
│ │ └── agent/
│ │ ├── approval-card.tsx
│ │ └── thinking-terminal.tsx
│ └── stores/
│ └── agent.store.ts
├── packages/
│ └── lewooogo-core/ # Shared types & contracts
└── docs/
└── adr/ # Architecture Decision Records
```
---
## Roadmap
| Phase | Status | Description |
|-------|--------|-------------|
| Phase 0 | Complete | Contracts & Scaffolding |
| Phase 1 | Complete | Core Integration (Monorepo, SSE, Ollama) |
| Phase 2 | Complete | HITL (ApprovalCard, Dry-Run, Multi-Sig) |
| Phase 3 | Complete | Enterprise (Privacy Shield, GraphRAG, FinOps) |
| Phase 4 | In Progress | Production Hardening & GA Release |
| Phase 5 | Planned | Multi-cluster, Federation, SaaS |
---
## Contributing
We welcome contributions! Please see our [Contributing Guide](CONTRIBUTING.md) for details.
```bash
# Run tests
pnpm test
# Run linting
pnpm lint
# Format code
pnpm format
```
---
## License
MIT License - see [LICENSE](LICENSE) for details.
---
<div align="center">
**Built with love by [岑洋國際行銷有限公司](https://wooo.tw)**
*Turning 3 AM pages into peaceful nights since 2026*
```
"The best incident is the one you never have to wake up for."
— AWOOOI Philosophy
```
</div>