agentebox/

devops

Org

DevOps / Infra agent for a zero-human AI startup — the keeper of the CI/CD pipeline, cloud infrastructure, and observability stack. Automates everything, monitors everything, and keeps infrastructure costs controlled.

DevOps / Infra Agent

The keeper of the pipeline and the infrastructure for a zero-human AI startup. This agent ensures code gets from merge to production safely (CI/CD), cloud resources are provisioned and right-sized (infrastructure), everything is observable (monitoring/alerting), and infrastructure spend stays controlled (cost optimization). If the Lead Engineer builds the house, DevOps keeps the electricity and plumbing running.

Quick Start

  1. Deploy the agent using OpenClaw with the ClawPack bundle:

    clawpack install @agentebox/devops
    
  2. Configure communication channels — DevOps needs to send/receive messages to the CTO (upstream), Lead Engineer (deployment coordination), and QA/Testing (test infrastructure).

  3. Set up the Remote Project Board — primary tracking for infrastructure tasks, incidents, and cost optimization projects.

  4. Connect infrastructure tools — IaC (Terraform/Pulumi), CI/CD platform, monitoring stack, cloud provider APIs.

  5. Configure cadences — daily infrastructure check (morning, 10 min), weekly infrastructure planning (Monday, 30 min), monthly infrastructure review (last Monday, 60 min).

  6. Initialize monitoring — set up dashboards and alerts for all existing services before the agent starts its regular cadence.

Environment Variables

VariableDescriptionRequired
REMOTE_PROJECT_IDProject ID on the Remote boardYes
CTO_AGENT_IDSession ID or label for the CTO agentYes
LEAD_ENGINEER_AGENT_IDSession ID or label for the Lead Engineer agentYes
CLOUD_PROVIDERPrimary cloud provider (aws/gcp/azure)Yes
INFRA_BUDGET_PER_SERVICEMax monthly spend per service without CTO approval (default: $100)No
ALERT_NOISE_TARGETMaximum acceptable alert noise ratio (default: 0.10)No
UPTIME_SLA_TARGETTarget uptime percentage (default: 99.5)No

File Listing

FileDescription
SOUL.mdComplete agent identity: behaviors, decision framework, communication protocols, boundaries, failure modes
IDENTITY.mdQuick-reference identity card (name, role, emoji)
manifest.jsonMachine-readable configuration: skills, tools, cadences, autonomy levels
README.mdThis file — setup guide and integration reference
skills/cicd-management/SKILL.mdPipeline health, deployment configuration, fallback runbooks
skills/infrastructure-provisioning/SKILL.mdIaC-based provisioning, capacity planning, resource management
skills/monitoring-alerting/SKILL.mdObservability setup, alert tuning, incident detection, capacity monitoring
skills/cost-optimization/SKILL.mdWaste identification, right-sizing, commitment evaluation, cost-per-unit tracking

Architecture

CTO
     ↕ (directives, incident reports, cost reports)
DevOps / Infra ──── 🔧
     ├── CI/CD Pipeline (automated build → test → deploy)
     ├── Cloud Infrastructure (compute, storage, networking)
     ├── Monitoring Stack (metrics, logs, traces, alerts)
     └── Cost Management (tracking, optimization, reporting)

Coordinates with:
     → Lead Engineer (deployment support, infrastructure requests)
     → QA / Testing (test infrastructure)
     → COO (cost reporting via CTO)

Framework Integration

OpenClaw (Native)

# openclaw.yaml
agent:
  name: devops
  soul: ./SOUL.md
  identity: ./IDENTITY.md
  skills:
    - ./skills/cicd-management/
    - ./skills/infrastructure-provisioning/
    - ./skills/monitoring-alerting/
    - ./skills/cost-optimization/
  heartbeat:
    interval: 15m
    file: ./HEARTBEAT.md

CrewAI

from crewai import Agent, Task, Crew

devops = Agent(
    role="DevOps / Infra",
    goal="Keep the pipeline running, infrastructure reliable, and costs controlled",
    backstory=open("SOUL.md").read(),
    tools=[iac_tool, cicd_tool, monitoring_tool, remote_board_tool],
    verbose=True
)

daily_check = Task(
    description="Run daily infrastructure check: review alerts, verify service health, check deployment queue",
    agent=devops,
    expected_output="Infrastructure health report with any issues flagged"
)

crew = Crew(agents=[devops], tasks=[daily_check], verbose=True)
crew.kickoff()

Monitoring

The DevOps agent is healthy when:

  • Deployment success rate stays above 95%
  • Mean time to deploy stays under 15 minutes
  • Alert noise ratio stays below 10% (most alerts require action)
  • Uptime SLA stays above 99.5%
  • Infrastructure costs stay within budget (±10%)
  • All production services have monitoring, alerts, and runbooks

Warning signs:

  • Deployment queue building up (pipeline bottleneck)
  • Alert volume increasing without corresponding incidents
  • Infrastructure costs rising faster than traffic/revenue
  • IaC drift detected (manual changes in production)
  • Any production service without monitoring coverage
  • Incident MTTR increasing (harder to fix things)

Version History

VersionDateChanges
1.0.02026-03-16Initial creation

Install

clawpack pull agentebox/devops
1
Downloads
0
Stars
Latest1.0.0
Updated3/16/2026

Share