QA / Testing agent for a zero-human AI startup — the quality gatekeeper who writes automated tests, catches bugs before production, and blocks regressions from reaching customers.
QA / Testing Agent
The quality gatekeeper for a zero-human AI startup. This agent writes automated test suites, triages bug reports with full reproduction steps, and runs regression testing before every production deployment. If it isn't tested, it doesn't ship. The last line of defense between code and customers.
Quick Start
-
Deploy the agent using OpenClaw with the ClawPack bundle:
clawpack install @agentebox/qa -
Configure communication channels — QA needs to send/receive messages to Lead Engineer (upstream) and coordinate with DevOps/Infra (deployment gating) and Customer Success (bug report intake).
-
Set up the Remote Project Board — primary tracking for bugs, test tasks, and quality metrics.
-
Connect code repository and CI/CD — tests live alongside code and run automatically in the pipeline.
-
Configure cadences — daily test monitoring (morning, 10 min), continuous sprint test execution.
-
Initialize the test suite — establish baseline coverage metrics and set up regression test infrastructure.
Environment Variables
| Variable | Description | Required |
|---|---|---|
REMOTE_PROJECT_ID | Project ID on the Remote board | Yes |
LEAD_ENGINEER_AGENT_ID | Session ID or label for the Lead Engineer agent | Yes |
DEVOPS_AGENT_ID | Session ID or label for the DevOps/Infra agent | Yes |
CODE_REPO_URL | Repository URL for the main codebase | Yes |
TEST_COVERAGE_TARGET | Target coverage on critical paths (default: 80%) | No |
MAX_FLAKY_TESTS | Maximum quarantined flaky tests before P1 escalation (default: 5) | No |
REGRESSION_TIMEOUT_MIN | Maximum regression suite execution time (default: 10) | No |
File Listing
| File | Description |
|---|---|
SOUL.md | Complete agent identity: behaviors, decision framework, communication protocols, boundaries, failure modes |
IDENTITY.md | Quick-reference identity card (name, role, emoji) |
manifest.json | Machine-readable configuration: skills, tools, cadences, autonomy levels |
README.md | This file — setup guide and integration reference |
skills/test-suite-management/SKILL.md | Write/maintain tests using the testing pyramid, cover unhappy paths, manage coverage |
skills/bug-triage/SKILL.md | Reproduce bugs, classify severity (P0-P3), route with minimal reproduction steps |
skills/regression-testing/SKILL.md | Pre-deploy regression runs, green/yellow/red release verdicts, flaky test management |
Architecture
Lead Engineer
↕ (PRs for testing, bug reports, release verdicts)
QA / Testing ──── 🔍
├── Test Suite (unit → integration → E2E, testing pyramid)
├── Bug Triage (intake → reproduce → classify → route)
└── Regression Gate (pre-deploy suite → verdict → block/approve)
Coordinates with:
→ DevOps / Infra (deployment gating, test infrastructure)
→ Customer Success (via Lead Engineer — bug report intake)
Framework Integration
OpenClaw (Native)
# openclaw.yaml
agent:
name: qa
soul: ./SOUL.md
identity: ./IDENTITY.md
skills:
- ./skills/test-suite-management/
- ./skills/bug-triage/
- ./skills/regression-testing/
heartbeat:
interval: 30m
file: ./HEARTBEAT.md
CrewAI
from crewai import Agent, Task, Crew
qa = Agent(
role="QA / Testing",
goal="Ensure software quality by writing comprehensive tests, catching bugs before production, and blocking regressions",
backstory=open("SOUL.md").read(),
tools=[repo_tool, cicd_tool, remote_board_tool, messaging_tool],
verbose=True
)
regression_check = Task(
description="Run full regression suite for the pending deployment, analyze failures, and issue a release verdict",
agent=qa,
expected_output="Green/yellow/red release verdict with evidence and risk assessment"
)
crew = Crew(agents=[qa], tasks=[regression_check], verbose=True)
crew.kickoff()
Monitoring
The QA agent is healthy when:
- Test coverage on critical paths stays ≥80%
- Bug escape rate stays below 5%
- Regression rate is 0 per release (regressions caught before deploy)
- Test suite execution time stays under 10 minutes
- Flaky test count stays ≤5 (quarantine budget not exceeded)
- Bug triage latency stays under 24 hours
Warning signs:
- Test coverage declining without corresponding code removal
- Bug escape rate rising (bugs reaching production that tests should catch)
- Flaky test count exceeding budget (>5 quarantined)
- Regression suite execution time growing (blocking deployment velocity)
- Bug reports piling up without triage (>48 hours old)
- Lead Engineer reporting "test suite is in the way" (friction, not quality)
Version History
| Version | Date | Changes |
|---|---|---|
| 1.0.0 | 2026-03-16 | Initial creation |