Overview - 2501.ai

The 2501 Scenario Runner (2501-runner) is a benchmarking tool that lets you define infrastructure scenarios, run them against 2501 agents in a controlled sandbox, and measure how well those agents perform. Each scenario simulates a realistic IT problem: a broken service, a misconfigured daemon, a disk filling up. The runner sets up the problem, dispatches the task to an agent, waits for it to act, and scores the result against validation rules you define.

What Gets Validated

Every scenario run produces two independent scores. Compliance answers: did the agent follow the right process? These are behavioral checks: did the agent test the nginx config before restarting the service, did it check disk usage before deleting files, did it avoid destructive commands? Compliance rules let you encode your operational standards and verify the agent respects them. Task validation answers: did the agent actually fix the problem? This is ground-truth verification against the actual state of the target host after the agent finishes: is the service running, is the port responding, is the file in the right place? A scenario passes only when both scores pass. This separation is intentional: an agent can fix the problem in a way that violates your processes, or follow all the right steps but leave the system broken.

Prerequisites

Before using the runner, make sure the following are in place on the machine where it’s installed:

A running 2501 instance reachable from the runner machine. The runner connects to the database directly (DATABASE_URL) and optionally to a gateway (ServiceNow) or the engine API.
Ansible installed and available in PATH. The runner uses ansible-playbook to run scenario playbooks against target hosts.
SSH access from the runner machine to your sandbox hosts.
Scenario files in the scenarios directory (default: /etc/2501/runner/scenarios). See Scenario Structure or Examples to get started.

Quick Start

# Verify everything is configured correctly before running anything
2501-runner run -s nginx/001-broken-config --check

# Run a single scenario
2501-runner run -s nginx/001-broken-config

# Run all scenarios tagged with "nginx", exit non-zero on any failure
2501-runner run -s nginx --fail-on-error

Next Steps

Scenario Structure: directory layout and the scenario.json format
Hosts & Agents: referencing your sandbox hosts and agents
Validation: rules, validators, and the scoring model
Playbooks: Ansible playbooks and inventory
Examples: complete worked examples (disk full, nginx, Kubernetes)
Run Scenarios: CLI reference, entry points, and environment setup

​What Gets Validated

​Prerequisites

​Quick Start

​Next Steps

What Gets Validated

Prerequisites

Quick Start

Next Steps