> ## Documentation Index
> Fetch the complete documentation index at: https://docs.2501.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# Run Scenarios

> CLI reference, execution flow, entry points, environment setup, and CI integration

## Execution Flow

Understanding what the runner does on each invocation helps when writing scenarios and debugging failures. Every `2501-runner run` call goes through these phases:

### 1. Pre-flight

Before any scenario executes, the runner validates the full environment:

* Connects to the database and verifies `ORG_ID`, `TENANT_ID`, `USER_ID` exist
* If `--from gateway`: tests ServiceNow API connectivity
* If `--from ticket`: tests engine API connectivity
* Verifies `ansible-playbook` is available in PATH
  Use `--check` to run only this step without executing anything.

### 2. Scenario Discovery

The runner scans the scenarios directory, loads all `scenario.json` files, and resolves the `-s` argument against available keys and tags.

### 3. Per-Scenario Loop

For each matched scenario (and each iteration when `-i` > 1):

**Provision**
Resolve host and agent records from the database by their IDs, flush agent memory.

**Prepare**

* Run a silent, error-suppressed `restore.yml` first (to clear stale state from any previous failed run)
* Run `prepare.yml` with the Ansible inventory; abort the scenario if it fails

**Execute**

* Dispatch the scenario through the selected entry point (see [Entry Points](#entry-points) below)
* Poll for job or task completion; check `allowedAgents`/`allowedHosts` on every poll cycle

**Validate**

* Evaluate all validation rules in order
* Run the Ansible `validate.yml` playbook if declared
* Compute compliance score and pass/fail result

**Restore**

* Run `restore.yml` to reset the host to baseline
* Restore failures are non-fatal: a warning is logged and execution continues

### 4. Report

After all scenarios complete, the runner:

* Prints a summary table (pass/fail, duration, token usage, validation details per rule)
* Persists a `ScenarioReport` to the database
* Exits `0` if all passed, or `1` if any failed and `--fail-on-error` is set

***

## Environment Setup

Environment variables are loaded automatically from `/etc/2501/env.runner`. Override the path with `--env-file`.

### Required

| Variable       | Description                                           |
| -------------- | ----------------------------------------------------- |
| `DATABASE_URL` | PostgreSQL connection string for your 2501 deployment |
| `ORG_ID`       | Your organization ID                                  |
| `TENANT_ID`    | Your tenant ID                                        |
| `USER_ID`      | The user ID under which scenarios run                 |

### ServiceNow (`--from gateway`)

| Variable                         | Description                          |
| -------------------------------- | ------------------------------------ |
| `SERVICENOW_API_URL`             | ServiceNow instance URL              |
| `SERVICENOW_USERNAME`            | ServiceNow username                  |
| `SERVICENOW_PASSWORD`            | ServiceNow password                  |
| `SERVICENOW_ASSIGNMENT_GROUP_ID` | Assignment group for created tickets |
| `SERVICENOW_CALLER_ID`           | Caller ID for created tickets        |

### Engine API (`--from ticket`)

| Variable         | Description                       |
| ---------------- | --------------------------------- |
| `ENGINE_API_URL` | Base URL of the 2501 engine API   |
| `ENGINE_API_KEY` | API key for engine authentication |

***

## The `run` Command

```bash theme={null}
2501-runner run -s <scenarios> [options]
```

### Options

| Option                        | Default                      | Description                                                                   |
| ----------------------------- | ---------------------------- | ----------------------------------------------------------------------------- |
| `-s, --scenarios <list>`      | required                     | Scenario keys or tags, comma-separated. Mix freely.                           |
| `--from <type>`               | `gateway`                    | Entry point: `gateway` \| `task` \| `ticket`                                  |
| `-g, --gateway <type>`        | -                            | Gateway type. Required with `--from gateway`. Must be `servicenow`.           |
| `-i, --iter <n>`              | `1`                          | Number of iterations per scenario                                             |
| `-p, --scenarios-path <path>` | `/etc/2501/runner/scenarios` | Path to the scenarios root directory                                          |
| `--env-file <path>`           | `/etc/2501/env.runner`       | Path to the env file                                                          |
| `--check`                     | `false`                      | Run pre-flight checks only: do not execute any scenarios                      |
| `--fail-on-error`             | `false`                      | Exit with code `1` if any scenario fails                                      |
| `-v` / `-vv`                  | -                            | `-v`: show only failed checks. `-vv`: show all checks with full debug output. |
| `--main-engine <engine>`      | -                            | Override the main LLM engine for all agents in all scenarios                  |
| `--secondary-engine <engine>` | -                            | Override the secondary LLM engine for all agents                              |

***

## Selecting Scenarios

The `-s` flag accepts scenario keys or tags, comma-separated. To run all your scenarios at once, tag them with a common tag (e.g. `all`) and use that.

```bash theme={null}
# Single scenario by key
2501-runner run -s nginx/001-broken-config

# Multiple scenarios by key
2501-runner run -s nginx/001-broken-config,nginx/002-high-load

# All scenarios with the "nginx" tag
2501-runner run -s nginx

# Mix: all "nginx" scenarios plus a specific "disk" scenario
2501-runner run -s nginx,disk/001-cleanup

# Run everything (if you've tagged your scenarios with "all")
2501-runner run -s all  # requires scenarios to have the "all" tag
```

***

## Entry Points

The `--from` flag controls how each scenario is dispatched. Each entry point exercises a different layer of the stack.

### `--from gateway` (default)

Creates a ticket in ServiceNow. The gateway bot processes it, creates a job, and the agent resolves it. The bot then marks the ticket resolved.

This is the most complete end-to-end path: it exercises the full integration between your ticketing system and your 2501 deployment.

```bash theme={null}
2501-runner run -s nginx/001-broken-config --from gateway --gateway servicenow
```

Because the engine selects agents automatically, use `allowedAgents` in `validation` if you need to assert which agent was chosen. If an unauthorized agent is used, the runner kills the job immediately.

**Requires:** `--gateway servicenow` and the `SERVICENOW_*` env vars.

***

### `--from task`

Creates a task directly for a single agent, bypassing the job router entirely. This is the fastest and most direct path.

```bash theme={null}
2501-runner run -s nginx/001-broken-config --from task
```

**Requirements:**

* The scenario must define exactly one agent and one host
* The agent must be referenced by `agent_id`

Use this when you want to benchmark a specific agent's response to an instruction without involving the gateway or job orchestration layer.

***

### `--from ticket`

POSTs directly to the 2501 engine's internal ticket endpoint. The engine creates a job and routes it to agents internally.

```bash theme={null}
2501-runner run -s nginx/001-broken-config --from ticket
```

This exercises the engine's internal ticket-to-job flow without going through an external gateway.

**Requires:** `ENGINE_API_URL` and `ENGINE_API_KEY` env vars.

***

## Preflight Check

Run `--check` to validate your full configuration before executing any scenarios. Useful after environment changes or before a large batch run.

```bash theme={null}
2501-runner run -s nginx/001-broken-config --check
```

The runner validates database connectivity, org/tenant/user IDs, gateway or engine API reachability, specialty keys, and Ansible availability. No scenarios are executed.

***

## Verbosity

```bash theme={null}
# Show details for failed checks only
2501-runner run -s nginx -v

# Show all checks with full debug output
2501-runner run -s nginx -vv
```

***

## Iterating

Run each scenario multiple times to check for consistency and surface flaky behavior:

```bash theme={null}
2501-runner run -s nginx/001-broken-config -i 5
```

The full prepare → execute → validate → restore cycle runs for each iteration.

***

## Overriding Engines

Override the LLM engines for all agents across all scenarios in a run. Useful for comparing how different models perform on the same scenario set.

```bash theme={null}
2501-runner run -s nginx --main-engine claude-opus-4-6 --secondary-engine claude-opus-4-6
```

***

## CI Integration

Use `--fail-on-error` to exit non-zero when any scenario fails:

```bash theme={null}
2501-runner run -s nginx --fail-on-error
```

Multiple iterations for regression detection:

```bash theme={null}
2501-runner run -s regression-suite -i 3 --fail-on-error
```

***

## The `validate` Command

Re-runs the validation rules for a scenario against an existing job or task, without re-executing the scenario. Use this when iterating on validation rules and you don't want to wait for another full agent run.

```bash theme={null}
2501-runner validate -s nginx/001-broken-config --job-id <job-id>
2501-runner validate -s nginx/001-broken-config --task-id <task-id>
```

| Option                 | Description                                                            |
| ---------------------- | ---------------------------------------------------------------------- |
| `-s, --scenario <key>` | Required. The scenario whose rules to apply.                           |
| `-j, --job-id <id>`    | Job to validate against.                                               |
| `-t, --task-id <id>`   | Task to validate against.                                              |
| `-g, --gateway <type>` | Gateway type, if the original run used a gateway.                      |
| `--skip-ansible`       | Skip Ansible-based rules. Useful when the host is no longer reachable. |
| `-v, --verbose`        | Show detailed output.                                                  |

***

## The `restore` Command

Re-runs the `restore.yml` playbook for a scenario. Use this to manually reset a host that was left in a dirty state after a failed or interrupted run.

```bash theme={null}
2501-runner restore -s nginx/001-broken-config
```
