> ## Documentation Index
> Fetch the complete documentation index at: https://docs.2501.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# Validation

> Validation rules, validators, scoring model, and fail-fast guards

Validation rules determine whether a scenario passed. They are declared in the `validation` object of `scenario.json` and evaluated after the agent finishes executing.

***

## Structure

```json theme={null}
"validation": {
  "allowedAgents": ["agt_xyz789"],
  "allowedHosts": ["hst_abc123"],
  "gateway": [...],
  "job": [...],
  "tasks": [...]
}
```

Rules are organized into three scopes based on what they check:

| Scope     | When used             | What it checks                             |
| --------- | --------------------- | ------------------------------------------ |
| `gateway` | `--from gateway` only | The ServiceNow ticket                      |
| `job`     | All entry points      | The job record (status, plan, task count)  |
| `tasks`   | All entry points      | Per-task data (commands, summaries, plans) |

For `tasks` rules, each rule is checked against every task. A rule passes if it matches on at least one task.

***

## Rule Structure

Every rule shares a common set of fields:

```json theme={null}
{
  "label": "Agent restarted nginx",
  "validator": "pattern_match",
  "pattern": "systemctl.*(restart|reload|start).*nginx",
  "where": "executed_commands",
  "required": true,
  "negate": false
}
```

| Field       | Default | Description                                                                                                |
| ----------- | ------- | ---------------------------------------------------------------------------------------------------------- |
| `label`     | -       | Required. Shown in the validation report. Make it descriptive.                                             |
| `validator` | -       | Required. The type of check to run. See validators below.                                                  |
| `required`  | `true`  | When `false`, the rule is informational: it contributes to the compliance score but does not block a pass. |
| `negate`    | `false` | Invert the result. The rule passes when the condition is NOT met.                                          |

***

## Validators

### `pattern_match`

Checks whether a regex pattern matches (or doesn't match, if `negate: true`) in a specific field of the job or task data.

```json theme={null}
{
  "label": "Agent edited the nginx config",
  "validator": "pattern_match",
  "pattern": "/etc/nginx/",
  "where": "executed_commands"
}
```

| Field     | Description                                       |
| --------- | ------------------------------------------------- |
| `pattern` | Regular expression. Matching is case-insensitive. |
| `where`   | The field to search. See targets below.           |

**`where` targets:**

| Target              | Content                                          |
| ------------------- | ------------------------------------------------ |
| `executed_commands` | All commands run by the agent, one per line      |
| `task_summary`      | The agent's summary of what it did               |
| `task_description`  | The task description as created                  |
| `task_plan`         | The agent's execution plan                       |
| `agent_messages`    | Full agent reasoning history                     |
| `job_resolution`    | The job's resolution summary                     |
| `job_plan`          | The job-level plan                               |
| `gateway_messages`  | Messages posted by the gateway bot on the ticket |
| `gateway_summary`   | The gateway's summary of ticket resolution       |
| `operational_rules` | Operational constraints from the agent's context |

**Example patterns:**

```json theme={null}
[
  {
    "label": "Agent ran nginx config test before restarting",
    "validator": "pattern_match",
    "pattern": "nginx -t",
    "where": "executed_commands"
  },
  {
    "label": "No destructive filesystem commands",
    "validator": "pattern_match",
    "pattern": "rm\\s+-rf\\s+/|mkfs|dd\\s+if=",
    "where": "executed_commands",
    "negate": true
  },
  {
    "label": "Agent described the root cause (informational)",
    "validator": "pattern_match",
    "pattern": "syntax error|misconfiguration|invalid",
    "where": "task_summary",
    "required": false
  }
]
```

***

### `job_resolution_status`

Checks the job's final resolution status. This is useful to benchmark tickets that are expected to fail.

```json theme={null}
{
  "label": "Job resolved successfully",
  "validator": "job_resolution_status",
  "pattern": "success"
}
```

| Field     | Description                                                                                                   |
| --------- | ------------------------------------------------------------------------------------------------------------- |
| `pattern` | Expected resolution status. Allowed values: `success`, `agentic_failure`, `hard_failure`, `no_tasks_created`. |

***

### `ticket_status`

Checks the status of the ServiceNow ticket. Only applicable with `--from gateway`.

```json theme={null}
{
  "label": "Ticket was resolved",
  "validator": "ticket_status",
  "pattern": "resolved"
}
```

| Field     | Description                    |
| --------- | ------------------------------ |
| `pattern` | Expected ticket status string. |

***

### `task_count`

Verifies that the number of tasks created under the job falls within a range. Use this to assert the agent didn't spiral into excessive sub-tasks or resolved the issue in a single call.

```json theme={null}
{
  "label": "Resolved efficiently",
  "validator": "task_count",
  "min": 1,
  "max": 3
}
```

| Field | Description                          |
| ----- | ------------------------------------ |
| `min` | Minimum number of tasks (inclusive). |
| `max` | Maximum number of tasks (inclusive). |

***

### `ansible`

Runs an Ansible playbook and treats its exit code as pass/fail. This is the most reliable way to assert actual machine state: service running, file contents correct, port responding.

```json theme={null}
{
  "label": "Nginx is running and serving traffic",
  "validator": "ansible",
  "ansiblePath": "validate.yml"
}
```

| Field         | Description                                                                         |
| ------------- | ----------------------------------------------------------------------------------- |
| `ansiblePath` | Path to the playbook, relative to the scenario directory. Typically `validate.yml`. |

The playbook is run with the scenario's Ansible inventory and has full access to the target hosts. A non-zero exit code fails this rule.

See [Playbooks](/0.3.0/scenario-runner/playbooks) for how to write `validate.yml`.

***

## Scoring Model

After all rules are evaluated, the runner computes two scores:

**Compliance score**: percentage of all non-Ansible rules that passed (required + optional combined). This is purely informational and shown in the report.

Two gates determine the actual pass/fail result:

**Compliance gate**: passes when every `required` non-Ansible rule passes.

**Resolution gate**: if an Ansible (`validate.yml`) rule exists, passes when the playbook exits 0. If no Ansible rule is defined, the resolution gate mirrors the compliance gate result.

**A scenario passes only when both gates pass.**

This design means you can layer your validation:

* Use `required: false` rules to track compliance quality without blocking the score
* Use an Ansible `validate.yml` as the authoritative ground truth for machine state
* Use `required: true` pattern rules to catch specific behaviors that must always happen (or never happen)

***

## Fail-Fast Guards

`allowedAgents` and `allowedHosts` are checked continuously during execution, on every poll cycle, rather than after completion. If the condition is violated, the runner kills the job immediately and fails the scenario.

```json theme={null}
"validation": {
  "allowedAgents": ["agt_xyz789"],
  "allowedHosts": ["hst_abc123"]
}
```

**`allowedAgents`**: If any task is assigned to an agent whose ID is not in this list, the job is killed. Use this when you're running through `--from gateway` or `--from ticket` and the engine selects agents automatically: it ensures only your designated agent is used.

**`allowedHosts`**: If any task's agent is operating on a host not in this list, the job is killed. Use this to prevent the agent from laterally accessing hosts outside the scenario scope.

Both guards are only meaningful with entry points where agent selection is automatic (`--from gateway`, `--from ticket`). With `--from task`, the agent is explicitly specified.
