Validation - 2501.ai

Validation rules determine whether a scenario passed. They are declared in the validation object of scenario.json and evaluated after the agent finishes executing.

Structure

"validation": {
  "gateway": [...],
  "job": [...],
  "tasks": [...]
}

Rules are organized into three scopes based on what they check:

Scope	When used	What it checks
`gateway`	`--from gateway` only	The ServiceNow ticket
`job`	All entry points	The job record (status, plan, task count)
`tasks`	All entry points	Per-task data (commands, summaries, plans)

For tasks rules, each rule is checked against every task. A rule passes if it matches on at least one task.

Rule Structure

Every rule shares a common set of fields:

{
  "label": "Agent restarted nginx",
  "validator": "pattern_match",
  "pattern": "systemctl.*(restart|reload|start).*nginx",
  "where": "executed_commands",
  "required": true,
  "negate": false
}

Field	Default	Description
`label`	-	Required. Shown in the validation report. Make it descriptive.
`validator`	-	Required. The type of check to run. See validators below.
`required`	`true`	When `false`, the rule is informational: it contributes to the compliance score but does not block a pass.
`negate`	`false`	Invert the result. The rule passes when the condition is NOT met.

Validators

`pattern_match`

Checks whether a regex pattern matches in a specific field of the job or task data.

{
  "label": "Agent edited the nginx config",
  "validator": "pattern_match",
  "pattern": "/etc/nginx/",
  "where": "executed_commands"
}

Field	Description
`pattern`	Regular expression. Matching is case-insensitive.
`where`	The field to search.

where targets:

Target	Content
`executed_commands`	All commands run by the agent, one per line
`task_summary`	The agent’s summary of what it did
`task_description`	The task description as created
`task_plan`	The agent’s execution plan
`agent_messages`	Full agent reasoning history
`job_resolution`	The job’s resolution summary
`job_plan`	The job-level plan
`gateway_messages`	Messages posted by the gateway bot on the ticket
`gateway_summary`	The gateway’s summary of ticket resolution
`operational_rules`	Operational constraints from the agent’s context

`job_resolution_status`

Checks the job’s final resolution status.

{
  "label": "Job resolved successfully",
  "validator": "job_resolution_status",
  "pattern": "success"
}

Allowed values: success, agentic_failure, hard_failure, partial, no_action.

`ticket_status`

Checks the status of the ServiceNow ticket. Only applicable with --from gateway.

{
  "label": "Ticket was resolved",
  "validator": "ticket_status",
  "pattern": "resolved"
}

`task_count`

Verifies that the number of tasks created under the job falls within a range.

{
  "label": "Resolved efficiently",
  "validator": "task_count",
  "min": 1,
  "max": 3
}

`ansible`

Runs an Ansible playbook and treats its exit code as pass/fail. The most reliable way to assert actual machine state.

{
  "label": "Nginx is running and serving traffic",
  "validator": "ansible",
  "ansiblePath": "validate.yml"
}

A non-zero exit code fails the resolution gate and marks the scenario as failed. See Playbooks for how to write validate.yml.

Scoring Model

Compliance score: percentage of all non-Ansible rules that passed (required + optional combined). Informational. Two gates determine the actual pass/fail result: Compliance gate: passes when every required non-Ansible rule passes. Resolution gate: if a validate.yml Ansible rule exists, passes when the playbook exits 0. If no validate.yml rule is defined, it passes when the compliance gate passes and the compliance score is at least 80%. A scenario passes only when both gates pass.

​Structure

​Rule Structure

​Validators

​pattern_match

​job_resolution_status

​ticket_status

​task_count

​ansible

​Scoring Model