> ## Documentation Index
> Fetch the complete documentation index at: https://docs.2501.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# Playbooks

> Ansible playbooks for prepare, restore, and validate phases

Three optional Ansible playbooks can live in a scenario directory. The runner discovers and invokes them automatically at the right point in the execution cycle.

***

## Execution Flow

```
2501-runner run -s <scenario>
        │
        ▼
┌───────────────┐
│  Pre-flight   │  verify DB, gateway/engine, ansible availability
└───────┬───────┘
        │
        ▼
┌───────────────┐
│   Provision   │  resolve host + agent from DB, flush agent memory
└───────┬───────┘
        │
        ├─── restore.yml ──► silent pre-clean (errors suppressed)
        │
        ▼
┌───────────────┐
│    Prepare    │  run prepare.yml  →  introduce the failure condition
└───────┬───────┘    (abort if fails)
        │
        ▼
┌───────────────┐
│    Execute    │  dispatch task via gateway / job / task / ticket
│               │  agent investigates and acts on the target host
└───────┬───────┘
        │
        ▼
┌───────────────┐
│   Validate    │  evaluate pattern_match / task_count / status rules
│               │  run validate.yml  →  check actual machine state
└───────┬───────┘
        │
        ▼
┌───────────────┐
│    Restore    │  run restore.yml  →  reset host to clean baseline
└───────┬───────┘    (non-fatal if fails)
        │
        ▼
┌───────────────┐
│    Report     │  print summary, persist ScenarioReport to DB
└───────────────┘
```

***

## prepare.yml

Runs **before** the agent executes. Use it to introduce the failure condition the scenario is benchmarking.

Before `prepare.yml` runs, the runner silently executes `restore.yml` with errors suppressed. This clears leftover state from any previous failed run so each execution starts from a known baseline.

If `prepare.yml` itself fails, execution is aborted and the scenario is marked as failed.

```yaml theme={null}
---
- name: Deploy broken nginx configuration
  hosts: all
  become: true
  tasks:
    - name: Ensure nginx is installed
      apt:
        name: nginx
        state: present
        update_cache: true

    - name: Write a config with a syntax error (missing semicolon after listen 80)
      copy:
        dest: /etc/nginx/sites-available/default
        content: |
          server {
            listen 80
            root /var/www/html;
            index index.html;
          }

    - name: Attempt reload: this will fail, which is intentional
      systemd:
        name: nginx
        state: restarted
      ignore_errors: true
```

***

## restore.yml

Runs **after** validation, during cleanup. Use it to reset the host to a clean baseline.

Restore failures are **non-fatal**: the runner logs a warning and continues. Write restore playbooks defensively with `ignore_errors: true` on steps that may fail on an already-clean host.

```yaml theme={null}
---
- name: Reset nginx to clean state
  hosts: all
  become: true
  tasks:
    - name: Stop nginx
      systemd:
        name: nginx
        state: stopped
        enabled: false
      ignore_errors: true

    - name: Remove injected config
      file:
        path: /etc/nginx/sites-available/default
        state: absent
      ignore_errors: true
```

***

## validate.yml

Runs **after** execution as part of the validation phase. Use it to verify that the agent's actions actually worked: service status, file contents, port availability, process list.

Declared in `scenario.json` as an `ansible` validator:

```json theme={null}
{
  "label": "Nginx is healthy",
  "validator": "ansible",
  "ansiblePath": "validate.yml"
}
```

A non-zero exit code fails the resolution gate and marks the scenario as failed.

```yaml theme={null}
---
- name: Verify nginx is healthy
  hosts: all
  become: true
  tasks:
    - name: Config syntax is valid
      command: nginx -t

    - name: Service is active
      command: systemctl is-active nginx

    - name: Port 80 is responding
      uri:
        url: http://localhost:80
        status_code: [200, 301, 302]
```

***

## Using Environment Variables

Variables from the env file are available in all playbooks via `lookup('env', ...)`:

```yaml theme={null}
- name: Clone a private repository
  git:
    repo: "https://{{ lookup('env', 'GH_TOKEN') }}@github.com/your-org/fixtures.git"
    dest: /opt/fixtures
```

***

## inventory.ini

The runner needs an Ansible inventory to know which hosts to target and how to reach them.

**Host mode**: place an `inventory.ini` in the scenario directory. The runner detects it automatically and passes it to every `ansible-playbook` call.

**VM mode (incus/lima)**: the runner generates the inventory automatically from the provisioned VM's IP, port, and SSH key. You do not need an `inventory.ini`.

```ini theme={null}
[web]
sandbox-web-01 ansible_host=10.0.1.10 ansible_user=ubuntu ansible_port=22

[all:vars]
ansible_ssh_private_key_file=/etc/2501/keys/sandbox.pem
```

The hostnames must match the `host_name` values used in `scenario.json`.

<Warning>
  If no `inventory.ini` is present in host mode, `ansible-playbook` runs without an explicit inventory. Your playbooks won't be able to reach any hosts. Always include `inventory.ini` when your scenario has playbooks.
</Warning>
