---
title: "Scheduler config reference"
description: "Every field of the Anyscale scheduler config, with types, constraints, and a complete example."
---

# Scheduler config reference

The scheduler config is a YAML document with up to three top-level keys, all optional. An empty config is valid: an org with no rules passes every workload through. Unknown keys are rejected at every level. The config is org-scoped and versioned: applying creates a new active version.

## Structure

| Key | Purpose |
| --- | --- |
| [`resource_flavors`](#resource_flavors) | Named compute categories, matched to workloads by label. |
| [`resource_queues`](#resource_queues) | Pools that hold workloads and cap usage with quotas. |
| [`scheduling_rules`](#scheduling_rules) | Ordered rules that route workloads to queues and assign priority. |

## `resource_flavors`

A list of named compute categories, matched to workloads by label selectors. A workload is assigned a flavor **per resource group**: within a group, the group's `flavors` are tried in listed order, and the workload gets the first flavor whose `selector` matches its labels and that has quota available. If no flavor matches, the workload is rejected.

| Field | Type | Required | Description |
| --- | --- | --- | --- |
| `name` | string | Yes | Unique flavor name (non-empty). Referenced from a queue's quota by this name. |
| `selector` | list of [match expressions](#match-expressions) | No | Label rules a workload must satisfy (all must match). Omit to match every workload (catch-all flavor). |
| `advanced_instance_config` | object | No | Opaque cloud-provider launch overrides, passed through verbatim (for example EC2 `CapacityReservationSpecification`, instance tags, or Kubernetes pod metadata). |

:::note
The `selector` shape is identical in `resource_flavors` and `scheduling_rules`, and both match the same set of [workload labels](#selector-keys). Keys are matched exactly and are case-sensitive; reserved keys are hyphenated (for example `accelerator-type`, not `accelerator_type`).
:::

## `resource_queues`

A list of named pools that hold workloads and cap their resource usage with quotas. A queue that omits `resource_groups` is a **passthrough**: every workload routed to it is admitted with no quota enforcement.

| Field | Type | Required | Description |
| --- | --- | --- | --- |
| `name` | string | Yes | Unique queue name (non-empty). Referenced by a scheduling rule's `resource_queue`. |
| `preemption` | object | No | Whether and how the queue preempts running workloads. See [`preemption`](#preemption). |
| `resource_groups` | list of objects | No | Quotas, grouped by the resources they cover. See [`resource_groups`](#resource_groups). Omit entirely to make the queue a passthrough. |

:::note
A queue's spec is **immutable per name**. To change a queue's shape, use a new name (for example `team-a:v2`).
:::

### `preemption`

Controls when a queue evicts running workloads to free quota.

| Field | One of | Description |
| --- | --- | --- |
| `within_resource_queue` | `never`, `lower_priority` | When `lower_priority`, a workload may preempt strictly lower-priority workloads in the **same queue**. `never` (or unset) means no preemption. |

### `resource_groups`

Each entry groups a set of resource axes governed together, with per-flavor quotas. Omit `resource_groups` from a queue to make it a passthrough (admit everything, no quota).

| Field | Type | Required | Description |
| --- | --- | --- | --- |
| `covered_resources` | list of string | Yes | The resource axes this group governs. Each must be one of `gpu`, `cpu`, `memory_gb`, `tpu` (any other name is rejected). Non-empty; no duplicates. Each resource belongs to exactly one group within a queue (covered sets are disjoint). Use `memory_gb`, not `memory`. |
| `flavors` | list of objects | Yes | Per-flavor quotas for the covered resources. Non-empty. Tried in listed order. |

Each `flavors` entry:

| Field | Type | Required | Description |
| --- | --- | --- | --- |
| `name` | string | Yes | A flavor name defined in [`resource_flavors`](#resource_flavors). Appears at most once per group, and at most once across a queue's groups. |
| `resources` | list of objects | No | Per-flavor quotas, a subset of the group's `covered_resources` — quota only the resources you want to cap. Any covered resource you omit (or list with no `nominal_quota`) is unlimited. Names unique within this list. |

Each `resources` entry (one quota for one resource of one flavor):

| Field | Type | Required | Description |
| --- | --- | --- | --- |
| `name` | string | Yes | Resource name; must be one of the group's `covered_resources`. |
| `nominal_quota` | float ≥ 0 | No | The quota ceiling for this (queue, flavor, resource), with at most 3 decimal places. Omit (or omit the whole resource entry) for **unlimited**; set to `0` to block the resource. |

## `scheduling_rules`

An ordered list of rules that route a workload to a queue and set its priority. Rules are evaluated top to bottom; the **first match wins** (a rule whose `priority_policy` rejects falls through to the next rule). A workload matching no rule is failed.

| Field | Type | Required | Description |
| --- | --- | --- | --- |
| `selector` | list of [match expressions](#match-expressions) | No | Label rules applied over the workload's labels (all must match). Omit to match every workload (catch-all) — put such a rule last. |
| `resource_queue` | string | Yes | Name of the queue to route matched workloads to. |
| `priority_policy` | object | No | Priority bounds applied to matched workloads. See [`priority_policy`](#priority_policy). |

### `priority_policy`

Bounds and default for the priority assigned to matching workloads. Higher priority is more important (drives queue order and preemption).

| Field | Type | Required | Description |
| --- | --- | --- | --- |
| `default` | integer ≥ 0 | No | Priority used when the workload doesn't request one. Must be within `[min, max]`. If unset, `0`. |
| `min` | integer ≥ 0 | No | Lower bound of the allowed priority range. Must be ≤ `max` and ≤ `default`. |
| `max` | integer ≥ 0 | No | Upper bound of the allowed priority range. |
| `on_violation` | one of `reject`, `force_update` | No | Behavior when a requested priority is outside `[min, max]`: `reject` makes this rule fall through to the next; `force_update` clamps the priority into range and continues. |

## Match expressions

A Kubernetes-style label match, used by both `selector` on flavors and `selector` on scheduling rules. Each is written as `{ key, operator, values }`.

| Field | Type | Required | Description |
| --- | --- | --- | --- |
| `key` | string | Yes | Label key to match (non-empty). |
| `operator` | one of `in`, `not_in`, `exists`, `does_not_exist` | Yes | How to compare the label against `values`. |
| `values` | list of string | Conditional | **Required** and non-empty for `in` / `not_in`; **must be omitted** for `exists` / `does_not_exist`. |

All expressions in a single list must match (logical AND).

## Selector keys

A selector (in both `resource_flavors` and `scheduling_rules`) matches against a workload's labels. A workload's labels are the custom tags you set on the job, service, or workspace, plus the reserved keys the platform always stamps onto the workload. Keys are matched exactly and are **case-sensitive**; reserved keys are hyphenated.

Reserved keys (always present; you cannot override them with your own tags):

| Key | Values |
| --- | --- |
| `instance-type` | The launched instance type (for example `m5.large`). |
| `accelerator-type` | The accelerator/GPU type (for example `A100`, `H100`, `L4`). |
| `market-type` | `SPOT` or `ON_DEMAND`. |
| `cloud-resource-id` | The ID of the cloud resource the workload runs on. |
| `workload-type` | `service`, `job`, or `workspace`. |

Any custom tag key you set on the workload is also selectable, as long as it does not collide with a reserved key.

## Validation

`apply` validates your config and returns an error if anything is wrong: unknown keys (rejected at every level), invalid values, and the per-field constraints listed above. It also enforces these cross-field rules:

-   Flavor names are unique across `resource_flavors`; queue names are unique across `resource_queues`.
-   A queue's flavor name must reference a defined flavor; a rule's `resource_queue` must reference a defined queue.
-   `covered_resources` allows only `gpu`, `cpu`, `memory_gb`, `tpu`; covered sets are disjoint across a queue's groups (each resource belongs to one group), with no duplicates within a group.
-   A flavor name appears at most once per resource group, and at most once across a queue's groups.
-   A flavor's `resources` is a subset of its group's `covered_resources` (no resource outside it, names unique within the flavor), and each `nominal_quota` is ≥ 0 with at most 3 decimal places. An omitted covered resource — or a listed one with no `nominal_quota` — is unlimited; `0` blocks it.
-   `priority_policy` satisfies `min ≤ default ≤ max`.
-   Selector `values` are required for `in` / `not_in` and must be empty for `exists` / `does_not_exist`.

## Complete example

```yaml
# Named compute categories, matched to workloads by label. Order matters within a
# group: flavors are tried in listed order.
resource_flavors:
  - name: gpu_a100_spot
    selector:
      - { key: accelerator-type, operator: in, values: [A100] }
      - { key: market-type, operator: in, values: [SPOT] }
  - name: research_gpu_pool
    selector:
      - { key: team, operator: in, values: [research] }   # custom workload tag

resource_queues:
  - name: ml_training
    preemption:
      within_resource_queue: lower_priority
    resource_groups:
      - covered_resources: [gpu, cpu]
        flavors:
          - name: gpu_a100_spot
            resources:
              - { name: gpu, nominal_quota: 64 }
              - { name: cpu, nominal_quota: 512 }
          - name: research_gpu_pool
            resources:
              - { name: gpu, nominal_quota: 16 }
              - { name: cpu, nominal_quota: 128 }

  - name: default        # passthrough: no resource_groups -> admit with no quota

scheduling_rules:
  - selector:
      - { key: team, operator: in, values: [research] }
    resource_queue: ml_training
    priority_policy: { default: 50, min: 0, max: 100, on_violation: reject }
  - resource_queue: default     # catch-all: no selector -> matches everything above
```

## Example: cap a specific instance type

Limit a particular instance type to a fixed number of instances, while letting everything else run uncapped. Define a flavor for just that instance type and give it a quota; route everything else to a passthrough queue, which admits without quota.

```yaml
resource_flavors:
  - name: m5-2xlarge
    selector:
      - { key: instance-type, operator: in, values: [m5.2xlarge] }   # scopes the cap to m5.2xlarge

resource_queues:
  - name: q-m5-2xlarge-cap3
    preemption: { within_resource_queue: never }
    resource_groups:
      - covered_resources: [cpu, memory_gb]
        flavors:
          - name: m5-2xlarge                          # capped: 3 instances
            resources:
              - { name: cpu, nominal_quota: 24 }       # 3 x 8 vCPU
              - { name: memory_gb, nominal_quota: 96 } # 3 x 32 GiB
  - name: default        # passthrough for everything else

scheduling_rules:
  - selector:
      - { key: instance-type, operator: in, values: [m5.2xlarge] }
    resource_queue: q-m5-2xlarge-cap3
    priority_policy: { default: 50, min: 0, max: 100, on_violation: force_update }
  # catch-all (no selector) -> passthrough queue. MUST be last; first match wins.
  - resource_queue: default
```

Only `m5.2xlarge` is capped (3 instances' worth), so at most three run at once. Every other workload matches the catch-all rule and is routed to the `default` passthrough queue, which admits it without quota.

## Example: use a capacity reservation

Route reservation-eligible workloads to a specific cloud capacity reservation and cap how much of it they consume. This example pins `a4-highgpu-8g` instances to a Google Cloud reservation. A flavor carries the reservation binding in [`advanced_instance_config`](#resource_flavors), and a queue quota caps reserved usage. Opt a workload in by tagging it `use-reservation=true`.

```yaml
resource_flavors:
  # a4-highgpu-8g pinned to a specific Google Cloud capacity reservation.
  - name: a4_reserved
    selector:
      - { key: instance-type, operator: in, values: [a4-highgpu-8g] }
    advanced_instance_config:
      instance_properties:
        reservation_affinity:
          key: compute.googleapis.com/reservation-name
          consume_reservation_type: SPECIFIC_RESERVATION
          values: [my-a4-reservation]   # replace with your reservation name

resource_queues:
  # Workloads tagged use-reservation=true land here and are capped to the reservation.
  - name: reservation
    resource_groups:
      - covered_resources: [gpu]
        flavors:
          - name: a4_reserved
            resources:
              # 8 GPUs/node x 10 nodes = 80. Caps reserved usage to ~10 a4-highgpu-8g.
              - { name: gpu, nominal_quota: 80 }
  # Everything else. No resource_groups => passthrough: admits everything, no quota.
  - name: default

scheduling_rules:
  - selector:
      - { key: use-reservation, operator: in, values: ["true"] }
    resource_queue: reservation
  - resource_queue: default   # no selector = matches all; MUST be last (first match wins)
```

A workload tagged `use-reservation=true` routes to the `reservation` queue, where the `a4_reserved` flavor binds it to the named Google Cloud reservation and the `gpu` quota of `80` caps usage at roughly ten `a4-highgpu-8g` nodes. Everything else falls through to the `default` passthrough queue and runs uncapped.

The `reservation` queue defines only the `a4_reserved` flavor, so a `use-reservation=true` workload that isn't an `a4-highgpu-8g` matches no flavor in the queue and fails with "no flavor matched" rather than falling through uncapped.

:::caution
Set the quota to match the reservation, and never above it. The `nominal_quota` must not exceed the capacity reservation's size. A quota larger than the reservation lets the scheduler admit more reserved workloads than the reservation can hold.
:::