Scheduler config reference

The scheduler config is a YAML document with up to three top-level keys, all optional. An empty config is valid: an org with no rules passes every workload through. Unknown keys are rejected at every level. The config is org-scoped and versioned: applying creates a new active version.

Structure

Key	Purpose
`resource_flavors`	Named compute categories, matched to workloads by label.
`resource_queues`	Pools that hold workloads and cap usage with quotas.
`scheduling_rules`	Ordered rules that route workloads to queues and assign priority.

`resource_flavors`

A list of named compute categories, matched to workloads by label selectors. A workload is assigned a flavor per resource group: within a group, the group's flavors are tried in listed order, and the workload gets the first flavor whose selector matches its labels and that has quota available. If no flavor matches, the workload is rejected.

Field	Type	Required	Description
`name`	string	Yes	Unique flavor name (non-empty). Referenced from a queue's quota by this name.
`selector`	list of match expressions	No	Label rules a workload must satisfy (all must match). Omit to match every workload (catch-all flavor).
`advanced_instance_config`	object	No	Opaque cloud-provider launch overrides, passed through verbatim (for example EC2 `CapacityReservationSpecification`, instance tags, or Kubernetes pod metadata).

note

The selector shape is identical in resource_flavors and scheduling_rules, and both match the same set of workload labels. Keys are matched exactly and are case-sensitive; reserved keys are hyphenated (for example accelerator-type, not accelerator_type).

`resource_queues`

A list of named pools that hold workloads and cap their resource usage with quotas. A queue that omits resource_groups is a passthrough: every workload routed to it is admitted with no quota enforcement.

Field	Type	Required	Description
`name`	string	Yes	Unique queue name (non-empty). Referenced by a scheduling rule's `resource_queue`.
`preemption`	object	No	Whether and how the queue preempts running workloads. See `preemption`.
`resource_groups`	list of objects	No	Quotas, grouped by the resources they cover. See `resource_groups`. Omit entirely to make the queue a passthrough.

note

A queue's spec is immutable per name. To change a queue's shape, use a new name (for example team-a:v2).

`preemption`

Controls when a queue evicts running workloads to free quota.

Field	One of	Description
`within_resource_queue`	`never`, `lower_priority`	When `lower_priority`, a workload may preempt strictly lower-priority workloads in the same queue. `never` (or unset) means no preemption.

`resource_groups`

Each entry groups a set of resource axes governed together, with per-flavor quotas. Omit resource_groups from a queue to make it a passthrough (admit everything, no quota).

Field	Type	Required	Description
`covered_resources`	list of string	Yes	The resource axes this group governs. Each must be one of `gpu`, `cpu`, `memory_gb`, `tpu` (any other name is rejected). Non-empty; no duplicates. Each resource belongs to exactly one group within a queue (covered sets are disjoint). Use `memory_gb`, not `memory`.
`flavors`	list of objects	Yes	Per-flavor quotas for the covered resources. Non-empty. Tried in listed order.

Each flavors entry:

Field	Type	Required	Description
`name`	string	Yes	A flavor name defined in `resource_flavors`. Appears at most once per group, and at most once across a queue's groups.
`resources`	list of objects	No	Per-flavor quotas, a subset of the group's `covered_resources` — quota only the resources you want to cap. Any covered resource you omit (or list with no `nominal_quota`) is unlimited. Names unique within this list.

Each resources entry (one quota for one resource of one flavor):

Field	Type	Required	Description
`name`	string	Yes	Resource name; must be one of the group's `covered_resources`.
`nominal_quota`	float ≥ 0	No	The quota ceiling for this (queue, flavor, resource), with at most 3 decimal places. Omit (or omit the whole resource entry) for unlimited; set to `0` to block the resource.

`scheduling_rules`

An ordered list of rules that route a workload to a queue and set its priority. Rules are evaluated top to bottom; the first match wins (a rule whose priority_policy rejects falls through to the next rule). A workload matching no rule is failed.

Field	Type	Required	Description
`selector`	list of match expressions	No	Label rules applied over the workload's labels (all must match). Omit to match every workload (catch-all) — put such a rule last.
`resource_queue`	string	Yes	Name of the queue to route matched workloads to.
`priority_policy`	object	No	Priority bounds applied to matched workloads. See `priority_policy`.

`priority_policy`

Bounds and default for the priority assigned to matching workloads. Higher priority is more important (drives queue order and preemption).

Field	Type	Required	Description
`default`	integer ≥ 0	No	Priority used when the workload doesn't request one. Must be within `[min, max]`. If unset, `0`.
`min`	integer ≥ 0	No	Lower bound of the allowed priority range. Must be ≤ `max` and ≤ `default`.
`max`	integer ≥ 0	No	Upper bound of the allowed priority range.
`on_violation`	one of `reject`, `force_update`	No	Behavior when a requested priority is outside `[min, max]`: `reject` makes this rule fall through to the next; `force_update` clamps the priority into range and continues.

Match expressions

A Kubernetes-style label match, used by both selector on flavors and selector on scheduling rules. Each is written as { key, operator, values }.

Field	Type	Required	Description
`key`	string	Yes	Label key to match (non-empty).
`operator`	one of `in`, `not_in`, `exists`, `does_not_exist`	Yes	How to compare the label against `values`.
`values`	list of string	Conditional	Required and non-empty for `in` / `not_in`; must be omitted for `exists` / `does_not_exist`.

All expressions in a single list must match (logical AND).

Selector keys

A selector (in both resource_flavors and scheduling_rules) matches against a workload's labels. A workload's labels are the custom tags you set on the job, service, or workspace, plus the reserved keys the platform always stamps onto the workload. Keys are matched exactly and are case-sensitive; reserved keys are hyphenated.

Reserved keys (always present; you cannot override them with your own tags):

Key	Values
`instance-type`	The launched instance type (for example `m5.large`).
`accelerator-type`	The accelerator/GPU type (for example `A100`, `H100`, `L4`).
`market-type`	`SPOT` or `ON_DEMAND`.
`cloud-resource-id`	The ID of the cloud resource the workload runs on.
`workload-type`	`service`, `job`, or `workspace`.

Any custom tag key you set on the workload is also selectable, as long as it does not collide with a reserved key.

Validation

apply validates your config and returns an error if anything is wrong: unknown keys (rejected at every level), invalid values, and the per-field constraints listed above. It also enforces these cross-field rules:

Flavor names are unique across resource_flavors; queue names are unique across resource_queues.
A queue's flavor name must reference a defined flavor; a rule's resource_queue must reference a defined queue.
covered_resources allows only gpu, cpu, memory_gb, tpu; covered sets are disjoint across a queue's groups (each resource belongs to one group), with no duplicates within a group.
A flavor name appears at most once per resource group, and at most once across a queue's groups.
A flavor's resources is a subset of its group's covered_resources (no resource outside it, names unique within the flavor), and each nominal_quota is ≥ 0 with at most 3 decimal places. An omitted covered resource — or a listed one with no nominal_quota — is unlimited; 0 blocks it.
priority_policy satisfies min ≤ default ≤ max.
Selector values are required for in / not_in and must be empty for exists / does_not_exist.

Complete example

# Named compute categories, matched to workloads by label. Order matters within a
# group: flavors are tried in listed order.
resource_flavors:
  - name: gpu_a100_spot
    selector:
      - { key: accelerator-type, operator: in, values: [A100] }
      - { key: market-type, operator: in, values: [SPOT] }
  - name: research_gpu_pool
    selector:
      - { key: team, operator: in, values: [research] }   # custom workload tag

resource_queues:
  - name: ml_training
    preemption:
      within_resource_queue: lower_priority
    resource_groups:
      - covered_resources: [gpu, cpu]
        flavors:
          - name: gpu_a100_spot
            resources:
              - { name: gpu, nominal_quota: 64 }
              - { name: cpu, nominal_quota: 512 }
          - name: research_gpu_pool
            resources:
              - { name: gpu, nominal_quota: 16 }
              - { name: cpu, nominal_quota: 128 }

  - name: default        # passthrough: no resource_groups -> admit with no quota

scheduling_rules:
  - selector:
      - { key: team, operator: in, values: [research] }
    resource_queue: ml_training
    priority_policy: { default: 50, min: 0, max: 100, on_violation: reject }
  - resource_queue: default     # catch-all: no selector -> matches everything above

Example: cap a specific instance type

Limit a particular instance type to a fixed number of instances, while letting everything else run uncapped. Define a flavor for just that instance type and give it a quota; route everything else to a passthrough queue, which admits without quota.

resource_flavors:
  - name: m5-2xlarge
    selector:
      - { key: instance-type, operator: in, values: [m5.2xlarge] }   # scopes the cap to m5.2xlarge

resource_queues:
  - name: q-m5-2xlarge-cap3
    preemption: { within_resource_queue: never }
    resource_groups:
      - covered_resources: [cpu, memory_gb]
        flavors:
          - name: m5-2xlarge                          # capped: 3 instances
            resources:
              - { name: cpu, nominal_quota: 24 }       # 3 x 8 vCPU
              - { name: memory_gb, nominal_quota: 96 } # 3 x 32 GiB
  - name: default        # passthrough for everything else

scheduling_rules:
  - selector:
      - { key: instance-type, operator: in, values: [m5.2xlarge] }
    resource_queue: q-m5-2xlarge-cap3
    priority_policy: { default: 50, min: 0, max: 100, on_violation: force_update }
  # catch-all (no selector) -> passthrough queue. MUST be last; first match wins.
  - resource_queue: default

Only m5.2xlarge is capped (3 instances' worth), so at most three run at once. Every other workload matches the catch-all rule and is routed to the default passthrough queue, which admits it without quota.

Example: use a capacity reservation

Route reservation-eligible workloads to a specific cloud capacity reservation and cap how much of it they consume. This example pins a4-highgpu-8g instances to a Google Cloud reservation. A flavor carries the reservation binding in advanced_instance_config, and a queue quota caps reserved usage. Opt a workload in by tagging it use-reservation=true.

resource_flavors:
  # a4-highgpu-8g pinned to a specific Google Cloud capacity reservation.
  - name: a4_reserved
    selector:
      - { key: instance-type, operator: in, values: [a4-highgpu-8g] }
    advanced_instance_config:
      instance_properties:
        reservation_affinity:
          key: compute.googleapis.com/reservation-name
          consume_reservation_type: SPECIFIC_RESERVATION
          values: [my-a4-reservation]   # replace with your reservation name

resource_queues:
  # Workloads tagged use-reservation=true land here and are capped to the reservation.
  - name: reservation
    resource_groups:
      - covered_resources: [gpu]
        flavors:
          - name: a4_reserved
            resources:
              # 8 GPUs/node x 10 nodes = 80. Caps reserved usage to ~10 a4-highgpu-8g.
              - { name: gpu, nominal_quota: 80 }
  # Everything else. No resource_groups => passthrough: admits everything, no quota.
  - name: default

scheduling_rules:
  - selector:
      - { key: use-reservation, operator: in, values: ["true"] }
    resource_queue: reservation
  - resource_queue: default   # no selector = matches all; MUST be last (first match wins)

A workload tagged use-reservation=true routes to the reservation queue, where the a4_reserved flavor binds it to the named Google Cloud reservation and the gpu quota of 80 caps usage at roughly ten a4-highgpu-8g nodes. Everything else falls through to the default passthrough queue and runs uncapped.

The reservation queue defines only the a4_reserved flavor, so a use-reservation=true workload that isn't an a4-highgpu-8g matches no flavor in the queue and fails with "no flavor matched" rather than falling through uncapped.

caution

Set the quota to match the reservation, and never above it. The nominal_quota must not exceed the capacity reservation's size. A quota larger than the reservation lets the scheduler admit more reserved workloads than the reservation can hold.

Structure​

resource_flavors​

resource_queues​

preemption​

resource_groups​

scheduling_rules​

priority_policy​

Match expressions​

Selector keys​

Validation​

Complete example​

Example: cap a specific instance type​

Example: use a capacity reservation​