Scheduler config reference
Scheduler config reference
The scheduler config is a YAML document with up to three top-level keys, all optional. An empty config is valid: an org with no rules passes every workload through. Unknown keys are rejected at every level. The config is org-scoped and versioned: applying creates a new active version.
Structure
| Key | Purpose |
|---|---|
resource_flavors | Named compute categories, matched to workloads by label. |
resource_queues | Pools that hold workloads and cap usage with quotas. |
scheduling_rules | Ordered rules that route workloads to queues and assign priority. |
resource_flavors
A list of named compute categories, matched to workloads by label selectors. A workload is assigned
a flavor per resource group: within a group, the group's flavors are tried in listed order,
and the workload gets the first flavor whose selector matches its labels and that has quota
available. If no flavor matches, the workload is rejected.
| Field | Type | Required | Description |
|---|---|---|---|
name | string | Yes | Unique flavor name (non-empty). Referenced from a queue's quota by this name. |
selector | list of match expressions | No | Label rules a workload must satisfy (all must match). Omit to match every workload (catch-all flavor). |
advanced_instance_config | object | No | Opaque cloud-provider launch overrides, passed through verbatim (for example EC2 CapacityReservationSpecification, instance tags, or Kubernetes pod metadata). |
The selector shape is identical in resource_flavors and scheduling_rules, and both match the
same set of workload labels. Keys are matched exactly and are case-sensitive;
reserved keys are hyphenated (for example accelerator-type, not accelerator_type).
resource_queues
A list of named pools that hold workloads and cap their resource usage with quotas. A queue that
omits resource_groups is a passthrough: every workload routed to it is admitted with no quota
enforcement.
| Field | Type | Required | Description |
|---|---|---|---|
name | string | Yes | Unique queue name (non-empty). Referenced by a scheduling rule's resource_queue. |
preemption | object | No | Whether and how the queue preempts running workloads. See preemption. |
resource_groups | list of objects | No | Quotas, grouped by the resources they cover. See resource_groups. Omit entirely to make the queue a passthrough. |
A queue's spec is immutable per name. To change a queue's shape, use a new name (for example
team-a:v2).
preemption
Controls when a queue evicts running workloads to free quota.
| Field | One of | Description |
|---|---|---|
within_resource_queue | never, lower_priority | When lower_priority, a workload may preempt strictly lower-priority workloads in the same queue. never (or unset) means no preemption. |
resource_groups
Each entry groups a set of resource axes governed together, with per-flavor quotas. Omit
resource_groups from a queue to make it a passthrough (admit everything, no quota).
| Field | Type | Required | Description |
|---|---|---|---|
covered_resources | list of string | Yes | The resource axes this group governs. Each must be one of gpu, cpu, memory_gb, tpu (any other name is rejected). Non-empty; no duplicates. Each resource belongs to exactly one group within a queue (covered sets are disjoint). Use memory_gb, not memory. |
flavors | list of objects | Yes | Per-flavor quotas for the covered resources. Non-empty. Tried in listed order. |
Each flavors entry:
| Field | Type | Required | Description |
|---|---|---|---|
name | string | Yes | A flavor name defined in resource_flavors. Appears at most once per group, and at most once across a queue's groups. |
resources | list of objects | No | Per-flavor quotas, a subset of the group's covered_resources — quota only the resources you want to cap. Any covered resource you omit (or list with no nominal_quota) is unlimited. Names unique within this list. |
Each resources entry (one quota for one resource of one flavor):
| Field | Type | Required | Description |
|---|---|---|---|
name | string | Yes | Resource name; must be one of the group's covered_resources. |
nominal_quota | float ≥ 0 | No | The quota ceiling for this (queue, flavor, resource), with at most 3 decimal places. Omit (or omit the whole resource entry) for unlimited; set to 0 to block the resource. |
scheduling_rules
An ordered list of rules that route a workload to a queue and set its priority. Rules are evaluated
top to bottom; the first match wins (a rule whose priority_policy rejects falls through to the
next rule). A workload matching no rule is failed.
| Field | Type | Required | Description |
|---|---|---|---|
selector | list of match expressions | No | Label rules applied over the workload's labels (all must match). Omit to match every workload (catch-all) — put such a rule last. |
resource_queue | string | Yes | Name of the queue to route matched workloads to. |
priority_policy | object | No | Priority bounds applied to matched workloads. See priority_policy. |
priority_policy
Bounds and default for the priority assigned to matching workloads. Higher priority is more important (drives queue order and preemption).
| Field | Type | Required | Description |
|---|---|---|---|
default | integer ≥ 0 | No | Priority used when the workload doesn't request one. Must be within [min, max]. If unset, 0. |
min | integer ≥ 0 | No | Lower bound of the allowed priority range. Must be ≤ max and ≤ default. |
max | integer ≥ 0 | No | Upper bound of the allowed priority range. |
on_violation | one of reject, force_update | No | Behavior when a requested priority is outside [min, max]: reject makes this rule fall through to the next; force_update clamps the priority into range and continues. |
Match expressions
A Kubernetes-style label match, used by both selector on flavors and selector on scheduling
rules. Each is written as { key, operator, values }.
| Field | Type | Required | Description |
|---|---|---|---|
key | string | Yes | Label key to match (non-empty). |
operator | one of in, not_in, exists, does_not_exist | Yes | How to compare the label against values. |
values | list of string | Conditional | Required and non-empty for in / not_in; must be omitted for exists / does_not_exist. |
All expressions in a single list must match (logical AND).
Selector keys
A selector (in both resource_flavors and scheduling_rules) matches against a workload's labels.
A workload's labels are the custom tags you set on the job, service, or workspace, plus the reserved
keys the platform always stamps onto the workload. Keys are matched exactly and are
case-sensitive; reserved keys are hyphenated.
Reserved keys (always present; you cannot override them with your own tags):
| Key | Values |
|---|---|
instance-type | The launched instance type (for example m5.large). |
accelerator-type | The accelerator/GPU type (for example A100, H100, L4). |
market-type | SPOT or ON_DEMAND. |
cloud-resource-id | The ID of the cloud resource the workload runs on. |
workload-type | service, job, or workspace. |
Any custom tag key you set on the workload is also selectable, as long as it does not collide with a reserved key.
Validation
apply validates your config and returns an error if anything is wrong: unknown keys (rejected at
every level), invalid values, and the per-field constraints listed above. It also enforces these
cross-field rules:
- Flavor names are unique across
resource_flavors; queue names are unique acrossresource_queues. - A queue's flavor name must reference a defined flavor; a rule's
resource_queuemust reference a defined queue. covered_resourcesallows onlygpu,cpu,memory_gb,tpu; covered sets are disjoint across a queue's groups (each resource belongs to one group), with no duplicates within a group.- A flavor name appears at most once per resource group, and at most once across a queue's groups.
- A flavor's
resourcesis a subset of its group'scovered_resources(no resource outside it, names unique within the flavor), and eachnominal_quotais ≥ 0 with at most 3 decimal places. An omitted covered resource — or a listed one with nonominal_quota— is unlimited;0blocks it. priority_policysatisfiesmin ≤ default ≤ max.- Selector
valuesare required forin/not_inand must be empty forexists/does_not_exist.
Complete example
# Named compute categories, matched to workloads by label. Order matters within a
# group: flavors are tried in listed order.
resource_flavors:
- name: gpu_a100_spot
selector:
- { key: accelerator-type, operator: in, values: [A100] }
- { key: market-type, operator: in, values: [SPOT] }
- name: research_gpu_pool
selector:
- { key: team, operator: in, values: [research] } # custom workload tag
resource_queues:
- name: ml_training
preemption:
within_resource_queue: lower_priority
resource_groups:
- covered_resources: [gpu, cpu]
flavors:
- name: gpu_a100_spot
resources:
- { name: gpu, nominal_quota: 64 }
- { name: cpu, nominal_quota: 512 }
- name: research_gpu_pool
resources:
- { name: gpu, nominal_quota: 16 }
- { name: cpu, nominal_quota: 128 }
- name: default # passthrough: no resource_groups -> admit with no quota
scheduling_rules:
- selector:
- { key: team, operator: in, values: [research] }
resource_queue: ml_training
priority_policy: { default: 50, min: 0, max: 100, on_violation: reject }
- resource_queue: default # catch-all: no selector -> matches everything above
Example: cap a specific instance type
Limit a particular instance type to a fixed number of instances, while letting everything else run uncapped. Define a flavor for just that instance type and give it a quota; route everything else to a passthrough queue, which admits without quota.
resource_flavors:
- name: m5-2xlarge
selector:
- { key: instance-type, operator: in, values: [m5.2xlarge] } # scopes the cap to m5.2xlarge
resource_queues:
- name: q-m5-2xlarge-cap3
preemption: { within_resource_queue: never }
resource_groups:
- covered_resources: [cpu, memory_gb]
flavors:
- name: m5-2xlarge # capped: 3 instances
resources:
- { name: cpu, nominal_quota: 24 } # 3 x 8 vCPU
- { name: memory_gb, nominal_quota: 96 } # 3 x 32 GiB
- name: default # passthrough for everything else
scheduling_rules:
- selector:
- { key: instance-type, operator: in, values: [m5.2xlarge] }
resource_queue: q-m5-2xlarge-cap3
priority_policy: { default: 50, min: 0, max: 100, on_violation: force_update }
# catch-all (no selector) -> passthrough queue. MUST be last; first match wins.
- resource_queue: default
Only m5.2xlarge is capped (3 instances' worth), so at most three run at once. Every other workload
matches the catch-all rule and is routed to the default passthrough queue, which admits it without
quota.
Example: use a capacity reservation
Route reservation-eligible workloads to a specific cloud capacity reservation and cap how much of it
they consume. This example pins a4-highgpu-8g instances to a Google Cloud reservation. A flavor
carries the reservation binding in advanced_instance_config, and a queue
quota caps reserved usage. Opt a workload in by tagging it use-reservation=true.
resource_flavors:
# a4-highgpu-8g pinned to a specific Google Cloud capacity reservation.
- name: a4_reserved
selector:
- { key: instance-type, operator: in, values: [a4-highgpu-8g] }
advanced_instance_config:
instance_properties:
reservation_affinity:
key: compute.googleapis.com/reservation-name
consume_reservation_type: SPECIFIC_RESERVATION
values: [my-a4-reservation] # replace with your reservation name
resource_queues:
# Workloads tagged use-reservation=true land here and are capped to the reservation.
- name: reservation
resource_groups:
- covered_resources: [gpu]
flavors:
- name: a4_reserved
resources:
# 8 GPUs/node x 10 nodes = 80. Caps reserved usage to ~10 a4-highgpu-8g.
- { name: gpu, nominal_quota: 80 }
# Everything else. No resource_groups => passthrough: admits everything, no quota.
- name: default
scheduling_rules:
- selector:
- { key: use-reservation, operator: in, values: ["true"] }
resource_queue: reservation
- resource_queue: default # no selector = matches all; MUST be last (first match wins)
A workload tagged use-reservation=true routes to the reservation queue, where the a4_reserved
flavor binds it to the named Google Cloud reservation and the gpu quota of 80 caps usage at
roughly ten a4-highgpu-8g nodes. Everything else falls through to the default passthrough queue
and runs uncapped.
The reservation queue defines only the a4_reserved flavor, so a use-reservation=true workload
that isn't an a4-highgpu-8g matches no flavor in the queue and fails with "no flavor matched"
rather than falling through uncapped.
Set the quota to match the reservation, and never above it. The nominal_quota must not exceed the
capacity reservation's size. A quota larger than the reservation lets the scheduler admit more
reserved workloads than the reservation can hold.