Skip to main content

Scheduler config reference

Scheduler config reference

The scheduler config is a YAML document with up to three top-level keys, all optional. An empty config is valid: an org with no rules passes every workload through. Unknown keys are rejected at every level. The config is org-scoped and versioned: applying creates a new active version.

Structure

KeyPurpose
resource_flavorsNamed compute categories, matched to workloads by label.
resource_queuesPools that hold workloads and cap usage with quotas.
scheduling_rulesOrdered rules that route workloads to queues and assign priority.

resource_flavors

A list of named compute categories, matched to workloads by label selectors. A workload is assigned a flavor per resource group: within a group, the group's flavors are tried in listed order, and the workload gets the first flavor whose selector matches its labels and that has quota available. If no flavor matches, the workload is rejected.

FieldTypeRequiredDescription
namestringYesUnique flavor name (non-empty). Referenced from a queue's quota by this name.
selectorlist of match expressionsNoLabel rules a workload must satisfy (all must match). Omit to match every workload (catch-all flavor).
advanced_instance_configobjectNoOpaque cloud-provider launch overrides, passed through verbatim (for example EC2 CapacityReservationSpecification, instance tags, or Kubernetes pod metadata).
note

The selector shape is identical in resource_flavors and scheduling_rules, and both match the same set of workload labels. Keys are matched exactly and are case-sensitive; reserved keys are hyphenated (for example accelerator-type, not accelerator_type).

resource_queues

A list of named pools that hold workloads and cap their resource usage with quotas. A queue that omits resource_groups is a passthrough: every workload routed to it is admitted with no quota enforcement.

FieldTypeRequiredDescription
namestringYesUnique queue name (non-empty). Referenced by a scheduling rule's resource_queue.
preemptionobjectNoWhether and how the queue preempts running workloads. See preemption.
resource_groupslist of objectsNoQuotas, grouped by the resources they cover. See resource_groups. Omit entirely to make the queue a passthrough.
note

A queue's spec is immutable per name. To change a queue's shape, use a new name (for example team-a:v2).

preemption

Controls when a queue evicts running workloads to free quota.

FieldOne ofDescription
within_resource_queuenever, lower_priorityWhen lower_priority, a workload may preempt strictly lower-priority workloads in the same queue. never (or unset) means no preemption.

resource_groups

Each entry groups a set of resource axes governed together, with per-flavor quotas. Omit resource_groups from a queue to make it a passthrough (admit everything, no quota).

FieldTypeRequiredDescription
covered_resourceslist of stringYesThe resource axes this group governs. Each must be one of gpu, cpu, memory_gb, tpu (any other name is rejected). Non-empty; no duplicates. Each resource belongs to exactly one group within a queue (covered sets are disjoint). Use memory_gb, not memory.
flavorslist of objectsYesPer-flavor quotas for the covered resources. Non-empty. Tried in listed order.

Each flavors entry:

FieldTypeRequiredDescription
namestringYesA flavor name defined in resource_flavors. Appears at most once per group, and at most once across a queue's groups.
resourceslist of objectsNoPer-flavor quotas, a subset of the group's covered_resources — quota only the resources you want to cap. Any covered resource you omit (or list with no nominal_quota) is unlimited. Names unique within this list.

Each resources entry (one quota for one resource of one flavor):

FieldTypeRequiredDescription
namestringYesResource name; must be one of the group's covered_resources.
nominal_quotafloat ≥ 0NoThe quota ceiling for this (queue, flavor, resource), with at most 3 decimal places. Omit (or omit the whole resource entry) for unlimited; set to 0 to block the resource.

scheduling_rules

An ordered list of rules that route a workload to a queue and set its priority. Rules are evaluated top to bottom; the first match wins (a rule whose priority_policy rejects falls through to the next rule). A workload matching no rule is failed.

FieldTypeRequiredDescription
selectorlist of match expressionsNoLabel rules applied over the workload's labels (all must match). Omit to match every workload (catch-all) — put such a rule last.
resource_queuestringYesName of the queue to route matched workloads to.
priority_policyobjectNoPriority bounds applied to matched workloads. See priority_policy.

priority_policy

Bounds and default for the priority assigned to matching workloads. Higher priority is more important (drives queue order and preemption).

FieldTypeRequiredDescription
defaultinteger ≥ 0NoPriority used when the workload doesn't request one. Must be within [min, max]. If unset, 0.
mininteger ≥ 0NoLower bound of the allowed priority range. Must be ≤ max and ≤ default.
maxinteger ≥ 0NoUpper bound of the allowed priority range.
on_violationone of reject, force_updateNoBehavior when a requested priority is outside [min, max]: reject makes this rule fall through to the next; force_update clamps the priority into range and continues.

Match expressions

A Kubernetes-style label match, used by both selector on flavors and selector on scheduling rules. Each is written as { key, operator, values }.

FieldTypeRequiredDescription
keystringYesLabel key to match (non-empty).
operatorone of in, not_in, exists, does_not_existYesHow to compare the label against values.
valueslist of stringConditionalRequired and non-empty for in / not_in; must be omitted for exists / does_not_exist.

All expressions in a single list must match (logical AND).

Selector keys

A selector (in both resource_flavors and scheduling_rules) matches against a workload's labels. A workload's labels are the custom tags you set on the job, service, or workspace, plus the reserved keys the platform always stamps onto the workload. Keys are matched exactly and are case-sensitive; reserved keys are hyphenated.

Reserved keys (always present; you cannot override them with your own tags):

KeyValues
instance-typeThe launched instance type (for example m5.large).
accelerator-typeThe accelerator/GPU type (for example A100, H100, L4).
market-typeSPOT or ON_DEMAND.
cloud-resource-idThe ID of the cloud resource the workload runs on.
workload-typeservice, job, or workspace.

Any custom tag key you set on the workload is also selectable, as long as it does not collide with a reserved key.

Validation

apply validates your config and returns an error if anything is wrong: unknown keys (rejected at every level), invalid values, and the per-field constraints listed above. It also enforces these cross-field rules:

  • Flavor names are unique across resource_flavors; queue names are unique across resource_queues.
  • A queue's flavor name must reference a defined flavor; a rule's resource_queue must reference a defined queue.
  • covered_resources allows only gpu, cpu, memory_gb, tpu; covered sets are disjoint across a queue's groups (each resource belongs to one group), with no duplicates within a group.
  • A flavor name appears at most once per resource group, and at most once across a queue's groups.
  • A flavor's resources is a subset of its group's covered_resources (no resource outside it, names unique within the flavor), and each nominal_quota is ≥ 0 with at most 3 decimal places. An omitted covered resource — or a listed one with no nominal_quota — is unlimited; 0 blocks it.
  • priority_policy satisfies min ≤ default ≤ max.
  • Selector values are required for in / not_in and must be empty for exists / does_not_exist.

Complete example

# Named compute categories, matched to workloads by label. Order matters within a
# group: flavors are tried in listed order.
resource_flavors:
- name: gpu_a100_spot
selector:
- { key: accelerator-type, operator: in, values: [A100] }
- { key: market-type, operator: in, values: [SPOT] }
- name: research_gpu_pool
selector:
- { key: team, operator: in, values: [research] } # custom workload tag

resource_queues:
- name: ml_training
preemption:
within_resource_queue: lower_priority
resource_groups:
- covered_resources: [gpu, cpu]
flavors:
- name: gpu_a100_spot
resources:
- { name: gpu, nominal_quota: 64 }
- { name: cpu, nominal_quota: 512 }
- name: research_gpu_pool
resources:
- { name: gpu, nominal_quota: 16 }
- { name: cpu, nominal_quota: 128 }

- name: default # passthrough: no resource_groups -> admit with no quota

scheduling_rules:
- selector:
- { key: team, operator: in, values: [research] }
resource_queue: ml_training
priority_policy: { default: 50, min: 0, max: 100, on_violation: reject }
- resource_queue: default # catch-all: no selector -> matches everything above

Example: cap a specific instance type

Limit a particular instance type to a fixed number of instances, while letting everything else run uncapped. Define a flavor for just that instance type and give it a quota; route everything else to a passthrough queue, which admits without quota.

resource_flavors:
- name: m5-2xlarge
selector:
- { key: instance-type, operator: in, values: [m5.2xlarge] } # scopes the cap to m5.2xlarge

resource_queues:
- name: q-m5-2xlarge-cap3
preemption: { within_resource_queue: never }
resource_groups:
- covered_resources: [cpu, memory_gb]
flavors:
- name: m5-2xlarge # capped: 3 instances
resources:
- { name: cpu, nominal_quota: 24 } # 3 x 8 vCPU
- { name: memory_gb, nominal_quota: 96 } # 3 x 32 GiB
- name: default # passthrough for everything else

scheduling_rules:
- selector:
- { key: instance-type, operator: in, values: [m5.2xlarge] }
resource_queue: q-m5-2xlarge-cap3
priority_policy: { default: 50, min: 0, max: 100, on_violation: force_update }
# catch-all (no selector) -> passthrough queue. MUST be last; first match wins.
- resource_queue: default

Only m5.2xlarge is capped (3 instances' worth), so at most three run at once. Every other workload matches the catch-all rule and is routed to the default passthrough queue, which admits it without quota.

Example: use a capacity reservation

Route reservation-eligible workloads to a specific cloud capacity reservation and cap how much of it they consume. This example pins a4-highgpu-8g instances to a Google Cloud reservation. A flavor carries the reservation binding in advanced_instance_config, and a queue quota caps reserved usage. Opt a workload in by tagging it use-reservation=true.

resource_flavors:
# a4-highgpu-8g pinned to a specific Google Cloud capacity reservation.
- name: a4_reserved
selector:
- { key: instance-type, operator: in, values: [a4-highgpu-8g] }
advanced_instance_config:
instance_properties:
reservation_affinity:
key: compute.googleapis.com/reservation-name
consume_reservation_type: SPECIFIC_RESERVATION
values: [my-a4-reservation] # replace with your reservation name

resource_queues:
# Workloads tagged use-reservation=true land here and are capped to the reservation.
- name: reservation
resource_groups:
- covered_resources: [gpu]
flavors:
- name: a4_reserved
resources:
# 8 GPUs/node x 10 nodes = 80. Caps reserved usage to ~10 a4-highgpu-8g.
- { name: gpu, nominal_quota: 80 }
# Everything else. No resource_groups => passthrough: admits everything, no quota.
- name: default

scheduling_rules:
- selector:
- { key: use-reservation, operator: in, values: ["true"] }
resource_queue: reservation
- resource_queue: default # no selector = matches all; MUST be last (first match wins)

A workload tagged use-reservation=true routes to the reservation queue, where the a4_reserved flavor binds it to the named Google Cloud reservation and the gpu quota of 80 caps usage at roughly ten a4-highgpu-8g nodes. Everything else falls through to the default passthrough queue and runs uncapped.

The reservation queue defines only the a4_reserved flavor, so a use-reservation=true workload that isn't an a4-highgpu-8g matches no flavor in the queue and fails with "no flavor matched" rather than falling through uncapped.

caution

Set the quota to match the reservation, and never above it. The nominal_quota must not exceed the capacity reservation's size. A quota larger than the reservation lets the scheduler admit more reserved workloads than the reservation can hold.