Skip to main content

Get started

Get started

This guide creates a small scheduler config (an on-demand CPU flavor and one queue with a CPU quota and priority-based preemption), applies it, and routes a team's workloads into it.

note

Alpha. The anyscale scheduler CLI and anyscale.scheduler SDK are in alpha. A config is org-scoped and versioned: applying validates the config, creates a new active version, and never overwrites earlier versions.

Prerequisites

  • The anyscale CLI installed and authenticated, or the anyscale Python SDK.
  • The Org owner role, which is required to manage scheduler configs.

1. Write a config

Create scheduler-config.yaml:

# Define the kinds of compute this org has. Here: on-demand instances.
resource_flavors:
- name: cpu-on-demand
selector:
- { key: market-type, operator: in, values: [ON_DEMAND] }

# A queue with a CPU quota on the on-demand flavor, with within-queue preemption.
resource_queues:
- name: team-a
preemption:
within_resource_queue: lower_priority
resource_groups:
- covered_resources: [cpu]
flavors:
- name: cpu-on-demand
resources:
- { name: cpu, nominal_quota: 64 }

# Route workloads into the team-a queue.
scheduling_rules:
# Production projects: high priority, may preempt.
- selector:
- { key: project, operator: in, values: [prod] }
resource_queue: team-a
priority_policy: { default: 100, min: 50, max: 100, on_violation: reject }
# Everything else from team-a: low priority.
- selector:
- { key: team, operator: in, values: [team-a] }
resource_queue: team-a
priority_policy: { default: 10, min: 0, max: 49, on_violation: force_update }

This defines a cpu-on-demand flavor (matching on-demand instances), a team-a queue that can use 64 CPUs of that flavor with within-queue preemption, and routes prod-project workloads to high priority (50-100) and other team-a workloads to low priority (0-49). Workloads run on whatever CPU instance their compute config requests (for example, a small on-demand instance), and that usage counts against the cpu-on-demand flavor's quota.

tip

Both resource flavors and scheduling rules use a selector, a list of match expressions over the workload's labels. A full annotated example ships with the CLI at anyscale/scheduler/examples/scheduler-config.yaml.

2. Apply it

anyscale scheduler config apply -f scheduler-config.yaml
Applied scheduler config (version 1).

apply validates the config: schema and value checks locally, and cross-references (rules → queues, flavor quotas → flavors) on the server (surfaced as a 4xx). It then makes the config the active version.

3. Inspect configs

# Active config (default output is yaml)
anyscale scheduler config get

# A specific version, as json
anyscale scheduler config get --version 1 -o json

# Version history, newest first
anyscale scheduler config list
VERSION  CREATED AT
1 2026-06-23T10:00:00Z

get_config() returns a SchedulerConfigVersion (version, is_active, created_at, creator_id, config); each history entry has version, created_at, and creator_id.

note

The same methods are available on the client object: Anyscale().scheduler.apply_config(...), .get_config(...), .list_config_versions(...). SchedulerConfig helpers: from_yaml(path), from_dict(d), to_dict().

4. Submit a workload

Submit workloads as usual; set a priority (an integer from 0 to 1000) to influence ordering and preemption. For example, in a job config:

# job.yaml
name: train-prod
priority: 100 # within the matched rule's allowed range
tags:
project: prod # custom tags the scheduler's rules match on
team: team-a
# entrypoint, compute config, image, …
anyscale job submit -f job.yaml

The job's labels are your custom tags (here project and team) plus the reserved keys the platform attaches (workload-type, instance-type, …). The scheduler evaluates the rules top to bottom; train-prod's project: prod tag matches the first rule, routing it to team-a at priority 100.

5. Update and roll back

apply never overwrites; it adds a version and makes it active, and earlier versions stay available. To roll back, re-apply an older version:

anyscale scheduler config get --version 1 -o yaml > rollback.yaml
anyscale scheduler config apply -f rollback.yaml
note

Queues and flavors are identified by name and aren't edited in place. To change a queue's or flavor's spec, give it a new name (for example team-a:v2) rather than mutating it, so in-flight workloads on the old name are unaffected.

Use a capacity reservation

Capacity reservations are configured in your compute config, per node type, through the advanced instance config. See the compute configuration docs for AWS and Google Cloud for the exact YAML (CapacityReservationSpecification on AWS, reservationAffinity on GCP).

You can also attach a reservation to a resource flavor through its advanced_instance_config, using the same YAML. When the scheduler assigns that flavor to a workload, it merges the flavor's config into the workload's launch request, so the workload runs against the reservation — and a queue quota on that flavor caps how much of the reservation it can consume. Where the flavor and the compute config set the same key, the flavor wins. See Example: use a capacity reservation.

Run on Kubernetes with Kueue or KAI

On Kubernetes, the Anyscale scheduler works alongside an in-cluster queue, either Kueue or the Nvidia KAI Scheduler. Anyscale generates the required manifests in your data plane and launches pods labeled for your in-cluster queue, which handles all queueing and prioritization. You add the queue label to your cluster's pod config. See Integrate with Kueue.

You can also drive the in-cluster queue from a resource flavor: give the flavor an advanced_instance_config carrying the pod metadata for your queue (for example a kueue.x-k8s.io/queue-name label under metadata.labels). When the scheduler assigns that flavor to a workload, it patches the flavor's metadata into the workload's pods — so the flavor a workload lands on decides which Kueue or KAI queue its pods join.

Good to know

  • No config = passthrough. With no config (or an empty one), every workload is admitted immediately. Once a config exists, add an explicit catch-all rule (a rule with no selector) for workloads you still want admitted.
  • Block a class of workloads by routing them to a queue whose relevant flavor has nominal_quota: 0.

Next steps