---
title: "Anyscale scheduler"
description: "Understand the Anyscale scheduler and how admins share pooled compute across teams with quotas, routing, and priority."
---

# Anyscale scheduler

The **Anyscale scheduler** intelligently acquires and distributes compute across your many Ray clusters, using your compute estate. It decides which workloads get compute, in what order, and on which resources, so a fixed pool of expensive accelerators can be shared across many teams with predictable priority and fairness.

An admin writes one declarative **scheduler config**; developers keep submitting jobs, services, and workspaces as they always have, optionally with a priority.

The scheduler shares compute across Anyscale **jobs, services, and workspaces**, and supports a mix of compute types (**capacity reservations**, **spot instances**, and **on-demand instances**) that you model as [resource flavors](#resource-flavor). It works across:

-   **VM-based clouds** on **AWS** or **Google Cloud**
-   **Kubernetes-based clouds** on **Amazon EKS**, **Google GKE**, **Azure AKS**, or **neoclouds**

A single resource queue can pool capacity across these clouds and clusters at once, so workloads fall back across reservations, spot, on-demand, and providers according to the quotas you set.

The scheduler configuration is **modeled after [Kueue](https://kueue.sigs.k8s.io/)**, Kubernetes' job-queueing system, so resource flavors and queues will look familiar if you've used it. See [How this maps to Kueue](#how-this-maps-to-kueue).

:::note
**Alpha.** The scheduler is in alpha. Names and fields may change.
:::

## Who it's for

| Role | What they do |
| --- | --- |
| **Platform / org admins** | Author the scheduler config: flavors, queues, quotas, scheduling rules, priority. |
| **Developers** | Submit workloads with a compute config, custom tags, and an optional priority. They don't touch the scheduler config. |

## Core concepts

A scheduler config has three main sections (all optional):

```yaml
resource_flavors:   # the kinds of compute you have
resource_queues:    # pools with quotas that hold workloads
scheduling_rules:   # how workloads are routed to queues and prioritized
```

### Resource flavor

A named category of compute, defined by a label **`selector`** over the workload's labels, for example `spot` vs. `on-demand`, or a specific GPU reservation. A workload's usage counts against the quota of the flavor it's assigned.

```yaml
resource_flavors:
  - name: spot
    selector:
      - { key: market-type, operator: in, values: [SPOT] }
```

Within a queue's resource group, flavors are tried in listed order and a workload takes the first matching flavor that has quota, so list more specific flavors first. If no flavor matches, the workload is rejected.

### Resource queue

A named pool that holds workloads and caps their usage with **quotas**. A queue's quota is grouped by the resources it covers; for each flavor you set a **`nominal_quota`** per resource.

```yaml
resource_queues:
  - name: research
    preemption:
      within_resource_queue: lower_priority   # higher-priority work can evict lower
    resource_groups:
      - covered_resources: [gpu]
        flavors:
          - name: spot
            resources:
              - { name: gpu, nominal_quota: 64 }
```

Within a flavor, a covered resource with no `nominal_quota` is **unlimited**; set a `nominal_quota` to cap it, or `0` to block it.

A queue with no `resource_groups` is a **`passthrough`** queue: it admits everything routed to it with no quota. With no config at all, every workload is admitted immediately.

### Scheduling rule

Routes an incoming workload to a queue and sets its priority. Each rule has a **`selector`** over the workload's labels, a target **`resource_queue`**, and an optional **`priority_policy`**. Rules are evaluated top to bottom and the **first match wins**; a rule with no selector is a catch-all.

```yaml
scheduling_rules:
  - selector:
      - { key: team, operator: in, values: [research] }
    resource_queue: research
    priority_policy: { default: 50, min: 0, max: 100, on_violation: reject }
  - resource_queue: research        # catch-all for everything else
    priority_policy: { default: 0 }
```

`priority_policy` supplies a `default` priority and bounds requests to `[min, max]`. When a requested priority is out of range, `on_violation: reject` skips the rule and falls through to the next one, while `on_violation: force_update` clamps the priority into range. Priority is a non-negative integer; higher is more important.

:::note
**Flavor and rule selectors:** both resource flavors and scheduling rules use a **`selector`** over the workload's labels. Each is a list of match expressions: `{ key, operator, values }` with operators `in`, `not_in`, `exists`, `does_not_exist`.
:::

### Preemption

A queue with `preemption.within_resource_queue: lower_priority` lets a higher-priority workload evict lower-priority ones in the same queue to free quota.

See the [Config reference](/admin/scheduler/config-reference.md) for every field.

## How scheduling works

1.  A user submits a workload; Anyscale attaches labels to it.
2.  The scheduler evaluates **scheduling rules** top to bottom and routes the workload to a queue with an assigned priority (first match wins; no match → the workload fails).
3.  The workload waits in the queue, ordered by **priority** (highest first), then **FIFO** by **submission time** (oldest first within the same priority).
4.  When a matching flavor has **quota**, the workload is **admitted** and runs, preempting lower-priority work first if the queue allows it.

## How this maps to Kueue

The config mirrors [Kueue](https://kueue.sigs.k8s.io/), so admins who are familiar with Kueue may recognize it:

| Anyscale scheduler | Kueue |
| --- | --- |
| `resource_flavors` | `ResourceFlavor` |
| `resource_queues` | `ClusterQueue` |
| match expressions | label `matchExpressions` |
| `scheduling_rules` | _(Anyscale-specific)_: centralized routing, instead of users picking a queue |

## What's available

The scheduler supports the following:

-   Author, validate, and version configs through [CLI / SDK](/admin/scheduler/get-started.md) and the UI (config, stats, events).
-   One Kueue-compatible config for both VMs and Kubernetes.
-   Declarative compute configs with free pod shapes.
-   Routing with multi-queue and **flavor mixing**; custom resource types and tags.
-   **Integer priorities** and **within-queue preemption**.
-   **Fixed quotas** per queue and flavor.
-   **Capacity reservations** through advanced instance config.
-   **Multi-cloud / multi-cluster**: a new cloud is a cloud resource only; workloads can target any or all clouds.
-   **Kueue and KAI pass-through** on Kubernetes.
-   Observability: queue view, basic stats, real-time events.

## Next steps

-   [Get started](/admin/scheduler/get-started.md): write and apply your first config with the CLI or SDK.
-   [Config reference](/admin/scheduler/config-reference.md): every field, with an annotated example.