Integrate the Anyscale operator with Kueue Scheduler

info

Kueue integration for Anyscale on Kubernetes is in developer preview.

This guide explains how to integrate your Anyscale Kubernetes deployments with Kueue to leverage Kubernetes-native queueing and gang scheduling capabilities.

Prerequisites

Before you begin, ensure that:

You already configured the Kubernetes cluster is with Kueue.
You've created at least one Kueue queue, for example, user-queue.

Configuring Anyscale to use Kueue

In the compute config, configure the cluster advanced configuration to instruct Anyscale to integrate with your Kueue queue.

While it's possible to set Kueue configurations at the worker group level, you will run into issues with gang scheduling guarantees if worker groups with minimum pods belong to different queues. All worker groups with a minimum pod count must belong to the same Kueue queue to ensure the cluster becomes healthy with the minimum expected resources. Groups can belong to other Kueue queues if they have a minimum of 0 pods or if the worker groups are intended for autoscaling.

Set the queue label

Specify the Kueue queue by adding the following labels to your advanced configuration:

{
  "metadata": {
    "labels": {
      "kueue.x-k8s.io/queue-name": "user-queue"
    }
  }
}

Replace user-queue with the name of your queue.

How Anyscale works with Kueue

When you configure the queue label kueue.x-k8s.io/queue-name, Anyscale automatically integrates resource scheduling requests with Kueue in the following manner:

Batching requests:
Anyscale groups minimum-resource scaling requests (min_count) into a single batch request (gang scheduling).
Pod grouping:
Anyscale sets two critical values on pods to enable gang scheduling in Kueue:
- Label: kueue.x-k8s.io/pod-group-name. Set as {cluster-id}-min.
- Annotation: kueue.x-k8s.io/pod-group-total-count. Set as the total number of pods in the scheduled batch.
Automatic pod specifications:
Pods generated by Anyscale automatically include these required labels and annotations, in addition to maintaining user-specified queue labels.

Managed labels and annotations

Anyscale explicitly manages the following labels and annotations required by Kueue:

Type	Key	Description
Label	`kueue.x-k8s.io/pod-group-name`	Identifier for a group of pods scheduled together.
Annotation	`kueue.x-k8s.io/pod-group-total-count`	Total number of pods in the scheduled pod group.

You can set additional labels or annotations described in the Kueue documentation as needed.

Autoscaling behavior

For autoscaling requests, that is, non-minimum resource requests, Anyscale explicitly leaves these values unset:

Label: kueue.x-k8s.io/pod-group-name
Annotation: kueue.x-k8s.io/pod-group-total-count

This approach causes each autoscaled pod to become an independent scheduling unit in the Kueue queue.

Limitations and notes

Resource constraint limitations:
When integrating Anyscale with Kueue, the Ray cluster's compute configuration can't include resource-based minimum flags, like min_cpu, min_memory, etc. Anyscale only supports min_count. Any update to min_count requires that you relaunch the cluster.

Prerequisites​

Configuring Anyscale to use Kueue​

Set the queue label​

How Anyscale works with Kueue​

Managed labels and annotations​

Autoscaling behavior​

Limitations and notes​