What is the global resource scheduler?
The global resource scheduler maximizes utilization and maintains fairness while assigning compute resources from Anyscale machine pools to jobs, workspaces, and services. The global resource scheduler controls the following behaviors for workloads configured to use machine pools:
- Allocates machines from the machine pool to a workload.
- Schedules workloads based on priority, required resources, time, and workload type.
- Evicts machines being used by other workloads to make room for an incoming workload.
- Queues workloads until machines are available.
See Share reserved compute resources with Anyscale machine pools.
How does the global resource scheduler work?
The global resource scheduler is a service in the Anyscale control plane that manages scheduling for machine pool resources.
When a workload requests machine pool instances, the global resource scheduler evaluates the request based on user-defined rules. It then makes a scheduling decision based on workload priority and available resources.
You configure the rules for your machine pools with regards to scheduling prioirties, then set complementary rules in the compute config for your workloads. For example, you can configure the global resource scheduler to reserve up to 25% of the GPUs in a machine pool for development work, but let production jobs or services expand to 100% of GPU capacity when development resources aren't in use.
How does Anyscale queue workloads?
If the machine pool doesn't have sufficient machines for a workload, the global resource scheduler places the workload in a first-in, first-out (FIFO) queue. The workload state reports as STARTING
or PENDING
. Workloads can bypass the queue if they're runnable while earlier submissions aren't.
The scheduler preempts running jobs when a job with higher priority requires resources. When preempted, a job enters the queue with the following behaviors:
- The scheduler enqueues the job using its start time. Because the job was previously running, this typically means the job is at the front of the queue relative to other jobs of the same priority.
- The job attempts to acquire the resources defined by the
min_nodes
parameter. - The
workload_recovering_timeout
parameter defines how long the job continues to try to acquire resources. The job terminates due to timeout if it can't acquire enough resources in the defined window. - If the
max_retries
parameter is greater than 0, the scheduler adds a new job to the end of the queue with a new workload start time.
Set a workload timeout
Anyscale respects the workload_recovering_timeout
and workload_starting_timeout
parameters when scheduling with machine pools.
Set timeout flags in the compute configuration to help manage queuing and recovery, as in the following example:
flags:
# If the cluster doesn't satisfy min_resources for > 5 minutes while
# in a RUNNING state, then terminate the workload (and re-queue
# it if it's a job).
workload_recovering_timeout: 5m
# If the cluster doesn't satisfy min_resources for > 24h while in
# a STARTING state, then terminate the workload.
workload_starting_timeout: 24h
These timeouts apply only to jobs and workspaces, not to services.
Fall back to cloud capacity
You can optionally configure the global resource scheduler to fall back to spot or on-demand cloud capacity.
Anyscale recommends configuring workloads using the following ranking strategy for machines:
- Machine pool instances
- Spot instances
- On demand instances
This configuration maximizes utilization for reserved instances while permitting additional dynamic resource allocation from the cloud.
Use the instance_ranking_strategy
flag to configure this behavior.
The following example defines three worker node groups for CPU compute and ranks them to favor machine pools first:
cloud: CLOUD_NAME
head_node:
instance_type: "m5.2xlarge"
worker_nodes:
- name: reserved-H100
instance_type: RESERVED-H100
min_nodes: 0
max_nodes: 2
cloud_deployment:
machine_pool: reserved-capacity
- name: fallback-H100
min_nodes: 0
max_nodes: 2
instance_type: "p5.48xlarge"
market_type: PREFER_SPOT
flags:
min_resources:
GPU: 8
instance_ranking_strategy:
- ranker_type: custom_group_order
ranker_config:
group_order:
- reserved-H100
- fallback-H100/spot
- fallback-H100/on-demand
enable_replacement: true
replacement_threshold: 30m
Due to limited availability of GPU resources, spot instance and on demand fall back configurations are often most practical for CPU-only worker groups.
Machine pool observability
Use the following Anyscale CLI syntax to describe your machine pool:
anyscale machine-pool describe --name <machine-pool-name>
Returned details can help you debug scheduling behavior, view pending machine requests, and view recent cloud instance launch failures.
When the global resource schedule evicts a machine from a workload, it sends an eviction notice to the impacted workload. Eviction notices detail the cloud, project, user, and workload name that caused the eviction event. You can use these details to further tune priority ranking.
Scheduling rules
Scheduling rules follow the Kubernetes selector language. Available labels include:
Label | Description |
---|---|
workload-type | Either job , service , or workspace . |
cloud | The name of an Anyscale cloud. |
project | The name of a project in an Anyscale cloud. |
The following table contains examples of using this syntax:
Objective | Example selector syntax |
---|---|
Select workloads of type workspace in cloud named dev-cloud | workload-type in (workspace), cloud in (dev-cloud) |
Select workloads of type job | workload-type=job |
Select workloads of type job or service | workload-type in (job,service) |
Select all workloads (all workloads have a workload-type label) | workload-type |