Skip to main content

Instance ranking

When utilizing the Global resource min/max feature, some workloads may require prioritizing specific worker groups over others. This could involve utilizing capacity reservations assigned as on-demand instances before scaling out to use spot instances or influencing the utilization of a particular node type before another. The instance ranking functionality allows you to specify a ranking order for the worker groups. This powerful feature aids in managing workloads more efficiently and ensures resources are utilized in the most cost-effective way possible, maximizing your resource management strategy.

info

This feature requires Ray 2.11+.

Usage

The instance ranking feature requires using the Anyscale CLI/Anyscale Python SDK and a compute config YAML definition file. In the compute config YAML definition, a new top-level flags section is required and includes the order as well as the strategy to use:

flags:
instance_ranking_strategy:
- ranker_type: custom_group_order
ranker_config:
group_order:
- worker-node-type-2
- worker-node-type-1
- worker-node-type-0
- worker-node-type-5
instance_selection_strategy: force_smallest_fit | relaxed

instance_ranking_strategy

Anyscale supports multiple ranker definitions and has one ranking type: custom_group_order.

Under group_order, specify the list of worker group names in the order best suited for the workload.

instance_selection_strategy

There are two options for the instance_selection_strategy.

  • force_smallest_fit (Default): This value forces the Anyscale autoscaler always to choose the smallest node, even if a larger node is available in the instance_ranking_strategy. It also does not allow GPU workers to be scaled up when a CPU is requested.
  • relaxed: If this value is set, the Anyscale autoscaler enforces the instance_ranking_strategy order, even if a smaller node is available farther down the group order.

Examples

Reservations and spot instances

Requirements: A workload requires between 20 and 50 instances with A100 GPUs. You have a capacity reservation for 20 instances with A100s and would like to utilize spot instances for the additional nodes. The worker nodes are all the same instance type.

Implementation:

  1. Define a compute-config.yaml that looks has your capacity reservation, as well as spot instance worker groups.
  2. Add a section for the global min-max feature and specify the minimum number of GPUs required to obtain 20 of the reserved instances, and a maximum to scale up to 50 instances.
  3. Create the compute config with Anyscale CLI or SDK.
cloud: anyscale_v2_aws_useast2
allowed_azs:
- us-east-2a
- us-east-2b
- us-east-2c
head_node_type:
name: head_node_type
instance_type: m6i.2xlarge
worker_node_types:
- name: gpu_worker_1
instance_type: p4d.24xlarge
min_workers: 0
max_workers: 20
aws_advanced_configurations_json:
CapacityReservationSpecification:
CapacityReservationTarget:
CapacityReservationId: <RESERVATION_ID>
- name: gpu_worker_2
instance_type: p4d.24xlarge
min_workers: 0
max_workers: 50
use_spot: true
aws_advanced_configurations_json:
TagSpecifications:
- ResourceType: instance
Tags:
- Key: as-feature-cross-group-min-count-GPU
Value: "160"
- Key: as-feature-cross-group-max-count-GPU
Value: "400"
- Key: as-feature-multi-zone
Value: "true"
flags:
instance_ranking_strategy:
- ranker_type: custom_group_order
ranker_config:
group_order:
- gpu_worker_1
- gpu_worker_2

Relaxed instance selection

Requirements: A workload requires a minimum of 150 CPUs and a maximum of 500. The workload can utilize a variety of different node types. Capacity reservations exist for 2 of the worker node types and you want to utilize those before switching to spot instances. The capacity reservations are also used for other workloads and may not have availability. The workload can also run in any available AZ.

Implementation:

  1. Define a compute-config.yaml that looks has your capacity reservation as well as spot instance worker groups.
  2. Add a section for the global min-max feature and specify the minimum number of GPUs required to obtain 20 of the reserved instances, and a maximum to scale up to 50 instances.
  3. Create the compute config with Anyscale CLI or SDK.
cloud: anyscale_aws_cloud_uswest2
allowed_azs:
- us-west-2a
- us-west-2b
- us-west-2c
head_node_type:
name: head_node_type
instance_type: m5.2xlarge
worker_node_types:
- name: cpu_worker_1
instance_type: c6a.4xlarge
min_workers: 0
max_workers: 50
use_spot: true
- name: cpu_worker_2
instance_type: c6i.8xlarge
min_workers: 0
max_workers: 10
aws_advanced_configurations_json:
CapacityReservationSpecification:
CapacityReservationTarget:
CapacityReservationId: <RESERVATION_ID>
- name: cpu_worker_3
instance_type: c6a.12xlarge
min_workers: 0
max_workers: 10
aws_advanced_configurations_json:
CapacityReservationSpecification:
CapacityReservationTarget:
CapacityReservationId: <RESERVATION_ID>
aws_advanced_configurations_json:
TagSpecifications:
- ResourceType: instance
Tags:
- Key: as-feature-cross-group-min-count-CPU
Value: "150"
- Key: as-feature-cross-group-max-count-CPU
Value: "500"
- Key: as-feature-multi-zone
Value: "true"
flags:
instance_ranking_strategy:
- ranker_type: custom_group_order
ranker_config:
group_order:
- cpu_worker_2
- cpu_worker_3
- cpu_worker_1
instance_selection_strategy: relaxed

FAQ

Question: What happens if I have a worker node that is not part of a ranking order?

The default ranking is applied. For example, a workload may have several CPU workers groups that are available with capacity reservations that you wish to specify a ranking order on, but the workload also requires a specific GPU instance.