Instance ranking
When utilizing the Global resource min/max feature, some workloads may require prioritizing specific worker groups over others. This could involve utilizing capacity reservations assigned as on-demand instances before scaling out to use spot instances or influencing the utilization of a particular node type before another. The instance ranking functionality allows you to specify a ranking order for the worker groups. This powerful feature aids in managing workloads more efficiently and ensures resources are utilized in the most cost-effective way possible, maximizing your resource management strategy.
This feature requires Ray 2.11+.
Usage
The instance ranking feature requires using the Anyscale CLI/Anyscale Python SDK and a compute config YAML definition file. In the compute config YAML definition, a new top-level flags
section is required and includes the order as well as the strategy to use:
flags:
instance_ranking_strategy:
- ranker_type: custom_group_order
ranker_config:
group_order:
- worker-node-type-2
- worker-node-type-1
- worker-node-type-0
- worker-node-type-5
instance_selection_strategy: force_smallest_fit | relaxed
instance_ranking_strategy
Anyscale supports multiple ranker definitions and has one ranking type: custom_group_order
.
Under group_order
, specify the list of worker group names in the order best suited for the workload.
instance_selection_strategy
There are two options for the instance_selection_strategy
.
force_smallest_fit
(Default): This value forces the Anyscale autoscaler always to choose the smallest node, even if a larger node is available in theinstance_ranking_strategy
. It also does not allow GPU workers to be scaled up when a CPU is requested.relaxed
: If this value is set, the Anyscale autoscaler enforces theinstance_ranking_strategy
order, even if a smaller node is available farther down the group order.
Examples
Reservations and spot instances
Requirements: A workload requires between 20 and 50 instances with A100 GPUs. You have a capacity reservation for 20 instances with A100s and would like to utilize spot instances for the additional nodes. The worker nodes are all the same instance type.
Implementation:
- Define a
compute-config.yaml
that looks has your capacity reservation, as well as spot instance worker groups. - Add a section for the global min-max feature and specify the minimum number of GPUs required to obtain 20 of the reserved instances, and a maximum to scale up to 50 instances.
- Create the compute config with Anyscale CLI or SDK.
- AWS (EC2)
- GCP (GCE)
cloud: anyscale_v2_aws_useast2
allowed_azs:
- us-east-2a
- us-east-2b
- us-east-2c
head_node_type:
name: head_node_type
instance_type: m6i.2xlarge
worker_node_types:
- name: gpu_worker_1
instance_type: p4d.24xlarge
min_workers: 0
max_workers: 20
aws_advanced_configurations_json:
CapacityReservationSpecification:
CapacityReservationTarget:
CapacityReservationId: <RESERVATION_ID>
- name: gpu_worker_2
instance_type: p4d.24xlarge
min_workers: 0
max_workers: 50
use_spot: true
aws_advanced_configurations_json:
TagSpecifications:
- ResourceType: instance
Tags:
- Key: as-feature-cross-group-min-count-GPU
Value: "160"
- Key: as-feature-cross-group-max-count-GPU
Value: "400"
- Key: as-feature-multi-zone
Value: "true"
flags:
instance_ranking_strategy:
- ranker_type: custom_group_order
ranker_config:
group_order:
- gpu_worker_1
- gpu_worker_2
cloud: anyscale_v2_gcp_uswest1
allowed_azs:
- us-west1-a
- us-west1-b
- us-west1-c
head_node_type:
name: head_node_type
instance_type: n2-standard-32
worker_node_types:
- name: gpu_worker_1
instance_type: nvidia-tesla-a100
min_workers: 0
max_workers: 20
gcp_advanced_configurations_json:
instanceProperties:
reservationAffinity:
consumeReservationType: SPECIFIC_RESERVATION
key: compute.googleapis.com/reservation-name
values: [<RESERVATION_NAME>]
- name: gpu_worker_2
instance_type: nvidia-tesla-a100
min_workers: 0
max_workers: 50
use_spot: true
gcp_advanced_configurations_json:
instance_properties:
labels:
as-feature-cross-group-min-count-GPU: "160"
as-feature-cross-group-max-count-GPU: "400"
as-feature-multi-zone: "true"
flags:
instance_ranking_strategy:
- ranker_type: custom_group_order
ranker_config:
group_order:
- gpu_worker_1
- gpu_worker_2
Relaxed instance selection
Requirements: A workload requires a minimum of 150 CPUs and a maximum of 500. The workload can utilize a variety of different node types. Capacity reservations exist for 2 of the worker node types and you want to utilize those before switching to spot instances. The capacity reservations are also used for other workloads and may not have availability. The workload can also run in any available AZ.
Implementation:
- Define a
compute-config.yaml
that looks has your capacity reservation as well as spot instance worker groups. - Add a section for the global min-max feature and specify the minimum number of GPUs required to obtain 20 of the reserved instances, and a maximum to scale up to 50 instances.
- Create the compute config with Anyscale CLI or SDK.
- AWS (EC2)
- GCP (GCE)
cloud: anyscale_aws_cloud_uswest2
allowed_azs:
- us-west-2a
- us-west-2b
- us-west-2c
head_node_type:
name: head_node_type
instance_type: m5.2xlarge
worker_node_types:
- name: cpu_worker_1
instance_type: c6a.4xlarge
min_workers: 0
max_workers: 50
use_spot: true
- name: cpu_worker_2
instance_type: c6i.8xlarge
min_workers: 0
max_workers: 10
aws_advanced_configurations_json:
CapacityReservationSpecification:
CapacityReservationTarget:
CapacityReservationId: <RESERVATION_ID>
- name: cpu_worker_3
instance_type: c6a.12xlarge
min_workers: 0
max_workers: 10
aws_advanced_configurations_json:
CapacityReservationSpecification:
CapacityReservationTarget:
CapacityReservationId: <RESERVATION_ID>
aws_advanced_configurations_json:
TagSpecifications:
- ResourceType: instance
Tags:
- Key: as-feature-cross-group-min-count-CPU
Value: "150"
- Key: as-feature-cross-group-max-count-CPU
Value: "500"
- Key: as-feature-multi-zone
Value: "true"
flags:
instance_ranking_strategy:
- ranker_type: custom_group_order
ranker_config:
group_order:
- cpu_worker_2
- cpu_worker_3
- cpu_worker_1
instance_selection_strategy: relaxed
cloud: anyscale-gcp-uscentral1
region: us-central1
allowed_azs:
- us-central1-a
- us-central1-b
- us-central1-c
head_node_type:
name: head_node_type
instance_type: n2-standard-16
worker_node_types:
- name: cpu_worker_type_1
instance_type: c3-standard-8
min_workers: 0
max_workers: 50
use_spot: true
- name: cpu_worker_type_2
instance_type: c3-standard-22
min_workers: 0
max_workers: 10
gcp_advanced_configurations_json:
instanceProperties:
reservationAffinity:
consumeReservationType: SPECIFIC_RESERVATION
key: compute.googleapis.com/reservation-name
values: [<RESERVATION_NAME>]
- name: cpu_worker_type_3
instance_type: c3-standard-44
min_workers: 0
max_workers: 10
gcp_advanced_configurations_json:
instanceProperties:
reservationAffinity:
consumeReservationType: SPECIFIC_RESERVATION
key: compute.googleapis.com/reservation-name
values: [<RESERVATION_NAME>]
gcp_advanced_configurations_json:
instance_properties:
labels:
as-feature-cross-group-min-count-CPU: "150"
as-feature-cross-group-max-count-CPU: "500"
as-feature-multi-zone: "true"
flags:
instance_ranking_strategy:
- ranker_type: custom_group_order
ranker_config:
group_order:
- cpu_worker_type_2
- cpu_worker_type_3
- cpu_worker_type_1
instance_selection_strategy: relaxed
FAQ
Question: What happens if I have a worker node that is not part of a ranking order?
The default ranking is applied. For example, a workload may have several CPU workers groups that are available with capacity reservations that you wish to specify a ranking order on, but the workload also requires a specific GPU instance.