Skip to main content

Multi-zone compute configs user guide

Check your docs version

This version of the Anyscale docs is deprecated. Go to the latest version for up to date information.

warning

This feature will be deprecated soon. Please use the "Enable cross-zone scaling" checkbox in the UI or the enable_cross_zone_scaling field in the compute config instead.

Compute capacity is often difficult to find, especially for specific instance types like those with GPUs. The Anyscale platform offers several features to configure your compute config for multiple zones.

Once configured, Anyscale uses various internal metrics, including instance type, zones requested, and other intelligence to increase the probability of success in provisioning the type of instances you want.

Multi-zone search

Usage

Step 1

Via the Anyscale Console in the Compute Config screen you will see a drop down where Zones are multi-selectable as well as an additional Any option.

Simply select two or more Zones (or Any if you want Anyscale to look across the entire Region).

Multizone Configuration Cluster Wide

Step 2

In addition to the zone configuration you will have to add the following cluster-level Advanced Configuration.

{
"TagSpecifications": [
{
"ResourceType": "instance",
"Tags": [
{
"Key": "as-feature-multi-zone",
"Value": "true"
}
]
}
]
}
If you prefer to use the Anyscale CLI or SDK, click here to view an example YAML.
cloud: anyscale_v2_default_cloud_vpn_us_east_2 # You may specify `cloud_id` instead
allowed_azs:
- us-east-2a
- us-east-2b
head_node_type:
name: head_node_type
instance_type: m5.2xlarge
worker_node_types:
- name: cpu_worker
instance_type: m5.4xlarge
min_workers: 2
max_workers: 10
use_spot: true
- name: gpu_worker
instance_type: g4dn.4xlarge
min_workers: 0
max_workers: 10
aws_advanced_configurations_json:
TagSpecifications:
- ResourceType: instance
Tags:
- Key: as-feature-multi-zone
Value: "true"

As you already create a compute configuration with the JSON block via console, you can use that compute config in your Anyscale cluster, job or service YAML: such as “compute_config: multi-az”.

If you are using Anyscale SDK, you can either use cluster_compute_id with your compute configuration id, or use cluster_compute_config as a dict based on ClusterComputeConfig

Monitoring

The Autoscaler logs will help you identify where availability zones your worker nodes were launched in:

Multizone Event Logs

FAQ

Question: How does it work?

The Anyscale platform examines all variables in the Compute Config and uses them to make the right sequence of provisioning requests to the underlying Cloud Service Provider. These variables include, but are not limited to:

  • Desired Instance Type for your Head/Worker Nodes.
  • On-demand, Spot, or Spot with Fallback to On-demand.
  • Regional/Zonal Instance Type availability.

For example, GPUs are typically more difficult to acquire than pure CPU machines, the Anyscale platform tunes the above variables to seek GPU-based machines first. Then, depending on the success of provisioning and the zones configured, the Anyscale platform will launch easier-to-acquire Instance Types in the same Zone.

The exact algorithm is constantly evolving and may be different for every customer/use-case at any point in time.

If you require and/or are interested in an in-depth answer please reach out to your Anyscale account team to coordinate a deeper conversation with Anyscale Engineering.

Question: For Clusters powering Services does it take into account dimensions such as Replicas and traffic?

It does not. Currently, application level resource utilization is considered as part of zone selection. The pattern of Any/Multizone configurations for Anyscale Services is different than batch inference, training, or fine-tuning Jobs.

Question: What happens if I specify Any but with Multi-zone turned “off”?

Any controls which Zones Anyscale will search for capacity in during initial Cluster startup. Multi-zone controls how “sticky” your Cluster is to the initial Zone that the Cluster starts in (where the Head node is in).

With Multi-zone turned off, Anyscale will only look in the Zone that your Head Node starts in and all subsequent capacity will be kept to the same Zone. With Multi-zone on, the Anyscale platform has the ability to deploy Worker nodes in the best zone with sufficient capacity.