Skip to main content

Multi-AZ Services

note

Support for multi-AZ services requires Ray 2.7.1+.

Why multi-AZ services

The main motivation for multi-AZ services is high availability. It allows the service to tolerate single availability zone failure by running deployment replicas in multiple ones.

How to enable multi-AZ services

Step 1

We need to select the availability zones that the cluster can use. By default, it's Any and this means Anyscale can use any zones. See here for instructions on how to specify particular zones to use if you don't want Any.

Step 2

We need to add the following Advanced Instance Configurations to the cluster-wide instance configurations.

{
"TagSpecifications": [
{
"ResourceType": "instance",
"Tags": [
{
"Key": "as-feature-enable-multi-az-serve",
"Value": "true"
}
]
}
]
}

This is what it looks like in the console:

Multi-AZ Services Config

Once multi-AZ service is enabled, Anyscale will try to launch nodes in two availability zones as even as possible. Ray Serve will then try to spread replicas across these two availability zones. On the Ray dashboard, you can figure out the availability zone in which a deployment replica is located by finding the Ray node where the deployment replica runs and then checking the anyscale.com/availability_zone node label of that particular node (This requires Ray 2.9.0+).

Multi-AZ Services Replica Node Multi-AZ Services Node Label

To make use of multi-AZ services, it's recommended to have at least 2 replicas per deployment.

Cost of multi-AZ services

While multi-AZ services improve availability, it is more expensive since cloud provider charges for data transfer cross-AZ (for example, AWS charges $0.01 per GB in each direction). Most of cross-AZ traffic is between Serve proxies and replicas. In addition to that, Ray head node and worker nodes also exchange system messages with each other but the volume is low (tens of Kb per second). To save cost, Ray Serve implements the AZ-aware routing: Serve proxies prefer to forward requests to replicas in the same AZ to avoid cross-AZ traffic. This optimization can eliminate a significant amount of cross-AZ traffic especially when the QPS is low.