Skip to main content

Compute configurations for AWS

This page lists common cloud-specific compute configurations for Anyscale clouds deployed on AWS. You specify these options using the Instance config field in the Anyscale console at the node or cluster level.

For a general overview of compute configs, see Compute configuration on Anyscale.

For configuration settings applicable to all Anyscale clouds, see Advanced settings for compute configs on Anyscale.

For a complete reference on what might be configurable, see the AWS docs and contact Anyscale support.

Anyscale does not support setting the following fields:

ClientToken
DryRun
ImageId
InstanceType
Ipv6Addresses
LaunchTemplate
MaxCount
MinCount
PrivateIpAddress
SecurityGroups
UserData

Managing capacity reservations

Securing certain instance types from cloud providers can sometimes be challenging due to high demand or limited availability. With Anyscale, you can leverage your cloud provider capacity reservations, ensuring the availability of required node types for your workloads. To configure capacity reservations for a specific worker node type, modify the advanced configuration through the console UI or by editing the compute config YAML file.

To add a reservation with the console UI, navigate to a worker node and expand the Advanced config section. Under the Instance config tab, input the following JSON, substituting in your specific reservation ID:

{
"CapacityReservationSpecification": {
"CapacityReservationTarget": {
"CapacityReservationId": "RESERVATION_ID"
}
}
}

The following is an example YAML file you can use with the Anyscale CLI:

cloud: CLOUD_NAME
head_node:
instance_type: INSTANCE_TYPE_HEAD
worker_nodes:
- instance_type: INSTANCE_TYPE_ACCELERATED
min_nodes: MIN_NODES
max_nodes: MAX_NODES
advanced_instance_config:
CapacityReservationSpecification:
CapacityReservationTarget:
CapacityReservationId: RESERVATION_ID

For additional details on utilizing capacity reservations on AWS, see the AWS Documentation.

Changing the default disk size

The default disk size for all nodes in an Anyscale cluster is 150 GB. You can change the default disk size for the entire cluster or an individual worker node type.

To modify the default disk size from the console UI, use the Advanced configuration section for the Worker node or the Advanced settings section for the entire cluster. This example increases the default to 500 GB.

{
"BlockDeviceMappings": [
{
"Ebs": {
"VolumeSize": 500,
"VolumeType": "gp3",
"DeleteOnTermination": true
},
"DeviceName": "/dev/sda1"
}
]
}

The following is an example YAML that modifies the disk for all nodes in the Anyscale cluster:

cloud: CLOUD_NAME
head_node:
instance_type: INSTANCE_TYPE_HEAD
worker_nodes:
- instance_type: INSTANCE_TYPE_ACCELERATED
min_nodes: MIN_NODES
max_nodes: MAX_NODES
advanced_instance_config:
BlockDeviceMappings:
- Ebs:
- VolumeSize: 500
VolumeType: gp3
DeleteOnTermination: true
DeviceName: "/dev/sda1"

NVMe configuration

Anyscale supports Non-Volatile Memory Express (NVMe) interface to access SSD storage volumes, which provides additional temporary storage to the instances. This enables higher performance, lower latency, scalability, and support for versatile use cases across a variety of workloads. Anyscale exposes /mnt/local_storage as the mount path in the Ray container by default. For instance types that don't have NVMe, /mnt/local_storage just falls back to the root disk.

You can choose EC2 instance types that have NVMe. See AWS instance store documentation for more details. Anyscale then automatically detects these devices, formats them, and mounts them when the Ray Container starts.

Anyscale configures EC2 instance types that have multiple NVMe devices with a software RAID (RAID 0), which maximizes disk performance.

EC2 NVMe Example

Subnets, security groups, and instance IAM role

Specify the following configurations for the entire cluster (do not specify them for individual node groups unless you have to):

  • Security groups and subnets:

    • any security groups registered with the cloud.
    • any subset of the security groups registered with the cloud.
  • Instance IAM role:

    • any IAM role for the cluster to run with. It must have an instance profile with the same name as the role.

To modify them from the console UI, use the Advanced settings section for the entire cluster.

{
"IamInstanceProfile": {"Arn": "arn:aws:iam::<>:role/<>"},
"SecurityGroupIds": ["security-group-id"],
"SubnetId": "subnet-id",
}

This following sample YAML modifies the security groups, subnets, and instance IAM role for all nodes in the Anyscale cluster:

cloud: CLOUD_NAME
head_node:
instance_type: INSTANCE_TYPE_HEAD
worker_nodes:
- instance_type: INSTANCE_TYPE_ACCELERATED
min_nodes: MIN_NODES
max_nodes: MAX_NODES
advanced_instance_config:
IamInstanceProfile:
Arn: "arn:aws:iam::<>:role/<>"
SecurityGroupIds:
- "security-group-id"
SubnetId: "subnet-id"

Configure Elastic Fabric Adapter (EFA) support

You can configure your Anyscale cluster to use Elastic Fabric Adapter (EFA) to accelerate network communication between nodes in your cluster.

EFA requirements

EFA support has the following requirements:

  • You configure inbound and outbound communication between EFA and the security group used by Anyscale.
  • You must set the environment variable FI_PROVIDER=efa.
  • You must use an image that contains the following libraries:
    • EFA libraries
    • Libfabric libraries
    • NCCL
    • AWS OFI NCCL
note

Anyscale slim base images don't include the necessary libraries by default.

Cluster-level EFA configuration

Anyscale recommends configuring EFA at the cluster level to simplify setup.

note

The console might prompt you to change your configuration if you enable EFA with the incorrect settings for zones.

Make sure you select a single zone and don't enable cross-zone scaling.

To configure EFA for your cluster, complete the following steps:

  1. In your compute config, click Advanced settings.
  2. Select a single zone in the Allowed zones field.
  3. Click Advanced features.
  4. To add EFA to your cluster and use the default values, add the efa_config property with an empty object, as in the following example:
{
"efa_config": {}
}

You can optionally specify the following parameters for efa_config using JSON syntax:

ParameterDescriptionDefault
device_count(Optional) The number of EFA interfaces to use per supported instance.

If 0, disables EFA.
The maximum supported number of EFA interfaces for each supported node type.
placement_group(Optional) The name of the placement group for the EFA interfaces. For recommendations, see AWS docs.None.
security_group_id(Optional) The security group used by EFA.The security group configured for the selected zone.

Instance-level EFA setup

You can optionally configure EFA at the instance level. This provides you with more flexibility to customize your configuration but can be prone to errors.

note

Configuring EFA at the instance level for any worker group overrides cluster-level configuration to all nodes.

To configure EFA for your head node or worker nodes, complete the following steps:

  1. In you compute config, click the name of the node.
  2. Click Advanced config.
  3. In the Instance config tab, add the NetworkInterfaces option using JSON syntax, as in the following example:
{
"NetworkInterfaces": [
{
"Groups": [
"sg-1234" # your security group id
],
"SubnetId": "subnet-1234",
"DeviceIndex": 0,
"InterfaceType": "efa",
"NetworkCardIndex": 0
},
{
"Groups": [
"sg-1234"
],
"SubnetId": "subnet-1234",
"DeviceIndex": 1,
"InterfaceType": "efa",
"NetworkCardIndex": 1
},
{
"Groups": [
"sg-1234"
],
"SubnetId": "subnet-1234",
"DeviceIndex": 1,
"InterfaceType": "efa",
"NetworkCardIndex": 2
},
{
"Groups": [
"sg-1234"
],
"SubnetId": "subnet-1234",
"DeviceIndex": 1,
"InterfaceType": "efa",
"NetworkCardIndex": 3
}
],
"Placement": { "GroupName": "my-placement-group" } # optional but recommended
}
note

You must specify the security group and subnet ID when configuring EFA at the instance level.

You must specify the correct DeviceIndex and NetworkCardIndex for each interface.