Storage configurations
This version of the Anyscale docs is deprecated. Go to the latest version for up to date information.
Anyscale provides out-of-the box storage options including:
- Local storage for a Node
- Storage shared across Nodes
- Object storage
Configuring local storage for a node
To configure root disk or volume of a node, modify the compute config as you specify instance types. If the node has NVMe storage available, it's automatically mounted at /mnt/local_storage
.
Root disk modifications
The default disk size for an Anyscale node is 150 GB. To change the default, use the Advanced Configuration section of the compute config.
- AWS (EC2)
- GCP (GCE)
The following example sets the root drive for all nodes in the cluster to 500 GB:
{
"BlockDeviceMappings": [
{
"Ebs": {
"VolumeSize": 500,
"VolumeType": "gp3",
"DeleteOnTermination": true
},
"DeviceName": "/dev/sda1"
}
]
}
The following example sets the root drive for all nodes in the cluster to 500 GB:
{
"instance_properties": {
"disks": [
{
"boot": true,
"auto_delete": true,
"initialize_params": {
"disk_size_gb": 500
}
}
]
}
}
NVMe configuration
Anyscale supports Non-Volatile Memory Express (NVMe) interface to access SSD storage volumes, which provides additional temporary storage to the instances. This enables higher performance, lower latency, scalability and support for versatile use cases across a variety of workloads. Anyscale exposes /mnt/local_storage as the mount path in the ray container by default. For instance types that don't have NVMe, /mnt/local_storage just falls back to the root disk.
- AWS (EC2)
- GCP (GCE)
If you are using AWS, you can choose the EC2 instance types that have NVMe (refer to AWS instance store documentation for more details). Anyscale then automatically detects the devices, format and mount them when the Ray Container starts.
For EC2 instance types that have multiple NVMe devices, Anyscale also configures them as a software RAID (RAID 0) which maximizes the disk performance.
If you are using GCP, you need to manually specify the number of NVMe devices to attach to the instance (refer to the GCP local SSD documentation for more details). Note that each local SSD has 375 GB. Specify the number of NVMe devices to attach via the Advance Configuration section of the Compute Config. Here is an example:
{
"instance_properties": {
"disks": [
{
"boot": true,
"type": "PERSISTENT",
"initializeParams": {
"diskSizeGb": 150
}
},
{
"type": "SCRATCH",
"interface": "NVME",
"autoDelete": true,
"initializeParams": {
"diskType": "local-ssd"
}
},
{
"type": "SCRATCH",
"interface": "NVME",
"autoDelete": true,
"initializeParams": {
"diskType": "local-ssd"
}
}
]
}
}
Configuring storage shared across nodes
NFS mounts automatically on the Workspace/Job/Service Clusters. Follow the instructions below to opt out of mounting NFS.
/mnt/shared_storage
-- accessible to all the Anyscale users of the same Anyscale Cloud. It's mounted on every Node of all the Clusters in the same Cloud./mnt/user_storage
-- private to the Anyscale user but accessible from every Node of all their Workspaces, Jobs, and Services Clusters./mnt/cluster_storage
-- accessible to all Nodes of a Workspaces, Jobs, and Services Cluster.
Limitations
NFS storage is a single file system per Anyscale Cloud. There are some limitations associated with these storage options based on the underlying cloud provider.
- AWS (EC2)
- GCP (GCE)
AWS has a hard limit of 25,000 “connections” for each file system. Within each Anyscale Cloud, you can connect a total number of 25,000 running instances or nodes to the file system at any moment across all clusters (Workspaces, Services, Jobs) and users.
Customers who deploy clouds using Anyscale managed setup have access to the "Standard (basic)" Filestore tier, which has a limit of 500 connections and 1T storage capacity. Within each Anyscale Cloud, you can connect a total number of 500 running instances or nodes to the file system at any moment across all clusters (Workspaces, Services, Jobs) and users.
While GCP doesn't throttle or rate limit connections that go over this limit, exceeding this limit is likely to cause the instance to become unhealthy and result in higher latencies for operations and potential instance unresponsiveness with too many connections.
For customers who anticipate using more than 500 connections or 1T capacity, you should deploy your Anyscale Clouds using Anyscale cloud registration. If you need to migrate a cloud to the “Enterprise” tier for up to 1,000 connections or increase the storage capacity, please reach out to support team. More information on both tiers is available at https://cloud.google.com/filestore
To increase the capacity of GCP Filestore instances, please refer to the GCP documentation.
Opting out of mounting NFS
You can opt out of mounting NFS for Anyscale Jobs and Services, but not for Workspaces because Workspaces rely on NFS storage to work properly.
To opt out, turn off the NFS Mount flag in the Cluster-wide advanced configuration and use this compute config for the Jobs and Services.
- AWS (EC2)
- GCP (GCE)
{
"TagSpecifications": [
{
"Tags": [
{
"Key": "as-feature-disable-nfs-mount",
"Value": "true"
}
],
"ResourceType": "instance"
}
]
}
{
"instance_properties": {
"labels": {
"as-feature-disable-nfs-mount": "true"
}
}
}
Configure object storage
For every Anyscale Cloud, Anyscale configures a default object storage bucket during the Cloud deployment. No additional configuration in the Compute Config is needed.