Storage
Storage
While file storage is optimal for storing your code, AI workloads often need access to large amounts of data, whether it's data for training and fine-tuning, or a common storage to store model checkpoints.
This guide describes:
- Different types of storage available on Anyscale.
- How to access these different storage types.
- How to configure them.
Available storage options on Anyscale:
- Local storage for a node.
- Object storage.
- Network file system (NFS).
Local storage for a node
Anyscale provides each node with its own volume and disk and doesn't share them with other nodes. This storage option enables higher access speed, lower latency, and scalability. Access local storage at /mnt/local_storage. Normally, Anyscale deletes data in the local storage after instances are terminated. To provide a more seamless development workflow, Anyscale workspaces snapshot and persist your data in the project directory.
Anyscale supports the Non-Volatile Memory Express (NVMe) interface to access SSD storage volumes. This support provides additional temporary storage to the instances. For details on how to configure the NVMe for your cloud, see Compute configurations.
Object storage
Anyscale requires users to set up a default object storage container during the deployment of each Anyscale cloud. Users can choose to let Anyscale create a storage container or bring their own. All workspace, job, and service clusters within an Anyscale cloud have permission to read and write to its default object storage container.
Use the following environment variables to access the default storage container:
-
ANYSCALE_CLOUD_STORAGE_BUCKET: the name of the default storage container for the cloud.- The storage container name format is
anyscale-production-data-{cloud_id}if Anyscale creates it. You can find the cloud ID in the cloud list page in the console UI. - Users provide the URI during cloud deployment if they bring their own storage container.
- The storage container name format is
-
ANYSCALE_CLOUD_STORAGE_BUCKET_REGION: the region of the default storage container for the cloud. -
ANYSCALE_ARTIFACT_STORAGE: the URI to the pre-generated path for storing your artifacts while keeping them separate from Anyscale-generated ones.- AWS:
s3://<bucket_name>/<org_id>/<cloud_id>/artifact_storage/ - Google Cloud:
gs://<bucket_name>/<org_id>/<cloud_id>/artifact_storage/ - Azure:
abfss://<container-name>@<storage-account-name>.dfs.core.windows.net
- AWS:
Within the bucket, Anyscale stores managed data in the {organization_id}/ path. For cloud-specific managed data, Anyscale further groups together the data into an {organization_id}/{cloud_id} path. Anyscale stores some log files in legacy folders.
Anyscale writes system- or user-generated files, for example, log files, to this bucket. Don't delete or edit the Anyscale-managed files, which may lead to unexpected data loss. Deleting the data degrades the Anyscale platform experience for features such as log viewing, log downloading, and others. Use $ANYSCALE_ARTIFACT_STORAGE to separate your files from Anyscale-generated ones.
Anyscale offers 100 GB of free object storage. If you need more storage, contact Anyscale support.
Storage shared across nodes
Anyscale mounts three shared storage locations to each Anyscale cluster. These shared storage locations use the default object storage location configured for your Anyscale cloud.
The following table contains the paths for these locations and describes their scoping and example uses. Note that the indicated scope only refers to how the path maps back to a dedicated directory within the default storage location. Because all clusters in an Anyscale cloud have read and write access to this location, all files and data stored in shared storage locations are discoverable and readable by all users in an Anyscale cloud.
Anyscale previously recommended deploying all clouds with a Network File System (NFS), and continues to support optional NFS configuration. Clouds deployed with NFS continue to use it for some storage tasks, including shared storage locations.
| Path | Scoping | Description |
|---|---|---|
/mnt/cluster_storage | The lifecycle of an Anyscale cluster. | Use this location to store files that all nodes in the cluster might need to access, such as TensorBoard logs or data files downloaded from the internet. |
/mnt/user_storage | The ID of the user that launched the cluster. | Use this location to store files that you need to use with multiple workspaces, jobs, or services, such as custom init scripts or commonly used Python dependency management files. |
/mnt/shared_storage | The Anyscale cloud deployment. | Use this location to store files that you want to share with other users in your Anyscale cloud, such as model checkpoints. |
Anyscale workspaces maintain the link to data stored in /mnt/cluster_storage between cluster restarts. When you clone a workspace, Anyscale doesn't clone this directory.
Anyscale service updates and job runs launch new clusters, and files in /mnt/cluster_storage don't persist.
Access local storage and storage shared across nodes (NFS-based)
You can only interact with local or NFS-based storage inside running Anyscale clusters.
- When using workspaces, the easiest way to transfer small files to and from is with the VS Code Web UI. Note: downloading a folder isn't supported.
- For files in workspace's working directory, you can use workspace CLI to pull and push files.
- You can commit files to git and run
git pullfrom a workspace, job, or service cluster. - For large files, use object storage, for example, Amazon S3 or Google Cloud Storage, and access the data from there.
Access object storage
Anyscale default storage container
Anyscale provides a default cloud storage path private to each cloud located at $ANYSCALE_ARTIFACT_STORAGE. All nodes launched within your cloud should have access to read or write files at this path. You can also access object storage outside Anyscale as long as you configure the permissions correctly.
To copy files from your workspace cluster into cloud storage, you can use standard cloud CLI commands such as aws s3 cp, gcloud storage cp, and azcopy copy.
Anyscale scopes permissions on the cloud storage bucket backing $ANYSCALE_ARTIFACT_STORAGE to only provide access to the specified path, so calls made to the root of the underlying bucket, for example, HeadObject, may be rejected by Anyscale with an ACCESS_DENIED error. Avoid making calls to any paths that don't explicitly have the $ANYSCALE_ARTIFACT_STORAGE/ prefix.
Private storage containers
To access private cloud storage buckets that aren't managed by Anyscale, configure permissions using one of the patterns below.
Grant Anyscale clusters access to the private container
- Access blob storage and ADLS
- Access Amazon S3 buckets
- Access Google Cloud Storage buckets
- Object storage and IAM roles for Kubernetes deployments
This approach doesn't work for Anyscale-hosted clouds. Contact Anyscale support for help with accessing your private buckets securely.
Load credentials from your secret manager
- Follow the instructions to grant Anyscale clusters access to your secret manager. See Secret management on Anyscale.
- Load credentials to your private storage containers from the secret manager.
(Not recommended) Use environment variables or plaintext
Anyscale doesn't recommend storing credentials in files or code, or setting them as environment variables.
These patterns can be helpful for gaining access to storage quickly in development, but lack security. Anyscale recommends cloud IAM mapping for managing access to cloud object storage. See Anyscale cloud IAM mapping.
Anyscale sets default environment variables for storage credentials and uses these to access $ANYSCALE_ARTIFACT_STORAGE. You can set environment variables for your chosen cloud provider to access other object storage, but this overrides the default credentials and blocks access to $ANYSCALE_ARTIFACT_STORAGE.
Choose which storage to use
The choice of storage to use depends on your performance expectation, file sizes, and collaboration needs, security requirements, etc. Key considerations include:
- NFS is generally slower when a workload generates a large amount of disk IO for both read and write.
- Don't put large files like datasets at terabyte scale in NFS storage. Use object storage, like an S3 bucket, for large files more than 10 GB.
- To share small files across different workspaces, jobs, or services, user and shared storage are good options.
- Use cluster storage,
mnt/cluster_storage, if you're developing or iterating, for example, if you want to keep model weights loaded without having to set up object storage. However, for production or at high scale, use object storage. - For large-scale workloads, consider co-locating compute and storage to avoid large data egress costs.