Skip to main content

Storage

While file storage is optimal for storing your code, AI workloads often need access to large amounts of data, whether it's data for training and fine tuning, or a common storage to store model checkpoints.

This guide describes:

  • Different types of storage available on Anyscale
  • How to access these different storage types
  • How to configure them

Available storage options on Anyscale:

  • Local storage for a node
  • Object storage
  • Network file system (NFS) shared across nodes

Local storage for a node

Anyscale provides each node with its own volume and disk and doesn’t share them with other nodes. This storage option enables higher access speed, lower latency, and scalability. Access local storage at /mnt/local_storage. Normally, Anyscale deletes data in the local storage after instances are terminated. To provide a more seamless development workflow, Anyscale Workspace snapshots and persists your data in the project directory.

Anyscale supports the Non-Volatile Memory Express (NVMe) interface to access SSD storage volumes. This support provides additional temporary storage to the instances. See NVMe configuration for details on how to configure it.

Object storage

Anyscale requires users to set up a default object storage bucket during the deployment of each Anyscale CLoud. Users can choose to let Anyscale create a bucket or bring their own bucket. All workspace, job, and service clusters within an Anyscale cloud have permission to read and write to its default bucket.

Use the following environment variables to access the default bucket:

  • ANYSCALE_CLOUD_STORAGE_BUCKET: the name of the default bucket for the cloud.

    • The bucket name format is anyscale-production-data-{cloud_id} if it is created by Anyscale. Cloud ID can be found in the cloud list page in the console UI.
    • The bucket name is defined by users during cloud deployment if users by their own bucket.
  • ANYSCALE_CLOUD_STORAGE_BUCKET_REGION: the region of the default bucket for the cloud.

  • ANYSCALE_ARTIFACT_STORAGE: the URI to the pre-generated path for storing your artifacts while keeping them separate from Anyscale-generated ones.

    • AWS: s3://<bucket_name>/<org_id>/<cloud_id>/artifact_storage/
    • GCP: gs://<bucket_name>/<org_id>/<cloud_id>/artifact_storage/

Within the bucket, Anyscale stores managed data in the {organization_id}/ path. For cloud-specific managed data, Anyscale further groups together the data into an {organization_id}/{cloud_id} path. Anyscale stores some log files in legacy folders.

caution

Anyscale writes system- or user-generated files, for example, log files, to this bucket. Don't delete or edit the Anyscale-managed files, which may lead to unexpected data loss. Deleting the data degrades the Anyscale platform experience for features such as log viewing, log downloading, and others. Use $ANYSCALE_ARTIFACT_STORAGE to separate your files from Anyscale-generated ones.

Anyscale-hosted cloud users only

Anyscale offers 100 GB of free object storage. If you need more storage, contact Anyscale support.

Storage shared across nodes​

Anyscale mounts a Network File System (NFS) system automatically on workspace, job, and service clusters. Anyscale mounts 3 shared storage options by default for common permission groups.

  • /mnt/cluster_storage is accessible to all nodes of a workspace, job, or service cluster.
  • /mnt/user_storage is private to the Anyscale user but accessible from every node of all their workspace, job, and service clusters in the same cloud.
  • /mnt/shared_storage is accessible to all Anyscale users of the same Anyscale cloud. Anyscale mounts it on every node of all the clusters in the same cloud.
warning

NFS storage is accessible to all users on your Anyscale cloud. Don't put any sensitive data or secrets that you don't want users in your cloud to access.

Cluster storage

/mnt/cluster_storage is a directory on NFS that Anyscale mounts on every node of the workspace, job, or service cluster and persists throughout the lifecycle of the cluster. This storage is useful for storing files that the head node and all the worker nodes need to access. For example:

  • TensorBoard logs
  • Common data files that all workers need to access with a stable path

Following are some behaviors to note about cluster storage:

  • Anyscale doesn’t clone the cluster storage when you clone a workspace.
  • New jobs and service updates launch new clusters. /mnt/cluster_storage doesn't persist in these cases.

User storage

/mnt/user_storage is a directory on NFS specific to an Anyscale user. The user who creates the workspace, job, or service cluster can access this storage from every node. This storage is useful for storing files you need to use with multiple workspace, job, or service clusters.

Shared storage

/mnt/shared_storage is a directory on NFS that all Anyscale users of the same Anyscale cloud can access. It's mounted on every node of every Anyscale cluster in the same cloud. This storage is useful for storing model checkpoints and other artifacts that you want to share with your team.

info

NFS storage usually has connection limits. Different cloud providers may have different limits. See Changing the default disk size for more information.

info

To increase the capacity of GCP Filestore instances, see the GCP documentation for more information.

note

Anyscale-hosted clouds use s3fs to mount the shared storage.

Access local storage and storage shared across nodes (NFS-based)

You can only interact with local or NFS-based storage inside running Anyscale clusters.

  • When using workspaces, the easiest way to transfer small files to and from is with the VS Code Web UI. Note: downloading a folder is not supported at the moment.
    • For files in workspace's working directory, you can use workspace CLI to pull and push files.
  • You can commit files to git and run git pull from a workspace, job, or service cluster.
  • For large files, use object storage, for example, Amazon S3 or Google Cloud Storage, and access the data from there.

Access object storage

Anyscale default storage bucket

Anyscale provides a default cloud storage path private to each cloud located at $ANYSCALE_ARTIFACT_STORAGE. All nodes launched within your cloud should have access to read or write files at this path. You can also access object storage outside Anyscale as long as you configure the permissions correctly.

To copy files from your workspace cluster into cloud storage, you can use standard aws s3 cp and gcloud storage cp commands.

echo "hello world" > /tmp/input.txt
aws s3 cp /tmp/input.txt $ANYSCALE_ARTIFACT_STORAGE/saved.txt
warning

Anyscale scopes permissions on the cloud storage bucket backing $ANYSCALE_ARTIFACT_STORAGE to only provide access to the specified path, so calls made to the root of the underlying bucket, for example, HeadObject, may be rejected by Anyscale with an ACCESS_DENIED error. Avoid making calls to any paths that don't explicitly have the $ANYSCALE_ARTIFACT_STORAGE/ prefix.

Private storage buckets

To access private cloud storage buckets that aren't managed by Anyscale, configure permissions using the patterns below.

Grant Anyscale clusters access to the private buckets

Anyscale-hosted Cloud users

This approach doesn't work for Anyscale-hosted clouds. Contact Anyscale support for help with accessing your private buckets securely.

Load credentials from your secret manager

  • Follow the instructions to grant Anyscale clusters access to your secret manager
  • Load credentials to your private buckets from the secret manager.
warning

It is not secure to store credentials in files or code directly. If you have to, use them only for development.

  • Option 1: Add the credentials as tracked environment variables (for example, AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY) under the Dependencies tab of a workspace or in Job or Service APIs
  • Option 2: Bake the credentials as environment variables into container images.

Anyscale automatically propagates them to all Ray workloads run through the workspace, job, or service. Use the bucket directly in your Ray app code, or similar.

warning

Access to $ANYSCALE_ARTIFACT_STORAGE will be blocked as you override the default AWS/GCS credentials. To use private cloud storage buckets concurrently with $ANYSCALE_ARTIFACT_STORAGE, pass the generated access keys or service account into the call to the cloud provider API directly, instead of setting them as process-wide environment variables.

Choosing which storage to use

The choice of storage to use depends on your performance expectation, file sizes, and collaboration needs, security requirements, etc. Key considerations include:

  • NFS is generally slower when a workload generates a large amount of disk IO for both read and write
  • Don't put large files like datasets at terabyte scale in NFS storage. Use object storage, like an S3 bucket, for large files more than 10 GB.
  • To share small files across different workspaces, jobs, or services, user and shared storage are good options.
  • Use cluster storage, mnt/cluster_storage, if you're developing or iterating, for example, if you want to keep model weights loaded without having to set up object storage. However, for production or at high scale, use object storage.
  • For large-scale workloads, consider co-locating compute and storage to avoid large data egress costs.