Skip to main content

Quick method for adding an S3 bucket


Setting up an S3 bucket is necessary for doing many things on Ray on Anyscale, including:

  • General data storage that lasts beyond cluster lifespan
  • Doing model checkpointing with Ray Tune or RLlib

Use this page if you:

  • Bring your own compute (AWS).
  • Only need one bucket for your Ray application on Anyscale.

Setting up


  1. Your AWS Account ID. Should be in upper right-hand corner of your AWS console when you click on your username dropdown.
  2. An AWS user with EC2/IAM/S3 access.

Now let's start the steps. Note that directions are slightly different depending on your cloud type.

Steps: Bring your own cloud (AWS)

  1. Log into your AWS account
  2. Visit this link to be guided through creating an S3 bucket that will be read/writeable by your Anyscale compute instances. You will input:
    • Region you want your S3 bucket in
    • Anyscale cloud ID (in lowercase)
      • Easy way to lowercase your cloud ID:
        • python -c "import sys; print(sys.argv[1].lower())" cld_XXXXXXXXXXXXXXXXXXXXXXXX
    • A "ClusterRole" name
      • Format: arn:aws:iam::<your_aws_account_id>:role/ray-autoscaler-v1
  3. In your compute config: add this "ClusterRole" information so your cloud can give your Ray worker nodes access.
    • JSON blob to be added to the Advanced Configuration section of your compute config: { "IamInstanceProfile": { "Arn": "<ClusterRole ARN string with 'role' replaced by 'instance-role'>" } }. Learn more here about advanced configuration of compute configs.
      • To clarify, in the compute config, you want your ClusterRole ARN string to be slightly different than the one you input into the form: arn:aws:iam::<your_aws_account_id>:instance-profile/ray-autoscaler-v1
      • In sed parlance: s/role/instance-profile/g

For example ...

What if I want a more advanced setup?

If you'd like to assign compute roles that have more complex setups, or bring your own IAM role for compute instances, that's totally fine too.