Accessing Google Cloud Storage buckets
This page describes how to directly interact with a GCS bucket from your self-hosted Anyscale cloud deployed on GCP.
Determining the cluster's Service Account
By default, Anyscale clusters run with a cloud-specific Service Account.
If your Anyscale clusters run with a custom Service Account, use that Service Account for the rest of the instructions.
To interact with a private Google Cloud Storage bucket you need both permissions and tooling.
Grant permissions to your GCP Service Account
To grant access to your Service Account, either the Anyscale default Service Account or your own, to a bucket, follow the Google instructions.
- Go the Permissions tab of the bucket.
- Click Add.
- Type the Service Account Email as a New principal new principal. If you are using the Anyscale default cloud-specific Service Account, you can find the Service Account Email in the Clouds table on the Configurations page in a column called
Provider Identity
. - Select roles to grant to the Service Account. To give full R/W List access, grant your bucket
Storage Object Admin
andStorage Object Viewer
. - Click Save.
Interacting with your bucket
To interact with a Bucket from the CLI, install gsutil
. Running the following command installs gsutil
on a node:
wget -qO- https://dl.google.com/dl/cloudsdk/channels/rapid/downloads/google-cloud-sdk-359.0.0-linux-x86_64.tar.gz | tar xvz
Afterwards, you can interact with gsutil
, for example, to copy a local file to the Bucket, as follows:
./google-cloud-sdk/bin/gsutil cp <file> gs://<bucket>
If you install gs_util
with pip
, as is the case with runtime_environments
, you may need to add the following to ~/.boto
:
[GoogleCompute]
service_account = default
You can create this file by running printf "[GoogleCompute]\nservice_account = default\n" > ~/.boto