Skip to main content

Deploy Anyscale on Google Kubernetes Engine (GKE)

Deploy Anyscale on Google Kubernetes Engine (GKE)

Complete the following steps to configure and deploy a new Anyscale cloud on GKE.

1. Install the Anyscale CLI

pip install -U "anyscale[gcp]"
anyscale login # authenticate

2. Authenticate the gcloud CLI

Prepare a Google Cloud project for Anyscale to use, and install the gcloud CLI if you haven't. See the Google Cloud instructions for installing the gcloud CLI.

note

Before you continue, make sure your Google Cloud credentials have the Owner role on the project you want Anyscale to use. See Configure Google Cloud resources for an Anyscale cloud.

3. Use the Anyscale Terraform module to create a GKE cluster

Anyscale provides a Terraform module to deploy a GKE cluster and supporting Google Cloud resources.

note

To use an existing GKE cluster, follow the existing GKE cluster example or see the Anyscale Operator documentation and the Anyscale Terraform repository.

Enter information about where you want to deploy your GKE cluster:

Clone the Terraform module and navigate to the GKE example:

git clone https://github.com/anyscale/terraform-kubernetes-anyscale-foundation-modules
cd terraform-kubernetes-anyscale-foundation-modules/examples/gcp/gke-new_cluster/

Run the following command to create and populate a Terraform variable file:

cat <<EOF > terraform.tfvars
google_project_id = "<your_google_project_id>"
google_region = "<your_google_region>"
gke_cluster_name = "<your_gke_cluster_name>"
EOF
note

The Terraform example enables GPU node pools (T4) by default. To customize or disable GPU pools, set gpu_instance_configs in your terraform.tfvars (for example, use an empty map {} to disable GPU pools).

Run the following commands to apply the Terraform configuration. This may take several minutes.

terraform init
terraform plan
terraform apply
note

You may need to enable some Google Cloud APIs for the Terraform to apply successfully.

Collect and enter the following values from your Terraform output:

4. Install additional GKE components

In this step, you install the Nginx ingress controller for externally facing load balancing on your GKE cluster. For more information about customizing ingress, see the Anyscale Terraform repository.

Run the following command to connect your terminal to the GKE cluster:

gcloud container clusters get-credentials <your_gke_cluster_name> --region <your_google_region> --project <your_google_project_id>
note

You may need to install the gke-gcloud-auth-plugin if it isn't already installed.

Install the Nginx ingress controller. The Terraform example repo includes a sample values file sample-values_nginx_gke_public.yaml. You can use this file or supply your own.

note

The Anyscale operator chart can optionally install the NGINX Ingress Controller as a dependency (ingress-nginx.enabled: true in Helm values). This guide follows the Terraform example and installs NGINX manually so you can use the example's values file.

helm repo add nginx https://kubernetes.github.io/ingress-nginx
helm upgrade ingress-nginx nginx/ingress-nginx \
--version 4.12.1 \
--namespace ingress-nginx \
--values sample-values_nginx_gke_public.yaml \
--create-namespace \
--install

5. Register the Anyscale cloud resources

Run the following command with the values from your Terraform output. Verify all variables are entered correctly.

anyscale cloud register \
--name <your_cloud_name> \
--provider gcp \
--region <your_google_region> \
--compute-stack k8s \
--kubernetes-zones us-central1-a,us-central1-b \
--anyscale-operator-iam-identity anyscale-gke-nodes@<your_google_project_id>.iam.gserviceaccount.com \
--cloud-storage-bucket-name gs://<your_storage_bucket_name>

Record the cloud resource ID from the output in the following field:

6. Install and deploy the Anyscale operator on your GKE cluster

In this step, you add the Anyscale operator Helm chart to your GKE cluster, create a values.yaml file that describes your cloud and Google Cloud identity, and install the operator with Helm.

Add the Anyscale operator Helm chart

Run the following command to add the Anyscale operator Helm chart:

helm repo add anyscale https://anyscale.github.io/helm-charts
helm repo update anyscale

Create a values YAML file

Create a values.yaml file. The following example uses the values you provided in earlier steps for a minimal Google Cloud configuration.

global:
cloudDeploymentId: <your_cloud_resource_id>
cloudProvider: gcp
auth:
iamIdentity: anyscale-gke-nodes@<your_google_project_id>.iam.gserviceaccount.com

workloads:
serviceAccount:
name: anyscale-operator

To customize the Helm chart with custom patches or additional pod shapes, see Configure the Helm chart for the Anyscale operator. To enable TPU support, see Leverage Cloud TPUs on GKE.

Install the Anyscale operator on GKE

Run the following command to install the Anyscale operator with Helm using your values.yaml file.

helm upgrade anyscale-operator anyscale/anyscale-operator \
--namespace anyscale-operator \
-f values.yaml \
--create-namespace \
--wait \
-i

Bind the workload identity

Run the following command to bind the Google Cloud service account to the Kubernetes service account for workload identity:

gcloud iam service-accounts add-iam-policy-binding anyscale-gke-nodes@<your_google_project_id>.iam.gserviceaccount.com \
--role roles/iam.workloadIdentityUser \
--member "serviceAccount:<your_google_project_id>.svc.id.goog[anyscale-operator/anyscale-operator]" \
--project <your_google_project_id>

It may take several minutes for your Anyscale cloud to be ready. You can watch the deployment status with the following command:

kubectl get deployments anyscale-operator -n anyscale-operator -w

7. Verify your Anyscale cloud

After the operator is ready, verify that your cloud is registered and functional:

anyscale cloud verify --name <your_cloud_name>