Skip to main content

Deploy Anyscale on Azure Kubernetes Service (AKS)

Deploy Anyscale on Azure Kubernetes Service (AKS)

Complete the following steps to configure and deploy a new Anyscale cloud on AKS.

1. Install the Anyscale CLI

pip install -U anyscale
anyscale login # authenticate

2. Authenticate the Azure CLI

az login

3. Use the Anyscale Terraform module to create an AKS cluster

Anyscale provides a Terraform module to deploy an AKS cluster and supporting Azure resources.

Enter information about where you want to deploy your AKS cluster:

Clone the Terraform module and navigate to the AKS example:

git clone https://github.com/anyscale/terraform-kubernetes-anyscale-foundation-modules
cd terraform-kubernetes-anyscale-foundation-modules/examples/azure/aks-new_cluster/

Run the following command to create and populate a Terraform variable file:

cat <<EOF > terraform.tfvars
aks_cluster_name = "<your_aks_cluster_name>"
azure_subscription_id = "<your_subscription_id>"
azure_tenant_id = "<your_tenant_id>"
azure_location = "<your_azure_region>"
EOF
note

The Terraform example enables GPU node pools (T4 and A100) by default. To customize or disable GPU pools, set gpu_pool_configs in variables.tf or in your terraform.tfvars (for example, use an empty map {} to disable GPU pools).

Run the following commands to apply the Terraform configuration. This may take several minutes.

terraform init
terraform plan
terraform apply

Collect and enter the following values from your Terraform output:

4. Install additional AKS components

In this step, you install the required components for ingress and GPU support on your AKS cluster.

Run the following command to connect your terminal to the AKS cluster:

az aks get-credentials --resource-group <azure_resource_group_name> --name <your_aks_cluster_name> --overwrite-existing

Install the Nginx ingress controller. The Terraform example repo includes a sample values file sample-values_nginx.yaml. You can use this file or supply your own.

note

The Anyscale operator chart can optionally install the NGINX Ingress Controller as a dependency (ingress-nginx.enabled: true in Helm values). This guide follows the Terraform example and installs NGINX manually so you can use the example's values file.

helm repo add nginx https://kubernetes.github.io/ingress-nginx
helm upgrade ingress-nginx nginx/ingress-nginx \
--version 4.12.1 \
--namespace ingress-nginx \
--values sample-values_nginx.yaml \
--create-namespace \
--install

If you intend to use NVIDIA GPUs in your Anyscale workloads, install the NVIDIA device plugin:

helm repo add nvdp https://nvidia.github.io/k8s-device-plugin
helm upgrade nvdp nvdp/nvidia-device-plugin \
--namespace nvidia-device-plugin \
--version 0.17.1 \
--values values_nvdp.yaml \
--create-namespace \
--install

5. Register the Anyscale cloud resources

Run the following command with the values from your Terraform output. Verify all variables are entered correctly.

anyscale cloud register \
--name <your_cloud_name> \
--region <your_azure_region> \
--provider azure \
--compute-stack k8s \
--azure-tenant-id <your_tenant_id> \
--anyscale-operator-iam-identity <anyscale_operator_principal_id> \
--cloud-storage-bucket-name 'abfss://<blob_storage_name>@<storage_account>.dfs.core.windows.net'

Record the cloud resource ID from the output in the following field:

6. Install and deploy the Anyscale operator on your AKS cluster

In this step, you add the Anyscale operator Helm chart to your AKS cluster, create a values.yaml file that describes your cloud and Azure identity, and install the operator with Helm.

Add the Anyscale operator Helm chart

Run the following command to add the Anyscale operator Helm chart:

helm repo add anyscale https://anyscale.github.io/helm-charts
helm repo update anyscale

Create a values YAML file

Create a values.yaml file. The following example uses the values you provided in earlier steps for a minimal Azure configuration.

global:
cloudDeploymentId: <your_cloud_resource_id>
controlPlaneURL: https://console.azure.anyscale.com
cloudProvider: azure
auth:
iamIdentity: <anyscale_operator_client_id>
audience: api://086bc555-6989-4362-ba30-fded273e432b/.default

workloads:
serviceAccount:
name: anyscale-operator

To customize the Helm chart with custom patches or additional pod shapes, see Configure the Helm chart for the Anyscale operator.

Install the Anyscale operator on AKS

Run the following command to install the Anyscale operator with Helm using your values.yaml file:

helm upgrade anyscale-operator anyscale/anyscale-operator \
--namespace anyscale-operator \
-f values.yaml \
--create-namespace \
--wait \
-i

It may take several minutes for your Anyscale cloud to be ready. You can watch the deployment status with the following command:

kubectl get deployments anyscale-operator -n anyscale-operator -w

7. Verify your Anyscale cloud

After the operator is ready, verify that your cloud is registered and functional:

anyscale cloud verify --name <your_cloud_name>