Configure the Helm chart for the Anyscale operator
Configure the Helm chart for the Anyscale operator
This page provides an overview of configuring Helm chart values to control how Helm installs the Anyscale operator on your Kubernetes cluster.
You must configure Helm chart parameters when deploying an Anyscale cloud to Kubernetes for the first time. You can also update settings and apply changes to an existing Anyscale cloud. See Deploy Anyscale on Kubernetes.
This page describes the Helm chart parameters introduced in Anyscale operator version 1.0.0.
If you have a deployment of the Anyscale operator configured with an earlier version, see Migrate from legacy Anyscale operator Helm charts.
Configuration workflow overview
Anyscale provides a values.yaml file with default settings required for the operator to function correctly. Don't modify the Anyscale-provided values file directly. Instead, create your own custom values file (for example, my-custom-values.yaml) to configure settings specific to your environment.
When you run helm install or helm upgrade with your custom values file, Helm merges your custom settings with Anyscale's defaults. This approach ensures that:
- You receive updated default values when upgrading to new operator versions.
- Your custom configurations persist across upgrades.
- You can version-control your configurations separately from Anyscale's defaults.
Anyscale requires default or Anyscale-provided values for some Helm chart parameters. These required values might change between operator versions.
For assistance configuring your Anyscale operator, contact Anyscale support.
See the following resources for more details:
- The Anyscale operator Helm chart GitHub repository includes a
values.yamlfile with parameters and docstrings. - For parameter descriptions and default values, see Kubernetes Helm configuration reference.
Apply custom configurations for your Anyscale operator
Complete the following steps to configure and deploy the Anyscale operator:
Specific configuration requirements and recommendations might differ based on how you've configured your Kubernetes cluster.
Some configurations might require updates to settings or IAM permissions in your cloud provider account.
To successfully customize the Anyscale operator, you should be familiar with your existing Kubernetes environment and have admin permissions in your cloud provider account.
-
Create a custom values file (for example,
my-custom-values.yaml) with your configuration:# Required: Global configuration
global:
cloudDeploymentId: "cldrsrc_abcdefgh12345678ijklmnop12"
cloudProvider: "aws"
aws:
region: "us-west-2"
auth:
iamIdentity: "arn:aws:iam::123456789012:role/anyscale-operator-role"
# Optional: Add custom instance types
workloads:
instanceTypes:
additional:
16CPU-64GB-2xA100:
resources:
CPU: 16
GPU: 2
memory: 64Gi
accelerators:
- A100-40G
nodeSelector:
custom-node-selector: value
tolerations:
- key: "custom-taint"
operator: "Exists"
effect: "NoSchedule"
# Optional: Enable Karpenter support
enableKarpenterSupport: true -
For initial installation, run the following command:
helm install anyscale-operator anyscale/anyscale-operator \
-n <namespace> \
-f my-custom-values.yaml -
To upgrade to a new operator version with your existing configuration:
helm repo update
helm upgrade anyscale-operator anyscale/anyscale-operator \
-n <namespace> \
-f my-custom-values.yaml
Don't use --reuse-values alone when upgrading.
The --reuse-values flag uses old chart default values and doesn't pick up important updates from Anyscale (such as IngressClass collision prevention). Always use the -f flag with your custom values file, which automatically merges with the latest chart defaults.
If you previously used --set flags and need to preserve those values while getting new defaults, use --reset-then-reuse-values instead of --reuse-values.
Configure custom instance types
When running on Kubernetes, an Anyscale instance type maps to a Pod shape with specific CPU, memory, and accelerator resources. You define instance types through the Helm chart to make them available in the Anyscale console and for compute configs.
Anyscale recommends using the workloads.instanceTypes.additional parameter to define custom resources. See Add custom instance types.
You can disable defaults entirely with workloads.instanceTypes.enableDefaults: false. You can also override the workloads.instanceTypes.defaults parameter but this can make troubleshooting with Anyscale support more difficult.
For complete parameter details, see Instance types.
How to size instance type for Anyscale on Kubernetes
You must correctly size your pod specs to ensure they respect CPU and memory reservations required by Anyscale and Kubernetes.
When the Anyscale operator applies a Pod spec to Kubernetes for an Anyscale workload, the operator uses the shapes defined in the Instance Type ConfigMap as an upper bound for the sum of all of the memory requests and limits across all containers in the pod. Anyscale reserves some memory and CPU for critical-path Anyscale sidecar containers and provides the rest to the Ray container to run the primary workload.
Kubernetes reserves a portion of CPU and memory for running the Kubelet and other Kubernetes system components. To accommodate for CPU usage by Kubernetes and Anyscale, the pod shapes defined in the ConfigMap must be smaller than the actual node shape. See Reserve Compute Resources for System Daemons in the Kubernetes documentation for more details.
As an example, an AWS m5.4xlarge virtual machine has 16 vCPUs and 64 GiB memory. Anyscale recommends configuring this instance as a 14CPU-56GB pod. Attempting to define the pod as 16CPU-64GB might make the pod unschedulable.
Add custom instance types
Anyscale recommends using the workloads.instanceTypes.additional parameter to define all your custom instance types. This keeps your configurations separate from Anyscale's defaults.
Instance type names can include alphanumeric characters, dashes, and underscores. The Anyscale console updates with new instance types approximately every 30 seconds.
The following example shows a configuration for an A100 for GKE:
workloads:
instanceTypes:
additional:
16CPU-64GB-2xA100:
resources:
CPU: 16
GPU: 2
memory: 64Gi
accelerators:
- A100-40G
nodeSelector:
cloud.google.com/gke-accelerator: nvidia-tesla-a100
tolerations:
- key: "nvidia.com/gpu"
operator: "Exists"
effect: "NoSchedule"
The accelerators list must contain Ray-supported accelerators.
Accelerators must map to the workloads.accelerator.nodeSelectors values for your cloud provider. Not all accelerators are available in all regions.
See Instance types for a list of supported accelerators by cloud and more details.
For an example configuring support for TPUs on GKE, see Leverage Cloud TPUs on GKE.
Anyscale default instance types for Kubernetes
Anyscale recommends against modifying default instance types. Use the workloads.instanceTypes.additional field to configure your own instance types.
Anyscale defines the following instance types by default:
workloads:
instanceTypes:
enableDefaults: true # Set to false to disable all defaults
defaults:
2CPU-8GB:
resources:
CPU: 2
memory: 8Gi
4CPU-16GB:
resources:
CPU: 4
memory: 16Gi
8CPU-32GB:
resources:
CPU: 8
memory: 32Gi
8CPU-32GB-1xT4:
resources:
CPU: 8
GPU: 1
memory: 32Gi
accelerators:
- T4