Customize the Anyscale operator Helm chart
You can customize the Anyscale operator Helm chart to meet the needs of your specific deployment of Anyscale on Kubernetes.
This page refers to some of the configuration options that are available, but additional customization is possible.
Anyscale recommends primarily making changes to the Helm chart before deploying new versions of the Anyscale operator. Inline arguments for variables listed in the values file are supported as well as live updates to resources like the Patch Config map and the Instance Type map. However, subsequent deployments may overwrite inline or inplace updates.
Installation variables
Parameter | Description | Default |
---|---|---|
cloudDeploymentId | Specifies your Anyscale cloud deployment ID. Created during cloud register . | "" |
cloudProvider | Specifies your cloud provider (aws , gcp , or generic ). Specifying a cloud provider enables identity | verification between the operator and Anyscale control plane. |
anyscaleCLIToken | If not using AWS or GCP identity verification, use an Anyscale CLI token for verification of the operator. | "" |
region | The region where your Kubernetes cluster is running. | "" |
operatorReplicas | Number of Anyscale operator replicas. | 1 |
enableKarpenterSupport | Enables integration with the AWS Karpenter autoscaler. | false |
Instance type configuration
When running on top of Kubernetes, an Anyscale "instance type" maps to a Pod shape. The cloud administrator defines instance types when setting up the Anyscale operator through either the Helm chart options or out-of-band by editing the instance-types
ConfigMap that the Helm chart creates.
Anyscale recommends using the Helm chart to manage instance types, as this approach ensure that Helm upgrades don't overwrite manual changes to the ConfigMap.
Default instance types
Each instance type defined in the ConfigMap is visible in the Anyscale UI through a drop-down list. Users can select these instance types when submitting workloads. Users may also define a compute config that uses these instance types through the Anyscale CLI/SDK.
These pod shapes are preconfigured and immediately usable:
defaultInstanceTypes:
2CPU-8GB:
resources:
CPU: 2
memory: 8Gi
4CPU-16GB:
resources:
CPU: 4
memory: 16Gi
8CPU-32GB:
resources:
CPU: 8
memory: 32Gi
8CPU-32GB-1xT4:
resources:
CPU: 8
GPU: 1
memory: 32Gi
'accelerator_type:T4': 1
4CPU-8GB
and 8CPU-32GB-1xT4
are names that follow an Anyscale naming convention. Cloud administrators may use a naming convention of their choice - valid characters include alphanumeric characters, dashes, and underscores.
Anyscale updates the console roughly every ~30 seconds with the latest instance types defined in the ConfigMap.
Additional instance types
You can add additional instance types depending on the desired pod shapes, available node groups, and accelerators in your Kubernetes cluster. When specifying a new pod shape, you're able to specify the resources as well as node selectors to facilitate scheduling pods within your cluster. Ensure that accelerators listed in your instances map to the supportedAccelerators
values and that Ray supports them.
additionalInstanceTypes: {}
Example TPU configurations (GKE only)
The following is an example of an instance type that specifies a single host TPU pod.
For more information on configuring TPUs, see the guide for TPU support with the Anyscale operator.
8CPU-16GB-TPU-V5E-2x2-SINGLEHOST:
resources:
CPU: 8
TPU: 4
memory: 16Gi
'accelerator_type:TPU-V5E': 1
'anyscale/tpu_hosts': 1
nodeSelector:
cloud.google.com/gke-tpu-accelerator: tpu-v5-lite-podslice
cloud.google.com/gke-tpu-topology: 2x2
For accelerators, the accelerator_type
value should map to the list of Ray-supported accelerators. If an accelerator type isn't defined in this list, open an issue on the Ray GitHub repository, and forward it to Anyscale support.
Resource allocation within the pod
When the Anyscale operator applies a pod spec to Kubernetes for an Anyscale workload, the operator uses the shapes defined in the Instance Type ConfigMap as an upper bound for the sum of all of the memory requests and limits across all containers in the pod. Anyscale reserves some memory and CPU for critical-path Anyscale sidecar containers, and provides the rest to the Ray container to run the primary workload.
Similarly, Kubernetes (and managed Kubernetes providers such as EKS/GKE) reserve a portion of CPU and memory for running the Kubelet and other Kubernetes system components. As such, the pod shapes defined in the ConfigMap should be smaller than the actual node shape. For example, if a m5.4xlarge
node group exists in an EKS cluster (16 vCPU's + 64 GiB memory), a 14CPU-56GB
Pod may be schedulable, but a 16CPU-64GB
Pod is most likely be unschedulable. See Reserve Compute Resources for System Daemons in the Kubernetes documentation for more details.
Enable Karpenter support
Karpenter is an open-source node provisioning project built for Kubernetes. The Anyscale operator can tag pods with the appropriate labels and node selectors if you enable the following flag for Karpenter support.
enableKarpenterSupport: true # Default is false.
Karpenter requires that you set specific taints and labels on various Kubernetes node groups. See the Karpenter documentation or leverage the Anyscale Terraform provider for Kubernetes to ensure that you properly configure these taints and labels.
Prevent head pod eviction
Enable protection against head-node eviction (may block Kubernetes cluster upgrades):
enableAnyscaleRayHeadNodePDB: true # Default is false.
If enabled, manually delete the PodDisruptionBudget before upgrading the Kubernetes cluster.
Advanced configurations
Customize ingress and DNS resolution
Be default, the Anyscale operator attempts to configure ingress for Ray clusters and services. If you don't want automatic resolution and prefer to manually configure ingress, you can specify the DNS ingress address:
ingressAddress: "<IP_ADDRESS_OR_HOSTNAME>"
To support Anyscale dynamic DNS resolution, you need to configure the appropriate ingresses and patches in order to successfully access the Ray dashboard and Anyscale service endpoints.
Using the AWS Load Balancer Controller for ingress
First, create a temporary minimal ingress with a fixed group name, such as:
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: minimal-ingress
annotations:
alb.ingress.kubernetes.io/group.name: anyscale
spec:
rules:
- http:
paths:
- path: /testpath
pathType: Prefix
backend:
service:
name: test
port:
number: 80
Describe the ingress (kubectl describe ingress minimal-ingress
) to retrieve the status.loadBalancer.ingress.hostname
attribute. It should resemble anyscale-1421684687.us-west-2.elb.amazonaws.com
. This address stays consistent for all ingresses you create with this group name.
Use this value for cloud registration for the --ingress-address
flag or ingressAddress
value in the Helm chart.
After you retrieve the address, delete the temporary ingress.
Then, apply this set of additional patches.
additionalPatches:
# Apply these patches to all `Ingress` resources created by Anyscale.
- kind: Ingress
patch:
- op: add
path: /metadata/annotations/alb.ingress.kubernetes.io~1group.name
value: "anyscale"
- op: add
path: /metadata/annotations/alb.ingress.kubernetes.io~1load-balancer-name
value: "anyscale"
# Uncomment this if you want to use an internal ALB, which is only accessible
# from inside the VPC and requires a VPN to access from a laptop.
# - op: add
# path: /metadata/annotations/alb.ingress.kubernetes.io~1scheme
# value: "internal"
# NOTE: The rest of the patches are only required for Anyscale services functionality.
# They aren't required for basic head node connectivity for workspaces and jobs.
# When the Anyscale operator performs a service rollout, two ingress resources are
# created for the NGINX Ingress Controller. The first ingress is the primary ingress,
# and the second ingress is the canary ingress.
# The Anyscale operator uses the following patches to convert from the NGINX Ingress Controller
# scheme to the ALB Ingress Controller scheme, handling the case where a single ingress and service
# exists and the case where a rollout is in progress and two ingresses and services exist.
# The ALB Ingress Controller doesn't need two ingresses to manage canary deployments.
# Instead, the ALB Ingress Controller can manage canary deployments through a single
# ingress resource. This patch modifies the ingress resource created by the Anyscale
# operator to use the ALB Ingress Controller scheme, by updating the annotations on
# the primary ingress when a canary exists.
- kind: Ingress
selector: anyscale.com/ingress-type in (primary), anyscale.com/canary-exists in (true)
patch:
- op: add
path: /metadata/annotations/alb.ingress.kubernetes.io~1actions.anyscale
value: >
{
"type": "forward",
"forwardConfig": {
"targetGroups": [
{
"serviceName": "{{.PrimarySvc}}",
"servicePort": "8000",
"weight": {{.PrimaryWeight}}
},
{
"serviceName": "{{.CanarySvc}}",
"servicePort": "8000",
"weight": {{.CanaryWeight}}
}
],
"targetGroupStickinessConfig": {
"enabled": false
}
}
}
# Update the serviceName and servicePort to point to the action name
# so that the rules in the annotation are used. For more information,
# see:
# https://kubernetes-sigs.github.io/aws-load-balancer-controller/v2.2/guide/ingress/annotations/#actions
- op: replace
path: /spec/rules/0/http/paths/0/backend/service/name
value: anyscale
- op: replace
path: /spec/rules/0/http/paths/0/backend/service/port/name
value: use-annotation
# This patch handles the primary ingress when a canary doesn't exist (for
# example, when a service rollout isn't in progress) by adding a set of actions
# to forward traffic to the primary service.
- kind: Ingress
selector: anyscale.com/ingress-type in (primary), anyscale.com/canary-exists in (false)
patch:
- op: add
path: /metadata/annotations/alb.ingress.kubernetes.io~1actions.anyscale
value: >
{
"type": "forward",
"forwardConfig": {
"targetGroups": [
{
"serviceName": "{{.PrimarySvc}}",
"servicePort": "8000",
"weight": {{.PrimaryWeight}}
}
],
"targetGroupStickinessConfig": {
"enabled": false
}
}
}
# Update the serviceName and servicePort to point to the action name
# so that the rules in the annotation are used. For more information,
# see:
# https://kubernetes-sigs.github.io/aws-load-balancer-controller/v2.2/guide/ingress/annotations/#actions
- op: replace
path: /spec/rules/0/http/paths/0/backend/service/name
value: anyscale
- op: replace
path: /spec/rules/0/http/paths/0/backend/service/port/name
value: use-annotation
# This patch handles the canary ingress by rewriting it into a no-op ingress. The ALB Ingress
# Controller doesn't need a separate ingress for canary deployments, so this patch no-ops it.
- kind: Ingress
selector: "anyscale.com/ingress-type in (canary)"
patch:
- op: replace
path: /spec
value:
defaultBackend:
service:
name: default-backend
port:
number: 80
Resource configuration for the Anyscale operator
Anyscale recommends the default values for getting started.
Define resource requests and limits for the operator and telemetry sidecar:
operatorResources:
operator:
requests:
cpu: 1
memory: 512Mi
limits:
memory: 2Gi
vector:
requests:
cpu: 100m
memory: 512Mi
limits:
memory: 512Mi
Custom patches
Different Kubernetes clusters have some variance when it comes to spot handling, accelerator handling, etc.
The Patch API provides an escape hatch to handle custom integrations. This API allows for just-in-time patching of all Anyscale-managed resources as they're applied to the Kubernetes cluster. The syntax for the Patch API is the JSON Patch syntax (ITEF specification). As an example, consider the patch below:
patches:
- kind: Pod
# See: https://kubernetes.io/docs/concepts/overview/working-with-objects/labels/#label-selectors
selector: "anyscale.com/market-type in (ON_DEMAND)"
# See: https://jsonpatch.com/
patch:
- op: add
path: /spec/nodeSelector/eks.amazonaws.com~1capacityType # use ~1 to escape the forward-slash
value: "ON_DEMAND"
For all Pods that the Anyscale operator creates, the operator applies the set of patches to all Pods that match the Kubernetes selector. In this case, the operator applies the eks.amazonaws.com/capacityType
node selector to the Pod spec.
The Helm chart generates a variety of patches using the default configuration options that should work on EKS or GKE out-of-the-box without additional configuration. The Helm chart also accepts additional patches to support custom autoscalers, ingresses, or other cluster-specific properties.
View all annotations provided by Anyscale that you can use for custom patches
The Anyscale control plane applies these annotations on resources created by Anyscale.
Label Name | Possible Label Values | Description |
---|---|---|
anyscale.com/market-type | SPOT, ON_DEMAND | Users with workloads that support preemption may opt to run their workloads on spot node types through the compute config. All other workloads are run on on-demand node types. This should most likely be transformed into a node affinity. |
anyscale.com/zone | user-defined through cloud setup | For Pods that have a specific zone affinity, the Anyscale operator sets this label to the zone that the Pod should be launched into (us-west-2a , for example). Zones are provided as []string at cloud registration time and can be selected from the Anyscale UI. This should most likely be transformed into a node affinity. |
anyscale.com/accelerator-type | user-defined through instance type configuration | When requesting a GPU Pod, the Anyscale operator sets one of the following values: Anyscale accelerator types. |
anyscale.com/instance-type | user-defined through instance type configuration | The operator sets this value for all Pods created through Anyscale. |
anyscale.com/canary-weight anyscale.com/canary-exists anyscale.com/canary-svc anyscale.com/ingress-type anyscale.com/bearer-token anyscale.com/primary-weight anyscale.com/primary-svc | various | For advanced use only (when using an ingress other than NGINX for inference / serving workloads with Anyscale Services). Contact Anyscale for more details. |
View a sample of common advanced configuration options
{
"metadata": {
// Add a new label.
"labels": {"new-label": "example-value"},
// Add a new annotation.
"annotations": {"new-annotation": "example-value"}
},
"spec": {
// Add a node selector.
"nodeSelector": {"disktype": "ssd"},
"tolerations": [{
"effect": "NoSchedule",
"key": "dedicated",
"value": "example-anyscale"
}]
"containers": [{
// Add a PersistentVolumeClaim to the Ray container.
"name": "ray",
"volumeMounts": [{
"name": "pvc-volume",
"mountPath": "/mnt/pvc-data"
}]
},{
// Add a sidecar for exporting logs and metrics.
"name": "monitoring-sidecar",
"image": "timberio/vector:latest",
"ports": [{
"containerPort": 9000
}],
"volumeMounts": [{
"name": "vector-volume",
"mountPath": "/mnt/vector-data"
}]
}],
"volumes": [{
"name": "pvc-volume",
"persistentVolumeClaim": {
"claimName": "my-pvc"
}
},{
"name": "vector-volume",
"emptyDir": {}
}]
}
}