Customize the Anyscale operator Helm chart

You can customize the Anyscale operator Helm chart to meet the needs of your specific deployment of Anyscale on Kubernetes.

This page refers to some of the configuration options that are available, but additional customization is possible.

tip

Anyscale recommends primarily making changes to the Helm chart before deploying new versions of the Anyscale operator. Inline arguments for variables listed in the values file are supported as well as live updates to resources like the Patch Config map and the Instance Type map. However, subsequent deployments may overwrite inline or inplace updates.

Installation variables

Parameter	Description	Default
`cloudDeploymentId`	Specifies your Anyscale cloud deployment ID. Created during `cloud register`.	`""`
`cloudProvider`	Specifies your cloud provider (`aws`, `gcp`, or `generic`). Specifying a cloud provider enables identity verification between the operator and Anyscale control plane.	`""`
`anyscaleCLIToken`	If not using AWS or GCP identity verification, use an Anyscale CLI token for verification of the operator.	`""`
`region`	The region where your Kubernetes cluster is running.	`""`
`operatorReplicas`	Number of Anyscale operator replicas.	`1`
`enableKarpenterSupport`	Enables integration with the AWS Karpenter autoscaler.	`false`

Instance type configuration

When running on top of Kubernetes, an Anyscale "instance type" maps to a Pod shape. The cloud administrator defines instance types when setting up the Anyscale operator through either the Helm chart options or out-of-band by editing the instance-types ConfigMap that the Helm chart creates.

Anyscale recommends using the Helm chart to manage instance types, as this approach ensures that Helm upgrades don't overwrite manual changes to the ConfigMap.

Default instance types

Each instance type defined in the ConfigMap is visible in the Anyscale UI through a drop-down list. Users can select these instance types when submitting workloads. Users may also define a compute config that uses these instance types through the Anyscale CLI/SDK.

These pod shapes are preconfigured and immediately usable:

defaultInstanceTypes:
  2CPU-8GB:
    resources:
      CPU: 2
      memory: 8Gi
  4CPU-16GB:
    resources:
      CPU: 4
      memory: 16Gi
  8CPU-32GB:
    resources:
      CPU: 8
      memory: 32Gi
  8CPU-32GB-1xT4:
    resources:
      CPU: 8
      GPU: 1
      memory: 32Gi
      'accelerator_type:T4': 1

note

4CPU-8GB and 8CPU-32GB-1xT4 are names that follow an Anyscale naming convention. Cloud administrators may use a naming convention of their choice - valid characters include alphanumeric characters, dashes, and underscores.

Anyscale updates the console roughly every ~30 seconds with the latest instance types defined in the ConfigMap.

Additional instance types

You can add additional instance types depending on the desired pod shapes, available node groups, and accelerators in your Kubernetes cluster. When specifying a new pod shape, you're able to specify the resources as well as node selectors to facilitate scheduling pods within your cluster. Ensure that accelerators listed in your instances map to the supportedAccelerators values and that Ray supports them.

additionalInstanceTypes: {}

Example TPU configurations (GKE only)

The following is an example of an instance type that specifies a single host TPU pod.

For more information on configuring TPUs, see the guide for TPU support with the Anyscale operator.

8CPU-16GB-TPU-V5E-2x2-SINGLEHOST:
  resources:
    CPU: 8
    TPU: 4
    memory: 16Gi
    'accelerator_type:TPU-V5E': 1
    'anyscale/tpu_hosts': 1
  nodeSelector:
    cloud.google.com/gke-tpu-accelerator: tpu-v5-lite-podslice
    cloud.google.com/gke-tpu-topology: 2x2

caution

For accelerators, the accelerator_type value should map to the list of Ray-supported accelerators. If an accelerator type isn't defined in this list, open an issue on the Ray GitHub repository, and forward it to Anyscale support.

Resource allocation within the pod

When the Anyscale operator applies a pod spec to Kubernetes for an Anyscale workload, the operator uses the shapes defined in the Instance Type ConfigMap as an upper bound for the sum of all of the memory requests and limits across all containers in the pod. Anyscale reserves some memory and CPU for critical-path Anyscale sidecar containers, and provides the rest to the Ray container to run the primary workload.

Similarly, Kubernetes (and managed Kubernetes providers such as EKS/GKE) reserve a portion of CPU and memory for running the Kubelet and other Kubernetes system components. As such, the pod shapes defined in the ConfigMap should be smaller than the actual node shape. For example, if a m5.4xlarge node group exists in an EKS cluster (16 vCPU's + 64 GiB memory), a 14CPU-56GB Pod may be schedulable, but a 16CPU-64GB Pod is most likely be unschedulable. See Reserve Compute Resources for System Daemons in the Kubernetes documentation for more details.

Enable Karpenter support

Karpenter is an open-source node provisioning project built for Kubernetes. The Anyscale operator can tag pods with the appropriate labels and node selectors if you enable the following flag for Karpenter support.

enableKarpenterSupport: true # Default is false.

Karpenter requires that you set specific taints and labels on various Kubernetes node groups. See the Karpenter documentation or leverage the Anyscale Terraform provider for Kubernetes to ensure that you properly configure these taints and labels.

Prevent head pod eviction

Enable protection against head-node eviction (may block Kubernetes cluster upgrades):

enableAnyscaleRayHeadNodePDB: true  # Default is false.

caution

If enabled, manually delete the PodDisruptionBudget before upgrading the Kubernetes cluster.

Advanced configurations

Customize ingress and DNS resolution

By default, the Anyscale operator attempts to configure ingress for Ray clusters and services. If you don't want automatic resolution and prefer to manually configure ingress, you can specify the DNS ingress address:

ingressAddress: "<IP_ADDRESS_OR_HOSTNAME>"

warning

To support Anyscale dynamic DNS resolution, you need to configure the appropriate ingresses and patches in order to successfully access the Ray dashboard and Anyscale service endpoints.

Using the AWS Load Balancer Controller for ingress

First, create a temporary minimal ingress with a fixed group name, such as:

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: minimal-ingress
  annotations:
    alb.ingress.kubernetes.io/group.name: anyscale
spec:
  rules:
  - http:
      paths:
      - path: /testpath
        pathType: Prefix
        backend:
          service:
            name: test
            port:
              number: 80

Describe the ingress (kubectl describe ingress minimal-ingress) to retrieve the status.loadBalancer.ingress.hostname attribute. It should resemble anyscale-1421684687.us-west-2.elb.amazonaws.com. This address stays consistent for all ingresses you create with this group name.

Use this value for cloud registration for the --ingress-address flag or ingressAddress value in the Helm chart.

After you retrieve the address, delete the temporary ingress.

Then, apply this set of additional patches.

additionalPatches:
# Apply these patches to all `Ingress` resources created by Anyscale.
- kind: Ingress
  patch:
  - op: add
    path: /metadata/annotations/alb.ingress.kubernetes.io~1group.name
    value: "anyscale"
  - op: add
    path: /metadata/annotations/alb.ingress.kubernetes.io~1load-balancer-name
    value: "anyscale"
  # Uncomment this if you want to use an internal ALB, which is only accessible
  # from inside the VPC and requires a VPN to access from a laptop.
  # - op: add
  #   path: /metadata/annotations/alb.ingress.kubernetes.io~1scheme
  #   value: "internal"

# NOTE: The rest of the patches are only required for Anyscale services functionality.
# They aren't required for basic head node connectivity for workspaces and jobs.

# When the Anyscale operator performs a service rollout, two ingress resources are
# created for the NGINX Ingress Controller. The first ingress is the primary ingress,
# and the second ingress is the canary ingress.

# The Anyscale operator uses the following patches to convert from the NGINX Ingress Controller
# scheme to the ALB Ingress Controller scheme, handling the case where a single ingress and service
# exists and the case where a rollout is in progress and two ingresses and services exist.

# The ALB Ingress Controller doesn't need two ingresses to manage canary deployments.
# Instead, the ALB Ingress Controller can manage canary deployments through a single
# ingress resource. This patch modifies the ingress resource created by the Anyscale
# operator to use the ALB Ingress Controller scheme, by updating the annotations on
# the primary ingress when a canary exists.
- kind: Ingress
  selector: anyscale.com/ingress-type in (primary), anyscale.com/canary-exists in (true)
  patch:
    - op: add
      path: /metadata/annotations/alb.ingress.kubernetes.io~1actions.anyscale
      value: >
        {
          "type": "forward",
          "forwardConfig": {
            "targetGroups": [
              {
                "serviceName": "{{.PrimarySvc}}",
                "servicePort": "8000",
                "weight": {{.PrimaryWeight}}
              },
              {
                "serviceName": "{{.CanarySvc}}",
                "servicePort": "8000",
                "weight": {{.CanaryWeight}}
              }
            ],
            "targetGroupStickinessConfig": {
              "enabled": false
            }
          }
        }
    # Update the serviceName and servicePort to point to the action name
    # so that the rules in the annotation are used. For more information,
    # see:
    # https://kubernetes-sigs.github.io/aws-load-balancer-controller/v2.2/guide/ingress/annotations/#actions
    - op: replace
      path: /spec/rules/0/http/paths/0/backend/service/name
      value: anyscale
    - op: replace
      path: /spec/rules/0/http/paths/0/backend/service/port/name
      value: use-annotation

# This patch handles the primary ingress when a canary doesn't exist (for
# example, when a service rollout isn't in progress) by adding a set of actions
# to forward traffic to the primary service.
- kind: Ingress
  selector: anyscale.com/ingress-type in (primary), anyscale.com/canary-exists in (false)
  patch:
    - op: add
      path: /metadata/annotations/alb.ingress.kubernetes.io~1actions.anyscale
      value: >
        {
          "type": "forward",
          "forwardConfig": {
            "targetGroups": [
              {
                "serviceName": "{{.PrimarySvc}}",
                "servicePort": "8000",
                "weight": {{.PrimaryWeight}}
              }
            ],
            "targetGroupStickinessConfig": {
              "enabled": false
            }
          }
        }
    # Update the serviceName and servicePort to point to the action name
    # so that the rules in the annotation are used. For more information,
    # see:
    # https://kubernetes-sigs.github.io/aws-load-balancer-controller/v2.2/guide/ingress/annotations/#actions
    - op: replace
      path: /spec/rules/0/http/paths/0/backend/service/name
      value: anyscale
    - op: replace
      path: /spec/rules/0/http/paths/0/backend/service/port/name
      value: use-annotation

# This patch handles the canary ingress by rewriting it into a no-op ingress. The ALB Ingress
# Controller doesn't need a separate ingress for canary deployments, so this patch no-ops it.
- kind: Ingress
  selector: "anyscale.com/ingress-type in (canary)"
  patch:
    - op: replace
      path: /spec
      value:
        defaultBackend:
          service:
            name: default-backend
            port:
              number: 80

Resource configuration for the Anyscale operator

note

Anyscale recommends the default values for getting started.

Define resource requests and limits for the operator and telemetry sidecar:

operatorResources:
  operator:
    requests:
      cpu: 1
      memory: 512Mi
    limits:
      memory: 2Gi
  vector:
    requests:
      cpu: 100m
      memory: 512Mi
    limits:
      memory: 512Mi

Custom patches

Different Kubernetes clusters have some variance when it comes to spot handling, accelerator handling, etc.

The Patch API provides an escape hatch to handle custom integrations. This API allows for just-in-time patching of all Anyscale-managed resources as they're applied to the Kubernetes cluster. The syntax for the Patch API is the JSON Patch syntax (ITEF specification). As an example, consider the patch below:

additionalPatches:
  - kind: Pod
    # See: https://kubernetes.io/docs/concepts/overview/working-with-objects/labels/#label-selectors
    selector: "anyscale.com/market-type in (ON_DEMAND)"
    # See: https://jsonpatch.com/
    patch:
      - op: add
        path: /spec/nodeSelector/eks.amazonaws.com~1capacityType # use ~1 to escape the forward-slash
        value: "ON_DEMAND"

For all Pods that the Anyscale operator creates, the operator applies the set of patches to all Pods that match the Kubernetes selector. In this case, the operator applies the eks.amazonaws.com/capacityType node selector to the Pod spec.

The Helm chart generates a variety of patches using the default configuration options that should work on EKS or GKE out-of-the-box without additional configuration. The Helm chart also accepts additional patches to support custom autoscalers, ingresses, or other cluster-specific properties.

View all annotations provided by Anyscale that you can use for custom patches

The Anyscale control plane applies these annotations on resources created by Anyscale.

Label Name	Possible Label Values	Description
`anyscale.com/market-type`	SPOT, ON_DEMAND	Users with workloads that support preemption may opt to run their workloads on spot node types through the compute config. All other workloads are run on on-demand node types. This should most likely be transformed into a node affinity.
`anyscale.com/zone`	user-defined through cloud setup	For Pods that have a specific zone affinity, the Anyscale operator sets this label to the zone that the Pod should be launched into (`us-west-2a`, for example). Zones are provided as []string at cloud registration time and can be selected from the Anyscale UI. This should most likely be transformed into a node affinity.
`anyscale.com/accelerator-type`	user-defined through instance type configuration	When requesting a GPU Pod, the Anyscale operator sets one of the following values: Anyscale accelerator types.
`anyscale.com/instance-type`	user-defined through instance type configuration	The operator sets this value for all Pods created through Anyscale.
`anyscale.com/canary-weight` `anyscale.com/canary-exists` `anyscale.com/canary-svc` `anyscale.com/ingress-type` `anyscale.com/bearer-token` `anyscale.com/primary-weight` `anyscale.com/primary-svc`	various	For advanced use only (when using an ingress other than NGINX for inference / serving workloads with Anyscale services). Contact Anyscale for more details.

View a sample of common advanced configuration options

{
  "metadata": {
    // Add a new label.
    "labels": {"new-label": "example-value"},
    // Add a new annotation.
    "annotations": {"new-annotation": "example-value"}
  },
  "spec": {
    // Add a node selector.
    "nodeSelector": {"disktype": "ssd"},
    "tolerations": [{
      "effect": "NoSchedule",
      "key": "dedicated",
      "value": "example-anyscale"
    }]
    "containers": [{
      // Add a PersistentVolumeClaim to the Ray container.
      "name": "ray",
      "volumeMounts": [{
        "name": "pvc-volume",
        "mountPath": "/mnt/pvc-data"
      }]
    },{
      // Add a sidecar for exporting logs and metrics.
      "name": "monitoring-sidecar",
      "image": "timberio/vector:latest",
      "ports": [{
        "containerPort": 9000
      }],
      "volumeMounts": [{
        "name": "vector-volume",
        "mountPath": "/mnt/vector-data"
      }]
    }],
    "volumes": [{
      "name": "pvc-volume",
      "persistentVolumeClaim": {
        "claimName": "my-pvc"
      }
    },{
      "name": "vector-volume",
      "emptyDir": {}
    }]
  }
}

Installation variables​

Instance type configuration​

Default instance types​

Additional instance types​

Enable Karpenter support​

Prevent head pod eviction​

Advanced configurations​

Customize ingress and DNS resolution​

Resource configuration for the Anyscale operator​

Custom patches​