Compute configuration options for Kubernetes
Cloud specific configurations are only available with self-hosted Anyscale Clouds.
Kubernetes-specific options
The compute configuration allows you to provide workload-scoped advanced configuration settings for either a specific node or the entire cluster. Anyscale applies this configuration as a strategic merge patch to the Pod specifications generated by Anyscale before sending them to the Kubernetes API.
Node-specific configurations override any cluster-wide configurations for that node type. For a full reference on configuring these properties, see the official Kubernetes documentation.
View a sample of common advanced configuration options.
- On Anyscale UI console
- In Anyscale cli
{
"metadata": {
// Add a new label.
"labels": {"new-label": "example-value"},
// Add a new annotation.
"annotations": {"new-annotation": "example-value"}
},
"spec": {
// Add a node selector.
"nodeSelector": {"disktype": "ssd"},
"tolerations": [{
"effect": "NoSchedule",
"key": "dedicated",
"value": "example-anyscale"
}]
"containers": [{
"name": "ray",
// Enable privileged container in order to use Ray image_uri runtime env to run nested worker container via podman.
"securityContext": {
"privileged": true
},
// Add a PersistentVolumeClaim to the Ray container.
"volumeMounts": [{
"name": "pvc-volume",
"mountPath": "/mnt/pvc-data"
}]
},{
// Add a sidecar for exporting logs/metrics.
"name": "monitoring-sidecar",
"image": "timberio/vector:latest",
"ports": [{
"containerPort": 9000
}],
"volumeMounts": [{
"name": "vector-volume",
"mountPath": "/mnt/vector-data"
}]
}],
"volumes": [{
"name": "pvc-volume",
"persistentVolumeClaim": {
"claimName": "my-pvc"
}
},{
"name": "vector-volume",
"emptyDir": {}
}]
}
}
compute_config:
cloud: your_anyscale_cloud_name
head_node:
instance_type: 4CPU-16GB
worker_nodes:
- instance_type: 4CPU-16GB
min_nodes: 1
max_nodes: 10
advanced_instance_config:
metadata:
# Add a new label.
labels:
new-label: example-value
# Add a new annotation.
annotations:
new-annotation: example-value
spec:
# Add a node selector.
nodeSelector:
disktype: ssd
# Add tolerations.
tolerations:
- effect: NoSchedule
key: dedicated
value: example-anyscale
containers:
- name: ray
# Enable privileged container in order to use Ray image_uri runtime env to run nested worker container via podman.
securityContext:
privileged: true
# Add a PersistentVolumeClaim to the Ray container.
volumeMounts:
- name: pvc-volume
mountPath: /mnt/pvc-data
# Add a sidecar for exporting logs/metrics.
- name: monitoring-sidecar
image: timberio/vector:latest
ports:
- containerPort: 9000
volumeMounts:
- name: vector-volume
mountPath: /mnt/vector-data
volumes:
- name: pvc-volume
persistentVolumeClaim:
claimName: my-pvc
- name: vector-volume
emptyDir: {}
Advanced features
The following options are common to any cloud.
Adjustable downscaling
The Anyscale platform automatically downscales worker nodes that have been idle for a given period. By default, the timeout period ranges from 30 seconds to 4 minutes and is dynamically adjusted for each node group based on the workload. For example, short, bursty workloads have shorter timeouts and more aggressive downscaling. Adjustable downscaling allows users to adjust this timeout value at the cluster-level based on their workload needs.
- Console
- CLI
To adjust the timeout value from the Anyscale console, use the Advanced features tab under the Advanced settings for the cluster. This example sets the timeout to 60 seconds for all nodes in the cluster.
{
"idle_termination_seconds": 60
}
To adjust the timeout value from the Anyscale CLI, use the flags
field. This example YAML sets the timeout value to 60 seconds for all nodes in the cluster.
cloud: CLOUD_NAME
head_node:
instance_type: INSTANCE_TYPE_HEAD
worker_nodes:
- instance_type: INSTANCE_TYPE_WORKER
min_nodes: MIN_NODES
max_nodes: MAX_NODES
flags:
idle_termination_seconds: 30
Cross-zone scaling
Cross-zone scaling is a feature that allows Anyscale to launch your Ray cluster across multiple availability zones. By default, all worker nodes are launched in the same availability zone. With cross-zone scaling enabled, Anyscale first attempts to launch worker nodes in existing zones, but if that fails, then tries the next-best zone (based on availability).
Use this feature if:
- You want to maximize the chances of provisioning desired instance types.
- You want to spread Serve app replicas across multiple zones for better resilience and availability.
- Your workloads have no heavy inter-node communication or the incurred inter-availability zone cost is acceptable.
- Console
- CLI
To enable or disable this feature from the Anyscale console, use the "Enable cross-zone scaling" checkbox under the Advanced settings for the cluster.
To enable or disable this feature from the Anyscale CLI, use the enable_cross_zone_scaling
field. This example YAML enables cross-zone scaling for all nodes in the cluster.
cloud: CLOUD_NAME
head_node:
instance_type: INSTANCE_TYPE_HEAD
worker_nodes:
- instance_type: INSTANCE_TYPE_WORKER
min_nodes: MIN_NODES
max_nodes: MAX_NODES
enable_cross_zone_scaling: true
Resource limits
Cluster-wide resource limits allow you to define minimum and maximum values for any resource across all nodes in the cluster. There are two common use cases for this feature:
- Specifying the maximum number of GPUs to avoid unintentionally launching a large number of expensive instances.
- Specifying a custom resource for specific worker nodes and using that custom resource value to limit the number of nodes of those types.
- Console
- CLI
To set the maximum number of CPUs and GPUs in a cluster from the Anyscale console, use the "Maximum CPUs" and "Maximum GPUs" fields under the Advanced settings for the cluster.
To set other resource limits, use the Advanced features tab under the Advanced settings for the cluster. To add a custom resource to a node group, use the Ray config tab under the Advanced config section for that node group.
This example limits the minimum resources to 1 GPU and 1 CUSTOM_RESOURCE
and limits the maximum resources to 5 CUSTOM_RESOURCE
.
{
"min_resources": {
"GPU": 1,
"CUSTOM_RESOURCE": 1
},
"max_resources": {
"CUSTOM_RESOURCE": 5,
}
}
To set resource limits for a cluster from the Anyscale CLI, use the min_resources
and max_resources
fields. This example YAML adds a custom resource to a worker node, limits the minimum resources to 8 CPU and 1 CUSTOM_RESOURCE
, and limits the maximum resources to 1 GPU and 5 CUSTOM_RESOURCE
.
cloud: CLOUD_NAME
head_node:
instance_type: INSTANCE_TYPE_HEAD
worker_nodes:
- instance_type: INSTANCE_TYPE_WORKER
resources:
CUSTOM_RESOURCE: 1
min_nodes: MIN_NODES
max_nodes: MAX_NODES
min_resources:
CPU: 8
CUSTOM_RESOURCE: 1
max_resources:
GPU: 1
CUSTOM_RESOURCE: 5
Workload starting and recovering timeouts
The workload starting timeout allows you to configure how long a workload should attempt to acquire the minimum resources when it first starts, before Anyscale terminates it.
After a workload is running, it may enter the RECOVERING
state if it's attempting to recover the minimum resources, for example, due to spot preemption. The workload recovering timeout allows you to configure how long a workload may remain in the RECOVERING
state, to avoid the cost of idling existing nodes.
By default, Anyscale sets both timeouts to 25 minutes.
These timeouts only apply to jobs and workspaces, not services.
- Console
- CLI
To configure the workload starting and recovering timeouts from the Anyscale console, use the Advanced features tab under the Advanced settings for the cluster. This example increases the workload starting timeout to 1 hour and decreases the workload recovering timeout to 10 minutes.
Valid time units are: s
, m
, and h
. For example, 1h30m
.
{
"workload_starting_timeout": "1h",
"workload_recovering_timeout": "10m"
}
To configure the workload starting and recovering timeouts from the Anyscale CLI, use the flags
field. This example increases the workload starting timeout to 1 hour and decreases the workload recovering timeout to 10 minutes.
Valid time units are: s
, m
, and h
. For example, 1h30m
.
cloud: CLOUD_NAME
head_node:
instance_type: INSTANCE_TYPE_HEAD
worker_nodes:
- instance_type: INSTANCE_TYPE_WORKER
flags:
workload_starting_timeout: 1h
workload_recovering_timeout: 10m
Zonal startup timeout
Anyscale attempts to pack cluster and worker group nodes into the same zone to improve cluster communication performance and minimize cross zone data transfer.
However, machines may not always be available because of capacity constraints, IP addresses, or other reasons. If Anyscale is unable to get the requested minimum resources, Anyscale terminates any existing nodes and sequentially tries the request in a different zone within the cloud deployment.
By default, Anyscale sets the zonal startup timeout to 10 minutes.
- Console
- CLI
To configure the workload zonal startup timeout from the Anyscale console, use the Advanced features tab under the Advanced settings for the cluster. This example increases the workload zonal startup timeout to 60 seconds.
Valid time units are: s
, m
, and h
. For example, 1h30m
.
{
"default_zone_starting_timeout": "60s"
}
To configure the workload zonal startup timeout from the Anyscale CLI, use the flags
field. This example increases the workload zonal startup timeout to 60 seconds.
Valid time units are: s
, m
, and h
. For example, 1h30m
.
cloud: CLOUD_NAME
head_node:
instance_type: INSTANCE_TYPE_HEAD
worker_nodes:
- instance_type: INSTANCE_TYPE_WORKER
flags:
default_zone_starting_timeout: 60s