Adjustable Down-scaling Compute Configs User Guide
This version of the Anyscale docs is deprecated. Go to the latest version for up to date information.
This feature will be deprecated soon. Please set the idle_termination_seconds
cluster-level flag in the compute config instead.
Autoscaler is a component that seamlessly optimizes running nodes based on real-time workload and resource utilization. It ensures that the system's performance remains both efficient and cost-effective.
When a node is not running any workload, and the current number of alive workers is above its min count, Autoscaler will terminate the node after a period of 4 minutes by default. Adjustable down-scaling provides user the ability to adjust this time period based on their workload needs. For instance, if the workload is short and bursty, then the user can set a shorter time period to have faster idle termination.
The feature is enabled for Ray 2.7+ and nightly images built after Aug 22, 2023.
Usage
In general, we recommend to use a threshold greater than 1 minute.
Configuration for Adjustable Down-scaling in the Anyscale Console is done completely via the Advanced Configurations input box. The JSON key value you want to specify is:
- AWS (EC2)
- GCP (GCE)
{
"TagSpecifications": [
{
"Tags": [
{
"Key": "as-feature-idle-terminate-second",
"Value": "60"
}
],
"ResourceType": "instance"
}
]
}
{
"instance_properties": {
"labels": {
"as-feature-idle-terminate-second": "60"
}
}
}
If you prefer to use the Anyscale CLI or SDK, click here to view an example YAML.
- AWS (EC2)
- GCP (GCE)
cloud: anyscale_v2_aws_useast2 # You may specify `cloud_id` instead
allowed_azs:
- us-east-2a
- us-east-2b
- us-east-2c
head_node_type:
name: head_node_type
instance_type: m5.2xlarge
worker_node_types:
- name: cpu_worker
instance_type: c5.8xlarge
min_workers: 2
max_workers: 10
use_spot: true
- name: gpu_worker
instance_type: g4dn.8xlarge
min_workers: 0
max_workers: 10
aws_advanced_configurations_json:
TagSpecifications:
- ResourceType: instance
Tags:
- Key: as-feature-idle-terminate-second
Value: "60"
cloud: anyscale_v2_gcp_uswest1 # You may specify `cloud_id` instead
allowed_azs:
- us-west1-a
- us-west1-b
- us-west1-c
head_node_type:
name: head_node_type
instance_type: n2-standard-32
worker_node_types:
- name: gpu_worker_1
instance_type: n1-standard-16-nvidia-t4-16gb-1
min_workers: 0
max_workers: 10
use_spot: true
fallback_to_ondemand: true
- name: gpu_worker_2
instance_type: n1-standard-16-nvidia-t4-16gb-4
min_workers: 0
max_workers: 10
use_spot: true
fallback_to_ondemand: true
gcp_advanced_configurations_json:
instance_properties:
labels:
as-feature-idle-terminate-second: "60"
FAQ
Question: What is the default down-scaling period?
The default is that a node will be terminated after it has been idle for 4 minutes.
Question: What is the minimum threshold recommended?
The minimum threshold that Anyscale recommends is 30 seconds.