Skip to main content

Declarative compute configs

Declarative compute configs

This page provides an overview of declarative compute configuration on Anyscale.

note

Declarative compute configs are in beta. Contact Anyscale support to request access.

Declarative compute configs let you specify CPU, memory, GPU, and accelerator requirements directly, rather than selecting from a list of predefined instance types. These configs are portable and work across all cloud resources and stacks.

For Anyscale cloud resources backed by Kubernetes, Anyscale provisions a pod that satisfies your requirements. This is especially useful for workloads that need arbitrary pod shapes, such as sharing GPUs across workloads. For cloud resources backed by VMs, Anyscale searches for available VM instance types and selects instances that match your requirements. See How does Anyscale select VM instances?.

You can configure declarative compute configs in the Anyscale console, CLI, or SDK. In the console, set Node type mode to Declarative when configuring a node. See Create or version a compute config. The following sections use YAML field names for the CLI and SDK. The console uses the same concepts with different labels. For example, the Required Resources section in the console maps to the required_resources field in YAML.

How declarative compute configs work

Rather than specifying instance types, you define CPU and accelerator requirements for your workloads.

  1. You specify the resources your workload needs using required_resources.
  2. Optionally, you target specific hardware using required_labels.
  3. Anyscale provisions nodes that satisfy those requirements.
  4. Autoscaling works as usual, adding nodes up to max_nodes as needed.

For a given cloud resource to be eligible for scheduling, you must have sufficient quota in the specified region for your cloud provider account. Some Kubernetes offerings can autoprovision and autoscale node pools based on pod shape requirements, while others require that you register all resources as node pools.

important

Each node group must use either predefined instances or declarative syntax, not both. In YAML, this means using either instance_type or required_resources for a given node group. You can mix approaches in the same compute config, such as using a predefined instance for the head node and declarative syntax for worker groups.

Define resource requirements

Define resource requirements in your compute configuration using required_resources and, optionally, required_labels. In the console, these correspond to the Required Resources and Required labels sections. For full compute config syntax, see Compute Config API Reference.

important

Anyscale support for TPUs requires cloud resources backed by GKE. See Leverage Cloud TPUs on GKE.

required_resources

Use required_resources to specify CPU, memory, and accelerator requirements for a node group.

ResourceTypeDescriptionExample
CPUIntegerNumber of CPU cores.CPU: 8
memoryStringMemory with unit (Gi, Mi, G, M).memory: 16Gi
GPUIntegerNumber of GPUs.GPU: 1
TPUIntegerNumber of TPU chips.TPU: 4
tpu_hostsIntegerNumber of TPU hosts. Required for TPU workloads.tpu_hosts: 1

required_labels

Use required_labels to target specific hardware such as GPU types or TPU topologies.

LabelDescriptionExample values
ray.io/accelerator-typeGPU or accelerator type.T4, A10G, A100, L4, H100, TPU-V4, TPU-V5E, TPU-V6E
ray.io/tpu-topologyTPU topology. Required for TPU workloads.2x2, 2x4, 4x4

Other worker group settings

Use the following fields to control upper and lower bounds for autoscaling events and to specify whether to use spot or on-demand instances.

FieldDescription
min_nodesThe minimum number of nodes to keep running for this worker group.
max_nodesThe maximum number of nodes that the cluster can acquire during autoscaling for this worker group.
market_typeThe type of instances to use. One of the following: PREFER_SPOT, SPOT, or ON_DEMAND.
note

Use PREFER_SPOT to try to deploy to spot instances but fall back to on-demand if they become unavailable.

Examples

The following examples demonstrate declarative compute configs for common workloads.

Request CPU and memory resources for a compute workload.

compute_config:
head_node:
required_resources:
CPU: 4
memory: 8Gi

worker_nodes:
- name: cpu-workers
required_resources:
CPU: 8
memory: 16Gi
min_nodes: 1
max_nodes: 10
market_type: PREFER_SPOT

How does Anyscale select VM instances?

Anyscale chooses VM instance types that satisfy your requirements set in the required_resources and required_labels fields. Anyscale uses the following criteria to identify candidate instance types.

1. Match requirements

For each cloud resource available to your workload, Anyscale filters instances in your cloud and region that match what you requested. Anyscale uses fuzzy matching for CPU and memory requirements, but exact matching for GPU requirements, as described in the following table:

ConfigurationDescription
CPU countSeeks instances that have approximately this many CPUs. Anyscale considers instance types with up to double the specified CPU count.
MemorySeeks instances that have approximately this much memory. Anyscale considers instance types with up to double the specified memory.
GPU countAnyscale only considers instances that have exactly the number of GPUs you specify.
GPU type (accelerator type)Anyscale only considers instances that have the exact GPU type you specify.

2. Rank by cost efficiency

Anyscale ranks instances using a cost-efficiency score tuned to find the best resource fit at lowest cost.

3. Select primary and fallback options

For the head node, Anyscale selects the single best-matching instance type.

note

Some single-node development workloads might be appropriate to run on a single node with GPU enabled. Standard Anyscale clusters with worker nodes don't schedule Ray tasks and actors to the head node, and so don't need GPUs.

Anyscale recommends always using on-demand instances for your head node for workload stability.

For worker nodes, Anyscale selects the most cost-effective instance type by default. Anyscale can select up to the top three matches for a worker group to improve availability when one instance type is capacity-constrained. For workers with multiple instance types, the system creates instance groups that share the same logical worker resource, respecting min and max thresholds set for the worker group and the cluster. Anyscale launches lower-cost options first.

4. Spot consideration

If you specify the PREFER_SPOT option for market type for a worker group, Anyscale includes both spot and on-demand instances in the ranking. Anyscale launches spot instances first when available and falls back to on-demand.

How does Anyscale select pod shapes on Kubernetes?

On Kubernetes, declarative compute configs bypass the instance types defined in the Helm chart for the Anyscale operator. You don't need to define instance types in the Helm chart for worker groups that use required_resources. Anyscale dynamically defines pod shapes from your requirements, so you don't need to patch and redeploy the Helm chart.

Kubernetes schedules pods onto nodes that can satisfy the requested resources. Ensure your Kubernetes cluster has nodes with sufficient capacity and the required accelerators available.

For general Kubernetes compute config options such as tolerations, node selectors, and volume mounts, see Compute configuration options for Kubernetes.

Best practices

  • Profile workloads to understand actual resource needs.
  • Leave headroom for Ray overhead (CPU and memory).
  • Use meaningful worker group names for observability.
  • Test autoscaling with small minimums first.
  • Monitor real usage and refine resource specifications over time.

Troubleshooting

Kubernetes Pods stuck in pending

  • Ensure cluster nodes can satisfy requested resources.
  • Verify required labels match node labels.
  • Confirm requests don't exceed single-node capacity.

Nodes not launching (VM-based clouds)

  • Ensure at least one instance type in your cloud and region satisfies your required_resources. See How does Anyscale select VM instances?.
  • Verify you have sufficient quota for the instance types in your cloud provider account.

Pods or nodes not using expected hardware

  • Verify accelerator labels on nodes.
  • Check label spelling and capitalization.
  • Verify required_resources and required_labels.
  • Confirm the instance type exists in your region and that you have quota.

TPU pods not starting

  • Confirm you've set tpu_hosts and ray.io/tpu-topology.
  • Ensure TPU count matches topology.
  • Verify TPU quota and node pool configuration.