Skip to main content

Tutorial: Create and use an Anyscale-managed machine pool

This tutorial provides an overview of creating an Anyscale-managed machine pool and using it to run jobs and launch workspaces.

You can also configure customer-managed machine pools.

Requirements

Configuring and deploying machine pools is a highly privileged task. Completing the steps in this tutorial requires the following:

  • An Anyscale organization with one or more Anyscale clouds.
  • You must have the following permissions:
    • Organization owner for your Anyscale organization.
    • Cloud collaborator or cloud owner for your Anyscale cloud.
  • You must authenticate the Anyscale CLI to your Anyscale organization using these credentials. See CLI configuration.

Step 1: Create a machine pool

Run the following command to create an empty machine pool named capacity-reservations:

anyscale machine-pool create --name capacity-reservations

Step 2: Attach the machine pool to clouds

Machine pools are scoped to your Anyscale organization. You must attach an Anyscale cloud to your machine pool to use the machine pool in your workloads.

Use the following syntax to attach an Anyscale cloud to a machine pool:

anyscale machine-pool attach --name <pool-name> -- cloud <cloud-name>

For example, the following commands attach a machine pool named capacity-reservations to clouds named dev-cloud and prod-cloud:

anyscale machine-pool attach --name capacity-reservations --cloud dev-cloud
anyscale machine-pool attach --name capacity-reservations --cloud prod-cloud

Step 3: Configure the machine pool

The machine pool configuration file specifies instance types, capacity reservation information, priorities, and quotas.

Use your preferred file editor to create a file named machine-pool-capacity-reservations.yaml in your current directory and copy the following definition into the file:

# machine-pool-capacity-reservations.yaml
kind: ANYSCALE_MANAGED
machine_types:
- machine_type: RES-8CPU-32GB
# Recycle policy configures instance rotation for cloud deployments on VMs.
# For cloud deployments on Kubernetes, the pods rotate after each workload.
recycle_policy:
# Rotate instances after 10 workloads.
max_workloads: 10
# Rotate instances that have been running for
# more than 60 minutes (note: rotation only
# happens when workloads are completed).
rotation_interval: 60m
# Shut down idle instances after 10 minutes.
max_idle_duration: 10m
launch_templates:
- instance_type: "m5.2xlarge"
market_type: ON_DEMAND
advanced_instance_config:
CapacityReservationSpecification:
CapacityReservationTarget:
CapacityReservationId: <your-capacity-reservation-id>
# 'zones' is optional. If not specified, the scheduler
# will attempt to provision instances in any zone in the
# cloud that is requesting this machine pool instance.
zones:
- us-west-2a
- us-west-2b
partitions:
- name: job
# The number of instances that this partition should contain.
size: 6
# Each rule can define a selector, priority, and optional quota.
# Rules are evaluated in order, and the first rule that matches
# will be used.
#
# Selectors can be used to select workloads based on labels
# (these are Kubernetes-style selectors; see below for details).
#
# Priority is an integer that defines the priority of the rule.
# The default priority is 0.
#
# Quotas are optional, and define a limit on the number of machines
# that can be allocated to workloads that fall under this rule.
rules:
- selector: workload-type in (job)
priority: 20
- selector: workload-type in (workspace)
priority: 10
quota: 3
- name: workspace
size: 4
rules:
- selector: workload-type in (workspace)
priority: 20

Step 4: Apply the configuration

Run the following command to update the machine pool named capacity-reservations to the configurations specified in the machine-pool-capacity-reservations.yaml file:

anyscale machine-pool update --name capacity-reservations --spec-file machine-pool-capacity-reservations.yaml
note

Running this command initializes resources in your AWS account. To avoid unexpected AWS charges, make sure you scale down the pool.

Step 5: Verify the machine pool

Run the following command to confirm that the machine pool has initialized with the specified capacity:

anyscale machine-pool describe --name capacity-reservations

If you see an error indicating scheduler cache is not ready, the scheduler is still initializing. Wait the duration specified in the error message before running the command again and then proceeding.

Step 6: Create a compute configuration and submit a job

In this step, you specify and compute configuration and simple script for Ray job, then submit it to your machine pool.

Create a file named ten-node-config.yaml and copy the following definition into the file:

# ten-node-config.yaml
cloud: CLOUD_NAME
head_node:
instance_type: "m5.2xlarge"
worker_nodes:
- name: reserved-cpu
instance_type: RES-8CPU-32GB
min_nodes: 10
max_nodes: 10
flags:
cloud_deployment:
machine_pool: capacity-reservations
flags:
# If the cluster doesn't satisfy min_nodes for > 5 minutes while
# in a RUNNING state, then terminate the workload (and re-queue
# it if it's a job).
workload_recovering_timeout: 5m
# If the cluster doesn't satisfy min_nodes for > 24h while in
# a STARTING state, then terminate the workload.
workload_starting_timeout: 24h

Run the following command to create a compute configuration using this file in your Anyscale cloud:

anyscale compute-config create --name ten-node-config -f ten-node-config.yaml

Create a file named loop.py and copy the following Ray code into the file:

# loop.py
import ray
import time

ray.init()

@ray.remote
def foo(i):
return i

for i in range(6000):
time.sleep(1)
print(f'Iteration {ray.get(foo.remote(i))}')

Run the following code to submit the job:

anyscale job submit --name ten-node-job --compute-config ten-node-config --working-dir=. -- python loop.py

Step 7: Create a workspace

In this step, you configure and create a workspace that requires four machines.

An example workspace configuration might be:

# four-node-config.yaml
cloud: CLOUD_NAME
head_node:
instance_type: "m5.2xlarge"
worker_nodes:
- name: reserved-cpu
instance_type: RES-8CPU-32GB
min_nodes: 4
max_nodes: 4
flags:
cloud_deployment:
machine_pool: capacity-reservations

Create the compute configuration:

anyscale compute-config create --name four-node-config -f four-node-config.yaml

Creating a workspace with this configuration evicts four of the ten machines assigned to the job. To confirm the current assigned resources, run the following command:

anyscale machine-pool describe --name capacity-reservations

Run the following command to create the workspace:

anyscale workspace_v2 create --name four-node-workspace --compute-config four-node-config

Run the following command to describe the pool again and observe the scheduler behavior:

anyscale machine-pool describe --name capacity-reservations

The output shows the allocation of machines from the pool to current workloads.

Step 8: Scale down and delete the machine pool

To clean up the demo, scale down the machine pool, detach it from your Anyscale cloud, and then delete it.

To scale down, create a file named machine-pool-cleanup.yaml and copy the following definition into the file:

kind: ANYSCALE_MANAGED
machine_types: [] # Set the machines to an empty list

Run the following command to update the machine pool in your Anyscale organization:

anyscale machine-pool update --name capacity-reservations --spec-file machine-pool-cleanup.yaml

Detach the machiine pool from your Anyscale cloud using the anyscale machine-pool detach command. The following example detaches the pool from two clouds named dev-cloud and prod-cloud:

anyscale machine-pool detach --name capacity-reservations --cloud dev-cloud
anyscale machine-pool detach --name capacity-reservations --cloud prod-cloud

Run the following command to delete the machine pool:

anyscale machine-pool delete --name capacity-reservations