Skip to main content

Configuring Ray Address

Connection options

To connect to Anyscale, set the Ray Address to a string starting with anyscale://. Ray Address can be specified as an argument to ray.init, or by setting the RAY_ADDRESS environment variable.

Ray address as an environment variable

Set the RAY_ADDRESS environment variable to an anyscale address.

export RAY_ADDRESS=anyscale://example_cluster

Now when you run your script, it will automatically connect to Anyscale.

python script.py

Ray address embedded in code

warning

Specifying the address in code can lead to unexpected behavior with Ray Jobs or Anyscale Jobs and is not recommended.

The URL can also be provided explicitly in code, e.g.:

script.py
import ray
ray.init("anyscale://")
python script.py

Specifying cluster properties

Connection string

The anyscale format for ray address is anyscale://<project-name>/<cluster-name>. For example, you can use anyscale://my-project/my-cluster to connect to the cluster named "my-cluster" in the project named "my-project".

project-name is optional. If no project name is provided, the ANYSCALE_PROJECT_NAME environment variable will be used or otherwise the default project of the organization will be used.

cluster-name is optional. If no cluster name is provided, Anyscale will create a new cluster with an auto-generated name.

Examples:

anyscale:// -> Create a new cluster with an autogenerated name in the default project
anyscale://my-cluster -> Create a new cluster named "my-cluster" in the default project
anyscale://my-project/ -> Create a new cluster with an autogenerated name in the "my-project" project
anyscale://my-project/my-cluster -> Create a new cluster named "my-cluster" in the "my-project" project

Additionally, you add the following optional properties as query parameters:

  • update: Whether to update cluster configurations when connecting to an existing cluster. Note that setting this parameter generally triggers a cluster restart. By default, update is set to False, and Anyscale will not restart a cluster to pick up configuration changes.
  • cluster_env: The name (and optionally revision) of the. cluster environment or a dictionary to build a new cluster environment. If no revision is specified, use the latest revision.
  • cluster_compute: The name of the cluster compute to use.
  • autosuspend: An integer representing the number of minutes before the cluster autosuspends if idle. By default, autosuspend is set to 120 minutes. It can be disabled by setting the value to -1.

A fully-formed example of Ray address looks as follows:

ray.init("anyscale://my-project/my-cluster?update=True&cluster_env=my-cluster-env:1&cluster_compute=my-cluster-compute&autosuspend=60")

Keyword arguments

When setting the Ray address in code, you can configure your cluster using keyword arguments in your call to ray.init, e.g.:

ray.init("anyscale://my-cluster", job_name="production_job", cloud="aws_test_cloud")

The following keyword arguments are supported:

Keyword ArgumentDescriptionExample values
update (bool)Whether to update cluster configurations when connecting to an existing cluster. Note that this may restart the Ray runtime. By default, update is set to False.True
cluster_compute (str or dict)The name of the cluster compute to use or a dictionary to build a new cluster compute."my_cluster_compute", {"cloud_id": "1234", ... }
cluster_env (str or dict)The name (and optionally revision) of the cluster environment or a dictionary to build a new cluster environment. If no revision is specified, use the latest revision."prev_created_cluster_env:2", {"base_image": "anyscale/ray-ml:1.1.0-gpu"}
runtime_env (dict)A python dictionary with runtime environment specifications.{"pip": "./requirements.txt"}, {"conda": "conda.yaml"}, {"working_dir": "/tmp/test", "pip": ["chess"]}
cloud (str)The name of the cloud to start the cluster in."aws_test_account"
project_dir (str)Deprecated: Use working_dir in runtime_env to sync a directory. Path to the project code directory. If not specified, the project directory will be autodetected based on the current working directory. If no Anyscale project is found, no project will be used. In general the project directory will be synced to all nodes in the cluster at /tmp/ray/session_latest /runtime_resources/_ray_pkg_<hash>. Ray workers will start in this directory, so when working with files, you may use relative filepaths in your remote task and actor definitions, and your code will work in the same way whether running locally or on your cluster. By default, ".git", "__pycache__", and "venv" are excluded from being uploaded, along with the contents of any ".gitignore" file in the directory. Note: if working_dir is specified in runtime_env, that directory is synced instead of project_dir."~/my-proj-dir"
project_name (str)Deprecated: Specify project name in connection string or in ANYSCALE_PROJECT_NAME environment variable. Optional name to use if the project for the directory specified by `project_dir` doesn't already exist. Can only be speicified if project_dir is provided."my_project"
request_cpus (int)The initial number of CPUs to scale to. The cluster will immediately scale to accommodate the requested number of CPUS, bypassing normal upscaling speed constraints. The requested CPUs are pinned and exempt from downscaling.8
request_gpus (int)The initial number of GPUs to scale to. The cluster will immediately scale to accommodate the requested number of GPUS, bypassing normal upscaling speed constraints. The requested GPUs are pinned and exempt from downscaling.2
request_bundles (list)A list of initial resource bundles to scale to. Each bundle is a dictionary mapping a resource name to a quantity that can be allocated on a single machine. The cluster will immediately scale to accommodate the requested resources, bypassing normal upscaling speed constraints. The requested resources are pinned and exempt from downscaling.[{"CPU": 8}, {"GPU": 8}, {"GPU": 1}]
autosuspend (int or str)Either an integer representing the number of minutes before the cluster autosuspends (after being idle), or a string with "h" or "m" representing the number of hours or minutes before the cluster autosuspends respectively. By default, autosuspend is set to 120 minutes. It can be disabled by setting the value to -1.90 (90 minutes), "120m" (120 minutes), "4h" (4 hours), -1 (disabled)
job_name (str)The name of the job, which will be shown in the UI. This name is only used for display purposes."production_job"
namespace (str)The name of the namespace to run this cluster in."training_namespace"

ClientContext

Calling ray.init("anyscale://...") returns a ClientContext object. This object provides information about the cluster (Python version, Ray version, and a URL to the Ray dashboard). The ClientContext also provides a disconnect() method that will disconnect from the current cluster. Finally, you may also use ClientContext as a context manager, which will automatically call disconnect() for you after the with block is complete.

context = ray.init("anyscale://cluster1")
... # Computation in Cluster 1
context.disconnect() # Manually disconnect

# Start client using a `with` statement
with ray.init("anyscale://cluster2") as c2:
... # do things with c2

# Automatically disconnect from cluster 2 after exiting the `with` block

For more information on ClientContext see the Ray Documentation for Ray Client