Anyscale Environments
There are two types of environments supported by Anyscale:
- Cluster environments
- Runtime environments
Cluster environments
Cluster environments are specially-built Docker images that work efficiently with Ray inside of Anyscale. The following diagram illustrates the principles behind cluster environments:
Builds are individual Docker images and associated metadata for an environment at a point in time. Environment builds are immutable and never deleted, so you can always reproduce or rollback to an old state.
As time goes on, you can create new environment builds and incrementally update your application's code and dependencies over time.
Definition
- Anyscale Cluster Environments
- Custom Cluster Environments
The following is a fully-specified example of a cluster environment definition:
base_image: anyscale/ray-ml:1.9.0-py38-gpu
env_vars:
MY_VAR: '123'
debian_packages:
- vim
python:
pip_packages:
- pkutils
conda_packages:
- pkutils
post_build_cmds:
- ls
Note that you can specify the following in your cluster environments:
- (Required) A base Docker image, which must be some version of
anyscale/ray
(refer to the DockerHub page for details) oranyscale/ray-ml
(refer to the DockerHub page for details). - (Optional) Environment variables, which are available during the cluster environment build process and on runtime when the container starts.
- (Optional) Debian, pip and Conda packages, which will be installed on top of the packages included in the base image.
- (Optional) Post-build commands, which will run at the end of the build process for the cluster environment.
Alternatively, you can manually specify a Docker image to use. See this section for more details on building and storing Anyscale compatible images.
docker_image: my-registry/my-image:tag
ray_version: 2.0.0 # Replace this with the version of Ray in your image
env_vars: # Optionally, specify environment variables
MY_VAR: value
Note that you can specify the following when using a customer docker environment:
- (Required) A Docker image URI, for example
<account-id>.dkr.ecr.<region>.amazonaws.com/image:tag
. Anyscale can access any publicly available image, and private Amazon ECR repositories if permissions are set up. - (Required) The version of Ray in your image. You can check this with
docker run --entrypoint ray <your-image> --version
- (Optional) Environment variables that will be set when the cluster is launched.
Creating a cluster environment
You can create a cluster environment using the CLI, the Python SDK and the HTTP API:
- CLI
- Python SDK
anyscale cluster-env build my_cluster_env.yaml --name my-cluster-env
import yaml
from anyscale.sdk.anyscale_client.models import CreateClusterEnvironment
from anyscale import AnyscaleSDK
sdk = AnyscaleSDK()
with open('my_cluster_env.yaml') as f:
cluster_env = yaml.safe_load(f)
sdk.build_cluster_environment(CreateClusterEnvironment(
name="my-cluster-env",
config_json=cluster_env
))
For more information, you can check the full reference of the CLI, Python SDK and the HTTP API.
You can also create a cluster environment in the Web UI by navigating to "Configurations > Cluster environments > Create a new config" to create a cluster environment. If you want to update the cluster environment, you must select "Create a new version" which will create a new build for your environment.
Referencing a cluster environment
When starting a production job or a cluster, you will be able to specify the cluster environment that you want to use. You can also use cluster environments when creating clusters on the fly for workspaces by specifying cluster_env_name:version
as the cluster environment.
You can do this by using the environment variable syntax:
export RAY_ADDRESS=anyscale://my-cluster?cluster_env=my-cluster-env:1
python my_script.py
Alternatively, you can hardcode the cluster environment within your script:
ray.init(address="anyscale://my-cluster", cluster_env="my-cluster-env:1")
Runtime environments
Basic usage
Runtime environments specify the Python environment that you want the driver, tasks, and actors of your job to run in. These are specified at runtime rather than being pre-built like cluster environments. Runtime environments are nested within cluster environments. You may have different runtime environments for different jobs, tasks, or actors on the same cluster.
A runtime environment is a JSON-serializable dictionary that can contain a number of different options, such as pip dependencies or files. Please see the Ray documentation for more details and the full list of options.
The example runtime environment below syncs the local directory and installs a few pip dependencies:
ray.init(address="anyscale://my-cluster", runtime_env={
"working_dir": ".",
"pip": ["requests", "torch==0.9.0"]
})
Working directory
The working directory (sometimes called "project directory") is also part of the runtime environment; code from the working directory is uploaded to Anyscale when you establish a connection. By default, the working directory is the current directory, and is also associated with an Anyscale project. You can however also configure the working directory. Here it is using ray_code
, a relative path:
ray.init("anyscale://my-cluster", runtime_env={"working_dir":"ray_code"})
You can also use schemes such as s3://
ans gs://
to specify code located elsewhere.
Your working directory might contain testing artifacts or other data that is either large or that you don't need on the Ray cluster. If the directory is larger than 100M, Anyscale will not upload it. In order to prune large files and directories from your runtime environment, use the excludes
key in its configuration. excludes
takes an array of strings, which can include .gitignore
style wildcards:
ray.init("anyscale://my-cluster", runtime_env={
"working_dir": ".",
"excludes": ["large_file1.csv", "tests/*.dat"]
})
If you need a large working directory, consider putting it in the cloud ahead of time and referencing from there:
ray.init("anyscale://my-cluster", runtime_env={
"working_dir": "s3://my-code-bucket",
"excludes": ["large_file1.csv", "tests/*.dat"]
})
The runtime code from your project is uploaded to a temporary location on each node in Ray cluster. You can find the Ray runtimes under /tmp/ray/session_latest
.
Using multiple runtime environments
By default, all tasks and actors in a job will run in the job-level runtime environment, but you can also specify per-task and per-actor runtime environments. This can be useful when you have conflicting dependencies within the same application (e.g., need to use two different versions of Tensorflow).
You can specify the runtime environment in the @ray.remote
decorator or using .options
:
@ray.remote(runtime_env=my_runtime_env1)
def task():
pass
# Task runs in env1.
task.remote()
# Alternative API:
task.options(runtime_env=my_runtime_env1).remote()
@ray.remote(runtime_env=my_runtime_env2)
class Actor:
pass
# Actor runs in env2.
Actor.remote()
# Alternative API:
Actor.options(runtime_env=my_runtime_env2).remote()
If the runtime environment specifies pip or Conda dependencies, these will be installed in an isolated Conda environment. In particular, pip and Conda dependencies from the cluster environment will not be inherited by the runtime environment.
For Anyscale Production Jobs and Anyscale Services, the usage of runtime_env
differs slightly; see Runtime Environments in Anyscale Production Jobs and Services for details.