Skip to main content
Version: Latest

Dependency management

Check your docs version

These docs are for the new Anyscale design. If you started using Anyscale before April 2024, use Version 1.0.0 of the docs. If you're transitioning to Anyscale Preview, see the guide for how to migrate.

Anyscale's approach to dependency management mirrors the development-to-production flow. For fast iteration cycles, runtime dependencies allow dynamic configuration of an Anyscale Workspace environment. For deploying applications to production, custom container images ensure performance and reliability across cluster nodes.

Manage dependencies in development​

To quickly manage your environment, you may already be familiar with running pip install <PACKAGE_NAME> or pip uninstall -r requirements.txt. Running these on the command line in Anyscale Workspaces updates the head node’s environment, tracking and persisting environment changes as you work. At runtime of a Ray application, the workspace distributes the dependencies to all nodes in the cluster, ensuring consistent execution.

To inspect the dependencies, select a workspace and navigate to the Dependencies tab. With this user interface, you can add and remove packages equivalent to pip install and pip uninstall.

note

Anyscale Workspaces supports a subset of pip options:

  • -r, --requirement
  • --dry-run
  • -q, --quiet
  • -U, --upgrade

Anyscale doesn't support other options across a cluster. Reach out to preview-help@anyscale.com for advanced options.

Dependencies tab

Editable packages​

Editable packages are commonly used during development; however, these packages are not automatically tracked. To propagate these packages across all worker nodes, we recommend using the Runtime environment python modules feature in Ray. Here is a simple example of how to do that:

import ray
import my_local_package # this package is installed on the head node with `pip install -e my_local_package`

# adding my_local_package here makes it available to all worker nodes
ray.init(runtime_env={"py_modules": [my_local_package]})

...

Disabling runtime dependency tracking​

If you want to skip tracking dependencies, you have a couple of options:

  1. ANYSCALE_SKIP_PYTHON_DEPENDENCY_TRACKING environment variable: Set this variable to skip tracking. For example, use ANYSCALE_SKIP_PYTHON_DEPENDENCY_TRACKING=1 pip install emoji. This option is useful for skipping the installation of specific packages or temporarily disabling this feature.

  2. Disable tracking for the entire workspace: You can disable tracking at any point by going to the dependency tab and toggling the enable option. Once this is toggled off, all subsequent runs will not include any of the previously tracked runtime dependencies.

disable runtime dependency tracking

Environment variables​

Within the Dependencies tab, you can add and remove environment variables to the workspace. After modification, open a new terminal and run source ~/.bashrc to restart bash and reload the environment variables into the current session.

Transition to a production environment​

Anyscale installs runtime dependencies on each node as it spins up. Though this method enables flexibility and quick iteration during development, it introduces some challenges for production workloads.

  1. Version variability - Different nodes started at different times might end up with varying versions depending on how strictly defined the requirements are.
  2. Latency issues - Each node must install all dependencies and download data or model weights before performing work, which can lead to significant slowdowns.
  3. Risk of production failure - If an incident occurs with package sources like pip or GitHub, this could lead to failure of the job or service.

The section shows how to solidify dynamic dependencies into stable, optimized images.

Build an optimized image​

Container images are the standard for production workloads. These isolated environments encapsulate all necessary elements for software execution, including code, runtime, system tools, libraries, and settings, ensuring consistent operation across different computing environments.

Workspaces start with a default, Anyscale-built image that works with most Ray workloads out-of-the-box. This image contains various versions of Ray, Python, CUDA, and popular machine learning packages like torch and pandas. When you modify runtime dependencies, you're using a default image as the base.

Anyscale build farm

To build a custom container image, select a workspace and navigate to the Dependencies tab. Click Build an optimized image to access a Dockerfile-like interface where you can specify the base image. See Anyscale base image options. Only images with Ray 2.8.1 or later are supported. This interface supports the following Dockerfile instructions:

  • FROM: Create a new build from an Anyscale base image.
  • ENV: Set environment variables.
  • WORKDIR: Change working directory.
  • COPY: Copy files and directories only from a previous build stage, not from local storage.
  • RUN: Execute build commands.

After building an image, switch the workspace image, restart the workspace, and submit a job or deploy a service. This enables a faster start-up time and a fixed environment for production workloads.

For the full specification for building Anyscale compatible images, see Image requirements.