Skip to main content
Version: Canary 🐤

Manage dependencies in development

We have already covered how to use and build container images in Anyscale, which is a widely adopted technology to package production applications and their dependencies. However, during development, when iteration speed is paramount, rebuilding container images can slow down the entire workflow.

Anyscale Workspaces bridge this gap by allowing users to seamlessly track their packages during development, automatically distribute them to worker nodes quickly without requiring a rebuild, and then offer an easy way to package the workspace environment into a container image when the application is ready to move to a production job or service.

For example, running pip install emoji on an Anyscale workspace automatically tracks the emoji package. To inspect the tracked dependencies, you can use the Dependencies tab. You can also use this interface to add and remove packages, equivalent to pip install and pip uninstall. The next time you restart your workspace or run a Ray job that requires adding worker nodes, those dependencies are automatically installed on top of the base container image.

note

Anyscale Workspaces support a subset of pip options:

  • -r, --requirement
  • --dry-run
  • -q, --quiet
  • -U, --upgrade

Anyscale doesn't support other options or packaging systems across a cluster yet. Reach out to preview-help@anyscale.com if you have any questions.

Dependencies tab

How are packages replicated in worker nodes?

Anyscale uses Ray's Runtime environment pip feature to distribute packages to the worker nodes. All Ray applications executed on the workspaces automatically get the packages injected inside the Ray runtime environments of the corresponding job. If your application already uses a runtime environment, the Anyscale workspace will merge your existing pip configuration with the tracked dependencies.

warning

Uninstalling pip packages from the base image will not be replicated to the worker nodes. The runtime environment feature is meant for pure addition or modification to packages installed in the base image.

Editable packages

Editable packages are commonly used during development; however, these packages are not automatically tracked. To propagate these packages across all worker nodes, we recommend using the Runtime environment py_modules feature. Here is a simple example of how to do that:

import ray
# this package is installed on the head node with `pip install -e my_local_package`
import my_local_package

# adding my_local_package here makes it available to all worker nodes
ray.init(runtime_env={"py_modules": [my_local_package]})

...

Disabling runtime dependency tracking

If you want to skip tracking dependencies, you have a couple of options:

ANYSCALE_SKIP_PYTHON_DEPENDENCY_TRACKING environment variable: Set this variable to skip tracking. For example, use ANYSCALE_SKIP_PYTHON_DEPENDENCY_TRACKING=1 pip install emoji. This option is useful for skipping the installation of specific packages or temporarily disabling this feature.

Disable tracking for the entire workspace: You can disable tracking at any point by going to the dependency tab and toggling the enable option. Once this is toggled off, all subsequent runs will not include any of the previously tracked runtime dependencies.

disable runtime dependency tracking

Environment variables

Within the Dependencies tab, you can add and remove environment variables to the workspace. After modification, open a new terminal or run source ~/.bashrc to reload the environment variables into the current session.

Transition to a production environment

Anyscale installs runtime dependencies on each node as it spins up. Though this method enables flexibility and quick iteration during development, it introduces some challenges for production workloads.

  1. Version variability: Different nodes started at different times might end up with varying versions depending on how strictly defined the requirements are.
  2. Latency issues: Each node must install all dependencies and download data or model weights before performing work, which can lead to significant slowdowns.
  3. Risk of production failure: If an incident occurs with package sources like pip or GitHub, this could lead to failure of the job or service. The following section shows how to solidify dynamic dependencies into stable, optimized images.

Build an optimized image

To build an optimized container image that includes all of these modifications applied to your workspace, select a workspace and navigate to the Dependencies tab. Click Build an optimized image to view the interface where you can find the base image of your workspace, the tracked dependencies, and environment variables automatically generated in a Dockerfile-like format.