Configuration overview
All jobs, services, and workspaces you launch on Anyscale run on a Ray cluster. To define a cluster, you need three configurations:
- Container image
- Compute configuration
- Storage
Together, these components specify what a cluster should look like. Where to locate the instances, how much memory should each instance have, which environment variables should they have, etc.
Conceptually, the container image defines the software on the cluster and the compute config defines the hardware. This section provides a general introduction to these two configurations. It also touches upon storage and secret management.
As applications transition through their lifecycle, managing these configurations ensures a smooth deployment and optimal scaling on the Anyscale platform.
Container image
The container image defines the environment for clusters. You can think of the container image as the file where you specify the software of your cluster. It contains the following:
- A base image
- Environment variables
- Package dependencies
- The specific versions for Ray, Python, and other essential libraries
All instances within a cluster use the same container image. You can use a prebuilt Anyscale image or a custom one with your configuration built in.
Anyscale’s base images provide a great starting point for many ML workloads. If you use the same packages across different clusters, you may want to build a custom image that pre-installs those packages. To learn more about building your own image, see this page.
Compute config
Once you have a container image, you can start up a cluster. Use a compute config to shape this cluster.
The compute config defines which instance types Anyscale can provision and manage on your behalf. You can think of the compute config as the file where you specify the hardware of your cluster.
It defines parameters such as cloud preferences, instance types, and scaling options. For example, you can configure a small head node that is only used for orchestration, and between five and ten worker nodes for the actual workloads. In the compute config, you also define whether you want to use spot instances and network settings.
Just like container images, you can reuse compute configs across different jobs, services, and workspaces. To learn more about compute configs, see this page.
Storage
The shape of the cluster also determines what storage is available within the cluster. Every cluster has three distinct types of storage:
- Local storage for a node
- Object storage
- Storage shared across nodes
To learn more about these different types of storage and how to configure them, see this page.
Secret management
While strictly speaking not necessary, secret management is a fourth configuration to consider when defining a cluster. There are multiple ways to ensure that workloads can access securely stored values such as tokens and keys.
To learn more about secret management in clusters, see this page.