Overview of Anyscale Clusters
This version of the Anyscale docs is deprecated. Go to the latest version for up to date information.
Anyscale clusters are managed Ray clusters and it is the backbone of different Anyscale features. Workspaces, Jobs, and Services all use clusters under the hood.
Advantages of Anyscale clusters over open-source Ray clusters are:
- Managed the cluster lifecycle. You can start, edit, and terminate a cluster using UI, CLI or our SDK. If enabled, clusters can also be automatically terminated when it idles for a period of time for cost-saving purpose.
- Fast startup time. It takes 1-2 minutes for it to start.
- Better spot/pre-emptible instance support. You can choose to use spot/pre-emptible instances only for fall back to on-demand instances if you want.
- Enhanced observability tool. Logs, Ray Dashboard, and Grafana are seamlessly integrated with Anyscale clusters (and exposed to Workspaces, Jobs, and Services).
For most the cases, you don't need to use Anyscale clusters directly. Use Workspaces for interactive development and use Jobs and Services when you move your applications to production.
Consult with your support team if you have to use clusters for certain use case.
Configurations of a cluster
Anyscale clusters accept following configurations:
- Name The name of the cluster. Multiple clusters can use the same name.
- Project The Project to start the cluster in. Click here to learn more.
- Cluster Environment Docker images, packages, and other dependencies for the cluster. Click here to learn more.
- Compute Config Configurations of the cluster shape(for example, the instance types), autoscaling, and underlying virtual machines(VMs). Click here to learn more.
- Idle Termination If enabled, clusters will be auto-terminated after the clusters' idle time reaches the given threshold.
Create and manage clusters.
As suggested above, you don't need to create and manage clusters directly in most of the cases. However, if needed, you can use the Console UI, CLI, or SDK to create and manage clusters.
You can also use Web Terminal and Jupyter to interact with the cluster directly.
If you need to use Ray Jobs API on Anyscale, you can submit Ray Jobs directly to Anyscale clusters. Contact the support team for how-tos and the constraints.
Idle Termination
If Idle Termination is enabled for a Cluster, Anyscale will automatically terminate the Cluster when it has been idle for a given amount of time. An idle Cluster is a Cluster with no activities. Idle time begins counting immediately after the last activity in a Cluster.
The following actions count as activities:
- A running Ray Job is treated as ongoing activity for its entire duration.
- Note that the Ray Job must attach a Ray Driver (by calling
ray.init()
) to count as activity. - Attaching a Ray Driver, even if it does nothing, is a good trick for keeping a Cluster active if you need to keep the Cluster active for something that does not otherwise count as activity.
- Note that the Ray Job must attach a Ray Driver (by calling
- Starting or updating the Cluster.
- Updating a Cluster refers to when a Cluster restarts in order to fulfill new configuration options.
- Completing or canceling a command submitted through the Web Terminal of the Anyscale Cluster page or the Anyscale API/SDK. Note that a command does not count as activity when it is submitted, only when it finishes.
The following actions DO NOT count as activities (provided they do not attach a Ray Driver as a side effect):
- Submitting a Ray Job through the Ray Jobs API if the submitted Job script never calls
ray.init()
. - Running a command in a terminal in Visual Studio Code or Jupyter.
- Running a JupyterLab notebook.
Note that a Ray Job started in a JupyterLab notebook via ray.init()
will continue to run indefinitely (and keep the Cluster active) unless it is explicitly stopped via ray.shutdown()
. This is because the python kernel in a Jupyter notebook does not trigger the typical atexit
hook for ray.shutdown()
when a block of code finishes.
Monitoring and troubleshooting
We recommended using Workspaces for interactive development and using Jobs and Services when you move your applications to production. As a result, you normally don't need to monitor and troubleshoot your clusters.
However, you may still need to troubleshoot your clusters directly sometimes. Here are the tools you can use:
- Events
- Logs
- Ray Dashboard
- Grafana
Events
Cluster events capture the critical state transitions of your cluster.
Logs
Depending on the version of the Anyscale CLI used to create the cloud, logs emitted by Ray are persisted within a storage bucket associated with the cloud.
Logs persisted for Anyscale clusters are exposed to other Anyscale features that leverage clusters under the hood.
- Click here for how to view logs of Anyscale Workspaces
- Click here for how to view logs of Anyscale Jobs
- Click here for how to view logs of Anyscale Services
Active clusters
View the Ray Dashboard for logs.
Terminated clusters
You can view and download logs after your cluster has been terminated using the anyscale logs
CLI commands. You can also configure your in-house logging provider to read logs from remote storage and ingest them into your own system.
Anyscale clouds are automatically set up with log persistence. Logs are automatically uploaded from clusters to a bucket.
- If you use Anyscale Managed Cloud Setup, logs are stored in a bucket called:
anyscale-production-data-{cloud_id}
- If you use Anyscale Cloud Register with your own resources, logs are stored in a bucket specified by you.
Use anyscale logs
to view and download logs.
# View logs for a particular cluster
anyscale logs cluster --id <cluster-id> <glob-filter | filename>
# Download all logs for a particular cluster
anyscale logs cluster --id <cluster-id> --download
# More help
anyscale logs cluster --help
Cost
You will be charged for standard storage pricing (AWS S3 or GCP Storage) based on the amount of log data that your clusters produce.
Configuring lifecycle policy
You may configure a lifecycle policy on the objects in this bucket directly through AWS.
Ray Dashboard
Click the Dashboard button from to display the metrics of your Ray cluster. View the Ray documentation to learn about how to use Ray Dashboard.
Grafana
Click the Grafana button from to display the metrics of your Ray cluster.