Skip to main content

Anyscale cluster dashboard

Anyscale cluster dashboard

This page provides details on using the Anyscale cluster dashboard for monitoring your Ray cluster nodes in your Anyscale jobs and workspaces.

important

This feature is in beta release.

The Anyscale cluster dashboard persists node details beyond the lifetime of the cluster for offline debugging. It provides a scoped view of metrics and Ray entities per node, which allows you to observe how your Ray workload uses cluster resources. Other Anyscale dashboards link to the cluster dashboard to provide details about a specific node.

Access the cluster dashboard

You can view the cluster dashboard in the Anyscale console for any job or workspace.

Complete the following steps to access the Anyscale cluster dashboard:

  1. Log in to the Anyscale console.
  2. Click Workspaces or Jobs.
  3. Click the name of a workspace or job.
  4. Click Ray Workloads.
  5. Click Cluster. The cluster dashboard appears.
note

By default, the cluster dashboard displays the current running session or most recently terminated session. You can view up to the last ten Ray sessions using the dropdown in the top right corner.

Cluster overview

The Overview tab contains a list of all nodes in a cluster, including dead nodes. You can expand each node row using the expand button on the left to view the live workers running on that node.

The following table describes the fields in the node overview:

FieldDescription
NodeThe name of the node in the format Instance type (truncated_node_id).
StateThe state of the node:
  • ALIVE: Node is alive.
  • DRAINING: Node is alive but starting shutdown processing to terminate.
  • DEAD: Node is dead.
IPThe IP address of the node.
IDThe ID of the node.
CPUThe live CPU utilization of the node.
MemoryThe live memory utilization of the node.
GPUThe live GPU utilization of the node.
GRAMThe live GPU memory utilization of the node.
Object store memoryThe live object store memory utilization of the node.
DiskThe live disk usage of the node.
Network sentThe live network bytes sent per second for the node.
Network receivedThe live network bytes received per second for the node.
DurationThe total duration of the node.

In the case of an ungraceful shutdown, Anyscale may not have captured the end time of the node and this field may be missing.
Start timeThe start time of the node.
End timeThe end time of the node.

In the case of an ungraceful shutdown, Anyscale may not have captured the end time of the node and this field may be missing.
ResourcesThe Ray resources configured for the node.
LabelsThe labels configured for the node.

Worker details

Click the chevron next to a node in the overview to see live workers on the node. The following details are available for workers on live nodes:

FieldDescription
WorkersThe command line or process the Ray worker is running.
StateThe state of the worker:
  • ALIVE: Worker is alive.
  • DEAD: Worker is dead.
PIDThe process ID of the Ray worker process.
IDThe ID of the worker.
CPUThe live CPU utilization of the worker.
MemoryThe live memory utilization of the worker.
GPUThe live GPU utilization of the worker.
GRAMThe live GPU memory utilization of the worker.

Node detail

Click the node name to see detailed information for a single node in the cluster.

The Overview tab repeats the details from the main cluster dashboard list.

The Metrics tab provides a filtered view of the Ray Core metrics dashboard for the node. See New metrics dashboard experience in Ray 2.50.0 and above.

The Workers tab contains the list of live workers running on the node. This tab is only available for live nodes.

The Tasks tab contains a list of all tasks run on the node.

important

The tasks tab provides a filtered view of the tasks dashboard. See Anyscale task dashboard.

Task dashboards require a background system cluster, which has an associated cost. See Task dashboard system cluster.

All limitations and requirements for the task dashboard apply to the task tab. See Requirements and limitations.