# Data Dashboard

[View Markdown](/monitoring/workload-debugging/data-dashboard.md)

# Data Dashboard

The Data dashboard is a centralized tool for debugging Ray Data workloads. It's a proprietary replacement for the open-source Data dashboard that enables users to monitor and debug Ray Data workloads more efficiently.

## Accessing the Data dashboard[​](#accessing-the-data-dashboard "Direct link to Accessing the Data dashboard")

To access the Data dashboard, click the **Ray Workloads** tab in the Jobs or Workspaces page. Then, select the **Data** tab.

**Note:** The dashboard is available only for Ray version 2.44 and higher. It's empty for lower versions of Ray.

![Data Dashboard](/assets/images/data-dashboard-overview-6cf4a5cb74c15de932c8d773381ea77a.png)

The dashboard offers a high-level overview of your entire workload pipeline, displaying key debugging metrics such as CPU and GPU usage, the number of output and queued rows, throughput, and more. Operators are listed in reverse order, with the first row representing the final output of the dataset and the last row representing the input. This design allows the UX to render the pipeline as a tree view, making it easier to visualize complex datasets.

Additionally, it offers an in-depth analysis tool, allowing you to click on individual operators to view operator-specific metrics, such as estimated remaining runtime, peak memory usage, and other relevant details.

![Data Operator View](/assets/images/data-dashboard-operator-view-f8edf29b506603e84a27d3ec6d270ce2.png)

## Tree view[​](#tree-view "Direct link to Tree view")

For pipelines with n-ary operators such as join, union, or zip, the dashboard visualizes the pipeline as a tree. This tree structure is crucial for understanding complex data workflows.

![Data Tree View](/assets/images/data-dashboard-tree-view-f88cd47c0426733231b854b6c389bbdf.png)

## Log view[​](#log-view "Direct link to Log view")

The Data dashboard includes a detailed log view for each dataset, capturing logs from `/tmp/ray/{SESSION_NAME}/logs/ray-data/ray-data-{DATASET_ID}.log`. By default, Ray Data generates a variety of useful logs, including details about backpressure, health checks, and failure events such as out-of-memory (OOM) errors.

![Data Log View](/assets/images/data-dashboard-log-view-0f38c84aae1cf3253859362cdf7441f5.png)

You can also use this feature to emit logs from your own functions using the Ray Data logger. For example, the snippet below demonstrates how to log messages within a custom map function to selectively inspect dataset contents:

```
logger = logging.getLogger("ray.data")

def my_map_function(batch):
  if is_interesting(batch):
    logger.info(f"Processing {batch}")
  ...
```

## Persistence[​](#persistence "Direct link to Persistence")

The Data dashboard remains available even after you terminate a job or workspace. By selecting a past session from the **Session** dropdown in the top-right corner, you can access historical data and continue debugging as needed.

![Persistent Data Dashboard](/assets/images/data-dashboard-persistent-1f47b4109be7f772fa482602ba0201e5.png)