Debug in workspaces
Debugging is a critical part of the development process. This document provides guidance on how to debug your Ray apps using Anyscale Workspaces.
Logs
The most common and versatile tool for debugging is looking at logs produced by your Ray app or by the Ray libraries. Logs can provide valuable information about the state of your app, the execution of your tasks, and any errors that occur.
While the workspace is active, you can view logs in the Ray Dashboard.
Anyscale also automatically persists logs in the cloud associated with your workspace and you can download them with the Anyscale CLI.
If log ingestion is enabled, you can also view logs in the Anyscale Console and use features such as filtering by text, labels, and time range.
To read more, see Accessing logs.
Ray Dashboard
The Ray Dashboard contains several useful tools for debugging your Ray applications. View the Ray documentation to learn about how to use Ray Dashboard.
View the Ray Dashboard by clicking the "Ray Dashboard" tab in the workspace detail page.
Metrics
Sometimes metrics can be useful for debugging as well. Metrics can provide insight into the performance of your application, the resource usage of your cluster, and the behavior of your tasks.
You can view metrics in the Anyscale Console in the Metrics tab of your workspace. Metrics are also available in Grafana for a more advanced UI which allows you to create custom dashboards for visualizing your metrics, including custom metrics.
To access Grafana, click the "View in Grafana" button within the Metrics tab.
Distributed debugger
Interactive distributed debugging with Visual Studio Code is supported with Ray >= 2.9.0
in your workspaces. This feature allows you to break into any remote task running anywhere in your Ray cluster and perform post-mortem debugging when tasks hit an exception.
Configuration
To enable debugging, set the RAY_DEBUG
environment variable in runtime_env
inside ray.init
.
# Make sure Ray is 2.9.0 or above.
# assert ray.__version__ >= "2.9.0"
ray.init(runtime_env={
"env_vars": {"RAY_DEBUG": "1"},
})
Breaking into tasks with breakpoint()
You can insert a breakpoint directly into your Ray tasks using the standard Python breakpoint()
method. When your code hits the breakpoint, it pauses execution, allowing you to connect with VS Code and inspect or modify the state of the program.
To set breakpoints, complete the following:
- Ensure your environment is correctly set up by enabling the
RAY_DEBUG
variable and having a supported Ray version. - Insert
breakpoint()
at the desired point in your Ray task. - Run your Ray application.
- When your program hits the breakpoint, the task enters a paused state. The terminal displays this information.
(test pid=3459) Ray debugger is listening on 10.0.0.241:48101
(test pid=3459) Waiting for debugger to attach...
- Open the Ray extension in the sidebar of VS Code, which lists paused tasks. Select the paused task you want to attach to.
- Use the VS Code debugger like you do when developing locally.
Postmortem debugging
Postmortem debugging is crucial for diagnosing issues that cause uncaught exceptions in your tasks. When a task raises an exception, Ray freezes its state and waits for a debugger to attach, allowing you to inspect the state of the program at the time of the error.
To use postmortem debugging, complete the following:
- Ensure your environment is correctly set up by enabling the
RAY_DEBUG
variable and having a supported Ray version. - Run your Ray application.
- When a task throws an unhandled exception, it enters a paused state. The terminal displays this information.
(test pid=3459) Ray debugger is listening on 10.0.0.241:48101
(test pid=3459) Waiting for debugger to attach...
- Open the Ray extension in the sidebar of VS Code, which lists paused tasks. Select the paused task you want to attach to.
- Use the VS Code debugger like you do when developing locally.
Ensure that the "unhandled exception" option is enabled in VS Code's debugger settings. This setting allows VS Code to break execution at the point where the exception occurred, enabling you to perform post-mortem analysis.
More info
- To learn more details about the Ray Dashboard, see the Ray Dashboard documentation
- To learn more about Grafana and how to use it, see the official Grafana documentation