Troubleshooting
Workspaces provide a couple of tools for troubleshooting, accessible from the Workspace UI:
- Events
- Logs
- Ray Dashboard
- Grafana
Events
The Events table capture critical state transitions of your Workspace such as starting, running and termination.
Logs
Workspaces give you easy access to your application logs, Ray logs, and other logs. You can use them to troubleshoot issues from the node level to the application code level.
Command history
Command history shows the Driver log which is the output from the Driver of your Ray application, for example, main.py
.
- When you run commands in the Web Terminal, the command and its output will be persisted in the Command History tab on the Workspace UI.
- When you run commands in JupyterLab terminal or VS Code terminal, the output will not be persisted when you close the IDE.
- Worker logs (logs of Ray tasks and actors from the Ray’s worker processes) by default are redirected to driver log.
Ray logs
Ray logs refer to the logs of Ray components like GCS, dashboard, etc. Learn more in the Ray documentation.
- When the Workspace is active, you can use Ray Dashboard to view Ray logs.
- When the Workspace is terminated, you can download all the Ray logs following the instruction on the Workspace UI.
Other application logs
Ray AIR logs and stores the results. You may consider using the VSCode or JupyterLab's TensorBoard extension to view TensorBoard logs.
Ray Dashboard
View the Ray documentation to learn about how to use Ray Dashboard.
Grafana
Click on the Grafana button from "Tools" to display the performance metrics of the underlying Ray cluster.