Skip to main content

Troubleshooting

Workspaces provide a couple of tools for troubleshooting, accessible from the Workspace UI:

  • Events
  • Logs
  • Ray Dashboard
  • Grafana

Events

The Events table capture critical state transitions of your Workspace such as starting, running and termination.

Logs

Workspaces give you easy access to your application logs, Ray logs, and other logs. You can use them to troubleshoot issues from the node level to the application code level.

Command history

Command history shows the Driver log which is the output from the Driver of your Ray application, for example, main.py.

  • When you run commands in the Web Terminal, the command and its output will be persisted in the Command History tab on the Workspace UI.
  • When you run commands in JupyterLab terminal or VS Code terminal, the output will not be persisted when you close the IDE.
  • Worker logs (logs of Ray tasks and actors from the Ray’s worker processes) by default are redirected to driver log.

Ray logs

Ray logs refer to the logs of Ray components like GCS, dashboard, etc. Learn more in the Ray documentation.

  • When the Workspace is active, you can use Ray Dashboard to view Ray logs.
  • When the Workspace is terminated, you can download all the Ray logs following the instruction on the Workspace UI.

Other application logs

Ray AIR logs and stores the results. You may consider using the VSCode or JupyterLab's TensorBoard extension to view TensorBoard logs.

Ray Dashboard

View the Ray documentation to learn about how to use Ray Dashboard.

Grafana

Click on the Grafana button from "Tools" to display the performance metrics of the underlying Ray cluster.