Monitor and troubleshoot
Jobs provide access to a couple of tools for troubleshooting.
- Ray Dashboard
Events capture the critical state transitions of your Jobs. View them on the Console UI.
For both running and terminated Jobs, you can view logs of a Jobs on the Console.
For a running Job
For running Jobs, you can follow it using the CLI or the Python SDK.
- Python SDK
anyscale job logs --job-id <JOB_ID> --follow
from anyscale import AnyscaleSDK
sdk = AnyscaleSDK()
job_logs = sdk.get_production_job_logs(JOB_ID)
For a terminated or completed Job
- Downloading logs only works for clouds on AWS. Support for downloading logs on GCP is coming soon.
- Only application and Ray system logs are persisted and able to be downloaded. Downloading logs from
ray_resultsis not supported yet.
For a terminated or completed Job, you can download the log files from the Cluster that ran the Job.
There is no additional setup required. Logs will be uploaded and stored from clusters to the cloud object storage bucket associated to the Anyscale Cloud by default.
View and Download Logs
anyscale logs to view and download logs. View the "Ray Logs" tab on the console UI for more instructions.
# View logs for a particular cluster. Cluster ID can be found by going to the Job page and finding the cluster the Job was run on. It should look something like ses_8kVvPt6pNkR7xJlEE2zfQQXW
anyscale logs cluster --id <cluster-id> <glob-filter | filename>
# Download all logs for a particular cluster
anyscale logs cluster --id <cluster-id> --download
# More help
anyscale logs cluster --help
Anyscale sends you email notifications when the following events happen on a Job created by you:
- Job completes successfully
- Job fails after exceeding configured retries
- Job fails due to system failures
These emails are sent from
Anyscale Alerts (do not reply) <email@example.com>, so please ensure that this is not blocked or marked as spam.
By default you are automatically subscribed for all notification emails. You can manage subscription by clicking the Unsubscribe link in the footer of the email. Clicking the Unsubscribe link will take you to a subscription preferences page (see below screenshot), where you can selectively subscribe/unsubscribe from these notifications emails (topics). Make sure to click the
Update button at the bottom of this page after checking/unchecking your subscription preferences. You will stop receiving emails for topics that are unchecked after the update. To subscribe again, simply check and update the topic in subscription preferences page.
When the Job is running, click the "Dashboard" button to view the Ray Dashboard for the underlying cluster. Click here to learn about how to use Ray Dashboard.