Skip to main content

Get started with Anyscale Jobs

note

Use of Anyscale Jobs requires Ray 2.0+.

info

Anyscale Jobs, or production Jobs, are related to but distinct from Ray Jobs. Ray Jobs are better suited for interactive development. You can learn more about how to develop Ray Jobs on Workspaces.

Anyscale Jobs are the recommended way to move your workload to production.

Key features of Anyscale Jobs:

  1. Automated failure handling
  2. Automated email alerting
  3. Record and persist outputs such as logs

Follow the instructions below to set up your environment and submit Jobs.

Setup your environment

Install the latest Ray and Anyscale

pip install ray
pip install -U anyscale

Clone the Anyscale Examples Repo.

git clone https://github.com/anyscale/docs_examples.git anyscale_doc_examples
cd anyscale_doc_examples

Setup an bucket in your cloud provider account. Your laptop needs to write to the bucket, and Anyscale needs to read files in the bucket. Please configure permissions accordingly. See docs for AWS or GCP.

info

A bucket is used to host a zip containing files for your Job from your local directory. If you haven't configured a bucket, you can follow along with the first part of this tutorial using example zip file stored on GitHub.

Any zip file that meets the format is accepted. Importantly, the zip file must contain only a single top-level directory. See Remote URIs for more details about Ray Runtime Environments, and see here for more detail about Anyscale-specific runtime_env configuration.

Submit your first Anyscale Job

When you submit a Job YAML, Anyscale runs your Job on a managed cluster and monitors it for completion or failure. Anyscale will retry your Job on transient failures.

anyscale job submit -f job.yaml --follow

(anyscale +2.2s) View the job in the UI at https://console.anyscale-staging.com/jobs/prodjob_5zhGUUdFLhZR1CcCYRSCwhZ8.
⠏ Waiting for a job run, current state is PENDING...
Hello World!

It may take a few minutes for your Job to finish. With the --follow argument, the CLI will follow the progress of the job and print the log output of the jobs to the terminal.

Using a working directory

The runtime_env.working_dir field in the Job YAML supports any remote zip URL. You can also specify a local working directory if you additionally specify the bucket you configured in the setup. Your laptop must have write permission to your bucket, and your cluster must have read permissions. For more information, view docs for access to a s3 bucket or access to a GCS bucket.

Make sure you are in the doc_examples repo before running the following example.

info

For Anyscale Jobs, local directories are not supported as the working directory parameter when upload path is not provided.

entrypoint: python hello_world.py
compute_config: config_with_bucket_permission
runtime_env:
- working_dir: "https://github.com/anyscale/docs_examples/archive/refs/heads/main.zip"
+ working_dir: .
+ upload_path: "s3://insert-your-anyscale-bucket"