Skip to main content

Get started with Anyscale Jobs

Check your docs version

This version of the Anyscale docs is deprecated. Go to the latest version for up to date information.

note

Use of Anyscale Jobs requires Ray 2.0+.

info

Anyscale Jobs is a powerful tool to run discrete workloads (aka jobs) in a way that provides first-class support for lightning fast scalability, fault-tolerance and observability.

Anyscale Jobs (production Jobs), is the best way to run your batch workloads in production.

Key features of Anyscale Jobs:

  1. Scalability: lightning-fast scaling to 1000s of Cloud VMs
  2. Reliability & fault-tolerance: Ray Tasks & Actor failures are automatically retried. If job crashes (due to OOM for ex) whole job will be automatically rescheduled onto a new cluster by Anyscale.
  3. Observability: providing access to Ray dashboard and Grafana, allows to observe, manage and troubleshoot jobs in realtime.

Follow the instructions below to set up your environment to be able to submit Anyscale Jobs.

Setup your environment

1. Prepare your Anyscale Cloud

Follow this tutorial make sure you create your first (default) Compute Config and Cluster Environment

2. Install the latest Ray and Anyscale CLI

pip install ray
pip install -U anyscale

3. Clone the Anyscale Examples Repo.

git clone https://github.com/anyscale/docs_examples.git anyscale_doc_examples
cd anyscale_doc_examples

3. Submit Job

When job is submitted to run on Anyscale Platform, it is

  1. Scheduled to run on a dedicated, managed cluster.
  2. Being monitored with its state being automatically updated and reflected on the job's page.
  3. Retried on failures (if any).
anyscale job submit -f job.yaml --follow

(anyscale +2.2s) View the job in the UI at https://console.anyscale.com/jobs/prodjob_...
⠏ Waiting for a job run, current state is PENDING...
Hello World!

While your job is running, to follow along its progress and stream logs from the job's driver (hello_world.py in this case)
add --follow argument to your CLI command above.

Submitting Job using current Working Directory

To provide even greater flexibility during development process, Anyscale Jobs could be submitted right from your current working directory directly utilising its contents.

Under the hood, Anyscale CLI will package contents of your working directory into a zip file and upload to your configured Cloud storage bucket, automatically setting up submitted jobs to use it as a custom working directory via Ray's Runtime Environment working_dir configuration option.

1. Configure Cloud storage bucket

note

Configuring Cloud storage bucket is required to hold artifacts (zip file) containing local working directories.

Alternatively, any zip file meeting the expected format is accepted: zip file must contain only a single top-level directory.

See Remote URIs for more details about Ray Runtime Environments and here for more details about Anyscale-specific runtime_env configuration.

To use this feature, first, setup Cloud storage bucket in your Cloud Provider account. Bucket should be configured to allow following access patterns:

  • Bucket should permit write-access from the machine holding working directory (so that Anyscale CLI is able to use Cloud provider credentials to upload it)
  • Cluster launched in your Cloud provider account should have read-access to this bucket to be able to download uploaded artifacts

See docs for AWS or GCP for examples and more details about how to configure Cloud storage buckets appropriately.

2. Configure Job to use working directory

Job could be configured to use working directory by specifying runtime_env.working_dir in the YAML configuration in one of the following ways.

To upload your local working directory (to a Cloud storage bucket):

runtime_env:
# Local (relative) path to your target working dir
working_dir: "."
# Target upload path to your Cloud storage buckets
upload_path: "s3://your-cloud-storage-bucket"
note

To be able to upload local working directory when using Anyscale Jobs, upload path configuration has to be provided.

To use any remotely accessible working directory (following appropriate format):

entrypoint: python hello_world.py
runtime_env:
# Already packaged and uploaded zip file
working_dir: "https://github.com/anyscale/docs_examples/archive/refs/heads/main.zip"

3. Submit Anyscale Job

Submit Anyscale Job as usual with:

anyscale job submit -f job.yaml --follow
note

Make sure you are in the docs_examples repo (where job.yaml is located) before running the following example.

Using aforementioned example from Anyscale doc_examples repo:

entrypoint: python hello_world.py
runtime_env:
working_dir: "https://github.com/anyscale/docs_examples/archive/refs/heads/main.zip"

Should produce following:

anyscale job submit -f job.yaml --follow

(anyscale +2.2s) View the job in the UI at https://console.anyscale-staging.com/jobs/prodjob_...
⠏ Waiting for a job run, current state is PENDING...
Hello World!

Where hello_world.py looks like following

import ray

@ray.remote
def hello_world():
return "Hello World!"

result = ray.get(hello_world.remote())
print(result)