Skip to main content

Production Jobs

note

Use of Production jobs requires Ray 1.9+.

info

Production jobs, or Anyscale jobs, are related to but distinct from Ray Jobs. You can still use Ray Jobs on Anyscale.

Ray Jobs are better suited for development. Check Ray jobs to learn more.

Production Jobs are the recommended way to move your workload to Production.

Key features of Production Jobs:

  1. Automated failure handling
  2. Automated email alerting
  3. Record outputs

This walkthrough will help you run your Jobs on Anyscale.

Setup your environment

Install the latest ray and Anyscale

pip install ray
pip install -U anyscale

Clone the Anyscale Examples Repo.

git clone https://github.com/anyscale/docs_examples.git anyscale_doc_examples
cd anyscale_doc_examples

Setup an bucket in your Cloud account. Your laptop needs to write to the bucket, and Anyscale needs to read files in the bucket. Please configure permissions accordingly. See docs for AWS or GCP.

info

We need a bucket because we will be packaging a local directory as a zip containing files for our Job, and hosting that zip in the specified bucket. If you haven't configured a bucket, you can follow along with the first part of this tutorial using example zip file stored on Github.

If you already have a hosted zip file with your code, that will work too. Any zip file that meets the format is accepted. Importantly, the zip file must contain only a single top-level directory. See Remote Uris for more details about Ray Runtime Environments, and see the Jobs Reference for more detail about Anyscale Specific runtime env configuration.

Submit your first Production Job

When you submit a Job yaml, Anyscale runs your Job on a managed cluster and monitors it for completion or failure. Anyscale will retry your Job on transient failures.

anyscale job submit -f job.yaml

(anyscale +2.2s) View the job in the UI at https://console.anyscale-staging.com/jobs/prodjob_5zhGUUdFLhZR1CcCYRSCwhZ8.
⠏ Waiting for a job run, current state is PENDING...
Hello World!

It may take a few minutes for your job to start, after which you should see the message printed to the terminal.

Using a local directory

The runtime_env.working_dir field in the Job YAML supports any remote zip url. You can also specify a local directory if you additionally specify the bucket you configured in the setup. Your laptop must have write permission to your bucket, and your cluster must have read permissions. For more information, view the permission guide, or the Job reference.

Make sure you are in the doc_examples repo before running the following example.

entrypoint: python hello_world.py
compute_config: config_with_bucket_permission
runtime_env:
- working_dir: "https://github.com/anyscale/docs_examples/archive/refs/heads/main.zip"
+ working_dir: .
+ upload_path: "s3://insert-your-anyscale-bucket"

Using Job Outputs

caution

Job Outputs is in Alpha, and is supported in Anyscale CLI >= 0.5.32. Please contact support for any feedback on this feature.

Job Outputs are stored in Anyscale's cloud account. We do not recommend storing sensitive data in your Job output.

note

This example builds off of the previous one. If you haven't completed the previous example, you can still use Job Outputs and publish your changes to a hosted zip file.

Using Job Outputs requires Anyscale CLI version >= 0.5.32, in your cluster environment. Please update your cluster environment to include the following line in the post_build_commands.

pip install -U anyscale
import ray
+ import anyscale.job

@ray.remote
def hello_world():
return "Hello World!"

result = ray.get(hello_world.remote())
- print(result)
+ anyscale.job.output({
+ "result": result
+ })

You should now see the output in the UI. For more information about Job Outputs, see the reference.