Skip to main content

Production Services


Use of Production Services requires Ray 2.0+.

Production Services are the recommended way to move your Ray Serve application to Production.

Key features of Production Services:

  1. Automatic failure recovery
  2. Automated email alerting
  3. Persistent host name
  4. [coming soon] High Availability and 0 downtime upgrade

This walkthrough will help you run your Ray Serve application as a Production Service on Anyscale.

If you're still in the development phase, you can refer to this tutorial for an example on how to develop Ray services interactively.

Setup your environment

Install the latest ray and Anyscale

pip install ray
pip install -U anyscale

Clone the Anyscale Examples Repo.

git clone anyscale_doc_examples
cd anyscale_doc_examples

Setup an bucket in your Cloud account. Your laptop needs to write to the bucket, and Anyscale needs to read files in the bucket. Please configure permissions accordingly. See docs for AWS or GCP.


We need a bucket because we will be packaging a local directory as a zip containing files for our Service, and hosting that zip in the specified bucket. If you haven't configured a bucket, you can follow along with the first part of this tutorial using example zip file stored on Github.

If you already have a hosted zip file with your code, that will work too. Any zip file that meets the format is accepted. Importantly, the zip file must contain only a single top-level directory. See Remote Uris for more details about Ray Runtime Environments, and see the Services Reference for more detail about Anyscale Specific runtime env configuration.

Deploy your first Production Service

When you submit a Service yaml, Anyscale runs your Service on a managed cluster and monitors it. In the case of failure, Anyscale will alert you and restart your Service.


Unlike Jobs, Service names must be unique within a project. This enables referencing services by name. If you call deploy on a yaml with a service name that exists, that service will be updated.

Let's run the example.

anyscale service deploy service.yaml

(anyscale +3.3s) Query the status of the service with `anyscale service list --service-id service_sc2JFXVupTCLy6CKqcYq5rmN`.
(anyscale +3.3s) View the service in the UI at

Querying the Service

You can find the address of the Service in the Anyscale UI by clicking the "Query" button in the top right hand corner.

You should find a command that looks something like the below

Query With Curl
curl -H "Authorization: Bearer SECRET_TOKEN"

"Hello world!"

Upgrade the Service

Let's upgrade the service to show a different message when we query it.

In this case, we just need to edit the entrypoint in our YAML to set the environment variable "SERVE_RESPONSE_MESSAGE".

To see the change, redeploy the service.

name: service-demo
- entrypoint: python
+ entrypoint: SERVE_RESPONSE_MESSAGE="hello from new service" python
healthcheck_url: "/healthcheck"

Now when you query the service, you should see the new message!

Query With Curl
curl -H "Authorization: Bearer SECRET_TOKEN"

"hello from new service"

Making a Code Change

The runtime_env.working_dir field in the Service YAML also supports specify a local directory if you additionally specify the bucket you configured in the setup. Your laptop must have write permission to your bucket, and your cluster must have read permissions. For more information, view the permission guide, or the Job reference.

name: service-demo
entrypoint: SERVE_RESPONSE_MESSAGE="hello from new service" python
+ compute_config: config_with_bucket_permission
- working_dir:
+ working_dir: .
+ upload_path: "s3://your-bucket/path"
healthcheck_url: "/healthcheck"

Now when you query the service, you should see the new message!

Query With Curl
curl -H "Authorization: Bearer SECRET_TOKEN"

"hello from new service, and a code change"


When we are done, let's go ahead and terminate the service using the CLI.

anyscale service terminate -n service-demo


In this tutorial, we have created, updated, and queried a Production Service. To see more information about Services, including configuration details and monitoring information, see the Reference.