Skip to main content

Multi-Application Services

note

Support for multi-application services requires Ray 2.4+ and Anyscale CLI version 0.5.106+.

This tutorial walks you through how to deploy multiple Ray Serve applications in a single service.

Recall that an application is the unit of upgrade in Serve, and it consists of one or more deployments tied together in a directed acyclic graph. An application can be called via HTTP at the specified route prefix. If you want to have many independent models each behind separate endpoints on the same cluster, each model should be deployed as a separate application.

For more information on deploying multiple applications, see the Ray Serve multi-application docs.

Deploy the first application

First, let's deploy a sentiment analysis application. We will use the following YAML configuration file. Notice the format of the Serve config: under the top-level field applications, there is an application entry named sentiment. This will be deployed from sentiment_analysis/app.py in the Anyscale Examples repo.

name: multi-app-service
cluster_env: default_cluster_env_ml_2.9.0_py310
ray_serve_config:
applications:
- name: sentiment
route_prefix: /sentiment
import_path: sentiment_analysis.app:model
runtime_env:
working_dir: https://github.com/anyscale/docs_examples/archive/refs/heads/main.zip
deployments:
- name: LanguageModel
info

Check out the Serve Multi-App Config Schema for more details on the new config format.

Deploy the service with the sentiment analysis application using the Anyscale CLI.

anyscale service rollout -f config.yaml

Once the service transitions to Running, you can query it.

>>> import requests
>>> resp = requests.get(
"https://multi-app-service-jrvwy.cld-6zjdsbn3kbphvk3l.s.anyscaleuserdata-staging.com/sentiment/predict",
headers={"Authorization": "Bearer EZqpBnbCkEHnbgF2LvRF0iWHDf7takDB-sCeUk_wQSg"},
params={"text": "Anyscale workspaces are great"},
)
>>> resp.json()
[{'label': 'POSITIVE', 'score': 0.9998007416725159}]

Add an application

Next, let's deploy a second application; one that runs a stable diffusion model. To do this, add a second entry under applications in the Ray Serve config. We name the application stable_diffusion, and it will be deployed from stable-diffusion/app.py in the Anyscale Examples repo.

name: multi-app-service
cluster_env: default_cluster_env_ml_2.9.0_py310
ray_serve_config:
applications:
- name: sentiment
route_prefix: /sentiment
import_path: sentiment_analysis.app:model
runtime_env:
working_dir: https://github.com/anyscale/docs_examples/archive/refs/heads/main.zip
deployments:
- name: LanguageModel

- name: stable_diffusion
route_prefix: /diffusion
import_path: stable-diffusion.app:entrypoint
runtime_env:
working_dir: https://github.com/anyscale/docs_examples/archive/refs/heads/main.zip
pip:
- accelerate==0.14.0
- diffusers @ git+https://github.com/huggingface/diffusers.git@25f11424f62d8d9bef8a721b806926399a1557f2
- Pillow==9.3.0
- scipy==1.9.3
- torch==1.13.0
- torchvision==0.14.0
- transformers==4.24.0
deployments:
- name: APIIngress
- name: StableDiffusionV2

While we normally caution against using in-place upgrades because we cannot provide any downtime-guarantees, multi-application services is one case in which in-place upgrades can be very useful. In this case, we are adding a second Serve application to our service without changing the first application, so a redeploy will deploy the new application while the existing application continues to run. Of course, many other risks that come with an in-place upgrade still exist, such as the new application failing to deploy and causing the Service to enter an unhealthy state. As a result, a rollout is always the safest option. However, for the purposes of this tutorial, we will use an in-place upgrade to demonstrate the possibility of adding an application to an already-running Service cluster.

Redeploy the service using in-place upgrade.

anyscale service rollout -f config.yaml --rollout-strategy=IN_PLACE

Once the service transitions to Running, you can query either of the two running applications.

>>> import requests
# Query the sentiment analysis model
>>> resp = requests.get(
"https://multi-app-service-jrvwy.cld-6zjdsbn3kbphvk3l.s.anyscaleuserdata-staging.com/sentiment/predict",
headers={"Authorization": "Bearer EZqpBnbCkEHnbgF2LvRF0iWHDf7takDB-sCeUk_wQSg"},
params={"text": "Anyscale workspaces are great"},
)
>>> resp.json()
[{'label': 'POSITIVE', 'score': 0.9998007416725159}]
# Query the stable diffusion model
>>> from PIL import Image
>>> import io
>>> resp = requests.get(
"https://multi-app-service-jrvwy.cld-6zjdsbn3kbphvk3l.s.anyscaleuserdata-staging.com/diffusion/imagine",
headers={"Authorization": "Bearer EZqpBnbCkEHnbgF2LvRF0iWHDf7takDB-sCeUk_wQSg"},
params={"prompt": "cat dancing on the grass"},
)
>>> image = Image.open(io.BytesIO(resp.content))
>>> image.show()

Updating and Deleting Applications

To update a running application, change the corresponding application entry in the Serve config and redeploy the service. Similarly, to delete an application, remove the corresponding entry and redeploy the service.

Word of caution against shared working directories

Recall that you can set the working_dir to a local directory, and Anyscale will upload your directory contents to the specified remote storage system (for example, an S3 bucket) and auto-populate the runtime environment when deploying the service. When executing an in-place upgrade, a change in the runtime environment triggers a redeployment of all affected applications. This ensures the most up-to-date code is being fetched and run in your Service when an in-place upgrade happens.

However, notice that if you are running multiple applications in one Service, and all of them share a local working directory (for example, the current directory .), then changes in that shared directory, upon executing an in-place upgrade, will cause a redeployment of all the applications. Thus, if you want to avoid this happening, is it recommended to keep the working directories for your applications separate.