Skip to main content

Production Services

info

Production services is a new feature that's currently under development.

Anyscale supports production services that are submitted as a standalone package containing a Ray Serve application. The service will be fully managed by the platform, providing fault tolerance and declarative upgrades.

Services consist of a single, long-running Ray cluster with a persistent DNS name that may be restarted over time.

If you're still in the development phase, you can refer to this tutorial for an example on how to develop Ray services interactively.

Using production services

Define

To create a production service, you must provide the following:

  • (Required) A compute config for the cluster the service will run on. On the SDK, this can only be specified as compute_config_id. On the CLI, you may specify compute_config (the name of the compute config) or cloud (the name of an Anyscale cloud) instead for convenience. This attribute is optional in the CLI. The SDK example below shows how to resolve a missing value, or a compute_config or a cloud into a compute_config_id.
  • (Required) A cluster environment for the cluster the service will run on. On the SDK, this can only be specified as build_id. On the CLI, you may specify cluster_env (the name and version for the cluster environment, colon-separated; if you don't specify a version, the latest will be used). This attribute is optional in the CLI. The SDK example below shows how to resolve a missing value, or a cluster_env into a build_id.
  • (Optional) A runtime environment containing your application code and dependencies.
  • (Required) The Ray Serve deployments that will be created on the cluster. Alternatively, you can provide a deployment script that creates Ray Serve deployments manually, but that is not the recommended workflow.
  • (Required) A health check that the platform will use to determine if your service is healthy. This should be an HTTP GET endpoint that returns status 200 if the service is healthy.
note

The working_dir option of the runtime environment must be a remote URL to a zip file stored on AWS S3 or Google Cloud Storage bucket, or a zip file accessible over HTTP/S (e.g., a GitHub download URL). The cluster running the service must have permissions to download from that URL.

These options can be specified in a configuration file:

my_production_service.yaml
compute_config: my-cluster-compute # You may specify `compute_config_id` or `cloud` instead
cluster_env: my-cluster-env:5 # You may specify `build_id` instead
runtime_env:
working_dir: "s3://my_bucket/my_service_files.zip"
entrypoint: "python deploy.py"
healthcheck_url: "/healthcheck"

Deploy

Your service can be deployed to Anyscale with the CLI or the Python SDK:

anyscale service deploy my_production_service.yaml \
--name my-production-service \
--description "A production service running on Anyscale."

Monitor

You can check the status of a service on the Web UI, or query it using the CLI or the Python SDK:

anyscale service list --service-id [SERVICE_ID]

Additionally, for each deployment, Anyscale will generate a dashboard in which you can monitor the health and activity of your deployment:

Service monitoring dashboard

The following graphs are automatically generated for you:

  • CPU utilization (cluster-wide)
  • Memory utilization (cluster-wide)
  • P95 latency (for queries)
  • Exceptions (exceptions per second)
  • QPS (queries per second)

Terminate

You can terminate a service using the CLI or the Python SDK:

anyscale service terminate [SERVICE_ID]

Example

You can try out production services using a public GitHub repo. The following serve_hello.py example is located on Anyscale's docs_examples repo:

serve_hello.py
from fastapi import FastAPI
from ray import serve

serve.start(detached=True)

app = FastAPI()

@serve.deployment(route_prefix="/")
@serve.ingress(app)
class HelloWorld:
@app.get("/")
def hello(self):
return f"Hello world!"

@app.get("/healthcheck")
def healthcheck(self):
return

HelloWorld.deploy()

You can define the following service:

serve_hello.yaml
runtime_env:
working_dir: "https://github.com/anyscale/docs_examples/archive/refs/heads/main.zip"
entrypoint: "python serve_hello.py"
healthcheck_url: "/healthcheck"

You can run, monitor and terminate the service as follows:

# Deploy the service defined above using the CLI
anyscale service deploy serve_hello.yaml --name serve_hello
No cluster environment provided, setting default based on local Python and Ray versions.
No cloud or compute config specified, using the default: cpt_js2dPghEYdJHafmTdUx9FsNf.
Service service_FZ87DmBwMnwdX6M14jzB8xms has been deployed. Current state of service: PENDING.
Query the status of the service with `anyscale service list --service-id service_FZ87DmBwMnwdX6M14jzB8xms`.
View the service in the UI at https://console.anyscale-staging.com/services/service_FZ87DmBwMnwdX6M14jzB8xms.

# Query the status of the service using the ID returned above
anyscale service list --service-id service_FZ87DmBwMnwdX6M14jzB8xms
View your services in the UI at https://console.anyscale-staging.com/services
Services:
NAME ID COST PROJECT NAME CLUSTER NAME CURRENT STATE CREATOR ENTRYPOINT
cli-job-2021-12-05T18:57:17.203774 service_FZ87DmBwMnwdX6M14jzB8xms 0 my-project RUNNING user python serve_hello.py

# Terminate the service
anyscale service terminate --service-id service_FZ87DmBwMnwdX6M14jzB8xms
Service service_FZ87DmBwMnwdX6M14jzB8xms has begun terminating...
Current state of service: RUNNING. Goal state of service: TERMINATED
Query the status of the service with `anyscale service list --service-id service_FZ87DmBwMnwdX6M14jzB8xms`.

# Get the status again
anyscale service list --service-id service_FZ87DmBwMnwdX6M14jzB8xms
View your services in the UI at https://console.anyscale-staging.com/services
Services:
NAME ID COST PROJECT NAME CLUSTER NAME CURRENT STATE CREATOR ENTRYPOINT
cli-job-2021-12-05T18:57:17.203774 service_FZ87DmBwMnwdX6M14jzB8xms 0 my-project TERMINATED user python serve_hello.py