Ray in an application lifecycle context (CI/CD)
This page demonstrates an application architecture that situates a Ray workload within a broader Python Application Context. We will start with a basic application and then put that through the receommended CI/CD process using Anyscale.
The below documentation refers to the project documented in this repository.
Example Application
- The application is designed to use a web endpoint to execute and return results from Ray Jobs.
- The application can be tested and run locally. It can be configured to use local ray or Anyscale for interactive runs and tests.
- A configuration for running and deploying the application with CI/CD Github Actions is included
- You can leverage package repositories to deploy the package to various environments.
Application Structure
Here are the relevant parts of the application to describe how it works, along with how it's tested and deployed
├── .github/
│ └── workflows
│ └── ci_cd.yaml
├── app
│ ├── __init__.py
│ ├── app.py
│ ├── driver.py
│ └── ray_impl
│ ├── __init__.py
│ ├── script.py
│ └── requirements.txt
├── setup.py
└── tests
├── __init__.py
├── conftest.py
├── remote_tests.py
└── test_app.py
Here is an overview of the application flow.
The application is contained within the app
directory, and all of the code which executes remotely on Ray is sequestered into the ray_impl
subdirectory.
The purpose of this application is to show how Ray code might sit within a larger application. So a typical codebase will have code that
- Runs outside of Ray. This is the code that drives the computation, and somehow interacts with users or the world in some well-defined way. In the sample application, the file called
app.py
contains an entry point for a small web application that launches tasks in Ray and reports back on them. - Serves as an entry point to the remote compute. The file called
driver.py
is the single entry point for leveraging remote, distributed compute. It contains a class that- initializes a ray connection as needed
- executes Ray Jobs and maintains their state.
- checks the status of Ray Jobs running in Anyscale
- returns the results of the jobs.
- Runs in the Ray container. The file
script.py
contains both a Ray task and a Ray Actor that execute code within the Ray container. This code can be deployed to run in a Ray container running locally, or onto a remote Ray cluster managed by Anyscale. The driver code submits Jobs that execute this file as an entry point.
Ray Jobs execute in an existing Ray Container. This behavior is different from Anyscale Jobs, which are fully managed and start their own Ray Clusters. For more info on Ray Jobs see the Ray Documentation for Jobs.
The tests
directory contains tests that both exercise the remote code directly in Ray, as well as another module that tests the application endpoints.
The .github
directory contains the configuration for invoking Github Actions
Out of Scope
- This application has no logging, no parameter passing and no integrations or machine learning libraries. It's purpose is solely to demonstrate how to model an application structure in order to leverage continuous integration systems.
- Production entry point. Locally, you can run this application by executing
app.py
as a module:python -m app.app
. There are several ways to package and deploy an application to production, which are not addressed in this project at this time. - The sample repo contains supporting code for some other architectural documentation as well... These files are not listed above.
CI/CD Environment
The CI/CD flow
A CI/CD process automates the testing and build of an application, generally triggered at every request to integrate code. In this diagram, each blue oval is a working copy of the codebase. Code makes its way from left to right as it goes from development to testing and integration and then deployment to production. Code at all stages can leverage Anyscale to run Ray distributed code, as shown in the lower right.
- "Local Working Copy" Development starts locally, and devs who want to iterate quickly on small data can leverage a "local Ray" server to develop and test distributed applications.
- When code is pushed to a pull request in Github, Github's CI system (Github Actions), invokes the pipeline configured under the
.github
directory. The system creates an environmnent in which to install code ("CI Checkout of Code") and run the codebase's tests. - If the tests are successful, then the code is ready for deployment. This document doesn't yet specify a specific deployment method.
- In each case, the codebase is configured to leverage Ray and/or Anyscale.
Secrets
All CI systems have a method for managing secrets. For Anyscale, you need a CLI token in order to access and run scripts on Anyscale. This CLI token should be treated as a secret. Enter it into your repository settings as a secret. In the example project it's called AUTOMATION_CLI_TOKEN
You can specify environment variables in your Github Actions configuration. In the example repo, we pass the credentials from Github secrets to the CI process by setting the token environment variable:
env:
ANYSCALE_CLI_TOKEN: ${{ secrets.AUTOMATION_CLI_TOKEN }}
ANYSCALE_ENVIRONMENT: "CI"
Environment
This demo CI/CD system leverages Anyscale
In the root of the project you'll note .anyscale.yaml
. This file provides the project id under which the project will execute.
If you are cloning and using this repository, you need to replace this file with one that points to a project in your account. As of Ray 1.8.0, this project file is still required, but will not be in future releases of Anyscale and Ray.