Detailed network flows for Anyscale
This page provides a detailed description of how network traffic flows through the Anyscale platform.
Anyscale networking definitions
Definitions for the purposes of this page:
- Client device: An end user's laptop or other internet-enabled device.
- Control plane: The endpoints and infrastructure supporting the Anyscale platform.
- Ray cluster: A Ray cluster operating within a customer environment.
There are five primary routes within the Anyscale architecture:
- Client to control plane: Enables interaction with the control plane through the console, SDK, or CLI.
- Control plane to infrastructure orchestrator: Facilitates the creation, management, and termination of resources within the customer environment.
- Ray cluster to control plane: Reports system health and telemetry.
- Ray cluster to other resources: Any network interaction required as part of the Ray workload. For example, reading from object storage, querying external APIs, or accessing other network-accessible resources.
- Client to Ray cluster: Provides access to the Ray dashboard, Ray job submission, or command-line interface.
Summary of important domains
Domain | Impacted routes | Purposes |
---|---|---|
console.anyscale.com | Client to control plane | [Required] Primary endpoint for Anyscale users to interact with Anyscale through the console, CLI, or SDK. |
console.anyscale.com | Ray cluster to control plane | [Required for Kubernetes deployments, recommended for all deployments] Registers the Anyscale Operator to the control plane or queries Anyscale APIs as part of Ray workloads. |
*.i.anyscaleuserdata.com vscode-*.i.anyscaleuserdata.com | Client to Ray cluster | [Required] Routes user traffic to the head node of the Ray cluster (for example, to access the Ray dashboard). |
*.s.anyscaleuserdata.com | Client to Ray cluster | [Required] Routes client requests to Anyscale services. |
*.anyscale-cloud.dev <cloud-id>.anyscale-cloud.dev | Ray cluster to control plane | [Required] Destination for traffic originating from Ray clusters to the control plane for health checks and other operational reporting. |
grafana-*.anyscale-cloud.dev grafana-<your-cloud-id>.anyscale-cloud.dev | Ray cluster to control plane | [Required] Destination for metrics originating from Ray clusters to the control plane. |
grafana-*.anyscale-cloud.dev grafana-<your-cloud-id>.anyscale-cloud.dev | Client to control plane | [Required] Hosts managed Grafana for long-term metric retention and dashboard reconstruction. |
registry-*.anyscale-cloud.dev registry-<cloud-id>.anyscale-cloud.dev | Ray cluster to control plane | [Required for Kubernetes deployments, recommended for all deployments] Provides images for Ray clusters. |
machine-pool.anyscale-cloud.dev | Ray cluster to control plane | [Recommended for all deployments] Enables communication from nodes managed by the Global Resource Scheduler. |
support-access.anyscale-cloud.dev | Ray cluster to control plane | [Strongly recommended] Connects Ray clusters to Anyscale support network. |
derp*.tailscale.com | Ray cluster to control plane | [Strongly recommended] Connects Ray clusters to Anyscale support network. |
Anyscale has recently deprecated the following domains from the platform architecture:
*.anyscale-test-production.com
*.anyscale-production.com
Client to control plane
The Anyscale control plane consists of the components and infrastructure required to manage Ray clusters across all customers.
Anyscale fully manages the control plane, including all API endpoints that power the console, CLI, and SDK.
Client devices create connections to the control plane using the primary domains of console.anyscale.com
and *.anyscale-cloud.dev
. Anyscale encrypts connections using industry-standard TLS using certificates managed by Anyscale.
Interactions between the client and the control plane undergo authentication and authorization checks based on the organization's settings for SSO, as well as the user's permissions for Anyscale.
Control plane to infrastructure orchestrator
This section describes how the control plane launches a Ray cluster in your cloud provider account or Kubernetes cluster.
The control plane takes this action in response to you issuing a command to deploy a Ray cluster.
Anyscale on AWS EC2 or Google Cloud GCE resources
The Anyscale control plane assumes a cross-account principal that directly accesses AWS or Google Cloud APIs to launch the virtual machines in your cloud provider account. You configure these permissions when adding cloud resources. See Manage AWS IAM roles for Anyscale clusters or Manage Google Cloud service accounts for Anyscale clusters.
Communication over the internet uses TLS encryption.
Kubernetes resources
Registering the Anyscale operator is a two-step process that requires an API call to the control plane (console.anyscale.com
) and a Helm deployment to the target Kubernetes cluster. You must have direct access to the target Kubernetes cluster to deploy the Helm chart.
During deployment, the Anyscale operator registers itself to the control plane using an endpoint on console.anyscale.com
.
A running Anyscale operator periodically polls the Anyscale control plane for instructions through an endpoint using the domain <cloud-id>.anyscale-cloud.dev
. The operator pulls Ray cluster images from registry-<cloud-id>.anyscale-cloud.dev
.
TLS encrypts communication from the control plane and the cloud providers. A cloud-provided identity authenticates this communication, or a token review process authenticates it if you register the operator using an Anyscale CLI token.
You must use an Anyscale operator version 0.6.2 or later to use the *.anyscale-cloud.dev
set of domains.
Ray cluster to control plane
During Ray cluster deployment, the control plane registers each node with a unique identifier to trade node identity and cluster membership.
All communication originates from the Ray cluster to the control plane to enable an egress-only pattern.
The cluster encrypts communication to the control plane using HTTPS endpoints hosted within the following domains:
<your-cloud-id>.anyscale-cloud.dev
grafana-<your-cloud-id>.anyscale-cloud.dev
Kubernetes deployments also use the following domains:
registry-<your-cloud-id>.anyscale-cloud.dev
console.anyscale.com
Machine pools use the following additional domain:
machine-pool.anyscale-cloud.dev
When you enable Support Access, Anyscale uses the following domains to establish VPN connectivity from Ray clusters to secure, Anyscale-managed laptops:
support-access.anyscale-cloud.dev
derp*.tailscale.com
Anyscale has recently deprecated the following domains from the platform architecture:
*.anyscale-test-production.com
*.anyscale-production.com
Client to Ray cluster
You can connect directly from your client devices to your Ray clusters. Traffic between the client and the cluster doesn't traverse the Anyscale control plane.
The network path varies based on the compute stack, networking mode, and whether you send requests to the cluster head node or as an Anyscale service request.
Public networking for head node access on EC2 and GCE
For each cluster, Anyscale provides a DNS address that routes to port 443 on the public IP of the head node. The address follows the pattern of https://session-123xyc.i.anyscaleuserdata.com
where the session ID is unique per cluster. For VS Code access, the address pattern is vscode-session-123xyc.i.anyscaleuserdata.com
.
The certificates terminate TLS on the head node. Anyscale rotates them automatically at least every 3 months.
When you initialize a connection to the head node (for example, accessing the Ray dashboard), the client resolves the DNS address that Anyscale manages to the public IP address of the head node.
On EC2 and GCE, Anyscale only supports Layer 4 network connectivity. Layer 7 routing (for example, through Application Load Balancers) isn't supported, because the head node terminates TLS directly rather than using an upstream proxy or load balancer.
Private networking for head node access on EC2 and GCE
Anyscale manages DNS and TLS the same way as in public networking mode; however, the DNS address points to the private IP of the head node.
It's your responsibility to ensure that the client device has a network route that can resolve to the private IP address of the head node through VPN or other private network setups.
Head node access on Kubernetes
When registering an Anyscale cloud resource for Kubernetes, the Anyscale operator creates a Kubernetes Ingress, monitors the field .status.loadBalancer.ingress
to determine the host address or IP, and automatically sets Anyscale DNS records accordingly. Customers can also override this behavior in the Helm chart by specifying a fixed address.
Anyscale creates a secret for the certificate (named anyscale-<cldsrc-id>-certificate
) within the operator namespace to terminate TLS on the ingress. Anyscale manages and automatically rotates the certificates at least every 3 months.
When you launch a cluster, the Anyscale Operator applies an ingress spec that routes traffic from the Anyscale-provided DNS address of the cluster (for example, session-123xyc.i.anyscaleuserdata.com
) to the head pod.
The client resolves the Anyscale-provided DNS address to the address of the ingress. Then, the ingress controller terminates TLS and forwards the traffic to port 80 of the head pod.
It's your responsibility to ensure connectivity between the client device and the Kubernetes ingress.
The ingress controller needs to be able to read the secret with the certificate, which may be in a separate namespace from the ingress controller. Some ingress controllers don't allow cross-namespace access by default.
Anyscale doesn't support custom domains or Layer 7 routing to the ingress address. For example, you need to configure Layer 4 load balancers (such as AWS Network Load Balancers) instead of Layer 7 (such as AWS Application Load Balancers). Because Anyscale manages DNS and certificates while the Anyscale Operator manages ingress routes, any updates to Layer 7 routers fall outside the permission boundary of the Anyscale Operator.
Authentication and routing within the head node
Once a request reaches the head node, an Anyscale-managed ingress controller first routes the request to an auth gateway hosted on the head pod. The auth gateway inspects the request for a session token or API key and conducts a verification against the control plane. The verification authenticates the authenticity of the token and authorizes the request based on the user's permission to access the cluster.
If the request passes both authentication and authorization, the controller forwards it to the appropriate service on the head pod (for example, the Ray dashboard, VS Code, Jupyter, or the web terminal).
Public networking for services access on EC2 and GCE
Anyscale services for EC2 and GCE leverage application load balancers to route traffic to service versions and to direct traffic during version updates. The load balancer also serves as the primary way to receive and proxy requests for clients that consume the service's APIs.
Anyscale deploys the load balancers as externally facing load balancers with a publicly addressable DNS address from the cloud provider, mapping them as CNAME records within Anyscale's DNS system.
When deployed, each service gets a single DNS record (for example, https://service-name.cloud-id.s.anyscaleuserdata.com
) that maps to a routing rule on the load balancer.
Client devices resolve the Anyscale-provided DNS address to the load balancer over the internet. The load balancer terminates TLS using an Anyscale-managed certificate. If configured, the load balancer also verifies the service Bearer token.
The load balancer then forwards traffic to the Ray cluster hosting the service version.
Private networking for services access on EC2 and GCE
Services for private networking leverage the same pattern as public networking services. However, Anyscale configures load balancers as internal load balancers and uses the private DNS address of the load balancer in the Anyscale-managed CNAME record.
It's your responsibility to ensure that the client device has a network route that can resolve to the private DNS address of the load balancer through VPN or other private network setups.
Services access on Kubernetes
For Kubernetes-based cloud resources, the ingress controller routes traffic to relevant services. Anyscale uses the ingress defined by the cloud resource to create a CNAME record within its DNS system to route service addresses (*.s.anyscaleuserdata.com
) to the ingress address.
It's your responsibility to ensure connectivity between the client device and the Kubernetes ingress.
When you launch a service for the first time, Anyscale creates a secret for the certificate to terminate TLS at the ingress with a name similar to anyscale-svc-cldrsrc-123xyz-certificate
. Anyscale manages and automatically rotates the certificate at least every 3 months.
Upon service launch, Anyscale Operator applies an ingress spec to route traffic from the ingress to the relevant pods associated with the service cluster.
The client resolves the Anyscale-provided DNS address to the address provided as part of cloud resource definition. It's your responsibility to ensure connectivity between the client device and the Kubernetes ingress. Then, the ingress controller terminates TLS and forwards the traffic to the Ray cluster pods.
The ingress controller needs to be able to read the secret with the certificate, which may be in a separate namespace from the ingress controller. Some ingress controllers don't allow cross-namespace access by default.
Anyscale doesn't support custom domains or Layer 7 routing to the ingress address. For example, you need to configure Layer 4 load balancers (such as AWS Network Load Balancers) instead of Layer 7 (such as AWS Application Load Balancers). Since Anyscale manages DNS and certificates while the Anyscale Operator manages ingress routes, any updates to Layer 7 routers fall outside the permission boundary of the Anyscale Operator.
Ray cluster to other resources
A Ray workload can access arbitrary resources as the application defines. You must ensure your workload accesses these assets in a secure manner using the appropriate protocols.