Skip to main content

Detailed network flows for Anyscale

Detailed network flows for Anyscale

This page provides a detailed description of how network traffic flows through the Anyscale platform.

Anyscale networking definitions

Definitions for the purposes of this page:

  • Client device: An end user's laptop or other internet-enabled device.
  • Control plane: The endpoints and infrastructure supporting the Anyscale platform.
  • Ray cluster: A Ray cluster operating within a customer environment.

There are five primary routes within the Anyscale architecture:

  • Client to control plane: Enables interaction with the control plane through the console, SDK, or CLI.
  • Control plane to infrastructure orchestrator: Facilitates the creation, management, and termination of resources within the customer environment.
  • Ray cluster to control plane: Reports system health and telemetry.
  • Ray cluster to other resources: Any network interaction required as part of the Ray workload. For example, reading from object storage, querying external APIs, or accessing other network-accessible resources.
  • Client to Ray cluster: Provides access to the Ray dashboard, Ray job submission, or command-line interface.

Summary of important domains

DomainImpacted routesPurposes
console.anyscale.comClient to control plane[Required] Primary endpoint for Anyscale users to interact with Anyscale through the console, CLI, or SDK.
console.anyscale.comRay cluster to control plane[Required for Kubernetes deployments, recommended for all deployments] Registers the Anyscale Operator to the control plane or queries Anyscale APIs as part of Ray workloads.
*.i.anyscaleuserdata.com

vscode-*.i.anyscaleuserdata.com
Client to Ray cluster[Required] Routes user traffic to the head node of the Ray cluster (for example, to access the Ray dashboard).
*.s.anyscaleuserdata.comClient to Ray cluster[Required] Routes client requests to Anyscale services.
*.anyscale-cloud.dev

<cloud-id>.anyscale-cloud.dev
Ray cluster to control plane[Required] Destination for traffic originating from Ray clusters to the control plane for health checks and other operational reporting.
grafana-*.anyscale-cloud.dev

grafana-<your-cloud-id>.anyscale-cloud.dev
Ray cluster to control plane[Required] Destination for metrics originating from Ray clusters to the control plane.
grafana-*.anyscale-cloud.dev

grafana-<your-cloud-id>.anyscale-cloud.dev
Client to control plane[Required] Hosts managed Grafana for long-term metric retention and dashboard reconstruction.
registry-*.anyscale-cloud.dev

registry-<cloud-id>.anyscale-cloud.dev
Ray cluster to control plane[Required for Kubernetes deployments, recommended for all deployments] Provides images for Ray clusters.
machine-pool.anyscale-cloud.devRay cluster to control plane[Recommended for all deployments] Enables communication from nodes managed by the Global Resource Scheduler.
support-access.anyscale-cloud.devRay cluster to control plane[Strongly recommended] Connects Ray clusters to Anyscale support network.
derp*.tailscale.comRay cluster to control plane[Strongly recommended] Connects Ray clusters to Anyscale support network.
note

Anyscale has recently deprecated the following domains from the platform architecture:

  • *.anyscale-test-production.com
  • *.anyscale-production.com

Client to control plane

The Anyscale control plane consists of the components and infrastructure required to manage Ray clusters across all customers.

Anyscale fully manages the control plane, including all API endpoints that power the console, CLI, and SDK.

Client devices create connections to the control plane using the primary domains of console.anyscale.com and *.anyscale-cloud.dev. Anyscale encrypts connections using industry-standard TLS using certificates managed by Anyscale.

Interactions between the client and the control plane undergo authentication and authorization checks based on the organization's settings for SSO, as well as the user's permissions for Anyscale.

Control plane to infrastructure orchestrator

This section describes how the control plane launches a Ray cluster in your cloud provider account or Kubernetes cluster.

The control plane takes this action in response to you issuing a command to deploy a Ray cluster.

Anyscale on AWS EC2 or Google Cloud GCE resources

The Anyscale control plane assumes a cross-account principal that directly accesses AWS or Google Cloud APIs to launch the virtual machines in your cloud provider account. You configure these permissions when adding cloud resources. See Manage AWS IAM roles for Anyscale clusters or Manage Google Cloud service accounts for Anyscale clusters.

Communication over the internet uses TLS encryption.

Kubernetes resources

Registering the Anyscale operator is a two-step process that requires an API call to the control plane (console.anyscale.com) and a Helm deployment to the target Kubernetes cluster. You must have direct access to the target Kubernetes cluster to deploy the Helm chart.

During deployment, the Anyscale operator registers itself to the control plane using an endpoint on console.anyscale.com.

A running Anyscale operator periodically polls the Anyscale control plane for instructions through an endpoint using the domain <cloud-id>.anyscale-cloud.dev. The operator pulls Ray cluster images from registry-<cloud-id>.anyscale-cloud.dev.

TLS encrypts communication from the control plane and the cloud providers. A cloud-provided identity authenticates this communication, or a token review process authenticates it if you register the operator using an Anyscale CLI token.

important

You must use an Anyscale operator version 1.0.0 or later to use the *.anyscale-cloud.dev set of domains.

Ray cluster to control plane

During Ray cluster deployment, the control plane registers each node with a unique identifier to trade node identity and cluster membership.

note

All communication originates from the Ray cluster to the control plane to enable an egress-only pattern.

The cluster encrypts communication to the control plane using HTTPS endpoints hosted within the following domains:

  • <your-cloud-id>.anyscale-cloud.dev
  • grafana-<your-cloud-id>.anyscale-cloud.dev

Kubernetes deployments also use the following domains:

  • registry-<your-cloud-id>.anyscale-cloud.dev
  • console.anyscale.com

Machine pools use the following additional domain:

  • machine-pool.anyscale-cloud.dev

When you enable Support Access, Anyscale uses the following domains to establish VPN connectivity from Ray clusters to secure, Anyscale-managed laptops:

  • support-access.anyscale-cloud.dev
  • derp*.tailscale.com
note

Anyscale has recently deprecated the following domains from the platform architecture:

  • *.anyscale-test-production.com
  • *.anyscale-production.com

Client to Ray cluster

You can connect directly from your client devices to your Ray clusters. Traffic between the client and the cluster doesn't traverse the Anyscale control plane.

The network path varies based on the compute stack, networking mode, and whether you send requests to the cluster head node or as an Anyscale service request.

Public networking for head node access on EC2 and GCE

For each cluster, Anyscale provides a DNS address that routes to port 443 on the public IP of the head node. The address follows the pattern of https://session-123xyc.i.anyscaleuserdata.com where the session ID is unique per cluster. For VS Code access, the address pattern is vscode-session-123xyc.i.anyscaleuserdata.com.

The certificates terminate TLS on the head node. Anyscale rotates them automatically at least every 3 months.

When you initialize a connection to the head node (for example, accessing the Ray dashboard), the client resolves the DNS address that Anyscale manages to the public IP address of the head node.

note

On EC2 and GCE, Anyscale only supports Layer 4 network connectivity. Layer 7 routing (for example, through Application Load Balancers) isn't supported, because the head node terminates TLS directly rather than using an upstream proxy or load balancer.

Private networking for head node access on EC2 and GCE

Anyscale manages DNS and TLS the same way as in public networking mode; however, the DNS address points to the private IP of the head node.

note

It's your responsibility to ensure that the client device has a network route that can resolve to the private IP address of the head node through VPN or other private network setups.

Head node access on Kubernetes

Anyscale creates a secret named anyscale-<cldrsrc-id>-certificate in the operator namespace for TLS termination. Anyscale manages and automatically rotates the certificate at least every 3 months. When you launch a cluster, the Anyscale operator routes traffic from the Anyscale-provided DNS address (for example, session-123xyc.i.anyscaleuserdata.com) to the head pod, terminating TLS and forwarding to port 80.

Configure the Gateway load balancer address explicitly in the Helm chart (networking.gateway.hostname or networking.gateway.ip). Anyscale sets DNS records to this address. When you launch a cluster, the Anyscale operator applies an HTTPRoute to route traffic to the head pod. Envoy Gateway terminates TLS.

note

It's your responsibility to ensure connectivity between the client device and the Envoy Gateway load balancer.

Authentication and routing within the head node

Once a request reaches the head node, an Anyscale-managed ingress controller first routes the request to an auth gateway hosted on the head pod. The auth gateway inspects the request for a session token or API key and conducts a verification against the control plane. The verification authenticates the authenticity of the token and authorizes the request based on the user's permission to access the cluster.

If the request passes both authentication and authorization, the controller forwards it to the appropriate service on the head pod (for example, the Ray dashboard, VS Code, Jupyter, or the web terminal).

Public networking for services access on EC2 and GCE

Anyscale services for EC2 and GCE leverage application load balancers to route traffic to service versions and to direct traffic during version updates. The load balancer also serves as the primary way to receive and proxy requests for clients that consume the service's APIs.

Anyscale deploys the load balancers as externally facing load balancers with a publicly addressable DNS address from the cloud provider, mapping them as CNAME records within Anyscale's DNS system.

When deployed, each service gets a single DNS record (for example, https://service-name.cloud-id.s.anyscaleuserdata.com) that maps to a routing rule on the load balancer.

Client devices resolve the Anyscale-provided DNS address to the load balancer over the internet. The load balancer terminates TLS using an Anyscale-managed certificate. If configured, the load balancer also verifies the service Bearer token.

The load balancer then forwards traffic to the Ray cluster hosting the service version.

Private networking for services access on EC2 and GCE

Services for private networking leverage the same pattern as public networking services. However, Anyscale configures load balancers as internal load balancers and uses the private DNS address of the load balancer in the Anyscale-managed CNAME record.

It's your responsibility to ensure that the client device has a network route that can resolve to the private DNS address of the load balancer through VPN or other private network setups.

Services access on Kubernetes

Anyscale creates a CNAME record routing service addresses (*.s.anyscaleuserdata.com) to the networking endpoint. When you launch a service for the first time, Anyscale creates a secret named anyscale-svc-<cldrsrc-id>-certificate for TLS termination. Anyscale manages and automatically rotates the certificate at least every 3 months.

The Anyscale operator creates HTTPRoute resources to route traffic to relevant services, using the Gateway address for the CNAME record. Upon service launch, the operator applies an HTTPRoute to route traffic to the relevant service pods. Envoy Gateway terminates TLS and forwards traffic to the Ray cluster pods.

note

It's your responsibility to ensure connectivity between the client device and the Envoy Gateway load balancer.

Ray cluster to other resources

A Ray workload can access arbitrary resources as the application defines. You must ensure your workload accesses these assets in a secure manner using the appropriate protocols.