Anyscale agent skills

Anyscale agent skills are agent instructions for Claude Code and Cursor. Each skill guides the agent through requirements gathering, code generation from validated templates, and deployment through the Anyscale CLI.

When to use skills

Use skills when you're building or deploying a workload and want the agent to handle the end-to-end workflow:

You need a complete deployment configuration. Skills generate configurations from validated templates that match current Anyscale APIs.
You're making workload-specific decisions. Skills handle tradeoffs such as GPU selection, tensor parallelism settings, and autoscaling configuration based on your requirements.
A workload is failing. Skills query logs through the Anyscale API and apply platform-specific diagnostic steps instead of offering generic suggestions.

Setup

Before using skills, install the Anyscale CLI and run anyscale skills install. See Install Anyscale agent skills for prerequisites and installation steps. For permissions, blocked commands, and credential handling, see Anyscale agent skills security.

Skill routing

The following table maps common tasks to the skill that handles them. Claude Code and Cursor automatically route your prompt to the correct skill, but you can also invoke a skill directly with its command name.

Task	Skill	Launches workloads
Deploy an LLM as a REST endpoint	`/anyscale-workload-llm-serving`	No
Serve a non-LLM model as a REST endpoint	`/anyscale-workload-ray-serve`	No
Run batch inference on a dataset	`/anyscale-workload-ray-data`	No
Generate embeddings at scale	`/anyscale-workload-batch-embedding`	No
Distribute non-LLM training or fine-tuning across GPUs	`/anyscale-workload-ray-train`	No
Post-train an LLM with SFT, DPO, GRPO, or CPT	`/anyscale-workload-llm-post-training`	No
Answer a question about Ray or Anyscale	`/anyscale-platform-ask`	No
Launch an Anyscale workspace, job, or service	`/anyscale-platform-run`	Yes
Diagnose a failing workspace, job, or service	`/anyscale-platform-inspect`	No
Fix a failing workspace, job, or service end-to-end	`/anyscale-platform-fix`	Yes
Set up Anyscale on Kubernetes (EKS, GKE, or AKS)	`/anyscale-infra-kubernetes`	No
Set up Anyscale on AWS VMs	`/anyscale-infra-aws-vm`	No
Set up Anyscale on Google Cloud VMs	`/anyscale-infra-gcp-vm`	No

Skills that launch workloads incur real cloud costs. Review what the agent proposes before confirming.

Workload skills

Workload skills generate code and configuration for AI and ML workloads. They read and write code and execute shell commands, but don't launch workloads. After a workload skill generates artifacts, the agent offers to validate them by invoking /anyscale-platform-run. See Run.

LLM serving

/anyscale-workload-llm-serving: Deploy and configure LLM serving on Anyscale with Ray Serve and vLLM. Handles model sizing, GPU selection, tensor parallelism, autoscaling, multi-LoRA, and OpenAI-compatible endpoints.

/anyscale-workload-llm-serving deploy Qwen2.5-72B-Instruct on H100 GPUs

Ray Serve

/anyscale-workload-ray-serve: Deploy non-LLM models as scalable REST endpoints. Covers scikit-learn, XGBoost, PyTorch, TensorFlow, embedding models, computer vision, multi-model composition, and async inference.

/anyscale-workload-ray-serve deploy an object detection service with retinanet_resnet50_fpn_v2 model using Anyscale service

Batch inference

/anyscale-workload-ray-data: Design offline batch inference pipelines using Ray Data. Supports LLM batch inference through ray.data.llm with vLLM and custom map_batches pipelines for CV, NLP, audio, and tabular workloads.

/anyscale-workload-ray-data run batch inference with Qwen2.5-7B-Instruct using the PDF data, summarize each page using the LLM

Batch embedding

/anyscale-workload-batch-embedding: Generate embeddings at scale using Ray Data. Supports text, image, and multimodal embeddings with both custom actor pipelines and vLLM-backed processors, with vector database integration.

/anyscale-workload-batch-embedding generate embeddings for 50M documents in parquet files using the model sentence-transformers/all-MiniLM-L6-v2

Distributed training

/anyscale-workload-ray-train: Scale training and fine-tuning for non-LLM models across multiple GPUs and nodes with Ray Train. Supports PyTorch, Lightning, Hugging Face Transformers, TensorFlow, XGBoost, LightGBM, DeepSpeed, and FSDP.

note

For LLM post-training with LLaMA-Factory, SkyRL or Ray Train, including SFT, RLHF, and RLVR workflows, see LLM post-training.

/anyscale-workload-ray-train fine-tune ResNet152 using the image data from S3 on T4 GPUs

LLM post-training

/anyscale-workload-llm-post-training: Fine-tune and align LLMs with supervised fine-tuning (SFT), preference optimization (DPO, KTO, ORPO, SimPO, PPO), reinforcement learning with verifiable rewards (GRPO, DAPO), continued pre-training (CPT), and agentic tuning. Supports LLaMA-Factory, SkyRL, and Ray Train with FSDP, DeepSpeed, or Megatron, plus parameter-efficient methods such as LoRA and QLoRA. For a comparison of these frameworks, see Choose a framework for LLM post-training.

The following prompt runs SFT with LLaMA-Factory:

/anyscale-workload-llm-post-training fine-tune Llama 3.1 8B with SFT on my instruction dataset using LLaMA-Factory

The following prompt runs reinforcement learning with SkyRL:

/anyscale-workload-llm-post-training train Qwen/Qwen3-1.7B with GRPO on GSM8K data using SkyRL

The following prompt runs training with Ray Train and DeepSpeed for custom loss function modifications:

/anyscale-workload-llm-post-training fine-tune GPT-2 on the ag_news dataset with DeepSpeed ZeRO Stage 3 on two GPUs. I'd like to modiify the loss function with KL penalty terms. 

Platform skills

Platform skills operate on live Anyscale workloads. /anyscale-platform-run and /anyscale-platform-fix launch workloads and incur real costs. /anyscale-platform-ask and /anyscale-platform-inspect are read-only and don't write code or launch workloads.

Ask

/anyscale-platform-ask: Answer technical questions about Ray and Anyscale. Searches official documentation, source code, and local references to provide source-backed answers with architecture recommendations. Read-only: doesn't write code, modify files, or launch workloads.

/anyscale-platform-ask recommend an architecture for building a RAG pipeline with Ray

Run

/anyscale-platform-run: Generate Anyscale configurations and execute workloads through the CLI. Handles workspace management, service deployment, job submission, configuration generation, image selection, compute setup, and environment variables.

warning

This skill launches workloads on your Anyscale cloud, which incur real costs. Review what the agent proposes before confirming.

/anyscale-platform-run deploy ./my_service_code/ as an Anyscale service

Inspect

/anyscale-platform-inspect: Diagnose live Anyscale workloads, including workspaces, jobs, and services. Queries logs, events, and metrics through the Anyscale API and validates configurations against Ray source code. Outputs a bug analysis report with code suggestions. Read-only: doesn't write code or launch workloads. For full execution permissions, use /anyscale-platform-fix.

/anyscale-platform-inspect debug this issue in the job [insert job url here]

Fix

/anyscale-platform-fix: Run the full debug-fix-validate loop for Anyscale workloads, including workspaces, jobs, and services. Uses /anyscale-platform-inspect for diagnosis, applies code fixes, and invokes /anyscale-platform-run to redeploy and verify.

warning

This skill reads and writes code and launches workloads on your Anyscale cloud to validate fixes end-to-end. For read-only diagnosis, use /anyscale-platform-inspect.

/anyscale-platform-fix fix this issue in the service [insert service url here]

Infrastructure skills

Infrastructure skills generate cloud configurations and walk through the setup process for the infrastructure that Anyscale runs on. They read and write code and execute shell commands, but don't provision cloud resources directly. Review all generated infrastructure files before applying them.

Kubernetes

/anyscale-infra-kubernetes: Deploy Anyscale on Kubernetes with EKS, GKE, or AKS. Supports both the anyscale cloud setup CLI and Terraform. Covers cluster creation, Helm operator installation, networking, GPU configuration, and Karpenter.

/anyscale-infra-kubernetes deploy Anyscale on a new EKS cluster with GPU support

AWS VMs

/anyscale-infra-aws-vm: Deploy Anyscale on AWS using virtual machines. Supports both anyscale cloud setup and Terraform with anyscale cloud register. Walks through VPC, IAM, S3, security groups, EFS, and MemoryDB configuration.

/anyscale-infra-aws-vm deploy Anyscale on AWS with a new VPC in us-west-2

Google Cloud VMs

/anyscale-infra-gcp-vm: Deploy Anyscale on Google Cloud using VMs. Supports both anyscale cloud setup and Terraform with anyscale cloud register. Walks through VPC, IAM, GCS, firewall, Memorystore, and Workload Identity Federation configuration.

/anyscale-infra-gcp-vm deploy Anyscale on Google Cloud in us-central1

Combine skills

Reference multiple skills in a single prompt. The agent executes them in sequence, passing context between them.

Use /anyscale-workload-ray-data to create a batch pipeline that extracts text from PDF files
and generates a one-paragraph summary per page using Qwen3-8B. Then use /anyscale-platform-run
to launch it in an Anyscale workspace and verify it runs end-to-end.
The PDF files are located in s3://my-bucket/docs/.

When to use skills​

Setup​

Skill routing​

Workload skills​

LLM serving​

Ray Serve​

Batch inference​

Batch embedding​

Distributed training​

LLM post-training​

Platform skills​

Ask​

Run​

Inspect​

Fix​

Infrastructure skills​

Kubernetes​

AWS VMs​

Google Cloud VMs​

Combine skills​