Anyscale agent skills
Anyscale agent skills
Anyscale agent skills are agent instructions for Claude Code and Cursor. Each skill guides the agent through requirements gathering, code generation from validated templates, and deployment through the Anyscale CLI.
When to use skills
Use skills when you're building or deploying a workload and want the agent to handle the end-to-end workflow:
- You need a complete deployment configuration. Skills generate configurations from validated templates that match current Anyscale APIs.
- You're making workload-specific decisions. Skills handle tradeoffs such as GPU selection, tensor parallelism settings, and autoscaling configuration based on your requirements.
- A workload is failing. Skills query logs through the Anyscale API and apply platform-specific diagnostic steps instead of offering generic suggestions.
Setup
Before using skills, install the Anyscale CLI and run anyscale skills install. See Install Anyscale agent skills for prerequisites and installation steps. For permissions, blocked commands, and credential handling, see Anyscale agent skills security.
Skill routing
The following table maps common tasks to the skill that handles them. Claude Code and Cursor automatically route your prompt to the correct skill, but you can also invoke a skill directly with its command name.
| Task | Skill | Launches workloads |
|---|---|---|
| Deploy an LLM as a REST endpoint | /anyscale-workload-llm-serving | No |
| Serve a non-LLM model as a REST endpoint | /anyscale-workload-ray-serve | No |
| Run batch inference on a dataset | /anyscale-workload-ray-data | No |
| Generate embeddings at scale | /anyscale-workload-batch-embedding | No |
| Distribute training or fine-tuning across GPUs | /anyscale-workload-ray-train | No |
| Answer a question about Ray or Anyscale | /anyscale-platform-ask | No |
| Launch an Anyscale workspace, job, or service | /anyscale-platform-run | Yes |
| Diagnose a failing workspace, job, or service | /anyscale-platform-inspect | No |
| Fix a failing workspace, job, or service end-to-end | /anyscale-platform-fix | Yes |
| Set up Anyscale on Kubernetes (EKS, GKE, or AKS) | /anyscale-infra-kubernetes | No |
| Set up Anyscale on AWS VMs | /anyscale-infra-aws-vm | No |
| Set up Anyscale on Google Cloud VMs | /anyscale-infra-gcp-vm | No |
Skills that launch workloads incur real cloud costs. Review what the agent proposes before confirming.
Workload skills
Workload skills generate code and configuration for AI and ML workloads. They read and write code and execute shell commands, but don't launch workloads. After a workload skill generates artifacts, the agent offers to validate them by invoking /anyscale-platform-run. See Run.
LLM serving
/anyscale-workload-llm-serving: Deploy and configure LLM serving on Anyscale with Ray Serve and vLLM. Handles model sizing, GPU selection, tensor parallelism, autoscaling, multi-LoRA, and OpenAI-compatible endpoints.
/anyscale-workload-llm-serving deploy Qwen2.5-72B-Instruct on H100 GPUs
Ray Serve
/anyscale-workload-ray-serve: Deploy non-LLM models as scalable REST endpoints. Covers scikit-learn, XGBoost, PyTorch, TensorFlow, embedding models, computer vision, multi-model composition, and async inference.
/anyscale-workload-ray-serve deploy an object detection service with retinanet_resnet50_fpn_v2 model using Anyscale service
Batch inference
/anyscale-workload-ray-data: Design offline batch inference pipelines using Ray Data. Supports LLM batch inference through ray.data.llm with vLLM and custom map_batches pipelines for CV, NLP, audio, and tabular workloads.
/anyscale-workload-ray-data run batch inference with Qwen2.5-7B-Instruct using the PDF data, summarize each page using the LLM
Batch embedding
/anyscale-workload-batch-embedding: Generate embeddings at scale using Ray Data. Supports text, image, and multimodal embeddings with both custom actor pipelines and vLLM-backed processors, with vector database integration.
/anyscale-workload-batch-embedding generate embeddings for 50M documents in parquet files using the model sentence-transformers/all-MiniLM-L6-v2
Distributed training
/anyscale-workload-ray-train: Scale training and fine-tuning across multiple GPUs and nodes with Ray Train. Supports PyTorch, Lightning, Hugging Face Transformers, TensorFlow, XGBoost, LightGBM, DeepSpeed, and FSDP.
This skill doesn't support third-party training frameworks such as LlamaFactory or SkyRL.
/anyscale-workload-ray-train fine-tune ResNet152 using the image data from S3 on T4 GPUs
Platform skills
Platform skills operate on live Anyscale workloads. /anyscale-platform-run and /anyscale-platform-fix launch workloads and incur real costs. /anyscale-platform-ask and /anyscale-platform-inspect are read-only and don't write code or launch workloads.
Ask
/anyscale-platform-ask: Answer technical questions about Ray and Anyscale. Searches official documentation, source code, and local references to provide source-backed answers with architecture recommendations. Read-only: doesn't write code, modify files, or launch workloads.
/anyscale-platform-ask recommend an architecture for building a RAG pipeline with Ray
Run
/anyscale-platform-run: Generate Anyscale configurations and execute workloads through the CLI. Handles workspace management, service deployment, job submission, configuration generation, image selection, compute setup, and environment variables.
This skill launches workloads on your Anyscale cloud, which incur real costs. Review what the agent proposes before confirming.
/anyscale-platform-run deploy ./my_service_code/ as an Anyscale service
Inspect
/anyscale-platform-inspect: Diagnose live Anyscale workloads, including workspaces, jobs, and services. Queries logs, events, and metrics through the Anyscale API and validates configurations against Ray source code. Outputs a bug analysis report with code suggestions. Read-only: doesn't write code or launch workloads. For full execution permissions, use /anyscale-platform-fix.
/anyscale-platform-inspect debug this issue in the job [insert job url here]
Fix
/anyscale-platform-fix: Run the full debug-fix-validate loop for Anyscale workloads, including workspaces, jobs, and services. Uses /anyscale-platform-inspect for diagnosis, applies code fixes, and invokes /anyscale-platform-run to redeploy and verify.
This skill reads and writes code and launches workloads on your Anyscale cloud to validate fixes end-to-end. For read-only diagnosis, use /anyscale-platform-inspect.
/anyscale-platform-fix fix this issue in the service [insert service url here]
Infrastructure skills
Infrastructure skills generate cloud configurations and walk through the setup process for the infrastructure that Anyscale runs on. They read and write code and execute shell commands, but don't provision cloud resources directly. Review all generated infrastructure files before applying them.
Kubernetes
/anyscale-infra-kubernetes: Deploy Anyscale on Kubernetes with EKS, GKE, or AKS. Supports both the anyscale cloud setup CLI and Terraform. Covers cluster creation, Helm operator installation, networking, GPU configuration, and Karpenter.
/anyscale-infra-kubernetes deploy Anyscale on a new EKS cluster with GPU support
AWS VMs
/anyscale-infra-aws-vm: Deploy Anyscale on AWS using virtual machines. Supports both anyscale cloud setup and Terraform with anyscale cloud register. Walks through VPC, IAM, S3, security groups, EFS, and MemoryDB configuration.
/anyscale-infra-aws-vm deploy Anyscale on AWS with a new VPC in us-west-2
Google Cloud VMs
/anyscale-infra-gcp-vm: Deploy Anyscale on Google Cloud using VMs. Supports both anyscale cloud setup and Terraform with anyscale cloud register. Walks through VPC, IAM, GCS, firewall, Memorystore, and Workload Identity Federation configuration.
/anyscale-infra-gcp-vm deploy Anyscale on Google Cloud in us-central1
Combine skills
Reference multiple skills in a single prompt. The agent executes them in sequence, passing context between them.
Use /anyscale-workload-ray-data to create a batch pipeline that extracts text from PDF files
and generates a one-paragraph summary per page using Qwen3-8B. Then use /anyscale-platform-run
to launch it in an Anyscale workspace and verify it runs end-to-end.
The PDF files are located in s3://my-bucket/docs/.