Choose a framework for LLM post-training

This page helps you choose the right framework for post-training large language models (LLMs) on Anyscale.

Recommended approaches

Anyscale supports post-training techniques including Supervised Fine-Tuning (SFT), Reinforcement Learning from Human Feedback (RLHF), Reinforcement Learning from Verifiable Rewards (RLVR) and agentic tuning. For more details, see Post-training for LLMs on Anyscale. The following table compares three frameworks for LLM post-training and agentic tuning on Anyscale:

Framework	Use cases and key features	Setup effort	Integration with Anyscale and Ray
LLaMA-Factory	SFT and RLHF (DPO, KTO, PPO) for open-source LLMs; continued pretraining; multimodal recipes. Config-driven YAML runs; minimal code with provided configs. Guides for dataset prep, checkpointing, and experiment tracking (in Anyscale docs). Supports FSDP and DeepSpeed.	Easy — official guides and examples; minimal code when using provided configs.	Use Ray train as orchestration; official Anyscale guides; runs on Ray clusters on Anyscale for scale-out training.
SkyRL	RLVR and agentic tuning with real environment interactions; long-horizon tasks (for example, SWE-Bench, Text2SQL). Modular full-stack RLVR algorithms (PPO, GRPO, DAPO). Async rollouts, custom environments, flexible model placement and colocation. Supports FSDP, DeepSpeed, and Megatron.	Medium — powerful and flexible; requires environment setup and Ray cluster familiarity.	Uses Ray for orchestration; deployable on Anyscale Ray clusters.
Ray Train	Custom LLM and reward-model training and other architectures; low-level control with PyTorch, Lightning, and HF Accelerate. Scalable distributed training across nodes. GPU and accelerator targeting; checkpoints; fault tolerance. Rich Hugging Face integrations.	Medium — you write training code, but many end-to-end examples exist.	Core Ray training library; first-class on Anyscale (Ray clusters on Anyscale; Train scales across nodes natively).

Recommended approaches​

Other LLM post-training and RL frameworks​

Recommended approaches

Other LLM post-training and RL frameworks