Skip to main content

Fine-tuning Llama-2 models with DeepSpeed, Accelerate, and Ray Train

💻Try it out

Run this example in the Anyscale Console or view it on GitHub.

This template shows you how to fine-tune Llama-2 models.

Step 1: Fine-tune Llama-2-7B model​

The flag --as-test is for testing purpose as it runs through only one forward and backward pass and checkpoints the model. It takes ~7 minutes. Without this flag, it takes ~42 minutes (3 epochs for optimal model quality).

Model checkpoints will be stored under {user's first name}/ft_llms_with_deepspeed/ in the cloud storage bucket created for your Anyscale account. The full path will be printed in the output after the training is completed.

python train.py --size=7b --as-test

(Optional) Step 2: Switch to a different model size​

  • Change model size (7B, 13B, or 70B) with the --size option in the previous command
  • Change the worker node type accordingly to fit the model
    • use g5.12xlarge for 13B
    • use g5.48xlarge for 70B
edit nodes
  • Run the command below to kick off fine-tuning with new model size and worker nodes.
    • 13B: ~9 minutes for test run. ~60 minutes for a full run (3 epochs)
    • 13B: ~35 minutes for test run. ~400 minutes for a full run (3 epochs)
python train.py --size=13b --as-test

Step 3: Use your custom data​

  • Replace the contents in ./data/train.jsonl with your own training data
  • (Optional) Replace the contents in ./data/test.jsonl with your own test data if any.
  • (Optional) Add special token in ./data/tokens.json if any.

Use the same command to train with your own data.


What's next.​

You have fine-tuned your own Llama-2 models. Want more than this? See the advanced tutorials below:

  • Walkthrough of this template: navigate to tutorials/walkthrough.md
  • Fine-tune Llama-2 with LoRA adapters: navigate to tutorials/lora.md