Fine-tuning Llama-2 models with DeepSpeed, Accelerate, and Ray Train
💻Try it out
Run this example in the Anyscale Console or view it on GitHub.
This template shows you how to fine-tune Llama-2 models.
Step 1: Fine-tune Llama-2-7B model​
- Run the command below to kick off fine-tuning with dummy data, Grade School Math 8k (GSM8K) dataset.
The flag --as-test
is for testing purpose as it runs through only one forward and backward pass and checkpoints the model. It takes ~7 minutes. Without this flag, it takes ~42 minutes (3 epochs for optimal model quality).
Model checkpoints will be stored under {user's first name}/ft_llms_with_deepspeed/
in the cloud storage bucket created for your Anyscale account. The full path will be printed in the output after the training is completed.
python train.py --size=7b --as-test
(Optional) Step 2: Switch to a different model size​
- Change model size (7B, 13B, or 70B) with the
--size
option in the previous command - Change the worker node type accordingly to fit the model
- use g5.12xlarge for 13B
- use g5.48xlarge for 70B
- Run the command below to kick off fine-tuning with new model size and worker nodes.
- 13B: ~9 minutes for test run. ~60 minutes for a full run (3 epochs)
- 13B: ~35 minutes for test run. ~400 minutes for a full run (3 epochs)
python train.py --size=13b --as-test
Step 3: Use your custom data​
- Replace the contents in
./data/train.jsonl
with your own training data - (Optional) Replace the contents in
./data/test.jsonl
with your own test data if any. - (Optional) Add special token in
./data/tokens.json
if any.
Use the same command to train with your own data.
What's next.​
You have fine-tuned your own Llama-2 models. Want more than this? See the advanced tutorials below:
- Walkthrough of this template: navigate to
tutorials/walkthrough.md
- Fine-tune Llama-2 with LoRA adapters: navigate to
tutorials/lora.md