Supported models for fine-tuning
These docs are for the new Anyscale design. If you started using Anyscale before April 2024, use Version 1.0.0 of the docs. If you're transitioning to Anyscale Preview, see the guide for how to migrate.
The following table provides a list of models available for fine-tuning. Each model has a context length, or, the maximum number of tokens the model can process in a single input. The context length depends on whether you're running inference on the base model, fine-tuning the base model, or running inference on the fine-tuned model.
Model | Size | Base context length | Fine-tuning context length | Inferencing context length |
---|---|---|---|---|
meta-llama/Meta-Llama-3-8B-Instruct | 8B | 8192 tokens | Up to 32768 tokens | The greater of the base context length and fine-tuning context length used. |
meta-llama/Meta-Llama-3-70B-Instruct | 70B | 8192 tokens | Up to 16384 tokens | The greater of the base context length and fine-tuning context length used. |
mistralai/Mistral-7B-Instruct-v0.1 | 7B | 8192 tokens | Up to 8192 tokens | The greater of the base context length and fine-tuning context length used. |
mistralai/Mixtral-8x7B-Instruct-v0.1 | 8x7B | 32768 tokens | Up to 32768 tokens | The greater of the base context length and fine-tuning context length used. |
Fine-tuning a model with a context length longer than its base context length may increase training time and result in reduced model quality.
meta-llama/Llama-2-7b-chat-hf
, meta-llama/Llama-2-13b-chat-hf
, meta-llama/Llama-2-70b-chat-hf
have transitioned to the legacy models list. Moving forward, for fine-tuning a similar or better model, use meta-llama/Meta-Llama-3-8B-Instruct
or meta-llama/Meta-Llama-3-70B-Instruct
.
Serving support for fine-tuned Llama-2 models expires on June 15, 2024.