Skip to main content
Version: Canary 🐤

Supported models for fine-tuning

Changes to Anyscale Endpoints API

Effective August 1, 2024 Anyscale Endpoints API will be available exclusively through the fully Hosted Anyscale Platform. Multi-tenant access to LLM models will be removed.

With the Hosted Anyscale Platform, you can access the latest GPUs billed by the second, and deploy models on your own dedicated instances. Enjoy full customization to build your end-to-end applications with Anyscale. Get started today.

The following table provides a list of models available for fine-tuning. Each model has a context length, or, the maximum number of tokens the model can process in a single input. The context length depends on whether you're running inference on the base model, fine-tuning the base model, or running inference on the fine-tuned model.

ModelSizeBase context lengthFine-tuning context lengthInferencing context length
meta-llama/Meta-Llama-3-8B-Instruct8B8192 tokensUp to 32768 tokensThe greater of the base context length and fine-tuning context length used.
meta-llama/Meta-Llama-3-70B-Instruct70B8192 tokensUp to 16384 tokensThe greater of the base context length and fine-tuning context length used.
mistralai/Mistral-7B-Instruct-v0.17B8192 tokensUp to 8192 tokensThe greater of the base context length and fine-tuning context length used.
mistralai/Mixtral-8x7B-Instruct-v0.18x7B32768 tokensUp to 32768 tokensThe greater of the base context length and fine-tuning context length used.
Context length considerations

Fine-tuning a model with a context length longer than its base context length may increase training time and result in reduced model quality.

Deprecation of Llama-2 models

meta-llama/Llama-2-7b-chat-hf, meta-llama/Llama-2-13b-chat-hf, meta-llama/Llama-2-70b-chat-hf have transitioned to the legacy models list. Moving forward, for fine-tuning a similar or better model, use meta-llama/Meta-Llama-3-8B-Instruct or meta-llama/Meta-Llama-3-70B-Instruct.

Serving support for fine-tuned Llama-2 models expires on June 15, 2024.