Skip to main content
Version: Latest

Model deprecations

Check your docs version

These docs are for the new Anyscale design. If you started using Anyscale before April 2024, use Version 1.0.0 of the docs. If you're transitioning to Anyscale Preview, see the guide for how to migrate.

Changes to Anyscale Endpoints API

Effective August 1, 2024 Anyscale Endpoints API will be available exclusively through the fully Hosted Anyscale Platform. Multi-tenant access to LLM models will be removed.

With the Hosted Anyscale Platform, you can access the latest GPUs billed by the second, and deploy models on your own dedicated instances. Enjoy full customization to build your end-to-end applications with Anyscale. Get started today.


Deprecated models remain accessible on the Anyscale Platform. For assistance, contact

  • Support discontinued for CodeLlama-70B as of May 28, 2024 and fine-tuned Llama-2 models from June 15, 2024.

  • Support discontinued for Llama-2-7B, Llama-2-13B, Llama-2-70B as of May 7, 2024: Transition to Llama-3 models recommended. Traffic to Llama-2-7B and Llama-2-13B redirects automatically to Llama-3-8B and Llama-2-70B traffic redirects to Llama-3-70B.

  • Support discontinued for SafetyLlama, Zephyr-7B, and Mistral-7B-OpenOrca as of February 22, 2024: Transition to Mistral-7B recommended. Traffic to Zephyr-7B and Mistral-7B-OpenOrca redirects automatically to Mistral 7B.

  • Latency degradation in Llama 13B from February 22, 2024: Transition to Mistral 7B, Mixtral 70B, or Llama 70B advised for enhanced performance.

  • Traffic rerouted from Code Llama 34B to Code Llama 70B as of February 15, 2024: Code Llama 34B traffic has migrated to Code Llama 70B, ensuring improved quality with no additional cost. Transition to Code Llama 70B recommended.