Skip to main content

Get started

Fine-tuning involves taking a pre-trained model and further refining it with a domain-specific dataset. The decision to fine-tune has subtle considerations. It's useful when techniques such as prompt engineering, document retrieval, and tool use don't satisfy the need to strike a particular style, format, or tone. When using smaller open source models on niche tasks, this method shows promising cost-saving potential without sacrificing quality.

Follow this guide for running fine-tuning on your data.


This guide outlines how to initiate fine-tuning with Anyscale Private Endpoints. For foundational fine-tuning information, consult the OpenAI documentation, as the formats and core steps remain the same.

Step 0: Prerequisites

Set up your account

Sign up for Anyscale Private Endpoints to receive an invite code. Then, create an account or sign in through the Anyscale Console.

Anyscale authentication

Install the Anyscale command-line tool and authenticate your account.

Satisfy cloud prerequisites

Deploy your Anyscale Cloud. Then, ensure that your cloud has sufficient quota to fine-tune your LLMs.

Step 1: Select your cloud and model

In the New fine-tuning job page, choose the following:

  • the cloud you want to run the fine-tuning job.
  • the model to use as a base.

Step 2: Prepare training data

To guide the base model, prepare a diverse set of conversations in a training dataset for fine-tuning and an optional validation dataset to provide a check against over-fitting. Both files should be in .jsonl (JSON Lines), which is the same format OpenAI's chat completion API uses.

Example .jsonl format

Each line is a JSON object with a messages array that represents one conversation. Each conversation three roles:

  • system - Description of the behavior or personality of the model.
  • user - User message, or "prompt," that provides a request for the model to respond to.
  • assistant - Stores previous responses but can also contain examples of intended responses.
{"messages": [{"role": "system", "content": "Answer in fewer than 25 words."}, {"role": "user", "content": "What's the difference between yogurt and labneh?"}, {"role": "assistant", "content": "Labneh is strained yogurt, thicker in texture with a tangier taste, often used as a dip or spread."}]}
{"messages": [{"role": "system", "content": "Answer in fewer than 25 words."}, {"role": "user", "content": "What is a floating point operation?"}, {"role": "assistant", "content": "A floating point operation is a calculation involving real numbers with decimal points, allowing for a wide range of values with fractional precision."}]}

Conversations can consist of a single query and reply or contain multiple "query and reply" exchanges.

Check data formatting

To verify your data formatting before running any fine-tuning job, run the following Python script:

import json


# Load the dataset
with open(DATA_PATH, 'r', encoding='utf-8') as f:
items = [json.loads(line) for line in f]

class DataFormatError(Exception):

def check_data_for_format_errors(items: list):

for line_num, batch in enumerate(items):
prefix = f"Error in line #{line_num + 1}: "
if not isinstance(batch, dict):
raise DataFormatError(
f"{prefix}Each line in the provided data should be a dictionar."

if "messages" not in batch:
raise DataFormatError(
f"{prefix}Each line in the provided data should have a 'messages' key."

if not isinstance(batch["messages"], list):
raise DataFormatError(
f"{prefix}Each line in the provided data should have a 'messages' key with a list of messages."

messages = batch["messages"]
if not any(message.get("role", None) == "assistant" for message in messages):
raise DataFormatError(
f"{prefix}Each message list should have at least one message with role 'assistant'."

for message_num, message in enumerate(messages):
prefix = f"Error in line #{line_num + 1}, message #{message_num + 1}: "
if "role" not in message or "content" not in message:
raise DataFormatError(
f"{prefix}Each message should have a 'role' and 'content' key."

if any(k not in ("role", "content", "name") for k in message):
raise DataFormatError(
f"{prefix}Each message should only have 'role', 'content', and 'name' keys, any other key is not allowed."

if message.get("role", None) not in ("system", "user", "assistant"):
raise DataFormatError(
f"{prefix}Each message should have a valid role (system, user, or assistant)."

print("Data format is valid!")
except DataFormatError as e:
print("Data format is NOT valid!")

Step 3: Run the fine-tuning job

After preparing your training dataset and optional validation dataset, initiate the fine-tuning job and upload files to your cloud with the following Anyscale command-line command:

anyscale fine-tune submit BASE_MODEL_NAME --train-file TRAIN_FILE_JSONL --valid-file VALIDATION_FILE_JSONL --cloud-id CLOUD_ID

Remember to replace these placeholder variables with your values:

  • BASE_MODEL_NAME - The name of the base pre-trained model to fine-tune.
  • TRAIN_FILE_JSONL - The filename for the training data in a .jsonl format.
  • VALIDATION_FILE_JSONL - The filename for the validation data in a .jsonl format.
  • CLOUD_ID - The ID of the Anyscale cloud you're using.

Datasets, models, and other files save to your default storage bucket of your Anyscale cloud. Navigate to your cloud service provider console to manage them.

Step 4: Try the fine-tuned model

Once the fine-tuning job completes, Anyscale sends an email containing the results of the run. To send queries, deploy the fine-tuned model as you would with a pre-trained model. Retrieve the FINE_TUNED_MODEL_NAME from the Fine-tuning jobs tab or listed when creating a New endpoint.

Remember to setup your Anyscale API base and key.

import os
from openai import OpenAI

client = OpenAI(
api_key = os.environ['OPENAI_API_KEY'],
base_url = os.environ['OPENAI_BASE_URL']

response =
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Hello!"}