Supported tasks
LLMForge supports the following tasks out-of-the-box:
- Causal language modeling: Loss considers predictions for all the tokens.
- Instruction tuning: Only "assistant" tokens are considered in the loss.
- Classification: Only a user-defined set of labels are predicted based on past tokens.
- Preference tuning: Use the contrast between chosen and rejected messages to improve the model.
The following hyperparameters enable this:
task
, classifier_config
(specific to classification) and preference_tuning_config
(specific to preference tuning).
Note that by default, task
defaults to "causal_lm"
unless you specify a task-specific config like classifier_config
or preference_tuning_config
.
Dataset format
The dataset format for all tasks is the OpenAI chat data format. This means that whether the task is continued pre-training on plain text (the causal_lm
task) or classifying messages as safe/unsafe, you need to format the dataset in the OpenAI format. Given below are some example dataset and config snippets for each task.
Task examples
Causal Language Modeling
Example Config:
model_id: meta-llama/Meta-Llama-3-8B-Instruct # any HF model ID
task: "causal_lm"
generation_config:
prompt_format: # does nothing but concatenation
system: "{instruction}"
user: "{instruction}"
assistant: "{instruction}"
system_in_user: False
...
Example dataset:
{
"messages": [
{"role": "user", "content": "Once upon a time ..."},
],
},
Instruction Tuning
Here, only the "assistant" tokens are considered in the loss function. This masking is especially helpful when you've got conversational data with large user/system messages (perhaps few shot examples), allowing you to train only on the tokens that matter.
Example Config:
model_id: meta-llama/Meta-Llama-3-8B-Instruct # any HF model ID
task: instruction_tuning
...
Example Dataset:
{
"messages": [
{"role": "user", "content": "Hello!"},
{"role": "assistant": "content": "Hi! How can I help you"}
],
},
Classification
Example Config:
model_id: meta-llama/Meta-Llama-3-8B-Instruct # any HF model ID
task: classification # this is optional - can be inferred when classifier_config is provided
classifier_config:
label_tokens:
- "[[1]]"
- "[[2]]"
- "[[3]]"
- "[[4]]"
- "[[5]]"
eval_metrics:
- "prauc"
- "accuracy"
- "f1"
...
In the classifier config, there are two configurable parameters:
label_tokens
: These are the set of strings representing the labels for classification.eval_metrics
: A list of evaluation metrics to be used for the classifier. See the API reference for supported metrics.
Internally, each label is added as a special token to the tokenizer. Model training happens as usual with standard cross-entropy loss, but considers only the final assistant message (with the ground truth label).
Example Dataset:
{
"messages": [
{"role": "system", "content": "[Instruction]\nBased on the question provided below, predict the score an expert evaluator would give to an AI assistant\'s response......"},
{"role": "user",
"content": "[Question]\nI'll give you a review, can you......[Answer]...."},
{"role": "assistant", "content": "[[3]]"}
]
}
For a detailed end-to-end example, see the LLM router template to see how you can fine-tune an LLM to route responses.
Preference Tuning
Llmforge supports preference tuning for language models through Direct Preference Optimization (DPO).
Example Config:
model_id: meta-llama/Meta-Llama-3-8B-Instruct # any HF model ID
task: preference_tuning # this is optional, can be inferred when preference_tuning_config is provided
preference_tuning_config:
# beta parameter in DPO, controlling the amount of regularization
beta: 0.01
logprob_processor_scaling_config:
custom_resources:
accelerator_type:A10G: 0.001 # custom resource per worker.
# Runs reference model logp calculation on 4 GPUs
concurrency: 4
# Batch size per worker
batch_size: 2
...
You can find more details in the dedicated preference tuning guide.