Skip to main content

Classification

In classification, the model predicts only a user-defined set of labels based on past tokens.

Example config

model_id: meta-llama/Meta-Llama-3-8B-Instruct # Any HF model ID.
task: classification # Optional: LLMForge can infer the task when you provide `classifier_config`.
classifier_config:
label_tokens:
- "[[1]]"
- "[[2]]"
- "[[3]]"
- "[[4]]"
- "[[5]]"
eval_metrics:
- "prauc"
- "accuracy"
- "f1"
...

In the classifier config, there are two configurable parameters:

  • label_tokens: These tokens are the set of strings representing the labels for classification.
  • eval_metrics: A list of evaluation metrics for LLMForge to use for the classifier. See the API reference for supported metrics.

Internally, LLMForge adds each label as a special token to the tokenizer. Model training happens as usual with standard cross-entropy loss, but considers only the final assistant message with the ground truth label.

Example dataset

{
"messages": [
{"role": "system", "content": "[Instruction]\nBased on the question provided below, predict the score an expert evaluator would give to an AI assistant\'s response......"},
{"role": "user",
"content": "[Question]\nI'll give you a review, can you......[Answer]...."},
{"role": "assistant", "content": "[[3]]"}
]
}

For a detailed end-to-end example, see the LLM router template for how to fine-tune an LLM to route responses.