Deprecated

LLMForge is being deprecated: The Ray Team is consolidating around open source fine-tuning solutions. Llama Factory and Axolotl provide enhanced functionality (quantization, advanced algorithms) and native Ray support for scaling. See the migration guide for transitioning your workflows.

Classification

In classification, the model predicts only a user-defined set of labels based on past tokens.

Example config

model_id: meta-llama/Meta-Llama-3-8B-Instruct # Any HF model ID.
task: classification # Optional: LLMForge can infer the task when you provide `classifier_config`.
classifier_config:
    label_tokens:
        - "[[1]]"
        - "[[2]]"
        - "[[3]]"
        - "[[4]]"
        - "[[5]]"
    eval_metrics:
      - "prauc"
      - "accuracy"
      - "f1"
...

In the classifier config, there are two configurable parameters:

label_tokens: These tokens are the set of strings representing the labels for classification.
eval_metrics: A list of evaluation metrics for LLMForge to use for the classifier. See the API reference for supported metrics.

Internally, LLMForge adds each label as a special token to the tokenizer. Model training happens as usual with standard cross-entropy loss, but considers only the final assistant message with the ground truth label.

Example dataset

{
    "messages": [
        {"role": "system", "content": "[Instruction]\nBased on the question provided below, predict the score an expert evaluator would give to an AI assistant\'s response......"},
    {"role": "user",
    "content": "[Question]\nI'll give you a review, can you......[Answer]...."},
    {"role": "assistant", "content": "[[3]]"}
    ]
}

For a detailed end-to-end example, see the LLM router template for how to fine-tune an LLM to route responses.

Example config​

Example dataset​

Example config

Example dataset