Skip to main content
Deprecated

LLMForge is being deprecated: The Ray Team is consolidating around open source fine-tuning solutions. Llama Factory and Axolotl provide enhanced functionality (quantization, advanced algorithms) and native Ray support for scaling. See the migration guide for transitioning your workflows.

Instruction tuning

For instruction tuning, LLMForge only considers the "assistant" tokens in the loss function. This masking is especially helpful for conversational data with large user or system messages, perhaps few shot examples, allowing you to train only on the tokens that matter.

Example config

model_id: meta-llama/Meta-Llama-3-8B-Instruct # Any HF model ID.
task: instruction_tuning
...

Example dataset

{
"messages": [
{"role": "user", "content": "Hello!"},
{"role": "assistant", "content": "Hi! How can I help you"}
],
},