Skip to main content

Instruction tuning

For instruction tuning, LLMForge only considers the "assistant" tokens in the loss function. This masking is especially helpful for conversational data with large user or system messages, perhaps few shot examples, allowing you to train only on the tokens that matter.

Example config

model_id: meta-llama/Meta-Llama-3-8B-Instruct # Any HF model ID.
task: instruction_tuning
...

Example dataset

{
"messages": [
{"role": "user", "content": "Hello!"},
{"role": "assistant", "content": "Hi! How can I help you"}
],
},