Finding good training hyperparameters for brand spanking new LLMs is all the time difficult and time-consuming. With Zephyr Gemma 7B, Hugging Face seems to have found a great recipe for fine-tuning Gemma. They used a mixture of distilled supervised fine-tuning and DPO just like what they did for…
Home Artificial Intelligence High quality-tune Google Gemma with Unsloth and Distilled DPO on Your Computer