Prompt Weight Experiments for LLM Instruction Fine-Tuning
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2024
Abstract
We present a small study analyzing how prompt token classification loss weighting (PLW) affects the performance of 7B-size LLaMA models fine-tuned on instruction tasks. We recreated Stanford's Alpaca experiment with both LLaMA 1 and LLaMA 2 using multiple instruction datasets. We found that models fine-tuned on our short-completion dataset have a negative quadratic relationship with PLW while models fine-tuned on long-completion datasets were unaffected by PLW.
View on arXivComments on this paper
