ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2503.06573
57
0

WildIFEval: Instruction Following in the Wild

9 March 2025
Gili Lior
Asaf Yehudai
Ariel Gera
L. Ein-Dor
ArXivPDFHTML
Abstract

Recent LLMs have shown remarkable success in following user instructions, yet handling instructions with multiple constraints remains a significant challenge. In this work, we introduce WildIFEval - a large-scale dataset of 12K real user instructions with diverse, multi-constraint conditions. Unlike prior datasets, our collection spans a broad lexical and topical spectrum of constraints, in natural user prompts. We categorize these constraints into eight high-level classes to capture their distribution and dynamics in real-world scenarios. Leveraging WildIFEval, we conduct extensive experiments to benchmark the instruction-following capabilities of leading LLMs. Our findings reveal that all evaluated models experience performance degradation with an increasing number of constraints. Thus, we show that all models have a large room for improvement on such tasks. Moreover, we observe that the specific type of constraint plays a critical role in model performance. We release our dataset to promote further research on instruction-following under complex, realistic conditions.

View on arXiv
@article{lior2025_2503.06573,
  title={ WildIFEval: Instruction Following in the Wild },
  author={ Gili Lior and Asaf Yehudai and Ariel Gera and Liat Ein-Dor },
  journal={arXiv preprint arXiv:2503.06573},
  year={ 2025 }
}
Comments on this paper