ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2504.13439
33
0

D-GEN: Automatic Distractor Generation and Evaluation for Reliable Assessment of Generative Model

18 April 2025
Grace Byun
Jinho D. Choi
    EGVM
ArXivPDFHTML
Abstract

Evaluating generative models with open-ended generation is challenging due to inconsistencies in response formats. Multiple-choice (MC) evaluation mitigates this issue, but generating high-quality distractors is time-consuming and labor-intensive. We introduce D-GEN, the first open-source distractor generator model that transforms open-ended data into an MC format. To evaluate distractor quality, we propose two novel methods: (1) ranking alignment, ensuring generated distractors retain the discriminatory power of ground-truth distractors, and (2) entropy analysis, comparing model confidence distributions. Our results show that D-GEN preserves ranking consistency (Spearman's rho 0.99, Kendall's tau 0.94) and closely matches the entropy distribution of ground-truth distractors. Human evaluation further confirms the fluency, coherence, distractiveness, and incorrectness. Our work advances robust and efficient distractor generation with automated evaluation, setting a new standard for MC evaluation.

View on arXiv
@article{byun2025_2504.13439,
  title={ D-GEN: Automatic Distractor Generation and Evaluation for Reliable Assessment of Generative Model },
  author={ Grace Byun and Jinho Choi },
  journal={arXiv preprint arXiv:2504.13439},
  year={ 2025 }
}
Comments on this paper