ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2407.21072
  4. Cited By
Beyond Metrics: A Critical Analysis of the Variability in Large Language
  Model Evaluation Frameworks

Beyond Metrics: A Critical Analysis of the Variability in Large Language Model Evaluation Frameworks

29 July 2024
Marco AF Pimentel
Clément Christophe
Tathagata Raha
Prateek Munjal
Praveen K Kanithi
Shadab Khan
    ELM
ArXivPDFHTML

Papers citing "Beyond Metrics: A Critical Analysis of the Variability in Large Language Model Evaluation Frameworks"

1 / 1 papers shown
Title
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language
  Understanding
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding
Alex Jinpeng Wang
Amanpreet Singh
Julian Michael
Felix Hill
Omer Levy
Samuel R. Bowman
ELM
294
6,927
0
20 Apr 2018
1