ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2205.06356
  4. Cited By
Beyond Static Models and Test Sets: Benchmarking the Potential of
  Pre-trained Models Across Tasks and Languages
v1v2 (latest)

Beyond Static Models and Test Sets: Benchmarking the Potential of Pre-trained Models Across Tasks and Languages

12 May 2022
Kabir Ahuja
Sandipan Dandapat
Sunayana Sitaram
Monojit Choudhury
    LRM
ArXiv (abs)PDFHTML

Papers citing "Beyond Static Models and Test Sets: Benchmarking the Potential of Pre-trained Models Across Tasks and Languages"

10 / 10 papers shown
Uncovering inequalities in new knowledge learning by large language models across different languages
Chenglong Wang
Haoyu Tang
Xiyuan Yang
Yueqi Xie
Jina Suh
...
Junming Huang
Yu Xie
Zhaoya Gong
Xing Xie
Fangzhao Wu
291
2
0
06 Mar 2025
SMAB: MAB based word Sensitivity Estimation Framework and its Applications in Adversarial Text Generation
SMAB: MAB based word Sensitivity Estimation Framework and its Applications in Adversarial Text GenerationNorth American Chapter of the Association for Computational Linguistics (NAACL), 2025
Saurabh Kumar Pandey
S. Vashistha
Debrup Das
Somak Aditya
Monojit Choudhury
AAML
390
0
0
10 Feb 2025
PARIKSHA : A Large-Scale Investigation of Human-LLM Evaluator Agreement
  on Multilingual and Multi-Cultural Data
PARIKSHA : A Large-Scale Investigation of Human-LLM Evaluator Agreement on Multilingual and Multi-Cultural Data
Ishaan Watts
Varun Gumma
Aditya Yadavalli
Vivek Seshadri
Manohar Swaminathan
Sunayana Sitaram
ELM
275
24
0
21 Jun 2024
METAL: Towards Multilingual Meta-Evaluation
METAL: Towards Multilingual Meta-Evaluation
Rishav Hada
Varun Gumma
Mohamed Ahmed
Kalika Bali
Sunayana Sitaram
ELM
220
8
0
02 Apr 2024
Are Large Language Model-based Evaluators the Solution to Scaling Up
  Multilingual Evaluation?
Are Large Language Model-based Evaluators the Solution to Scaling Up Multilingual Evaluation?Findings (Findings), 2023
Rishav Hada
Varun Gumma
Adrian de Wynter
Harshita Diddee
Mohamed Ahmed
Monojit Choudhury
Kalika Bali
Sunayana Sitaram
ALMLM&MAELM
306
87
0
14 Sep 2023
MEGA: Multilingual Evaluation of Generative AI
MEGA: Multilingual Evaluation of Generative AIConference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Kabir Ahuja
Harshita Diddee
Rishav Hada
Millicent Ochieng
Krithika Ramesh
...
T. Ganu
Sameer Segal
Maxamed Axmed
Kalika Bali
Sunayana Sitaram
LM&MALRMELM
548
348
0
22 Mar 2023
On the Calibration of Massively Multilingual Language Models
On the Calibration of Massively Multilingual Language ModelsConference on Empirical Methods in Natural Language Processing (EMNLP), 2022
Kabir Ahuja
Sunayana Sitaram
Sandipan Dandapat
Monojit Choudhury
217
21
0
21 Oct 2022
i-Code: An Integrative and Composable Multimodal Learning Framework
i-Code: An Integrative and Composable Multimodal Learning FrameworkAAAI Conference on Artificial Intelligence (AAAI), 2022
Ziyi Yang
Yuwei Fang
Chenguang Zhu
Reid Pryzant
DongDong Chen
...
Bin Xiao
Yuanxun Lu
Takuya Yoshioka
Michael Zeng
Xuedong Huang
284
53
0
03 May 2022
Multilingual CheckList: Generation and Evaluation
Multilingual CheckList: Generation and Evaluation
Karthikeyan K
Shaily Bhatt
Pankaj Singh
Somak Aditya
Sandipan Dandapat
Sunayana Sitaram
Monojit Choudhary
ELM
309
2
0
24 Mar 2022
TyDi QA: A Benchmark for Information-Seeking Question Answering in
  Typologically Diverse Languages
TyDi QA: A Benchmark for Information-Seeking Question Answering in Typologically Diverse LanguagesTransactions of the Association for Computational Linguistics (TACL), 2020
J. Clark
Eunsol Choi
Michael Collins
Dan Garrette
Tom Kwiatkowski
Vitaly Nikolaev
J. Palomaki
544
688
0
10 Mar 2020
1