ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2112.01342
  4. Cited By
How not to Lie with a Benchmark: Rearranging NLP Leaderboards

How not to Lie with a Benchmark: Rearranging NLP Leaderboards

2 December 2021
Tatiana Shavrina
Valentin Malykh
    ALMELM
ArXiv (abs)PDFHTML

Papers citing "How not to Lie with a Benchmark: Rearranging NLP Leaderboards"

7 / 7 papers shown
Title
Beyond statistical significance: Quantifying uncertainty and statistical variability in multilingual and multitask NLP evaluation
Beyond statistical significance: Quantifying uncertainty and statistical variability in multilingual and multitask NLP evaluation
Jonne Sälevä
Duygu Ataman
Constantine Lignos
116
0
0
26 Sep 2025
LEXTREME: A Multi-Lingual and Multi-Task Benchmark for the Legal Domain
LEXTREME: A Multi-Lingual and Multi-Task Benchmark for the Legal DomainConference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Joel Niklaus
Veton Matoshi
Pooja Rani
Andrea Galassi
Matthias Sturmer
Ilias Chalkidis
ELMAILaw
319
72
0
30 Jan 2023
Processing Long Legal Documents with Pre-trained Transformers: Modding
  LegalBERT and Longformer
Processing Long Legal Documents with Pre-trained Transformers: Modding LegalBERT and Longformer
Dimitris Mamakas
Petros Tsotsi
Ion Androutsopoulos
Ilias Chalkidis
VLMAILaw
209
33
0
02 Nov 2022
Voteñ'Rank: Revision of Benchmarking with Social Choice Theory
Voteñ'Rank: Revision of Benchmarking with Social Choice TheoryConference of the European Chapter of the Association for Computational Linguistics (EACL), 2022
Mark Rofin
Vladislav Mikhailov
Mikhail Florinskiy
A. Kravchenko
E. Tutubalina
Tatiana Shavrina
Daniel Karabekyan
Ekaterina Artemova
278
15
0
11 Oct 2022
Automatic Rule Induction for Interpretable Semi-Supervised Learning
Automatic Rule Induction for Interpretable Semi-Supervised LearningConference on Empirical Methods in Natural Language Processing (EMNLP), 2022
Reid Pryzant
Ziyi Yang
Yichong Xu
Chenguang Zhu
Michael Zeng
243
10
0
18 May 2022
Slovene SuperGLUE Benchmark: Translation and Evaluation
Slovene SuperGLUE Benchmark: Translation and EvaluationInternational Conference on Language Resources and Evaluation (LREC), 2022
Aleš Žagar
Marko Robnik-Šikonja
147
12
0
10 Feb 2022
LexGLUE: A Benchmark Dataset for Legal Language Understanding in English
LexGLUE: A Benchmark Dataset for Legal Language Understanding in English
Ilias Chalkidis
Abhik Jana
D. Hartung
M. Bommarito
Ion Androutsopoulos
Daniel Martin Katz
Nikolaos Aletras
AILawELM
450
354
0
03 Oct 2021
1