Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2407.21530
Cited By
Data Contamination Report from the 2024 CONDA Shared Task
31 July 2024
Oscar Sainz
Iker García-Ferrero
Alon Jacovi
Jonas Hanselle
Yanai Elazar
Eneko Agirre
Yoav Goldberg
Wei-Lin Chen
Johannes Fürnkranz
Leshem Choshen
Luca DÁmico-Wong
Melissa Dell
Run-Ze Fan
Shahriar Golchin
Yucheng Li
Pengfei Liu
Bhavish Pahwa
Ameya Prabhu
Suryansh Sharma
Emily Silcock
Kateryna Solonko
David Stap
Mihai Surdeanu
Yu-Min Tseng
Vishaal Udandarao
Zengzhi Wang
Ruijie Xu
Jinglin Yang
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Data Contamination Report from the 2024 CONDA Shared Task"
10 / 10 papers shown
Title
The FACTS Grounding Leaderboard: Benchmarking LLMs' Ability to Ground Responses to Long-Form Input
Alon Jacovi
Andrew Wang
Chris Alberti
Connie Tao
Jon Lipovetz
...
Rachana Fellinger
Rui Wang
Zizhao Zhang
Sasha Goldshtein
Dipanjan Das
HILM
ALM
77
11
0
06 Jan 2025
Evaluation data contamination in LLMs: how do we measure it and (when) does it matter?
Aaditya K. Singh
Muhammed Yusuf Kocyigit
Andrew Poulton
David Esiobu
Maria Lomeli
Gergely Szilvasy
Dieuwke Hupkes
22
7
0
06 Nov 2024
Contamination Report for Multilingual Benchmarks
Sanchit Ahuja
Varun Gumma
Sunayana Sitaram
18
0
0
21 Oct 2024
Detecting Training Data of Large Language Models via Expectation Maximization
Gyuwan Kim
Yang Li
Evangelia Spiliopoulou
Jie Ma
Miguel Ballesteros
William Yang Wang
MIALM
90
3
2
10 Oct 2024
Paloma: A Benchmark for Evaluating Language Model Fit
Ian H. Magnusson
Akshita Bhagia
Valentin Hofmann
Luca Soldaini
A. Jha
...
Iz Beltagy
Hanna Hajishirzi
Noah A. Smith
Kyle Richardson
Jesse Dodge
132
21
0
16 Dec 2023
State-of-the-art generalisation research in NLP: A taxonomy and review
Dieuwke Hupkes
Mario Giulianelli
Verna Dankers
Mikel Artetxe
Yanai Elazar
...
Leila Khalatbari
Maria Ryskina
Rita Frieske
Ryan Cotterell
Zhijing Jin
103
91
0
06 Oct 2022
The Pile: An 800GB Dataset of Diverse Text for Language Modeling
Leo Gao
Stella Biderman
Sid Black
Laurence Golding
Travis Hoppe
...
Horace He
Anish Thite
Noa Nabeshima
Shawn Presser
Connor Leahy
AIMat
245
1,977
0
31 Dec 2020
Explainable Automated Fact-Checking for Public Health Claims
Neema Kotonya
Francesca Toni
210
247
0
19 Oct 2020
Language Models as Knowledge Bases?
Fabio Petroni
Tim Rocktaschel
Patrick Lewis
A. Bakhtin
Yuxiang Wu
Alexander H. Miller
Sebastian Riedel
KELM
AI4MH
398
2,576
0
03 Sep 2019
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding
Alex Jinpeng Wang
Amanpreet Singh
Julian Michael
Felix Hill
Omer Levy
Samuel R. Bowman
ELM
294
6,927
0
20 Apr 2018
1