ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2112.01716
  4. Cited By
Reduced, Reused and Recycled: The Life of a Dataset in Machine Learning
  Research

Reduced, Reused and Recycled: The Life of a Dataset in Machine Learning Research

3 December 2021
Bernard Koch
Emily L. Denton
A. Hanna
J. Foster
ArXivPDFHTML

Papers citing "Reduced, Reused and Recycled: The Life of a Dataset in Machine Learning Research"

50 / 78 papers shown
Title
Minimizing Risk Through Minimizing Model-Data Interaction: A Protocol For Relying on Proxy Tasks When Designing Child Sexual Abuse Imagery Detection Models
Minimizing Risk Through Minimizing Model-Data Interaction: A Protocol For Relying on Proxy Tasks When Designing Child Sexual Abuse Imagery Detection Models
Thamiris Coelho
Leo S. F. Ribeiro
João Macedo
J. A. dos Santos
Sandra Avila
16
0
0
10 May 2025
We Need Improved Data Curation and Attribution in AI for Scientific Discovery
We Need Improved Data Curation and Attribution in AI for Scientific Discovery
Mara Graziani
Antonio Foncubierta
Dimitrios Christofidellis
Irina Espejo Morales
Malina Molnar
Marvin Alberts
Matteo Manica
Jannis Born
43
0
0
03 Apr 2025
What do Large Language Models Say About Animals? Investigating Risks of Animal Harm in Generated Text
Arturs Kanepajs
Aditi Basu
Sankalpa Ghose
Constance Li
Akshat Mehta
Ronak Mehta
Samuel David Tucker-Davis
Eric Zhou
Bob Fischer
ALM
ELM
43
0
0
03 Mar 2025
AnnoCaseLaw: A Richly-Annotated Dataset For Benchmarking Explainable Legal Judgment Prediction
Magnus Sesodia
Alina Petrova
John Armour
Thomas Lukasiewicz
Oana-Maria Camburu
P. Dokania
Philip H. S. Torr
Christian Schroeder de Witt
AILaw
ELM
41
1
0
28 Feb 2025
Can We Trust AI Benchmarks? An Interdisciplinary Review of Current Issues in AI Evaluation
Can We Trust AI Benchmarks? An Interdisciplinary Review of Current Issues in AI Evaluation
Maria Eriksson
Erasmo Purificato
Arman Noroozian
Joao Vinagre
Guillaume Chaslot
Emilia Gomez
David Fernandez Llorca
ELM
128
1
0
10 Feb 2025
Pricing and Competition for Generative AI
Pricing and Competition for Generative AI
Rafid Mahmood
22
3
0
04 Nov 2024
A Systematic Review of NeurIPS Dataset Management Practices
A Systematic Review of NeurIPS Dataset Management Practices
Yiwei Wu
Leah Ajmani
Shayne Longpre
Hanlin Li
39
0
0
31 Oct 2024
Benchmark Data Repositories for Better Benchmarking
Benchmark Data Repositories for Better Benchmarking
Rachel Longjohn
Markelle Kelly
Sameer Singh
Padhraic Smyth
38
0
0
31 Oct 2024
Scito2M: A 2 Million, 30-Year Cross-disciplinary Dataset for Temporal
  Scientometric Analysis
Scito2M: A 2 Million, 30-Year Cross-disciplinary Dataset for Temporal Scientometric Analysis
Yiqiao Jin
Yijia Xiao
Yiyang Wang
Jindong Wang
28
0
0
12 Oct 2024
Enhancing Data Quality through Simple De-duplication: Navigating
  Responsible Computational Social Science Research
Enhancing Data Quality through Simple De-duplication: Navigating Responsible Computational Social Science Research
Yida Mu
Mali Jin
Xingyi Song
Nikolaos Aletras
18
0
0
04 Oct 2024
Transforming Scholarly Landscapes: Influence of Large Language Models on
  Academic Fields beyond Computer Science
Transforming Scholarly Landscapes: Influence of Large Language Models on Academic Fields beyond Computer Science
Aniket Pramanick
Yufang Hou
Saif M. Mohammad
Iryna Gurevych
31
1
0
29 Sep 2024
Building Better Datasets: Seven Recommendations for Responsible Design
  from Dataset Creators
Building Better Datasets: Seven Recommendations for Responsible Design from Dataset Creators
Will Orr
Kate Crawford
30
3
0
30 Aug 2024
Benchmarks as Microscopes: A Call for Model Metrology
Benchmarks as Microscopes: A Call for Model Metrology
Michael Stephen Saxon
Ari Holtzman
Peter West
William Yang Wang
Naomi Saphra
29
10
0
22 Jul 2024
A Taxonomy of Challenges to Curating Fair Datasets
A Taxonomy of Challenges to Curating Fair Datasets
Dora Zhao
M. Scheuerman
Pooja Chitre
Jerone T. A. Andrews
Georgia Panagiotidou
Shawn Walker
Kathleen H. Pine
Alice Xiang
39
2
0
10 Jun 2024
Oil & Water? Diffusion of AI Within and Across Scientific Fields
Oil & Water? Diffusion of AI Within and Across Scientific Fields
Eamon Duede
William Dolan
André Bauer
Ian T. Foster
Karim Lakhani
AI4CE
21
4
0
24 May 2024
Adaptive Data Analysis for Growing Data
Adaptive Data Analysis for Growing Data
Neil G. Marchant
Benjamin I. P. Rubinstein
30
0
0
22 May 2024
Position: Why We Must Rethink Empirical Research in Machine Learning
Position: Why We Must Rethink Empirical Research in Machine Learning
Moritz Herrmann
F. J. D. Lange
Katharina Eggensperger
Giuseppe Casalicchio
Marcel Wever
Matthias Feurer
David Rügamer
Eyke Hüllermeier
A. Boulesteix
Bernd Bischl
44
6
0
03 May 2024
Inherent Trade-Offs between Diversity and Stability in Multi-Task
  Benchmarks
Inherent Trade-Offs between Diversity and Stability in Multi-Task Benchmarks
Guanhua Zhang
Moritz Hardt
42
7
0
02 May 2024
AI Competitions and Benchmarks: Dataset Development
AI Competitions and Benchmarks: Dataset Development
Romain Egele
Julio C. S. Jacques Junior
Jan N. van Rijn
Isabelle M Guyon
Xavier Baró
Albert Clapés
Prasanna Balaprakash
Sergio Escalera
T. Moeslund
Jun Wan
42
0
0
15 Apr 2024
From Protoscience to Epistemic Monoculture: How Benchmarking Set the
  Stage for the Deep Learning Revolution
From Protoscience to Epistemic Monoculture: How Benchmarking Set the Stage for the Deep Learning Revolution
Bernard J. Koch
David Peterson
14
5
0
09 Apr 2024
A Decade's Battle on Dataset Bias: Are We There Yet?
A Decade's Battle on Dataset Bias: Are We There Yet?
Zhuang Liu
Kaiming He
37
28
0
13 Mar 2024
Better than classical? The subtle art of benchmarking quantum machine
  learning models
Better than classical? The subtle art of benchmarking quantum machine learning models
Joseph Bowles
Shahnawaz Ahmed
Maria Schuld
34
62
0
11 Mar 2024
Speech Translation with Speech Foundation Models and Large Language
  Models: What is There and What is Missing?
Speech Translation with Speech Foundation Models and Large Language Models: What is There and What is Missing?
Marco Gaido
Sara Papi
Matteo Negri
L. Bentivogli
41
12
0
19 Feb 2024
Copycats: the many lives of a publicly available medical imaging dataset
Copycats: the many lives of a publicly available medical imaging dataset
Amelia Jiménez-Sánchez
Natalia-Rozalia Avlona
Dovile Juodelyte
Théo Sourget
Caroline Vang-Larsen
Anna Rogers
Hubert Dariusz Zajkac
V. Cheplygina
27
0
0
09 Feb 2024
[Citation needed] Data usage and citation practices in medical imaging
  conferences
[Citation needed] Data usage and citation practices in medical imaging conferences
Théo Sourget
Ahmet Akkocc
Stinna Winther
Christine Lyngbye Galsgaard
Amelia Jiménez-Sánchez
Dovile Juodelyte
Caroline Petitjean
V. Cheplygina
14
2
0
05 Feb 2024
Navigating Dataset Documentations in AI: A Large-Scale Analysis of
  Dataset Cards on Hugging Face
Navigating Dataset Documentations in AI: A Large-Scale Analysis of Dataset Cards on Hugging Face
Xinyu Yang
Weixin Liang
James Y. Zou
CVBM
18
16
0
24 Jan 2024
Challenge design roadmap
Challenge design roadmap
Hugo Jair Escalante
Isabelle M Guyon
Addison Howard
Walter Reade
Sébastien Treguer
AI4TS
13
0
0
15 Jan 2024
From Knowledge Representation to Knowledge Organization and Back
From Knowledge Representation to Knowledge Organization and Back
Fausto Giunchiglia
Mayukh Bagchi
8
3
0
12 Dec 2023
Socially Cognizant Robotics for a Technology Enhanced Society
Socially Cognizant Robotics for a Technology Enhanced Society
Kristin J. Dana
Clinton Andrews
Kostas Bekris
Jacob Feldman
Matthew Stone
Pernille Hemmer
Aaron Mazzeo
Hal Salzman
Jingang Yi
13
0
0
27 Oct 2023
Eliciting Model Steering Interactions from Users via Data and Visual
  Design Probes
Eliciting Model Steering Interactions from Users via Data and Visual Design Probes
Anamaria Crisan
Maddie Shang
Eric Brochu
20
3
0
12 Oct 2023
The Rise of Open Science: Tracking the Evolution and Perceived Value of
  Data and Methods Link-Sharing Practices
The Rise of Open Science: Tracking the Evolution and Perceived Value of Data and Methods Link-Sharing Practices
Hancheng Cao
Jesse Dodge
Kyle Lo
Daniel A. McFarland
Lucy Lu Wang
AI4CE
22
5
0
04 Oct 2023
Can large language models provide useful feedback on research papers? A
  large-scale empirical analysis
Can large language models provide useful feedback on research papers? A large-scale empirical analysis
Weixin Liang
Yuhui Zhang
Hancheng Cao
Binglu Wang
Daisy Ding
...
Siyu He
D. Smith
Yian Yin
Daniel A. McFarland
James Y. Zou
ALM
LM&MA
40
123
0
03 Oct 2023
RRR-Net: Reusing, Reducing, and Recycling a Deep Backbone Network
RRR-Net: Reusing, Reducing, and Recycling a Deep Backbone Network
Haozhe Sun
Isabelle M Guyon
F. Mohr
Hedi Tabia
CVBM
17
2
0
02 Oct 2023
Berkeley Open Extended Reality Recordings 2023 (BOXRR-23): 4.7 Million
  Motion Capture Recordings from 105,852 Extended Reality Device Users
Berkeley Open Extended Reality Recordings 2023 (BOXRR-23): 4.7 Million Motion Capture Recordings from 105,852 Extended Reality Device Users
V. Nair
Wenbo Guo
Rui Wang
J. F. O'Brien
Louis B. Rosenberg
Dawn Song
13
7
0
30 Sep 2023
Inferring Capabilities from Task Performance with Bayesian Triangulation
Inferring Capabilities from Task Performance with Bayesian Triangulation
John Burden
Konstantinos Voudouris
Ryan Burnell
Danaja Rutar
Lucy G. Cheke
José Hernández Orallo
16
7
0
21 Sep 2023
FACET: Fairness in Computer Vision Evaluation Benchmark
FACET: Fairness in Computer Vision Evaluation Benchmark
Laura Gustafson
Chloe Rolland
Nikhila Ravi
Quentin Duval
Aaron B. Adcock
Cheng-Yang Fu
Melissa Hall
Candace Ross
VLM
EGVM
16
36
0
31 Aug 2023
Towards Federated Foundation Models: Scalable Dataset Pipelines for
  Group-Structured Learning
Towards Federated Foundation Models: Scalable Dataset Pipelines for Group-Structured Learning
Zachary B. Charles
Nicole Mitchell
Krishna Pillutla
Michael Reneer
Zachary Garrett
FedML
AI4CE
28
28
0
18 Jul 2023
Mapping the Challenges of HCI: An Application and Evaluation of ChatGPT
  and GPT-4 for Mining Insights at Scale
Mapping the Challenges of HCI: An Application and Evaluation of ChatGPT and GPT-4 for Mining Insights at Scale
Jonas Oppenlaender
Joonas Hamalainen
25
6
0
08 Jun 2023
A Diachronic Analysis of Paradigm Shifts in NLP Research: When, How, and
  Why?
A Diachronic Analysis of Paradigm Shifts in NLP Research: When, How, and Why?
Aniket Pramanick
Yufang Hou
Saif M. Mohammad
Iryna Gurevych
14
6
0
22 May 2023
Learning from data with structured missingness
Learning from data with structured missingness
R. Mitra
Sarah F. McGough
Tapabrata (Rohan) Chakraborty
Chris Holmes
Ryan Copping
...
M. Mackintosh
E. Andrinopoulou
A. Basiri
Chris Harbron
Ben D. MacArthur
CML
11
44
0
04 Apr 2023
A View From Somewhere: Human-Centric Face Representations
A View From Somewhere: Human-Centric Face Representations
Jerone T. A. Andrews
Przemyslaw K. Joniak
Alice Xiang
CVBM
11
9
0
30 Mar 2023
Ecosystem Graphs: The Social Footprint of Foundation Models
Ecosystem Graphs: The Social Footprint of Foundation Models
Rishi Bommasani
Dilara Soylu
Thomas I. Liao
Kathleen A. Creel
Percy Liang
MLAU
27
32
0
28 Mar 2023
CoCon: A Data Set on Combined Contextualized Research Artifact Use
CoCon: A Data Set on Combined Contextualized Research Artifact Use
T. Saier
Youxiang Dong
Michael Färber
9
1
0
27 Mar 2023
Aligning benchmark datasets for table structure recognition
Aligning benchmark datasets for table structure recognition
B. Smock
Rohith Pesala
Robin Abraham
LMTD
14
8
0
01 Mar 2023
Benchmarks for Automated Commonsense Reasoning: A Survey
Benchmarks for Automated Commonsense Reasoning: A Survey
E. Davis
ELM
LRM
19
57
0
09 Feb 2023
Ethical Considerations for Responsible Data Curation
Ethical Considerations for Responsible Data Curation
Jerone T. A. Andrews
Dora Zhao
William Thong
Apostolos Modas
Orestis Papakyriakopoulos
Alice Xiang
17
19
0
07 Feb 2023
LEXTREME: A Multi-Lingual and Multi-Task Benchmark for the Legal Domain
LEXTREME: A Multi-Lingual and Multi-Task Benchmark for the Legal Domain
Joel Niklaus
Veton Matoshi
Pooja Rani
Andrea Galassi
Matthias Sturmer
Ilias Chalkidis
ELM
AILaw
19
54
0
30 Jan 2023
Neural Architecture Search: Insights from 1000 Papers
Neural Architecture Search: Insights from 1000 Papers
Colin White
Mahmoud Safari
R. Sukthanker
Binxin Ru
T. Elsken
Arber Zela
Debadeepta Dey
Frank Hutter
3DV
AI4CE
32
128
0
20 Jan 2023
Evaluation for Change
Evaluation for Change
Rishi Bommasani
ELM
35
0
0
20 Dec 2022
Graph Learning Indexer: A Contributor-Friendly and Metadata-Rich
  Platform for Graph Learning Benchmarks
Graph Learning Indexer: A Contributor-Friendly and Metadata-Rich Platform for Graph Learning Benchmarks
Jiaqi Ma
Xingjian Zhang
Hezheng Fan
Jin Huang
Tianyue Li
Tinghong Li
Yiwen Tu
Chen Zhu
Qiaozhu Mei
35
5
0
08 Dec 2022
12
Next