Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2011.02832
Cited By
Pitfalls in Machine Learning Research: Reexamining the Development Cycle
4 November 2020
Stella Biderman
Walter J. Scheirer
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Pitfalls in Machine Learning Research: Reexamining the Development Cycle"
16 / 16 papers shown
Title
No Metric to Rule Them All: Toward Principled Evaluations of Graph-Learning Datasets
Corinna Coupette
Jeremy Wayland
Emily Simons
Bastian Alexander Rieck
76
1
0
04 Feb 2025
Benchmark Data Repositories for Better Benchmarking
Rachel Longjohn
Markelle Kelly
Sameer Singh
Padhraic Smyth
41
0
0
31 Oct 2024
Automated Repair of AI Code with Large Language Models and Formal Verification
Yiannis Charalambous
Edoardo Manino
Lucas C. Cordeiro
25
2
0
14 May 2024
NeuroCodeBench: a plain C neural network benchmark for software verification
Edoardo Manino
R. Menezes
F. Shmarov
Lucas C. Cordeiro
11
3
0
07 Sep 2023
OBELICS: An Open Web-Scale Filtered Dataset of Interleaved Image-Text Documents
Hugo Laurenccon
Lucile Saulnier
Léo Tronchon
Stas Bekman
Amanpreet Singh
...
Siddharth Karamcheti
Alexander M. Rush
Douwe Kiela
Matthieu Cord
Victor Sanh
25
230
0
21 Jun 2023
Towards machine learning guided by best practices
Anamaria Mojica-Hanke
8
0
0
29 Apr 2023
The MiniPile Challenge for Data-Efficient Language Models
Jean Kaddour
MoE
ALM
24
41
0
17 Apr 2023
What are the Machine Learning best practices reported by practitioners on Stack Exchange?
Anamaria Mojica-Hanke
A. Bayona
Mario Linares-Vásquez
Steffen Herbold
Fabio A. González
HAI
19
6
0
25 Jan 2023
BigBIO: A Framework for Data-Centric Biomedical Natural Language Processing
Jason Alan Fries
Leon Weber
Natasha Seelam
Gabriel Altay
Debajyoti Datta
...
Minh Chien Vu
Trishala Neeraj
Jonas Golde
Albert Villanova del Moral
Benjamin Beilharz
LM&MA
93
46
0
30 Jun 2022
Leveraging Centric Data Federated Learning Using Blockchain For Integrity Assurance
Riadh Ben Chaabene
Darine Ameyed
M. Cheriet
FedML
14
0
0
09 Jun 2022
GPT-NeoX-20B: An Open-Source Autoregressive Language Model
Sid Black
Stella Biderman
Eric Hallahan
Quentin G. Anthony
Leo Gao
...
Shivanshu Purohit
Laria Reynolds
J. Tow
Benqi Wang
Samuel Weinbach
61
797
0
14 Apr 2022
Fooling MOSS Detection with Pretrained Language Models
Stella Biderman
Edward Raff
DeLMO
17
35
0
19 Jan 2022
Life is not black and white -- Combining Semi-Supervised Learning with fuzzy labels
Lars Schmarje
Reinhard Koch
32
2
0
13 Oct 2021
Cut the CARP: Fishing for zero-shot story evaluation
Shahbuland Matiana
J. Smith
Ryan Teehan
Louis Castricato
Stella Biderman
Leo Gao
Spencer Frazier
47
16
0
06 Oct 2021
Quality at a Glance: An Audit of Web-Crawled Multilingual Datasets
Julia Kreutzer
Isaac Caswell
Lisa Wang
Ahsan Wahab
D. Esch
...
Duygu Ataman
Orevaoghene Ahia
Oghenefego Ahia
Sweta Agrawal
Mofetoluwa Adeyemi
20
265
0
22 Mar 2021
Baselines and a datasheet for the Cerema AWP dataset
Ismaïla Seck
Khouloud Dahmane
Pierre Duthon
Gaëlle Loosli
21
11
0
11 Jun 2018
1