ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2103.03098
  4. Cited By
Accounting for Variance in Machine Learning Benchmarks

Accounting for Variance in Machine Learning Benchmarks

1 March 2021
Xavier Bouthillier
Pierre Delaunay
Mirko Bronzi
Assya Trofimov
Brennan Nichyporuk
Justin Szeto
Naz Sepah
Edward Raff
Kanika Madan
Vikram S. Voleti
Samira Ebrahimi Kahou
Vincent Michalski
Dmitriy Serdyuk
Tal Arbel
C. Pal
Gaël Varoquaux
Pascal Vincent
ArXivPDFHTML

Papers citing "Accounting for Variance in Machine Learning Benchmarks"

26 / 26 papers shown
Title
Pfungst and Clever Hans: Identifying the unintended cues in a widely used Alzheimer's disease MRI dataset using explainable deep learning
Pfungst and Clever Hans: Identifying the unintended cues in a widely used Alzheimer's disease MRI dataset using explainable deep learning
C. Tinauer
Maximilian Sackl
Rudolf Stollberger
Stefan Ropele
C. Langkammer
AAML
35
0
0
27 Jan 2025
Generalizability of experimental studies
Generalizability of experimental studies
Federico Matteucci
Vadim Arzamasov
Jose Cribeiro-Ramallo
Marco Heyden
Konstantin Ntounas
Klemens Bohm
42
0
0
25 Jun 2024
Learning from Uncertain Data: From Possible Worlds to Possible Models
Learning from Uncertain Data: From Possible Worlds to Possible Models
Jiongli Zhu
Su Feng
Boris Glavic
Babak Salimi
19
0
0
28 May 2024
Reinforcing Language Agents via Policy Optimization with Action
  Decomposition
Reinforcing Language Agents via Policy Optimization with Action Decomposition
Muning Wen
Ziyu Wan
Weinan Zhang
Jun Wang
Ying Wen
38
7
0
23 May 2024
Good Seed Makes a Good Crop: Discovering Secret Seeds in Text-to-Image Diffusion Models
Good Seed Makes a Good Crop: Discovering Secret Seeds in Text-to-Image Diffusion Models
Katherine Xu
Lingzhi Zhang
Jianbo Shi
41
12
0
23 May 2024
Target Variable Engineering
Target Variable Engineering
Jessica Clark
22
0
0
13 Oct 2023
A benchmark for computational analysis of animal behavior, using
  animal-borne tags
A benchmark for computational analysis of animal behavior, using animal-borne tags
Benjamin Hoffman
M. Cusimano
V. Baglione
D. Canestrari
D. Chevallier
...
O. Vainio
A. Vehkaoja
Ken Yoda
Katie Zacarian
A. Friedlaender
25
7
0
18 May 2023
On the Variance of Neural Network Training with respect to Test Sets and
  Distributions
On the Variance of Neural Network Training with respect to Test Sets and Distributions
Keller Jordan
OOD
11
10
0
04 Apr 2023
How to select predictive models for causal inference?
How to select predictive models for causal inference?
M. Doutreligne
Gaël Varoquaux
ELM
CML
14
2
0
01 Feb 2023
Lempel-Ziv Networks
Lempel-Ziv Networks
Rebecca Saul
Mohammad Mahmudul Alam
John Hurwitz
Edward Raff
Tim Oates
James Holt
13
2
0
23 Nov 2022
Deep Reinforcement Learning for Cryptocurrency Trading: Practical
  Approach to Address Backtest Overfitting
Deep Reinforcement Learning for Cryptocurrency Trading: Practical Approach to Address Backtest Overfitting
Berend Gort
Xiao-Yang Liu
Xinghang Sun
Jiechao Gao
Shuai Chen
Chris Wang
18
12
0
12 Sep 2022
Why do tree-based models still outperform deep learning on tabular data?
Why do tree-based models still outperform deep learning on tabular data?
Léo Grinsztajn
Edouard Oyallon
Gaël Varoquaux
LMTD
22
355
0
18 Jul 2022
Robustness Evaluation of Deep Unsupervised Learning Algorithms for
  Intrusion Detection Systems
Robustness Evaluation of Deep Unsupervised Learning Algorithms for Intrusion Detection Systems
D'Jeff K. Nkashama
Ariana Soltani
Jean-Charles Verdier
Marc Frappier
Pierre-Marting Tardif
F. Kabanza
OOD
AAML
13
5
0
25 Jun 2022
deep-significance - Easy and Meaningful Statistical Significance Testing
  in the Age of Neural Networks
deep-significance - Easy and Meaningful Statistical Significance Testing in the Age of Neural Networks
Dennis Ulmer
Christian Hardmeier
J. Frellsen
40
42
0
14 Apr 2022
A Siren Song of Open Source Reproducibility
A Siren Song of Open Source Reproducibility
Edward Raff
Andrew L. Farris
13
9
0
09 Apr 2022
Does the Market of Citations Reward Reproducible Work?
Does the Market of Citations Reward Reproducible Work?
Edward Raff
HAI
CML
6
12
0
08 Apr 2022
DR3: Value-Based Deep Reinforcement Learning Requires Explicit
  Regularization
DR3: Value-Based Deep Reinforcement Learning Requires Explicit Regularization
Aviral Kumar
Rishabh Agarwal
Tengyu Ma
Aaron Courville
George Tucker
Sergey Levine
OffRL
21
65
0
09 Dec 2021
Learning Pessimism for Robust and Efficient Off-Policy Reinforcement
  Learning
Learning Pessimism for Robust and Efficient Off-Policy Reinforcement Learning
Edoardo Cetin
Oya Celiktutan
OffRL
34
16
0
07 Oct 2021
A Framework for Cluster and Classifier Evaluation in the Absence of
  Reference Labels
A Framework for Cluster and Classifier Evaluation in the Absence of Reference Labels
R. Joyce
Edward Raff
Charles K. Nicholas
38
16
0
23 Sep 2021
Torch.manual_seed(3407) is all you need: On the influence of random
  seeds in deep learning architectures for computer vision
Torch.manual_seed(3407) is all you need: On the influence of random seeds in deep learning architectures for computer vision
David Picard
3DV
VLM
9
87
0
16 Sep 2021
HPOBench: A Collection of Reproducible Multi-Fidelity Benchmark Problems
  for HPO
HPOBench: A Collection of Reproducible Multi-Fidelity Benchmark Problems for HPO
Katharina Eggensperger
Philip Muller
Neeratyoy Mallik
Matthias Feurer
René Sass
Aaron Klein
Noor H. Awad
Marius Lindauer
Frank Hutter
30
100
0
14 Sep 2021
Deep Reinforcement Learning at the Edge of the Statistical Precipice
Deep Reinforcement Learning at the Edge of the Statistical Precipice
Rishabh Agarwal
Max Schwarzer
P. S. Castro
Aaron Courville
Marc G. Bellemare
OffRL
25
630
0
30 Aug 2021
The Benchmark Lottery
The Benchmark Lottery
Mostafa Dehghani
Yi Tay
A. Gritsenko
Zhe Zhao
N. Houlsby
Fernando Diaz
Donald Metzler
Oriol Vinyals
34
89
0
14 Jul 2021
Randomness In Neural Network Training: Characterizing The Impact of
  Tooling
Randomness In Neural Network Training: Characterizing The Impact of Tooling
Donglin Zhuang
Xingyao Zhang
S. Song
Sara Hooker
17
75
0
22 Jun 2021
What is the State of Neural Network Pruning?
What is the State of Neural Network Pruning?
Davis W. Blalock
Jose Javier Gonzalez Ortiz
Jonathan Frankle
John Guttag
183
1,027
0
06 Mar 2020
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language
  Understanding
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding
Alex Jinpeng Wang
Amanpreet Singh
Julian Michael
Felix Hill
Omer Levy
Samuel R. Bowman
ELM
294
6,950
0
20 Apr 2018
1