ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1902.00423
  4. Cited By
Do We Train on Test Data? Purging CIFAR of Near-Duplicates
v1v2 (latest)

Do We Train on Test Data? Purging CIFAR of Near-Duplicates

1 February 2019
Björn Barz
Joachim Denzler
ArXiv (abs)PDFHTML

Papers citing "Do We Train on Test Data? Purging CIFAR of Near-Duplicates"

42 / 42 papers shown
Title
Impact of Data Duplication on Deep Neural Network-Based Image Classifiers: Robust vs. Standard Models
Impact of Data Duplication on Deep Neural Network-Based Image Classifiers: Robust vs. Standard Models
Alireza Aghabagherloo
Aydin Abadi
Sumanta Sarkar
Vishnu Asutosh Dasu
Bart Preneel
AAML
125
1
0
01 Apr 2025
The Vendiscope: An Algorithmic Microscope For Data Collections
The Vendiscope: An Algorithmic Microscope For Data Collections
Amey P. Pasarkar
Adji Bousso Dieng
90
2
0
15 Feb 2025
MBInception: A new Multi-Block Inception Model for Enhancing Image
  Processing Efficiency
MBInception: A new Multi-Block Inception Model for Enhancing Image Processing Efficiency
Fatemeh Froughirad
Reza Bakhoda Eshtivani
Hamed Khajavi
Amir Rastgoo
79
0
0
18 Dec 2024
Label Errors in the Tobacco3482 Dataset
Label Errors in the Tobacco3482 Dataset
Gordon Lim
Stefan Larson
Kevin Leach
118
0
0
17 Dec 2024
Questionable practices in machine learning
Questionable practices in machine learning
Gavin Leech
Juan J. Vazquez
Misha Yagudin
Niclas Kupper
Laurence Aitchison
101
6
0
17 Jul 2024
Scaling Up Deep Clustering Methods Beyond ImageNet-1K
Scaling Up Deep Clustering Methods Beyond ImageNet-1K
Nikolas Adaloglou
Félix D. P. Michels
Kaspar Senft
Diana Petrusheva
M. Kollmann
102
1
0
03 Jun 2024
Repetita Iuvant: Data Repetition Allows SGD to Learn High-Dimensional Multi-Index Functions
Repetita Iuvant: Data Repetition Allows SGD to Learn High-Dimensional Multi-Index Functions
Luca Arnaboldi
Yatin Dandi
Florent Krzakala
Luca Pesce
Ludovic Stephan
128
18
0
24 May 2024
Automated Program Repair: Emerging trends pose and expose problems for
  benchmarks
Automated Program Repair: Emerging trends pose and expose problems for benchmarks
J. Renzullo
Pemma Reiter
Westley Weimer
Stephanie Forrest
84
3
0
08 May 2024
Tune without Validation: Searching for Learning Rate and Weight Decay on
  Training Sets
Tune without Validation: Searching for Learning Rate and Weight Decay on Training Sets
Lorenzo Brigato
Stavroula Mougiakakou
64
0
0
08 Mar 2024
Rethinking cluster-conditioned diffusion models
Rethinking cluster-conditioned diffusion models
Nikolas Adaloglou
Tim Kaiser
Félix D. P. Michels
M. Kollmann
VLM
70
3
0
01 Mar 2024
The Mirrored Influence Hypothesis: Efficient Data Influence Estimation
  by Harnessing Forward Passes
The Mirrored Influence Hypothesis: Efficient Data Influence Estimation by Harnessing Forward Passes
Myeongseob Ko
Feiyang Kang
Weiyan Shi
Ming Jin
Zhou Yu
Ruoxi Jia
TDI
72
4
0
14 Feb 2024
Benchmarking Pretrained Vision Embeddings for Near- and Duplicate
  Detection in Medical Images
Benchmarking Pretrained Vision Embeddings for Near- and Duplicate Detection in Medical Images
Tuan Truong
Farnaz Khun Jush
Matthias Lenga
69
3
0
12 Dec 2023
Reproducibility in Multiple Instance Learning: A Case For Algorithmic
  Unit Tests
Reproducibility in Multiple Instance Learning: A Case For Algorithmic Unit Tests
Edward Raff
James Holt
54
3
0
27 Oct 2023
No Data Augmentation? Alternative Regularizations for Effective Training
  on Small Datasets
No Data Augmentation? Alternative Regularizations for Effective Training on Small Datasets
Lorenzo Brigato
Stavroula Mougiakakou
72
5
0
04 Sep 2023
Data-Efficient Energy-Aware Participant Selection for UAV-Enabled
  Federated Learning
Data-Efficient Energy-Aware Participant Selection for UAV-Enabled Federated Learning
Youssra Cheriguene
Wael Jaafar
Kerrache Chaker Abdelaziz
H. Yanikomeroglu
Fatima Zohra Bousbaa
N. Lagraa
FedML
66
2
0
14 Aug 2023
Memorization Through the Lens of Curvature of Loss Function Around
  Samples
Memorization Through the Lens of Curvature of Loss Function Around Samples
Isha Garg
Deepak Ravikumar
Kaushik Roy
TDI
65
13
0
11 Jul 2023
Integrating Curricula with Replays: Its Effects on Continual Learning
Integrating Curricula with Replays: Its Effects on Continual Learning
Ren Jie Tee
Mengmi Zhang
KELMCLL
83
1
0
08 Jul 2023
On Evaluation of Document Classification using RVL-CDIP
On Evaluation of Document Classification using RVL-CDIP
Stefan Larson
Gordon Lim
Kevin Leach
96
3
0
21 Jun 2023
Image Classification with Small Datasets: Overview and Benchmark
Image Classification with Small Datasets: Overview and Benchmark
Lorenzo Brigato
Björn Barz
Luca Iocchi
Joachim Denzler
VLM
64
20
0
23 Dec 2022
Reducing Training Sample Memorization in GANs by Training with
  Memorization Rejection
Reducing Training Sample Memorization in GANs by Training with Memorization Rejection
Andrew Bai
Cho-Jui Hsieh
Wendy Kan
Hsuan-Tien Lin
GAN
85
5
0
21 Oct 2022
A Pareto-optimal compositional energy-based model for sampling and
  optimization of protein sequences
A Pareto-optimal compositional energy-based model for sampling and optimization of protein sequences
Natavsa Tagasovska
Nathan C. Frey
Andreas Loukas
I. Hotzel
J. Lafrance-Vanasse
...
A. Rajpal
Richard Bonneau
Kyunghyun Cho
Stephen Ra
Vladimir Gligorijević
88
11
0
19 Oct 2022
Bugs in the Data: How ImageNet Misrepresents Biodiversity
Bugs in the Data: How ImageNet Misrepresents Biodiversity
A. Luccioni
David Rolnick
78
46
0
24 Aug 2022
When does dough become a bagel? Analyzing the remaining mistakes on
  ImageNet
When does dough become a bagel? Analyzing the remaining mistakes on ImageNet
Vijay Vasudevan
Benjamin Caine
Raphael Gontijo-Lopes
Sara Fridovich-Keil
Rebecca Roelofs
VLMUQCV
90
59
0
09 May 2022
A Siren Song of Open Source Reproducibility
A Siren Song of Open Source Reproducibility
Edward Raff
Andrew L. Farris
82
9
0
09 Apr 2022
Perfectly Accurate Membership Inference by a Dishonest Central Server in
  Federated Learning
Perfectly Accurate Membership Inference by a Dishonest Central Server in Federated Learning
Georg Pichler
Marco Romanelli
L. Rey Vega
Pablo Piantanida
FedML
56
11
0
30 Mar 2022
Datamodels: Predicting Predictions from Training Data
Datamodels: Predicting Predictions from Training Data
Andrew Ilyas
Sung Min Park
Logan Engstrom
Guillaume Leclerc
Aleksander Madry
TDI
135
143
0
01 Feb 2022
Is the Performance of My Deep Network Too Good to Be True? A Direct
  Approach to Estimating the Bayes Error in Binary Classification
Is the Performance of My Deep Network Too Good to Be True? A Direct Approach to Estimating the Bayes Error in Binary Classification
Takashi Ishida
Ikko Yamane
Nontawat Charoenphakdee
Gang Niu
Masashi Sugiyama
BDLUQCV
62
18
0
01 Feb 2022
A Framework for Cluster and Classifier Evaluation in the Absence of
  Reference Labels
A Framework for Cluster and Classifier Evaluation in the Absence of Reference Labels
R. Joyce
Edward Raff
Charles K. Nicholas
80
16
0
23 Sep 2021
Tune It or Don't Use It: Benchmarking Data-Efficient Image
  Classification
Tune It or Don't Use It: Benchmarking Data-Efficient Image Classification
Lorenzo Brigato
Björn Barz
Luca Iocchi
Joachim Denzler
67
18
0
30 Aug 2021
A data-based comparative review and AI-driven symbolic model for
  longitudinal dispersion coefficient in natural streams
A data-based comparative review and AI-driven symbolic model for longitudinal dispersion coefficient in natural streams
Yifeng Zhao
Zicheng Liu
Peiren Zhang
S. Galindo‐Torres
Stan Z. Li
13
0
0
17 Jun 2021
On Memorization in Probabilistic Deep Generative Models
On Memorization in Probabilistic Deep Generative Models
G. V. D. Burg
Christopher K. I. Williams
TDI
95
63
0
06 Jun 2021
Rethinking Noisy Label Models: Labeler-Dependent Noise with Adversarial
  Awareness
Rethinking Noisy Label Models: Labeler-Dependent Noise with Adversarial Awareness
Glenn Dawson
R. Polikar
NoLa
73
3
0
28 May 2021
PEng4NN: An Accurate Performance Estimation Engine for Efficient
  Automated Neural Network Architecture Search
PEng4NN: An Accurate Performance Estimation Engine for Efficient Automated Neural Network Architecture Search
A. Rorabaugh
Silvina Caíno-Lores
Michael R. Wyatt
Travis Johnston
M. Taufer
39
2
0
11 Jan 2021
An Analysis of Dataset Overlap on Winograd-Style Tasks
An Analysis of Dataset Overlap on Winograd-Style Tasks
Ali Emami
Adam Trischler
Kaheer Suleman
Jackie C.K. Cheung
76
22
0
09 Nov 2020
FSD50K: An Open Dataset of Human-Labeled Sound Events
FSD50K: An Open Dataset of Human-Labeled Sound Events
Eduardo Fonseca
Xavier Favory
Jordi Pons
F. Font
Xavier Serra
121
467
0
01 Oct 2020
What Neural Networks Memorize and Why: Discovering the Long Tail via
  Influence Estimation
What Neural Networks Memorize and Why: Discovering the Long Tail via Influence Estimation
Vitaly Feldman
Chiyuan Zhang
TDI
245
472
0
09 Aug 2020
Are we done with ImageNet?
Are we done with ImageNet?
Lucas Beyer
Olivier J. Hénaff
Alexander Kolesnikov
Xiaohua Zhai
Aaron van den Oord
VLM
134
406
0
12 Jun 2020
The Curious Case of Convex Neural Networks
The Curious Case of Convex Neural Networks
S. Sivaprasad
Ankur Singh
Naresh Manwani
Vineet Gandhi
109
27
0
09 Jun 2020
Self-Distillation as Instance-Specific Label Smoothing
Self-Distillation as Instance-Specific Label Smoothing
Zhilu Zhang
M. Sabuncu
76
119
0
09 Jun 2020
Identifying Mislabeled Data using the Area Under the Margin Ranking
Identifying Mislabeled Data using the Area Under the Margin Ranking
Geoff Pleiss
Tianyi Zhang
Ethan R. Elenberg
Kilian Q. Weinberger
NoLa
119
274
0
28 Jan 2020
Big Transfer (BiT): General Visual Representation Learning
Big Transfer (BiT): General Visual Representation Learning
Alexander Kolesnikov
Lucas Beyer
Xiaohua Zhai
J. Puigcerver
Jessica Yung
Sylvain Gelly
N. Houlsby
MQ
301
1,212
0
24 Dec 2019
Understanding Isomorphism Bias in Graph Data Sets
Understanding Isomorphism Bias in Graph Data Sets
Sergei Ivanov
Sergei Sviridov
Evgeny Burnaev
FaMLAI4CE
118
38
0
26 Oct 2019
1