Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2206.10013
Cited By
Measuring the Effect of Training Data on Deep Learning Predictions via Randomized Experiments
International Conference on Machine Learning (ICML), 2022
20 June 2022
Jinkun Lin
Anqi Zhang
Mathias Lécuyer
Jinyang Li
Aurojit Panda
S. Sen
TDI
FedML
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Measuring the Effect of Training Data on Deep Learning Predictions via Randomized Experiments"
41 / 41 papers shown
Onto-Epistemological Analysis of AI Explanations
Martina Mattioli
Eike Petersen
Aasa Feragen
Marcello Pelillo
Siavash Bigdeli
281
0
0
03 Oct 2025
DataMIL: Selecting Data for Robot Imitation Learning with Datamodels
Shivin Dass
Alaa Khaddaj
Logan Engstrom
Aleksander Madry
Andrew Ilyas
Roberto Martín-Martín
383
10
0
14 May 2025
A Causal Framework for Aligning Image Quality Metrics and Deep Neural Network Robustness
Nathan G. Drenkow
Mathias Unberath
OOD
409
0
0
04 Mar 2025
Data Overvaluation Attack and Truthful Data Valuation in Federated Learning
Shuyuan Zheng
Sudong Cai
Chuan Xiao
Yang Cao
Jianbin Qin
Masatoshi Yoshikawa
Makoto Onizuka
TDI
AAML
582
0
0
01 Feb 2025
Unleashing the Power of Data Tsunami: A Comprehensive Survey on Data Assessment and Selection for Instruction Tuning of Language Models
Yulei Qin
Yuncheng Yang
Pengcheng Guo
Gang Li
Hang Shao
Yuchen Shi
Zihan Xu
Yun Gu
Ke Li
Xing Sun
ALM
924
27
0
31 Dec 2024
LossVal: Efficient Data Valuation for Neural Networks
Tim Wibiral
Mohamed Karim Belaid
Maximilian Rabus
Ansgar Scherp
TDI
616
1
0
05 Dec 2024
One Sample Fits All: Approximating All Probabilistic Values Simultaneously and Efficiently
Neural Information Processing Systems (NeurIPS), 2024
Weida Li
Yaoliang Yu
241
8
0
31 Oct 2024
Diffusion Attribution Score: Evaluating Training Data Influence in Diffusion Models
International Conference on Learning Representations (ICLR), 2024
Jinxu Lin
Linwei Tao
Minjing Dong
Chang Xu
TDI
476
14
0
24 Oct 2024
Adversarial Attacks on Data Attribution
International Conference on Learning Representations (ICLR), 2024
Xinhe Wang
Pingbang Hu
Junwei Deng
Jiaqi W. Ma
TDI
657
1
0
09 Sep 2024
Data Debiasing with Datamodels (D3M): Improving Subgroup Robustness via Data Selection
Saachi Jain
Kimia Hamidieh
Kristian Georgiev
Andrew Ilyas
Marzyeh Ghassemi
Aleksander Madry
263
5
0
24 Jun 2024
CHG Shapley: Efficient Data Valuation and Selection towards Trustworthy Machine Learning
Huaiguang Cai
FedML
TDI
705
3
0
17 Jun 2024
Data Shapley in One Training Run
Jiachen T. Wang
Prateek Mittal
Kurt Thomas
R. Jia
TDI
634
55
0
16 Jun 2024
Causal Estimation of Memorisation Profiles
Annual Meeting of the Association for Computational Linguistics (ACL), 2024
Pietro Lesci
Clara Meister
Thomas Hofmann
Andreas Vlachos
Tiago Pimentel
301
13
0
06 Jun 2024
Is Data Valuation Learnable and Interpretable?
Ou Wu
Weiyao Zhu
Mengyang Li
TDI
365
2
0
03 Jun 2024
Data Valuation by Fusing Global and Local Statistical Information
Xiaoling Zhou
Ou Wu
Michael K. Ng
Hao Jiang
TDI
650
1
0
23 May 2024
Uncertainty-aware Evaluation of Auxiliary Anomalies with the Expected Anomaly Posterior
Lorenzo Perini
Maja R. Rudolph
Sabrina Schmedding
Chen Qiu
374
4
0
22 May 2024
Rethinking Data Shapley for Data Selection Tasks: Misleads and Merits
Jiachen T. Wang
Tianji Yang
James Zou
Yongchan Kwon
Ruoxi Jia
TDI
358
23
0
06 May 2024
Neural Dynamic Data Valuation: A Stochastic Optimal Control Approach
Zhangyong Liang
Ji Zhang
Ji Zhang
Pengfei Zhang
Zhao Li
TDI
498
1
0
30 Apr 2024
An Economic Solution to Copyright Challenges of Generative AI
Jiachen T. Wang
Zhun Deng
Hiroaki Chiba-Okabe
Boaz Barak
Weijie J. Su
394
29
0
22 Apr 2024
Improve Knowledge Distillation via Label Revision and Data Selection
IEEE Transactions on Cognitive and Developmental Systems (IEEE TCDS), 2024
Weichao Lan
Yiu-ming Cheung
Qing Xu
Buhua Liu
Zhikai Hu
Mengke Li
Zhenghua Chen
323
7
0
03 Apr 2024
Task-Oriented GNNs Training on Large Knowledge Graphs for Accurate and Efficient Modeling
IEEE International Conference on Data Engineering (ICDE), 2024
Hussein Abdallah
Waleed Afandi
Panos Kalnis
Essam Mansour
247
6
0
09 Mar 2024
Efficient Data Shapley for Weighted Nearest Neighbor Algorithms
International Conference on Artificial Intelligence and Statistics (AISTATS), 2024
Jiachen T. Wang
Prateek Mittal
Ruoxi Jia
TDI
356
13
0
20 Jan 2024
Data Attribution for Diffusion Models: Timestep-induced Bias in Influence Estimation
Tong Xie
Haoyu Li
Andrew Bai
Cho-Jui Hsieh
TDI
406
11
0
17 Jan 2024
Towards Explainable Artificial Intelligence (XAI): A Data Mining Perspective
Haoyi Xiong
Xuhong Li
Xiaofei Zhang
Jiamin Chen
Xinhao Sun
Yuchen Li
Zeyi Sun
Jundong Li
XAI
413
15
0
09 Jan 2024
The Journey, Not the Destination: How Data Guides Diffusion Models
Kristian Georgiev
Joshua Vendrow
Hadi Salman
Sung Min Park
Aleksander Madry
400
38
0
11 Dec 2023
Data Valuation and Detections in Federated Learning
Wenqian Li
Shuran Fu
Fengrui Zhang
Yan Pang
FedML
TDI
458
20
0
09 Nov 2023
Intriguing Properties of Data Attribution on Diffusion Models
International Conference on Learning Representations (ICLR), 2023
Xiaosen Zheng
Tianyu Pang
Chao Du
Jing Jiang
Min Lin
TDI
489
41
1
01 Nov 2023
Data Optimization in Deep Learning: A Survey
IEEE Transactions on Knowledge and Data Engineering (TKDE), 2023
Ou Wu
Rujing Yao
363
6
0
25 Oct 2023
Selectivity Drives Productivity: Efficient Dataset Pruning for Enhanced Transfer Learning
Neural Information Processing Systems (NeurIPS), 2023
Yihua Zhang
Yimeng Zhang
Chenyi Zi
Jinghan Jia
Jiancheng Liu
Gaowen Liu
Min-Fong Hong
Shiyu Chang
Sijia Liu
AAML
421
16
0
13 Oct 2023
Threshold KNN-Shapley: A Linear-Time and Privacy-Friendly Approach to Data Valuation
Jiachen T. Wang
Yuqing Zhu
Yu Wang
R. Jia
Prateek Mittal
TDI
392
24
0
30 Aug 2023
Rethinking Backdoor Attacks
International Conference on Machine Learning (ICML), 2023
Alaa Khaddaj
Guillaume Leclerc
Aleksandar Makelov
Kristian Georgiev
Hadi Salman
Andrew Ilyas
Aleksander Madry
SILM
273
41
0
19 Jul 2023
OpenDataVal: a Unified Benchmark for Data Valuation
Neural Information Processing Systems (NeurIPS), 2023
Kevin Jiang
Weixin Liang
James Zou
Yongchan Kwon
FedML
505
50
0
18 Jun 2023
2D-Shapley: A Framework for Fragmented Data Valuation
International Conference on Machine Learning (ICML), 2023
Zhihong Liu
H. Just
Xiangyu Chang
Xinyu Chen
R. Jia
TDI
240
13
0
18 Jun 2023
Data-OOB: Out-of-bag Estimate as a Simple and Efficient Data Value
International Conference on Machine Learning (ICML), 2023
Yongchan Kwon
James Zou
TDI
FedML
499
54
0
16 Apr 2023
A Note on "Efficient Task-Specific Data Valuation for Nearest Neighbor Algorithms"
Wei Ping
Yue Liu
TDI
301
42
0
09 Apr 2023
On the Variance of Neural Network Training with respect to Test Sets and Distributions
International Conference on Learning Representations (ICLR), 2023
Keller Jordan
OOD
423
22
0
04 Apr 2023
TRAK: Attributing Model Behavior at Scale
International Conference on Machine Learning (ICML), 2023
Sung Min Park
Kristian Georgiev
Andrew Ilyas
Guillaume Leclerc
Aleksander Madry
TDI
443
251
0
24 Mar 2023
Training Data Influence Analysis and Estimation: A Survey
Machine-mediated learning (ML), 2022
Zayd Hammoudeh
Daniel Lowd
TDI
600
162
0
09 Dec 2022
XInsight: eXplainable Data Analysis Through The Lens of Causality
Pingchuan Ma
Rui Ding
Shuai Wang
Shi Han
Dongmei Zhang
CML
472
27
0
26 Jul 2022
Data Banzhaf: A Robust Data Valuation Framework for Machine Learning
International Conference on Artificial Intelligence and Statistics (AISTATS), 2022
Jiachen T. Wang
R. Jia
FedML
TDI
843
151
0
30 May 2022
What Neural Networks Memorize and Why: Discovering the Long Tail via Influence Estimation
Neural Information Processing Systems (NeurIPS), 2020
Vitaly Feldman
Chiyuan Zhang
TDI
697
597
0
09 Aug 2020
1
Page 1 of 1