Measuring the Effect of Training Data on Deep Learning Predictions via Randomized Experiments

International Conference on Machine Learning (ICML), 2022

20 June 2022

Papers citing "Measuring the Effect of Training Data on Deep Learning Predictions via Randomized Experiments"

41 / 41 papers shown

Onto-Epistemological Analysis of AI Explanations

281

03 Oct 2025

DataMIL: Selecting Data for Robot Imitation Learning with Datamodels

Roberto Martín-Martín

383

14 May 2025

A Causal Framework for Aligning Image Quality Metrics and Deep Neural Network Robustness

Nathan G. Drenkow

Mathias Unberath

OOD

409

04 Mar 2025

Data Overvaluation Attack and Truthful Data Valuation in Federated Learning

582

01 Feb 2025

Unleashing the Power of Data Tsunami: A Comprehensive Survey on Data Assessment and Selection for Instruction Tuning of Language Models

924

31 Dec 2024

LossVal: Efficient Data Valuation for Neural Networks

616

05 Dec 2024

One Sample Fits All: Approximating All Probabilistic Values Simultaneously and EfficientlyNeural Information Processing Systems (NeurIPS), 2024

Weida Li

Yaoliang Yu

241

31 Oct 2024

Diffusion Attribution Score: Evaluating Training Data Influence in Diffusion ModelsInternational Conference on Learning Representations (ICLR), 2024

476

24 Oct 2024

Adversarial Attacks on Data AttributionInternational Conference on Learning Representations (ICLR), 2024

657

09 Sep 2024

Data Debiasing with Datamodels (D3M): Improving Subgroup Robustness via Data Selection

Saachi Jain

Kimia Hamidieh

Kristian Georgiev

Andrew Ilyas

Marzyeh Ghassemi

Aleksander Madry

263

24 Jun 2024

CHG Shapley: Efficient Data Valuation and Selection towards Trustworthy Machine Learning

Huaiguang Cai

FedML TDI

705

17 Jun 2024

Data Shapley in One Training Run

634

16 Jun 2024

Causal Estimation of Memorisation ProfilesAnnual Meeting of the Association for Computational Linguistics (ACL), 2024

301

06 Jun 2024

Is Data Valuation Learnable and Interpretable?

365

03 Jun 2024

Data Valuation by Fusing Global and Local Statistical Information

650

23 May 2024

Uncertainty-aware Evaluation of Auxiliary Anomalies with the Expected Anomaly Posterior

374

22 May 2024

Rethinking Data Shapley for Data Selection Tasks: Misleads and Merits

358

06 May 2024

Neural Dynamic Data Valuation: A Stochastic Optimal Control Approach

498

30 Apr 2024

An Economic Solution to Copyright Challenges of Generative AI

Weijie J. Su

394

22 Apr 2024

Improve Knowledge Distillation via Label Revision and Data SelectionIEEE Transactions on Cognitive and Developmental Systems (IEEE TCDS), 2024

323

03 Apr 2024

Task-Oriented GNNs Training on Large Knowledge Graphs for Accurate and Efficient ModelingIEEE International Conference on Data Engineering (ICDE), 2024

247

09 Mar 2024

Efficient Data Shapley for Weighted Nearest Neighbor AlgorithmsInternational Conference on Artificial Intelligence and Statistics (AISTATS), 2024

356

20 Jan 2024

Data Attribution for Diffusion Models: Timestep-induced Bias in Influence Estimation

406

17 Jan 2024

Towards Explainable Artificial Intelligence (XAI): A Data Mining Perspective

Haoyi Xiong

413

09 Jan 2024

The Journey, Not the Destination: How Data Guides Diffusion Models

Kristian Georgiev

Joshua Vendrow

Hadi Salman

Sung Min Park

Aleksander Madry

400

11 Dec 2023

Data Valuation and Detections in Federated Learning

458

09 Nov 2023

Intriguing Properties of Data Attribution on Diffusion ModelsInternational Conference on Learning Representations (ICLR), 2023

489

01 Nov 2023

Data Optimization in Deep Learning: A SurveyIEEE Transactions on Knowledge and Data Engineering (TKDE), 2023

Ou Wu

Rujing Yao

363

25 Oct 2023

Selectivity Drives Productivity: Efficient Dataset Pruning for Enhanced Transfer LearningNeural Information Processing Systems (NeurIPS), 2023

421

13 Oct 2023

Threshold KNN-Shapley: A Linear-Time and Privacy-Friendly Approach to Data Valuation

392

30 Aug 2023

Rethinking Backdoor AttacksInternational Conference on Machine Learning (ICML), 2023

Kristian Georgiev

273

19 Jul 2023

OpenDataVal: a Unified Benchmark for Data ValuationNeural Information Processing Systems (NeurIPS), 2023

505

18 Jun 2023

2D-Shapley: A Framework for Fragmented Data ValuationInternational Conference on Machine Learning (ICML), 2023

240

18 Jun 2023

Data-OOB: Out-of-bag Estimate as a Simple and Efficient Data ValueInternational Conference on Machine Learning (ICML), 2023

Yongchan Kwon

James Zou

TDI FedML

499

16 Apr 2023

A Note on "Efficient Task-Specific Data Valuation for Nearest Neighbor Algorithms"

Wei Ping

Yue Liu

TDI

301

09 Apr 2023

On the Variance of Neural Network Training with respect to Test Sets and DistributionsInternational Conference on Learning Representations (ICLR), 2023

Keller Jordan

OOD

423

04 Apr 2023

TRAK: Attributing Model Behavior at ScaleInternational Conference on Machine Learning (ICML), 2023

Kristian Georgiev

443

251

24 Mar 2023

Training Data Influence Analysis and Estimation: A SurveyMachine-mediated learning (ML), 2022

Zayd Hammoudeh

Daniel Lowd

TDI

600

162

09 Dec 2022

XInsight: eXplainable Data Analysis Through The Lens of Causality

472

26 Jul 2022

Data Banzhaf: A Robust Data Valuation Framework for Machine LearningInternational Conference on Artificial Intelligence and Statistics (AISTATS), 2022

Jiachen T. Wang

R. Jia

FedML TDI

843

151

30 May 2022

What Neural Networks Memorize and Why: Discovering the Long Tail via Influence EstimationNeural Information Processing Systems (NeurIPS), 2020

Vitaly Feldman

Chiyuan Zhang

TDI

697

597

09 Aug 2020