ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2008.03703
  4. Cited By
What Neural Networks Memorize and Why: Discovering the Long Tail via
  Influence Estimation

What Neural Networks Memorize and Why: Discovering the Long Tail via Influence Estimation

9 August 2020
Vitaly Feldman
Chiyuan Zhang
    TDI
ArXivPDFHTML

Papers citing "What Neural Networks Memorize and Why: Discovering the Long Tail via Influence Estimation"

50 / 85 papers shown
Title
When Dynamic Data Selection Meets Data Augmentation
When Dynamic Data Selection Meets Data Augmentation
S. M. I. Simon X. Yang
Peng Ye
F. Shen
Dongzhan Zhou
24
0
0
02 May 2025
Geometric Median Matching for Robust k-Subset Selection from Noisy Data
Geometric Median Matching for Robust k-Subset Selection from Noisy Data
Anish Acharya
Sujay Sanghavi
Alexandros G. Dimakis
Inderjit S Dhillon
AAML
57
0
0
01 Apr 2025
Impact of Data Duplication on Deep Neural Network-Based Image Classifiers: Robust vs. Standard Models
Impact of Data Duplication on Deep Neural Network-Based Image Classifiers: Robust vs. Standard Models
Alireza Aghabagherloo
Aydin Abadi
Sumanta Sarkar
Vishnu Asutosh Dasu
Bart Preneel
AAML
54
0
0
01 Apr 2025
Severing Spurious Correlations with Data Pruning
Severing Spurious Correlations with Data Pruning
Varun Mulchandani
Jung-Eun Kim
141
0
0
24 Mar 2025
The Canary's Echo: Auditing Privacy Risks of LLM-Generated Synthetic Text
The Canary's Echo: Auditing Privacy Risks of LLM-Generated Synthetic Text
Matthieu Meeus
Lukas Wutschitz
Santiago Zanella Béguelin
Shruti Tople
Reza Shokri
75
0
0
24 Feb 2025
On Memorization in Diffusion Models
On Memorization in Diffusion Models
Xiangming Gu
Chao Du
Tianyu Pang
Chongxuan Li
Min-Bin Lin
Ye Wang
DiffM
TDI
166
43
0
21 Feb 2025
Data Attribution for Text-to-Image Models by Unlearning Synthesized Images
Data Attribution for Text-to-Image Models by Unlearning Synthesized Images
Sheng-Yu Wang
Aaron Hertzmann
Alexei A. Efros
Jun-Yan Zhu
Richard Zhang
TDI
126
2
0
21 Feb 2025
Captured by Captions: On Memorization and its Mitigation in CLIP Models
Captured by Captions: On Memorization and its Mitigation in CLIP Models
Wenhao Wang
Adam Dziedzic
Grace C. Kim
Michael Backes
Franziska Boenisch
86
0
0
11 Feb 2025
Early Stopping Against Label Noise Without Validation Data
Early Stopping Against Label Noise Without Validation Data
Suqin Yuan
Lei Feng
Tongliang Liu
NoLa
96
14
0
11 Feb 2025
MATH-Perturb: Benchmarking LLMs' Math Reasoning Abilities against Hard Perturbations
MATH-Perturb: Benchmarking LLMs' Math Reasoning Abilities against Hard Perturbations
Kaixuan Huang
Jiacheng Guo
Zihao Li
X. Ji
Jiawei Ge
...
Yangsibo Huang
Chi Jin
Xinyun Chen
Chiyuan Zhang
Mengdi Wang
AAML
LRM
95
7
0
10 Feb 2025
Privacy-Preserving Dataset Combination
Privacy-Preserving Dataset Combination
Keren Fuentes
Mimee Xu
Irene Chen
36
0
0
09 Feb 2025
The Silent Majority: Demystifying Memorization Effect in the Presence of Spurious Correlations
The Silent Majority: Demystifying Memorization Effect in the Presence of Spurious Correlations
Chenyu You
Haocheng Dai
Yifei Min
Jasjeet Sekhon
S. Joshi
James S. Duncan
60
2
0
01 Jan 2025
Unleashing the Power of Data Tsunami: A Comprehensive Survey on Data Assessment and Selection for Instruction Tuning of Language Models
Unleashing the Power of Data Tsunami: A Comprehensive Survey on Data Assessment and Selection for Instruction Tuning of Language Models
Yulei Qin
Yuncheng Yang
Pengcheng Guo
Gang Li
Hang Shao
Yuchen Shi
Zihan Xu
Yun Gu
Ke Li
Xing Sun
ALM
88
12
0
31 Dec 2024
Data Pruning Can Do More: A Comprehensive Data Pruning Approach for
  Object Re-identification
Data Pruning Can Do More: A Comprehensive Data Pruning Approach for Object Re-identification
Zi Yang
Haojin Yang
Soumajit Majumder
Jorge M. Cardoso
Guillermo Gallego
MoMe
VLM
93
1
0
13 Dec 2024
Diffusion Attribution Score: Evaluating Training Data Influence in Diffusion Models
Diffusion Attribution Score: Evaluating Training Data Influence in Diffusion Models
Jinxu Lin
Linwei Tao
Minjing Dong
Chang Xu
TDI
38
2
0
24 Oct 2024
Upsample or Upweight? Balanced Training on Heavily Imbalanced Datasets
Upsample or Upweight? Balanced Training on Heavily Imbalanced Datasets
Tianjian Li
Haoran Xu
Weiting Tan
Kenton Murray
Daniel Khashabi
35
1
0
06 Oct 2024
Adversarial Attacks on Data Attribution
Adversarial Attacks on Data Attribution
Xinhe Wang
Pingbang Hu
Junwei Deng
Jiaqi W. Ma
TDI
53
0
0
09 Sep 2024
Recent Advances in Attack and Defense Approaches of Large Language
  Models
Recent Advances in Attack and Defense Approaches of Large Language Models
Jing Cui
Yishi Xu
Zhewei Huang
Shuchang Zhou
Jianbin Jiao
Junge Zhang
PILM
AAML
52
1
0
05 Sep 2024
Fast Training Dataset Attribution via In-Context Learning
Fast Training Dataset Attribution via In-Context Learning
Milad Fotouhi
M. T. Bahadori
Oluwaseyi Feyisetan
P. Arabshahi
David Heckerman
31
0
0
14 Aug 2024
Range Membership Inference Attacks
Range Membership Inference Attacks
Jiashu Tao
Reza Shokri
42
1
0
09 Aug 2024
When does compositional structure yield compositional generalization? A kernel theory
When does compositional structure yield compositional generalization? A kernel theory
Samuel Lippl
Kim Stachenfeld
NAI
CoGe
73
5
0
26 May 2024
Data Reconstruction: When You See It and When You Don't
Data Reconstruction: When You See It and When You Don't
Edith Cohen
Haim Kaplan
Yishay Mansour
Shay Moran
Kobbi Nissim
Uri Stemmer
Eliad Tsfadia
AAML
42
2
0
24 May 2024
Quantifying In-Context Reasoning Effects and Memorization Effects in
  LLMs
Quantifying In-Context Reasoning Effects and Memorization Effects in LLMs
Siyu Lou
Yuntian Chen
Xiaodan Liang
Liang Lin
Quanshi Zhang
32
2
0
20 May 2024
Membership Inference Attacks and Privacy in Topic Modeling
Membership Inference Attacks and Privacy in Topic Modeling
Nico Manzonelli
Wanrong Zhang
Salil P. Vadhan
37
1
0
07 Mar 2024
Effective pruning of web-scale datasets based on complexity of concept
  clusters
Effective pruning of web-scale datasets based on complexity of concept clusters
Amro Abbas
E. Rusak
Kushal Tirumala
Wieland Brendel
Kamalika Chaudhuri
Ari S. Morcos
VLM
CLIP
34
22
0
09 Jan 2024
Understanding (Un)Intended Memorization in Text-to-Image Generative
  Models
Understanding (Un)Intended Memorization in Text-to-Image Generative Models
Ali Naseh
Jaechul Roh
Amir Houmansadr
DiffM
20
6
0
06 Dec 2023
Separating the Wheat from the Chaff with BREAD: An open-source benchmark
  and metrics to detect redundancy in text
Separating the Wheat from the Chaff with BREAD: An open-source benchmark and metrics to detect redundancy in text
Isaac Caswell
Lisa Wang
Isabel Papadimitriou
26
0
0
11 Nov 2023
There's no Data Like Better Data: Using QE Metrics for MT Data Filtering
There's no Data Like Better Data: Using QE Metrics for MT Data Filtering
Jan-Thorsten Peter
David Vilar
Daniel Deutsch
Mara Finkelstein
Juraj Juraska
Markus Freitag
9
16
0
09 Nov 2023
Intriguing Properties of Data Attribution on Diffusion Models
Intriguing Properties of Data Attribution on Diffusion Models
Xiaosen Zheng
Tianyu Pang
Chao Du
Jing Jiang
Min-Bin Lin
TDI
34
20
1
01 Nov 2023
On the Over-Memorization During Natural, Robust and Catastrophic
  Overfitting
On the Over-Memorization During Natural, Robust and Catastrophic Overfitting
Runqi Lin
Chaojian Yu
Bo Han
Tongliang Liu
22
7
0
13 Oct 2023
FTFT: Efficient and Robust Fine-Tuning by Transferring Training Dynamics
FTFT: Efficient and Robust Fine-Tuning by Transferring Training Dynamics
Yupei Du
Albert Gatt
Dong Nguyen
24
1
0
10 Oct 2023
Natural Example-Based Explainability: a Survey
Natural Example-Based Explainability: a Survey
Antonin Poché
Lucas Hervier
M. Bakkay
XAI
24
11
0
05 Sep 2023
Uncovering the Hidden Cost of Model Compression
Uncovering the Hidden Cost of Model Compression
Diganta Misra
Muawiz Chaudhary
Agam Goyal
Bharat Runwal
Pin-Yu Chen
VLM
30
0
0
29 Aug 2023
Samplable Anonymous Aggregation for Private Federated Data Analysis
Samplable Anonymous Aggregation for Private Federated Data Analysis
Kunal Talwar
Shan Wang
Audra McMillan
Vojta Jina
Vitaly Feldman
...
Congzheng Song
Karl Tarbe
Sebastian Vogt
L. Winstrom
Shundong Zhou
FedML
30
13
0
27 Jul 2023
Ethicist: Targeted Training Data Extraction Through Loss Smoothed Soft
  Prompting and Calibrated Confidence Estimation
Ethicist: Targeted Training Data Extraction Through Loss Smoothed Soft Prompting and Calibrated Confidence Estimation
Zhexin Zhang
Jiaxin Wen
Minlie Huang
30
29
0
10 Jul 2023
Deconstructing Data Reconstruction: Multiclass, Weight Decay and General
  Losses
Deconstructing Data Reconstruction: Multiclass, Weight Decay and General Losses
G. Buzaglo
Niv Haim
Gilad Yehudai
Gal Vardi
Yakir Oz
Yaniv Nikankin
Michal Irani
26
10
0
04 Jul 2023
Gradients Look Alike: Sensitivity is Often Overestimated in DP-SGD
Gradients Look Alike: Sensitivity is Often Overestimated in DP-SGD
Anvith Thudi
Hengrui Jia
Casey Meehan
Ilia Shumailov
Nicolas Papernot
20
3
0
01 Jul 2023
Understanding the Effect of the Long Tail on Neural Network Compression
Understanding the Effect of the Long Tail on Neural Network Compression
Harvey Dam
Vinu Joseph
Aditya Bhaskara
G. Gopalakrishna
Saurav Muralidharan
M. Garland
21
2
0
09 Jun 2023
How Spurious Features Are Memorized: Precise Analysis for Random and NTK
  Features
How Spurious Features Are Memorized: Precise Analysis for Random and NTK Features
Simone Bombari
Marco Mondelli
AAML
19
4
0
20 May 2023
Data-OOB: Out-of-bag Estimate as a Simple and Efficient Data Value
Data-OOB: Out-of-bag Estimate as a Simple and Efficient Data Value
Yongchan Kwon
James Y. Zou
TDI
FedML
31
35
0
16 Apr 2023
Do We Train on Test Data? The Impact of Near-Duplicates on License Plate
  Recognition
Do We Train on Test Data? The Impact of Near-Duplicates on License Plate Recognition
Rayson Laroca
Valter Estevam
A. Britto
Rodrigo Minetto
David Menotti
28
10
0
10 Apr 2023
Fairness Improves Learning from Noisily Labeled Long-Tailed Data
Fairness Improves Learning from Noisily Labeled Long-Tailed Data
Jiaheng Wei
Zhaowei Zhu
Gang Niu
Tongliang Liu
Sijia Liu
Masashi Sugiyama
Yang Liu
26
6
0
22 Mar 2023
Beyond Distribution Shift: Spurious Features Through the Lens of
  Training Dynamics
Beyond Distribution Shift: Spurious Features Through the Lens of Training Dynamics
Nihal Murali
A. Puli
Ke Yu
Rajesh Ranganath
Kayhan Batmanghelich
AAML
37
7
0
18 Feb 2023
Understanding Reconstruction Attacks with the Neural Tangent Kernel and
  Dataset Distillation
Understanding Reconstruction Attacks with the Neural Tangent Kernel and Dataset Distillation
Noel Loo
Ramin Hasani
Mathias Lechner
Alexander Amini
Daniela Rus
DD
26
5
0
02 Feb 2023
Pathologies of Predictive Diversity in Deep Ensembles
Pathologies of Predictive Diversity in Deep Ensembles
Taiga Abe
E. Kelly Buchanan
Geoff Pleiss
John P. Cunningham
UQCV
38
13
0
01 Feb 2023
Recursive Neural Networks with Bottlenecks Diagnose
  (Non-)Compositionality
Recursive Neural Networks with Bottlenecks Diagnose (Non-)Compositionality
Verna Dankers
Ivan Titov
28
2
0
31 Jan 2023
Extracting Training Data from Diffusion Models
Extracting Training Data from Diffusion Models
Nicholas Carlini
Jamie Hayes
Milad Nasr
Matthew Jagielski
Vikash Sehwag
Florian Tramèr
Borja Balle
Daphne Ippolito
Eric Wallace
DiffM
63
569
0
30 Jan 2023
Generalization on the Unseen, Logic Reasoning and Degree Curriculum
Generalization on the Unseen, Logic Reasoning and Degree Curriculum
Emmanuel Abbe
Samy Bengio
Aryo Lotfi
Kevin Rizk
LRM
30
48
0
30 Jan 2023
Leveraging Unlabeled Data to Track Memorization
Leveraging Unlabeled Data to Track Memorization
Mahsa Forouzesh
Hanie Sedghi
Patrick Thiran
NoLa
TDI
30
3
0
08 Dec 2022
On Pitfalls of Measuring Occlusion Robustness through Data Distortion
On Pitfalls of Measuring Occlusion Robustness through Data Distortion
Antonia Marcu
18
0
0
24 Nov 2022
12
Next