Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2303.14186
Cited By
TRAK: Attributing Model Behavior at Scale
24 March 2023
Sung Min Park
Kristian Georgiev
Andrew Ilyas
Guillaume Leclerc
A. Madry
TDI
Re-assign community
ArXiv
PDF
HTML
Papers citing
"TRAK: Attributing Model Behavior at Scale"
29 / 29 papers shown
Title
Meta-rater: A Multi-dimensional Data Selection Method for Pre-training Language Models
Xinlin Zhuang
Jiahui Peng
Ren Ma
Y. Wang
Tianyi Bai
Xingjian Wei
Jiantao Qiu
Chi Zhang
Ying Qian
Conghui He
39
0
0
19 Apr 2025
Data Attribution for Text-to-Image Models by Unlearning Synthesized Images
Sheng-Yu Wang
Aaron Hertzmann
Alexei A. Efros
Jun-Yan Zhu
Richard Zhang
TDI
126
2
0
21 Feb 2025
Building Bridges, Not Walls -- Advancing Interpretability by Unifying Feature, Data, and Model Component Attribution
Shichang Zhang
Tessa Han
Usha Bhalla
Hima Lakkaraju
FAtt
147
0
0
17 Feb 2025
Most Influential Subset Selection: Challenges, Promises, and Beyond
Yuzheng Hu
Pingbang Hu
Han Zhao
Jiaqi W. Ma
TDI
136
2
0
10 Jan 2025
Unleashing the Power of Data Tsunami: A Comprehensive Survey on Data Assessment and Selection for Instruction Tuning of Language Models
Yulei Qin
Yuncheng Yang
Pengcheng Guo
Gang Li
Hang Shao
Yuchen Shi
Zihan Xu
Yun Gu
Ke Li
Xing Sun
ALM
88
12
0
31 Dec 2024
Diffusion Attribution Score: Evaluating Training Data Influence in Diffusion Models
Jinxu Lin
Linwei Tao
Minjing Dong
Chang Xu
TDI
36
2
0
24 Oct 2024
Montessori-Instruct: Generate Influential Training Data Tailored for Student Learning
Xiaochuan Li
Zichun Yu
Chenyan Xiong
SyDa
27
1
0
18 Oct 2024
Influence Functions for Scalable Data Attribution in Diffusion Models
Bruno Mlodozeniec
Runa Eschenhagen
Juhan Bae
Alexander Immer
David Krueger
Richard E. Turner
TDI
DiffM
75
4
0
17 Oct 2024
Data Quality Control in Federated Instruction-tuning of Large Language Models
Yaxin Du
Rui Ye
Fengting Yuchi
W. Zhao
Jingjing Qu
Y. Wang
Siheng Chen
ALM
FedML
45
0
0
15 Oct 2024
dattri
\texttt{dattri}
dattri
: A Library for Efficient Data Attribution
Junwei Deng
Ting-Wei Li
Shiyuan Zhang
Shixuan Liu
Yijun Pan
Hao Huang
Xinhe Wang
Pingbang Hu
Xingjian Zhang
Jiaqi W. Ma
TDI
28
3
0
06 Oct 2024
How Much Can We Forget about Data Contamination?
Sebastian Bordt
Suraj Srinivas
Valentyn Boreiko
U. V. Luxburg
43
1
0
04 Oct 2024
RandALO: Out-of-sample risk estimation in no time flat
Parth Nobel
Daniel LeJeune
Emmanuel J. Candès
32
3
0
15 Sep 2024
Adversarial Attacks on Data Attribution
Xinhe Wang
Pingbang Hu
Junwei Deng
Jiaqi W. Ma
TDI
53
0
0
09 Sep 2024
Fast Training Dataset Attribution via In-Context Learning
Milad Fotouhi
M. T. Bahadori
Oluwaseyi Feyisetan
P. Arabshahi
David Heckerman
31
0
0
14 Aug 2024
Stretching Each Dollar: Diffusion Training from Scratch on a Micro-Budget
Vikash Sehwag
Xianghao Kong
Jingtao Li
Michael Spranger
Lingjuan Lyu
DiffM
37
9
0
22 Jul 2024
Is poisoning a real threat to LLM alignment? Maybe more so than you think
Pankayaraj Pathmanathan
Souradip Chakraborty
Xiangyu Liu
Yongyuan Liang
Furong Huang
AAML
43
13
0
17 Jun 2024
Data Shapley in One Training Run
Jiachen T. Wang
Prateek Mittal
Dawn Song
Ruoxi Jia
TDI
27
7
0
16 Jun 2024
Describing Differences in Image Sets with Natural Language
Lisa Dunlap
Yuhui Zhang
Xiaohan Wang
Ruiqi Zhong
Trevor Darrell
Jacob Steinhardt
Joseph E. Gonzalez
Serena Yeung-Levy
CoGe
VLM
30
30
0
05 Dec 2023
Intriguing Properties of Data Attribution on Diffusion Models
Xiaosen Zheng
Tianyu Pang
Chao Du
Jing Jiang
Min-Bin Lin
TDI
34
20
1
01 Nov 2023
Performance Scaling via Optimal Transport: Enabling Data Selection from Partially Revealed Sources
Feiyang Kang
H. Just
Anit Kumar Sahu
R. Jia
48
10
0
05 Jul 2023
A Kernel-Based View of Language Model Fine-Tuning
Sadhika Malladi
Alexander Wettig
Dingli Yu
Danqi Chen
Sanjeev Arora
VLM
68
60
0
11 Oct 2022
Understanding Influence Functions and Datamodels via Harmonic Analysis
Nikunj Saunshi
Arushi Gupta
M. Braverman
Sanjeev Arora
TDI
54
17
0
03 Oct 2022
Behind the Scenes of Gradient Descent: A Trajectory Analysis via Basis Function Decomposition
Jianhao Ma
Li-Zhen Guo
S. Fattahi
34
4
0
01 Oct 2022
Deduplicating Training Data Makes Language Models Better
Katherine Lee
Daphne Ippolito
A. Nystrom
Chiyuan Zhang
Douglas Eck
Chris Callison-Burch
Nicholas Carlini
SyDa
237
590
0
14 Jul 2021
A linearized framework and a new benchmark for model selection for fine-tuning
Aditya Deshpande
Alessandro Achille
Avinash Ravichandran
Hao Li
L. Zancato
Charless C. Fowlkes
Rahul Bhotika
Stefano Soatto
Pietro Perona
ALM
107
46
0
29 Jan 2021
The large learning rate phase of deep learning: the catapult mechanism
Aitor Lewkowycz
Yasaman Bahri
Ethan Dyer
Jascha Narain Sohl-Dickstein
Guy Gur-Ari
ODL
156
233
0
04 Mar 2020
Language Models as Knowledge Bases?
Fabio Petroni
Tim Rocktaschel
Patrick Lewis
A. Bakhtin
Yuxiang Wu
Alexander H. Miller
Sebastian Riedel
KELM
AI4MH
406
2,584
0
03 Sep 2019
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding
Alex Jinpeng Wang
Amanpreet Singh
Julian Michael
Felix Hill
Omer Levy
Samuel R. Bowman
ELM
294
6,950
0
20 Apr 2018
ImageNet Large Scale Visual Recognition Challenge
Olga Russakovsky
Jia Deng
Hao Su
J. Krause
S. Satheesh
...
A. Karpathy
A. Khosla
Michael S. Bernstein
Alexander C. Berg
Li Fei-Fei
VLM
ObjD
282
39,190
0
01 Sep 2014
1