Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2206.07137
Cited By
Prioritized Training on Points that are Learnable, Worth Learning, and Not Yet Learnt
14 June 2022
Sören Mindermann
J. Brauner
Muhammed Razzak
Mrinank Sharma
Andreas Kirsch
Winnie Xu
Benedikt Höltgen
Aidan N. Gomez
Adrien Morisot
Sebastian Farquhar
Y. Gal
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Prioritized Training on Points that are Learnable, Worth Learning, and Not Yet Learnt"
50 / 112 papers shown
Title
How to Train Data-Efficient LLMs
Noveen Sachdeva
Benjamin Coleman
Wang-Cheng Kang
Jianmo Ni
Lichan Hong
Ed H. Chi
James Caverlee
Julian McAuley
D. Cheng
24
51
0
15 Feb 2024
LESS: Selecting Influential Data for Targeted Instruction Tuning
Mengzhou Xia
Sadhika Malladi
Suchin Gururangan
Sanjeev Arora
Danqi Chen
80
185
0
06 Feb 2024
DsDm: Model-Aware Dataset Selection with Datamodels
Logan Engstrom
Axel Feldmann
A. Madry
OODD
10
46
0
23 Jan 2024
Generative Deduplication For Socia Media Data Selection
Xianming Li
Jing Li
29
2
0
11 Jan 2024
On the Convergence of Loss and Uncertainty-based Active Learning Algorithms
Daniel Haimovich
Dima Karamshuk
Fridolin Linder
Niek Tax
Milan Vojnovic
10
0
0
21 Dec 2023
Mitigating Label Bias in Machine Learning: Fairness through Confident Learning
Yixuan Zhang
Boyu Li
Zenan Ling
Feng Zhou
FaML
11
3
0
14 Dec 2023
Bad Students Make Great Teachers: Active Learning Accelerates Large-Scale Visual Understanding
Talfan Evans
Shreya Pathak
Hamza Merzic
Jonathan Schwarz
Ryutaro Tanno
Olivier J. Hénaff
8
16
0
08 Dec 2023
REDUCR: Robust Data Downsampling Using Class Priority Reweighting
William Bankes
George Hughes
Ilija Bogunovic
Zi Wang
13
3
0
01 Dec 2023
Computing Approximate
ℓ
p
\ell_p
ℓ
p
Sensitivities
Swati Padmanabhan
David P. Woodruff
Qiuyi Zhang
43
0
0
07 Nov 2023
AdaFlood: Adaptive Flood Regularization
Wonho Bae
Yi Ren
Mohamad Osama Ahmed
Frederick Tung
Danica J. Sutherland
Gabriel L. Oliveira
AI4CE
32
1
0
06 Nov 2023
Self-Influence Guided Data Reweighting for Language Model Pre-training
Megh Thakkar
Tolga Bolukbasi
Sriram Ganapathy
Shikhar Vashishth
Sarath Chandar
Partha P. Talukdar
MILM
22
20
0
02 Nov 2023
Data Optimization in Deep Learning: A Survey
Ou Wu
Rujing Yao
28
1
0
25 Oct 2023
KAKURENBO: Adaptively Hiding Samples in Deep Neural Network Training
Truong Thao Nguyen
Balazs Gerofi
Edgar Josafat Martinez-Noriega
Franccois Trahay
M. Wahib
19
1
0
16 Oct 2023
Sheared LLaMA: Accelerating Language Model Pre-training via Structured Pruning
Mengzhou Xia
Tianyu Gao
Zhiyuan Zeng
Danqi Chen
24
262
0
10 Oct 2023
What do larger image classifiers memorise?
Michal Lukasik
Vaishnavh Nagarajan
A. S. Rawat
A. Menon
Sanjiv Kumar
25
5
0
09 Oct 2023
GRASP: A Rehearsal Policy for Efficient Online Continual Learning
Md Yousuf Harun
Jhair Gallardo
Junyu Chen
Christopher Kanan
CLL
25
9
0
25 Aug 2023
D4: Improving LLM Pretraining via Document De-Duplication and Diversification
Kushal Tirumala
Daniel Simig
Armen Aghajanyan
Ari S. Morcos
SyDa
8
103
0
23 Aug 2023
From Quantity to Quality: Boosting LLM Performance with Self-Guided Data Selection for Instruction Tuning
Ming Li
Yong Zhang
Zhitao Li
Jiuhai Chen
Lichang Chen
Ning Cheng
Jianzong Wang
Tianyi Zhou
Jing Xiao
38
168
0
23 Aug 2023
Towards Accelerated Model Training via Bayesian Data Selection
Zhijie Deng
Peng Cui
Jun Zhu
16
4
0
21 Aug 2023
No Train No Gain: Revisiting Efficient Training Algorithms For Transformer-based Language Models
Jean Kaddour
Oscar Key
Piotr Nawrot
Pasquale Minervini
Matt J. Kusner
13
41
0
12 Jul 2023
T-MARS: Improving Visual Representations by Circumventing Text Feature Learning
Pratyush Maini
Sachin Goyal
Zachary Chase Lipton
J. Zico Kolter
Aditi Raghunathan
VLM
29
33
0
06 Jul 2023
Exploring Data Redundancy in Real-world Image Classification through Data Selection
Zhenyu Tang
Shaoting Zhang
Xiaosong Wang
8
2
0
25 Jun 2023
AdaSelection: Accelerating Deep Learning Training through Data Subsampling
Minghe Zhang
Chaosheng Dong
Jinmiao Fu
Tianchen Zhou
Jia Liang
...
Bo Liu
Michinari Momma
Bryan Wang
Yan Gao
Yi Sun
21
3
0
19 Jun 2023
Task-specific experimental design for treatment effect estimation
Beth D. Connolly
Kim Moore
Tobias Schwedes
Alexander Adam
Gary Willis
Ilya Feige
Christopher Frye
CML
17
3
0
08 Jun 2023
NLU on Data Diets: Dynamic Data Subset Selection for NLP Classification Tasks
Jean-Michel Attendu
Jean-Philippe Corbeil
17
15
0
05 Jun 2023
Towards Sustainable Learning: Coresets for Data-efficient Deep Learning
Yu Yang
Hao Kang
Baharan Mirzasoleiman
22
33
0
02 Jun 2023
Repeated Random Sampling for Minimizing the Time-to-Accuracy of Learning
Patrik Okanovic
R. Waleffe
Vasilis Mageirakos
Konstantinos E. Nikolakakis
Amin Karbasi
Dionysis Kalogerias
Nezihe Merve Gürel
Theodoros Rekatsinas
DD
30
12
0
28 May 2023
In-Context Demonstration Selection with Cross Entropy Difference
Dan Iter
Reid Pryzant
Ruochen Xu
Shuohang Wang
Yang Liu
Yichong Xu
Chenguang Zhu
18
11
0
24 May 2023
Selective Pre-training for Private Fine-tuning
Da Yu
Sivakanth Gopi
Janardhan Kulkarni
Zi-Han Lin
Saurabh Naik
Tomasz Religa
Jian Yin
Huishuai Zhang
17
19
0
23 May 2023
DoReMi: Optimizing Data Mixtures Speeds Up Language Model Pretraining
Sang Michael Xie
Hieu H. Pham
Xuanyi Dong
Nan Du
Hanxiao Liu
Yifeng Lu
Percy Liang
Quoc V. Le
Tengyu Ma
Adams Wei Yu
MoMe
MoE
25
172
0
17 May 2023
Alternating Gradient Descent and Mixture-of-Experts for Integrated Multimodal Perception
Hassan Akbari
Dan Kondratyuk
Yin Cui
Rachel Hornung
H. Wang
Hartwig Adam
VLM
MoE
20
11
0
10 May 2023
Learning Sample Difficulty from Pre-trained Models for Reliable Prediction
Peng Cui
Dan Zhang
Zhijie Deng
Yinpeng Dong
Junyi Zhu
16
12
0
20 Apr 2023
On Efficient Training of Large-Scale Deep Learning Models: A Literature Review
Li Shen
Yan Sun
Zhiyuan Yu
Liang Ding
Xinmei Tian
Dacheng Tao
VLM
24
39
0
07 Apr 2023
InfoBatch: Lossless Training Speed Up by Unbiased Dynamic Data Pruning
Ziheng Qin
K. Wang
Zangwei Zheng
Jianyang Gu
Xiang Peng
...
Daquan Zhou
Lei Shang
Baigui Sun
Xuansong Xie
Yang You
116
46
0
08 Mar 2023
Curriculum Based Multi-Task Learning for Parkinson's Disease Detection
Nikhil J. Dhinagar
Conor Owens-Walton
Emily Laltoo
C. Boyle
Yao-Liang Chen
...
Chih-Chien Tsai
Jiun-Jie Wang
Yih-Ru Wu
Y. D. Werf
Paul M. Thompson
10
3
0
27 Feb 2023
Data-Efficient Contrastive Self-supervised Learning: Most Beneficial Examples for Supervised Learning Contribute the Least
S. Joshi
Baharan Mirzasoleiman
SSL
17
18
0
18 Feb 2023
Confidence-based Reliable Learning under Dual Noises
Peng Cui
Yang Yue
Zhijie Deng
Jun Zhu
NoLa
18
8
0
10 Feb 2023
Selective In-Context Data Augmentation for Intent Detection using Pointwise V-Information
Yen-Ting Lin
Alexandros Papangelis
Seokhwan Kim
Sungjin Lee
Devamanyu Hazarika
Mahdi Namazifar
Di Jin
Yang Liu
Dilek Z. Hakkani-Tür
16
35
0
10 Feb 2023
Data Selection for Language Models via Importance Resampling
Sang Michael Xie
Shibani Santurkar
Tengyu Ma
Percy Liang
6
170
0
06 Feb 2023
Cramming: Training a Language Model on a Single GPU in One Day
Jonas Geiping
Tom Goldstein
MoE
28
84
0
28 Dec 2022
Selective classification using a robust meta-learning approach
Nishant Jain
Karthikeyan Shanmugam
Pradeep Shenoy
OOD
21
2
0
12 Dec 2022
Instance-Conditional Timescales of Decay for Non-Stationary Learning
Nishant Jain
Pradeep Shenoy
25
3
0
12 Dec 2022
General Intelligence Requires Rethinking Exploration
Minqi Jiang
Tim Rocktaschel
Edward Grefenstette
LRM
22
17
0
15 Nov 2022
Reduce, Reuse, Recycle: Improving Training Efficiency with Distillation
Cody Blakeney
Jessica Zosa Forde
Jonathan Frankle
Ziliang Zong
Matthew L. Leavitt
VLM
22
4
0
01 Nov 2022
Compute-Efficient Deep Learning: Algorithmic Trends and Opportunities
Brian Bartoldson
B. Kailkhura
Davis W. Blalock
24
47
0
13 Oct 2022
Robust Active Distillation
Cenk Baykal
Khoa Trinh
Fotis Iliopoulos
Gaurav Menghani
Erik Vee
23
10
0
03 Oct 2022
Stop Wasting My Time! Saving Days of ImageNet and BERT Training with Latest Weight Averaging
Jean Kaddour
MoMe
3DH
17
39
0
29 Sep 2022
Metadata Archaeology: Unearthing Data Subsets by Leveraging Training Dynamics
Shoaib Ahmed Siddiqui
Nitarshan Rajkumar
Tegan Maharaj
David M. Krueger
Sara Hooker
30
27
0
20 Sep 2022
Prioritizing Samples in Reinforcement Learning with Reducible Loss
Shivakanth Sujit
Somjit Nath
Pedro H. M. Braga
Samira Ebrahimi Kahou
28
15
0
22 Aug 2022
Machine Learning with Confidential Computing: A Systematization of Knowledge
Fan Mo
Zahra Tarkhani
Hamed Haddadi
22
7
0
22 Aug 2022
Previous
1
2
3
Next