Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2112.09153
Cited By
An Empirical Investigation of the Role of Pre-training in Lifelong Learning
16 December 2021
Sanket Vaibhav Mehta
Darshan Patil
Sarath Chandar
Emma Strubell
CLL
Re-assign community
ArXiv
PDF
HTML
Papers citing
"An Empirical Investigation of the Role of Pre-training in Lifelong Learning"
27 / 27 papers shown
Title
TiC-LM: A Web-Scale Benchmark for Time-Continual LLM Pretraining
Jeffrey Li
Mohammadreza Armandpour
Iman Mirzadeh
Sachin Mehta
Vaishaal Shankar
...
Samy Bengio
Oncel Tuzel
Mehrdad Farajtabar
Hadi Pouransari
Fartash Faghri
CLL
KELM
59
0
0
02 Apr 2025
Achieving Upper Bound Accuracy of Joint Training in Continual Learning
Saleh Momeni
Bing Liu
CLL
82
1
0
17 Feb 2025
OverThink: Slowdown Attacks on Reasoning LLMs
A. Kumar
Jaechul Roh
A. Naseh
Marzena Karpinska
Mohit Iyyer
Amir Houmansadr
Eugene Bagdasarian
LRM
60
12
0
04 Feb 2025
Generate to Discriminate: Expert Routing for Continual Learning
Yewon Byun
Sanket Vaibhav Mehta
Saurabh Garg
Emma Strubell
Michael Oberst
Bryan Wilder
Zachary Chase Lipton
78
0
0
31 Dec 2024
Buffer-based Gradient Projection for Continual Federated Learning
Shenghong Dai
Jy-yong Sohn
Yicong Chen
S. Alam
Ravikumar Balakrishnan
Suman Banerjee
N. Himayat
Kangwook Lee
FedML
75
2
0
03 Sep 2024
An Investigation of Warning Erroneous Chat Translations in Cross-lingual Communication
Yunmeng Li
Jun Suzuki
Makoto Morishita
Kaori Abe
Kentaro Inui
61
1
0
28 Aug 2024
HiDe-PET: Continual Learning via Hierarchical Decomposition of Parameter-Efficient Tuning
Liyuan Wang
Jingyi Xie
Xingxing Zhang
Hang Su
Jun Zhu
CLL
47
4
0
07 Jul 2024
Investigating Continual Pretraining in Large Language Models: Insights and Implications
cCaugatay Yildiz
Nishaanth Kanna Ravichandran
Prishruit Punia
Matthias Bethge
B. Ermiş
CLL
KELM
LRM
48
25
0
27 Feb 2024
Towards a General Framework for Continual Learning with Pre-training
Liyuan Wang
Jingyi Xie
Xingxing Zhang
Hang Su
Jun Zhu
CLL
29
3
0
21 Oct 2023
Continual Generalized Intent Discovery: Marching Towards Dynamic and Open-world Intent Recognition
Xiaoshuai Song
Yutao Mou
Keqing He
Yueyan Qiu
Pei Wang
Weiran Xu
21
2
0
16 Oct 2023
Lookbehind-SAM: k steps back, 1 step forward
Gonçalo Mordido
Pranshu Malviya
A. Baratin
Sarath Chandar
AAML
40
1
0
31 Jul 2023
Flatness-Aware Prompt Selection Improves Accuracy and Sample Efficiency
Lingfeng Shen
Weiting Tan
Boyuan Zheng
Daniel Khashabi
VLM
36
6
0
18 May 2023
Lightweight Transformers for Clinical Natural Language Processing
Omid Rohanian
Mohammadmahdi Nouriborji
Hannah Jauncey
Samaneh Kouchaki
Isaric Clinical Characterisation Group
Lei A. Clifton
L. Merson
David A. Clifton
MedIm
LM&MA
16
12
0
09 Feb 2023
DSI++: Updating Transformer Memory with New Documents
Sanket Vaibhav Mehta
Jai Gupta
Yi Tay
Mostafa Dehghani
Vinh Q. Tran
J. Rao
Marc Najork
Emma Strubell
Donald Metzler
CLL
30
39
0
19 Dec 2022
SQuAT: Sharpness- and Quantization-Aware Training for BERT
Zheng Wang
Juncheng Billy Li
Shuhui Qu
Florian Metze
Emma Strubell
MQ
13
7
0
13 Oct 2022
Schedule-Robust Online Continual Learning
Ruohan Wang
Marco Ciccone
Giulia Luise
A. Yapp
Massimiliano Pontil
C. Ciliberto
CLL
32
4
0
11 Oct 2022
Causes of Catastrophic Forgetting in Class-Incremental Semantic Segmentation
Tobias Kalb
Jürgen Beyerer
CLL
25
8
0
16 Sep 2022
Progressive Latent Replay for efficient Generative Rehearsal
Stanislaw Pawlak
Filip Szatkowski
Michal Bortkiewicz
Jan Dubiñski
Tomasz Trzciñski
14
2
0
04 Jul 2022
The Effect of Task Ordering in Continual Learning
Samuel J. Bell
Neil D. Lawrence
CLL
46
17
0
26 May 2022
Fine-tuned Language Models are Continual Learners
Thomas Scialom
Tuhin Chakrabarty
Smaranda Muresan
CLL
LRM
145
116
0
24 May 2022
DualPrompt: Complementary Prompting for Rehearsal-free Continual Learning
Zifeng Wang
Zizhao Zhang
Sayna Ebrahimi
Ruoxi Sun
Han Zhang
...
Xiaoqi Ren
Guolong Su
Vincent Perot
Jennifer Dy
Tomas Pfister
CLL
VLM
VPVLM
28
455
0
10 Apr 2022
Adversarial Continual Learning
Sayna Ebrahimi
Franziska Meier
Roberto Calandra
Trevor Darrell
Marcus Rohrbach
CLL
VLM
152
197
0
21 Mar 2020
Dynamical Isometry and a Mean Field Theory of CNNs: How to Train 10,000-Layer Vanilla Convolutional Neural Networks
Lechao Xiao
Yasaman Bahri
Jascha Narain Sohl-Dickstein
S. Schoenholz
Jeffrey Pennington
220
348
0
14 Jun 2018
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding
Alex Jinpeng Wang
Amanpreet Singh
Julian Michael
Felix Hill
Omer Levy
Samuel R. Bowman
ELM
297
6,950
0
20 Apr 2018
Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks
Chelsea Finn
Pieter Abbeel
Sergey Levine
OOD
311
11,681
0
09 Mar 2017
On Large-Batch Training for Deep Learning: Generalization Gap and Sharp Minima
N. Keskar
Dheevatsa Mudigere
J. Nocedal
M. Smelyanskiy
P. T. P. Tang
ODL
278
2,888
0
15 Sep 2016
Efficient Estimation of Word Representations in Vector Space
Tomáš Mikolov
Kai Chen
G. Corrado
J. Dean
3DV
230
31,253
0
16 Jan 2013
1