Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2307.01163
Cited By
Improving Language Plasticity via Pretraining with Active Forgetting
3 July 2023
Yihong Chen
Kelly Marchisio
Roberta Raileanu
David Ifeoluwa Adelani
Pontus Stenetorp
Sebastian Riedel
Mikel Artetx
KELM
AI4CE
CLL
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Improving Language Plasticity via Pretraining with Active Forgetting"
23 / 23 papers shown
Title
Sailor2: Sailing in South-East Asia with Inclusive Multilingual LLMs
Longxu Dou
Qian Liu
Fan Zhou
Changyu Chen
Zili Wang
...
Tianyu Pang
Chao Du
Xinyi Wan
Wei Lu
Min Lin
86
1
0
18 Feb 2025
Facilitating large language model Russian adaptation with Learned Embedding Propagation
Mikhail Tikhomirov
D. Chernyshev
25
1
0
31 Dec 2024
DASH: Warm-Starting Neural Network Training in Stationary Settings without Loss of Plasticity
Baekrok Shin
Junsoo Oh
Hanseul Cho
Chulhee Yun
AI4CE
44
1
0
30 Oct 2024
Exploring Continual Fine-Tuning for Enhancing Language Ability in Large Language Model
Divyanshu Aggarwal
Sankarshan Damle
Navin Goyal
Satya Lokam
Sunayana Sitaram
CLL
18
0
0
21 Oct 2024
DEPT: Decoupled Embeddings for Pre-training Language Models
Alex Iacob
Lorenzo Sani
Meghdad Kurmanji
William F. Shen
Xinchi Qiu
Dongqi Cai
Yan Gao
Nicholas D. Lane
VLM
55
0
0
07 Oct 2024
Co-occurrence is not Factual Association in Language Models
Xiao Zhang
Miao Li
Ji Wu
KELM
59
2
0
21 Sep 2024
Decoupled Vocabulary Learning Enables Zero-Shot Translation from Unseen Languages
Carlos Mullov
Ngoc-Quan Pham
Alexander Waibel
16
1
0
05 Aug 2024
KPC-cF: Aspect-Based Sentiment Analysis via Implicit-Feature Alignment with Corpus Filtering
Kibeom Nam
14
0
0
29 Jun 2024
Understanding and Mitigating Tokenization Bias in Language Models
Buu Phan
Marton Havasi
Matthew Muckley
Karen Ullrich
39
2
0
24 Jun 2024
Zero-Shot Tokenizer Transfer
Benjamin Minixhofer
E. Ponti
Ivan Vulić
VLM
39
8
0
13 May 2024
The Role of Language Imbalance in Cross-lingual Generalisation: Insights from Cloned Language Experiments
Anton Schäfer
Shauli Ravfogel
Thomas Hofmann
Tiago Pimentel
Imanol Schlag
55
3
0
11 Apr 2024
Do Llamas Work in English? On the Latent Language of Multilingual Transformers
Chris Wendler
V. Veselovsky
Giovanni Monea
Robert West
56
95
0
16 Feb 2024
MaLA-500: Massive Language Adaptation of Large Language Models
Peiqin Lin
Shaoxiong Ji
Jörg Tiedemann
André F. T. Martins
Hinrich Schütze
ELM
23
15
0
24 Jan 2024
MAPLE: Multilingual Evaluation of Parameter Efficient Finetuning of Large Language Models
Divyanshu Aggarwal
Ashutosh Sathe
Ishaan Watts
Sunayana Sitaram
27
1
0
15 Jan 2024
OFA: A Framework of Initializing Unseen Subword Embeddings for Efficient Large-scale Multilingual Continued Pretraining
Yihong Liu
Peiqin Lin
Mingyang Wang
Hinrich Schütze
19
21
0
15 Nov 2023
Reset It and Forget It: Relearning Last-Layer Weights Improves Continual and Transfer Learning
Lapo Frati
Neil Traft
Jeff Clune
Nick Cheney
CLL
11
0
0
12 Oct 2023
MasakhaNER 2.0: Africa-centric Transfer Learning for Named Entity Recognition
David Ifeoluwa Adelani
Graham Neubig
Sebastian Ruder
Shruti Rijhwani
Michael Beukman
...
Idris Abdulmumin
Odunayo Ogundepo
Oreen Yousuf
Tatiana Moteu Ngoli
Dietrich Klakow
36
43
0
22 Oct 2022
The Primacy Bias in Deep Reinforcement Learning
Evgenii Nikishin
Max Schwarzer
P. DÓro
Pierre-Luc Bacon
Aaron C. Courville
OnRL
85
178
0
16 May 2022
Fast Model Editing at Scale
E. Mitchell
Charles Lin
Antoine Bosselut
Chelsea Finn
Christopher D. Manning
KELM
219
341
0
21 Oct 2021
AmericasNLI: Evaluating Zero-shot Natural Language Understanding of Pretrained Multilingual Models in Truly Low-resource Languages
Abteen Ebrahimi
Manuel Mager
Arturo Oncevay
Vishrav Chaudhary
Luis Chiruzzo
...
Graham Neubig
Alexis Palmer
Rolando A. Coto Solano
Ngoc Thang Vu
Katharina Kann
99
71
0
18 Apr 2021
Scaling Laws for Neural Language Models
Jared Kaplan
Sam McCandlish
T. Henighan
Tom B. Brown
B. Chess
R. Child
Scott Gray
Alec Radford
Jeff Wu
Dario Amodei
226
4,424
0
23 Jan 2020
MLQA: Evaluating Cross-lingual Extractive Question Answering
Patrick Lewis
Barlas Oğuz
Ruty Rinott
Sebastian Riedel
Holger Schwenk
ELM
242
489
0
16 Oct 2019
Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks
Chelsea Finn
Pieter Abbeel
Sergey Levine
OOD
243
11,568
0
09 Mar 2017
1