v1v2 (latest)

From English To Foreign Languages: Transferring Pre-trained Language Models

18 February 2020

Papers citing "From English To Foreign Languages: Transferring Pre-trained Language Models"

32 / 32 papers shown

TokAlign: Efficient Vocabulary Adaptation via Token AlignmentAnnual Meeting of the Association for Computational Linguistics (ACL), 2025

250

04 Jun 2025

Token Distillation: Attention-aware Input Embeddings For New Tokens

540

26 May 2025

Cross-Lingual Optimization for Language Transfer in Large Language ModelsAnnual Meeting of the Association for Computational Linguistics (ACL), 2025

284

20 May 2025

Bielik v3 Small: Technical Report

518

05 May 2025

HYPEROFA: Expanding LLM Vocabulary to New Languages via Hypernetwork-Based Embedding Initialization

Enes Özeren

Yihong Liu

Hinrich Schütze

315

21 Apr 2025

SuperBPE: Space Travel for Language Models

594

17 Mar 2025

Adapters for Altering LLM Vocabularies: What Languages Benefit the Most?International Conference on Learning Representations (ICLR), 2024

HyoJung Han

329

12 Oct 2024

Gradient Routing: Masking Gradients to Localize Computation in Neural Networks

Alex Cloud

Jacob Goldman-Wetzler

Evžen Wybitul

Joseph Miller

Alexander Matt Turner

281

06 Oct 2024

An Empirical Comparison of Vocabulary Expansion and Initialization Approaches for Language Models

Anoop Kunchukuttan

245

08 Jul 2024

Zero-Shot Tokenizer TransferNeural Information Processing Systems (NeurIPS), 2024

376

13 May 2024

Bailong: Bilingual Transfer Learning based on QLoRA and Zip-tie Embedding

Lung-Chuan Chen

Zong-Ru Li

ALM

320

01 Apr 2024

Exploring Tokenization Strategies and Vocabulary Sizes for Enhanced Arabic Language Models

M. Alrefaie

Nour Eldin Morsy

Nada Samir

247

17 Mar 2024

Transferring BERT Capabilities from High-Resource to Low-Resource Languages Using Vocabulary Matching

Piotr Rybak

176

22 Feb 2024

An Empirical Study on Cross-lingual Vocabulary Adaptation for Efficient Language Model Inference

Atsuki Yamaguchi

Aline Villavicencio

Nikolaos Aletras

300

16 Feb 2024

OFA: A Framework of Initializing Unseen Subword Embeddings for Efficient Large-scale Multilingual Continued Pretraining

305

15 Nov 2023

Extrapolating Large Language Models to Non-English by Aligning Languages

Qingxiu Dong

Lingpeng Kong

Lei Li

393

09 Aug 2023

Distilling Efficient Language-Specific Models for Cross-Lingual TransferAnnual Meeting of the Association for Computational Linguistics (ACL), 2023

272

02 Jun 2023

Mini-Model Adaptation: Efficiently Extending Pretrained Models to New Languages via Aligned Shallow TrainingAnnual Meeting of the Association for Computational Linguistics (ACL), 2022

Kelly Marchisio

Patrick Lewis

Yihong Chen

Mikel Artetxe

317

20 Dec 2022

GreenPLM: Cross-Lingual Transfer of Monolingual Pre-Trained Language Models at Almost No CostInternational Joint Conference on Artificial Intelligence (IJCAI), 2022

Jie Yang

501

13 Nov 2022

Lifting the Curse of Multilinguality by Pre-training Modular TransformersNorth American Chapter of the Association for Computational Linguistics (NAACL), 2022

Xian Li

300

169

12 May 2022

Language Contamination Helps Explain the Cross-lingual Capabilities of English Pretrained ModelsConference on Empirical Methods in Natural Language Processing (EMNLP), 2022

Terra Blevins

Luke Zettlemoyer

391

108

17 Apr 2022

Oolong: Investigating What Makes Transfer Learning Hard with Controlled StudiesConference on Empirical Methods in Natural Language Processing (EMNLP), 2022

Zhengxuan Wu

Alex Tamkin

Isabel Papadimitriou

299

24 Feb 2022

WECHSEL: Effective initialization of subword embeddings for cross-lingual transfer of monolingual language models

Benjamin Minixhofer

Fabian Paischer

Navid Rekabsaz

397

110

13 Dec 2021

On the Universality of Deep Contextual Language Models

346

15 Sep 2021

Subword Mapping and Anchoring across LanguagesConference on Empirical Methods in Natural Language Processing (EMNLP), 2021

Giorgos Vernikos

Andrei Popescu-Belis

251

09 Sep 2021

Cross-lingual Transferring of Pre-trained Contextualized Language Models

Rui Wang

196

27 Jul 2021

Identifying the Limits of Cross-Domain Knowledge Transfer for Pretrained ModelsWorkshop on Representation Learning for NLP (RepL4NLP), 2021

Zhengxuan Wu

Nelson F. Liu

Christopher Potts

152

17 Apr 2021

Graph Convolutional Network for Swahili News Classification

Alexandros Kastanos

Tyler Martin

GNN

177

16 Mar 2021

What makes multilingual BERT multilingual?

233

20 Oct 2020

A Study of Cross-Lingual Ability and Language-specific Information in Multilingual BERT

218

20 Apr 2020

A Primer in BERTology: What we know about how BERT worksTransactions of the Association for Computational Linguistics (TACL), 2020

626

1,818

27 Feb 2020

On the Cross-lingual Transferability of Monolingual RepresentationsAnnual Meeting of the Association for Computational Linguistics (ACL), 2019

Mikel Artetxe

Sebastian Ruder

Dani Yogatama

735

851

25 Oct 2019