ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2112.06598
  4. Cited By
WECHSEL: Effective initialization of subword embeddings for
  cross-lingual transfer of monolingual language models

WECHSEL: Effective initialization of subword embeddings for cross-lingual transfer of monolingual language models

13 December 2021
Benjamin Minixhofer
Fabian Paischer
Navid Rekabsaz
ArXivPDFHTML

Papers citing "WECHSEL: Effective initialization of subword embeddings for cross-lingual transfer of monolingual language models"

50 / 51 papers shown
Title
Bielik v3 Small: Technical Report
Bielik v3 Small: Technical Report
Krzysztof Ociepa
Łukasz Flis
Remigiusz Kinas
Krzysztof Wróbel
Adrian Gwoździej
25
0
0
05 May 2025
Optimizing LLMs for Italian: Reducing Token Fertility and Enhancing Efficiency Through Vocabulary Adaptation
Optimizing LLMs for Italian: Reducing Token Fertility and Enhancing Efficiency Through Vocabulary Adaptation
Luca Moroni
Giovanni Puccetti
Pere-Lluís Huguet Cabot
Andrei Stefan Bejgu
Edoardo Barba
Alessio Miaschi
F. Dell’Orletta
Andrea Esuli
Roberto Navigli
22
0
0
23 Apr 2025
HYPEROFA: Expanding LLM Vocabulary to New Languages via Hypernetwork-Based Embedding Initialization
HYPEROFA: Expanding LLM Vocabulary to New Languages via Hypernetwork-Based Embedding Initialization
Enes Özeren
Yihong Liu
Hinrich Schütze
28
0
0
21 Apr 2025
Llama-3-Nanda-10B-Chat: An Open Generative Large Language Model for Hindi
Llama-3-Nanda-10B-Chat: An Open Generative Large Language Model for Hindi
Monojit Choudhury
Shivam Chauhan
Rocktim Jyoti Das
Dhruv Sahnan
Xudong Han
...
Rituraj Joshi
Gurpreet Gosal
Avraham Sheinin
Natalia Vassilieva
Preslav Nakov
21
0
0
08 Apr 2025
Cross-Tokenizer Distillation via Approximate Likelihood Matching
Cross-Tokenizer Distillation via Approximate Likelihood Matching
Benjamin Minixhofer
Ivan Vulić
E. Ponti
59
0
0
25 Mar 2025
Llama-3.1-Sherkala-8B-Chat: An Open Large Language Model for Kazakh
Fajri Koto
Rituraj Joshi
Nurdaulet Mukhituly
Y. Wang
Zhuohan Xie
...
Avraham Sheinin
Natalia Vassilieva
Neha Sengupta
Larry Murray
Preslav Nakov
ALM
KELM
38
0
0
03 Mar 2025
Efficient Continual Pre-training of LLMs for Low-resource Languages
Efficient Continual Pre-training of LLMs for Low-resource Languages
Arijit Nag
Soumen Chakrabarti
Animesh Mukherjee
Niloy Ganguly
67
0
0
13 Dec 2024
A Practical Guide to Fine-tuning Language Models with Limited Data
A Practical Guide to Fine-tuning Language Models with Limited Data
Márton Szép
Daniel Rueckert
Rüdiger von Eisenhart-Rothe
Florian Hinterwimmer
SyDa
ALM
42
2
0
14 Nov 2024
The Zeno's Paradox of `Low-Resource' Languages
The Zeno's Paradox of `Low-Resource' Languages
H. Nigatu
A. Tonja
Benjamin Rosman
Thamar Solorio
Monojit Choudhury
36
5
0
28 Oct 2024
Adapters for Altering LLM Vocabularies: What Languages Benefit the Most?
Adapters for Altering LLM Vocabularies: What Languages Benefit the Most?
HyoJung Han
Akiko Eriguchi
Haoran Xu
Hieu T. Hoang
Marine Carpuat
Huda Khayrallah
VLM
32
2
0
12 Oct 2024
Generative Model for Less-Resourced Language with 1 billion parameters
Generative Model for Less-Resourced Language with 1 billion parameters
Domen Vreš
Martin Božič
Aljaž Potočnik
Tomaž Martinčič
Marko Robnik-Šikonja
11
1
0
09 Oct 2024
Language Adaptation on a Tight Academic Compute Budget: Tokenizer
  Swapping Works and Pure bfloat16 Is Enough
Language Adaptation on a Tight Academic Compute Budget: Tokenizer Swapping Works and Pure bfloat16 Is Enough
Konstantin Dobler
Gerard de Melo
35
1
0
28 Aug 2024
Beyond English-Centric LLMs: What Language Do Multilingual Language
  Models Think in?
Beyond English-Centric LLMs: What Language Do Multilingual Language Models Think in?
Chengzhi Zhong
Fei Cheng
Qianying Liu
Junfeng Jiang
Zhen Wan
Chenhui Chu
Yugo Murawaki
Sadao Kurohashi
LRM
34
11
0
20 Aug 2024
Modular Sentence Encoders: Separating Language Specialization from
  Cross-Lingual Alignment
Modular Sentence Encoders: Separating Language Specialization from Cross-Lingual Alignment
Yongxin Huang
Kexin Wang
Goran Glavavs
Iryna Gurevych
44
0
0
20 Jul 2024
On Initializing Transformers with Pre-trained Embeddings
On Initializing Transformers with Pre-trained Embeddings
Ha Young Kim
Niranjan Balasubramanian
Byungkon Kang
19
0
0
17 Jul 2024
Bilingual Adaptation of Monolingual Foundation Models
Bilingual Adaptation of Monolingual Foundation Models
Gurpreet Gosal
Yishi Xu
Gokul Ramakrishnan
Rituraj Joshi
Avraham Sheinin
...
Rahul Pal
Parvez Mullah
Soundar Doraiswamy
Mohamed El Karim Chami
Preslav Nakov
CLL
21
2
0
13 Jul 2024
An Empirical Comparison of Vocabulary Expansion and Initialization
  Approaches for Language Models
An Empirical Comparison of Vocabulary Expansion and Initialization Approaches for Language Models
Nandini Mundra
Aditya Nanda Kishore
Raj Dabre
Ratish Puduppully
Anoop Kunchukuttan
Mitesh Khapra
22
3
0
08 Jul 2024
Large Vocabulary Size Improves Large Language Models
Large Vocabulary Size Improves Large Language Models
Sho Takase
Ryokan Ri
Shun Kiyono
Takuya Kato
24
3
0
24 Jun 2024
Exploring Design Choices for Building Language-Specific LLMs
Exploring Design Choices for Building Language-Specific LLMs
Atula Tejaswi
Nilesh Gupta
Eunsol Choi
22
3
0
20 Jun 2024
UniBridge: A Unified Approach to Cross-Lingual Transfer Learning for
  Low-Resource Languages
UniBridge: A Unified Approach to Cross-Lingual Transfer Learning for Low-Resource Languages
Trinh Pham
Khoi M. Le
Luu Anh Tuan
16
1
0
14 Jun 2024
Targeted Multilingual Adaptation for Low-resource Language Families
Targeted Multilingual Adaptation for Low-resource Language Families
C.M. Downey
Terra Blevins
Dhwani Serai
Dwija Parikh
Shane Steinert-Threlkeld
21
0
0
20 May 2024
Zero-Shot Tokenizer Transfer
Zero-Shot Tokenizer Transfer
Benjamin Minixhofer
E. Ponti
Ivan Vulić
VLM
33
8
0
13 May 2024
Setting up the Data Printer with Improved English to Ukrainian Machine
  Translation
Setting up the Data Printer with Improved English to Ukrainian Machine Translation
Yurii Paniv
Dmytro Chaplynskyi
Nikita Trynus
Volodymyr Kyrylov
AI4CE
31
2
0
23 Apr 2024
ANGOFA: Leveraging OFA Embedding Initialization and Synthetic Data for
  Angolan Language Model
ANGOFA: Leveraging OFA Embedding Initialization and Synthetic Data for Angolan Language Model
Osvaldo Luamba Quinjica
David Ifeoluwa Adelani
16
0
0
03 Apr 2024
Comparing Explanation Faithfulness between Multilingual and Monolingual
  Fine-tuned Language Models
Comparing Explanation Faithfulness between Multilingual and Monolingual Fine-tuned Language Models
Zhixue Zhao
Nikolaos Aletras
18
3
0
19 Mar 2024
An Empirical Study on Cross-lingual Vocabulary Adaptation for Efficient
  Language Model Inference
An Empirical Study on Cross-lingual Vocabulary Adaptation for Efficient Language Model Inference
Atsuki Yamaguchi
Aline Villavicencio
Nikolaos Aletras
13
6
0
16 Feb 2024
German Text Simplification: Finetuning Large Language Models with
  Semi-Synthetic Data
German Text Simplification: Finetuning Large Language Models with Semi-Synthetic Data
Lars Klöser
Mika Beele
Jan-Niklas Schagen
Bodo Kraft
17
1
0
16 Feb 2024
RomanSetu: Efficiently unlocking multilingual capabilities of Large
  Language Models via Romanization
RomanSetu: Efficiently unlocking multilingual capabilities of Large Language Models via Romanization
Jaavid Aktar Husain
Raj Dabre
Aswanth Kumar
Jay Gala
Thanmay Jayakumar
Ratish Puduppully
Anoop Kunchukuttan
14
11
0
25 Jan 2024
PHOENIX: Open-Source Language Adaption for Direct Preference
  Optimization
PHOENIX: Open-Source Language Adaption for Direct Preference Optimization
Matthias Uhlig
Sigurd Schacht
Sudarshan Kamath Barkur
ALM
6
1
0
19 Jan 2024
When a Language Question Is at Stake. A Revisited Approach to Label
  Sensitive Content
When a Language Question Is at Stake. A Revisited Approach to Label Sensitive Content
Stetsenko Daria
13
1
0
17 Nov 2023
OFA: A Framework of Initializing Unseen Subword Embeddings for Efficient
  Large-scale Multilingual Continued Pretraining
OFA: A Framework of Initializing Unseen Subword Embeddings for Efficient Large-scale Multilingual Continued Pretraining
Yihong Liu
Peiqin Lin
Mingyang Wang
Hinrich Schütze
11
21
0
15 Nov 2023
Architectural Sweet Spots for Modeling Human Label Variation by the
  Example of Argument Quality: It's Best to Relate Perspectives!
Architectural Sweet Spots for Modeling Human Label Variation by the Example of Argument Quality: It's Best to Relate Perspectives!
Philipp Heinisch
Matthias Orlikowski
Julia Romberg
Philipp Cimiano
8
2
0
06 Nov 2023
RedPenNet for Grammatical Error Correction: Outputs to Tokens,
  Attentions to Spans
RedPenNet for Grammatical Error Correction: Outputs to Tokens, Attentions to Spans
Bohdan Didenko
Andrii Sameliuk
16
4
0
19 Sep 2023
Embedding structure matters: Comparing methods to adapt multilingual
  vocabularies to new languages
Embedding structure matters: Comparing methods to adapt multilingual vocabularies to new languages
C.M. Downey
Terra Blevins
Nora Goldfine
Shane Steinert-Threlkeld
17
8
0
09 Sep 2023
Linear Alignment of Vision-language Models for Image Captioning
Linear Alignment of Vision-language Models for Image Captioning
Fabian Paischer
M. Hofmarcher
Sepp Hochreiter
Thomas Adler
CLIP
VLM
35
0
0
10 Jul 2023
Distilling Efficient Language-Specific Models for Cross-Lingual Transfer
Distilling Efficient Language-Specific Models for Cross-Lingual Transfer
Alan Ansell
E. Ponti
Anna Korhonen
Ivan Vulić
17
4
0
02 Jun 2023
The Grammar and Syntax Based Corpus Analysis Tool For The Ukrainian
  Language
The Grammar and Syntax Based Corpus Analysis Tool For The Ukrainian Language
Daria Stetsenko
Inez Okulska
13
1
0
22 May 2023
Language Models for German Text Simplification: Overcoming Parallel Data
  Scarcity through Style-specific Pre-training
Language Models for German Text Simplification: Overcoming Parallel Data Scarcity through Style-specific Pre-training
Miriam Anschütz
Joshua Oehms
Thomas Wimmer
Bartlomiej Jezierski
Georg Groh
11
21
0
22 May 2023
CharSpan: Utilizing Lexical Similarity to Enable Zero-Shot Machine
  Translation for Extremely Low-resource Languages
CharSpan: Utilizing Lexical Similarity to Enable Zero-Shot Machine Translation for Extremely Low-resource Languages
Kaushal Kumar Maurya
Rahul Kejriwal
M. Desarkar
Anoop Kunchukuttan
23
1
0
09 May 2023
Distill or Annotate? Cost-Efficient Fine-Tuning of Compact Models
Distill or Annotate? Cost-Efficient Fine-Tuning of Compact Models
Junmo Kang
Wei-ping Xu
Alan Ritter
30
15
0
02 May 2023
Transfer to a Low-Resource Language via Close Relatives: The Case Study
  on Faroese
Transfer to a Low-Resource Language via Close Relatives: The Case Study on Faroese
Vésteinn Snaebjarnarson
A. Simonsen
Goran Glavavs
Ivan Vulić
21
19
0
18 Apr 2023
Team QUST at SemEval-2023 Task 3: A Comprehensive Study of Monolingual
  and Multilingual Approaches for Detecting Online News Genre, Framing and
  Persuasion Techniques
Team QUST at SemEval-2023 Task 3: A Comprehensive Study of Monolingual and Multilingual Approaches for Detecting Online News Genre, Framing and Persuasion Techniques
Ye Jiang
14
9
0
09 Apr 2023
Efficient Language Model Training through Cross-Lingual and Progressive
  Transfer Learning
Efficient Language Model Training through Cross-Lingual and Progressive Transfer Learning
Malte Ostendorff
Georg Rehm
CLIP
VLM
CLL
22
14
0
23 Jan 2023
GreenPLM: Cross-Lingual Transfer of Monolingual Pre-Trained Language
  Models at Almost No Cost
GreenPLM: Cross-Lingual Transfer of Monolingual Pre-Trained Language Models at Almost No Cost
Qingcheng Zeng
Lucas Garay
Peilin Zhou
Dading Chong
Yining Hua
Jiageng Wu
Yi-Cheng Pan
Han Zhou
Rob Voigt
Jie Yang
VLM
16
22
0
13 Nov 2022
Extractive Question Answering on Queries in Hindi and Tamil
Extractive Question Answering on Queries in Hindi and Tamil
Adhitya Thirumala
Elisa Ferracane
14
2
0
27 Sep 2022
Automatic Readability Assessment of German Sentences with Transformer
  Ensembles
Automatic Readability Assessment of German Sentences with Transformer Ensembles
Patrick Gustav Blaneck
Tobias Bornheim
Niklas Grieger
Stephan Bialonski
32
10
0
09 Sep 2022
History Compression via Language Models in Reinforcement Learning
History Compression via Language Models in Reinforcement Learning
Fabian Paischer
Thomas Adler
Vihang Patil
Angela Bitto-Nemling
Markus Holzleitner
Sebastian Lehner
Hamid Eghbalzadeh
Sepp Hochreiter
OffRL
AI4TS
11
42
0
24 May 2022
Cross-lingual Lifelong Learning
Cross-lingual Lifelong Learning
Meryem M'hamdi
Xiang Ren
Jonathan May
CLL
32
7
0
23 May 2022
How Good is Your Tokenizer? On the Monolingual Performance of
  Multilingual Language Models
How Good is Your Tokenizer? On the Monolingual Performance of Multilingual Language Models
Phillip Rust
Jonas Pfeiffer
Ivan Vulić
Sebastian Ruder
Iryna Gurevych
69
235
0
31 Dec 2020
What the [MASK]? Making Sense of Language-Specific BERT Models
What the [MASK]? Making Sense of Language-Specific BERT Models
Debora Nozza
Federico Bianchi
Dirk Hovy
77
105
0
05 Mar 2020
12
Next