ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2012.05628
  4. Cited By
As Good as New. How to Successfully Recycle English GPT-2 to Make Models
  for Other Languages

As Good as New. How to Successfully Recycle English GPT-2 to Make Models for Other Languages

10 December 2020
Wietse de Vries
Malvina Nissim
ArXivPDFHTML

Papers citing "As Good as New. How to Successfully Recycle English GPT-2 to Make Models for Other Languages"

13 / 13 papers shown
Title
Achieving Tokenizer Flexibility in Language Models through Heuristic Adaptation and Supertoken Learning
Achieving Tokenizer Flexibility in Language Models through Heuristic Adaptation and Supertoken Learning
Shaurya Sharthak
Vinayak Pahalwan
Adithya Kamath
Adarsh Shirawalmath
CLL
VLM
45
0
0
14 May 2025
HYPEROFA: Expanding LLM Vocabulary to New Languages via Hypernetwork-Based Embedding Initialization
HYPEROFA: Expanding LLM Vocabulary to New Languages via Hypernetwork-Based Embedding Initialization
Enes Özeren
Yihong Liu
Hinrich Schütze
31
0
0
21 Apr 2025
Facilitating large language model Russian adaptation with Learned Embedding Propagation
Facilitating large language model Russian adaptation with Learned Embedding Propagation
Mikhail Tikhomirov
D. Chernyshev
38
1
0
31 Dec 2024
BAMBINO-LM: (Bilingual-)Human-Inspired Continual Pretraining of BabyLM
BAMBINO-LM: (Bilingual-)Human-Inspired Continual Pretraining of BabyLM
Zhewen Shen
Aditya Joshi
Ruey-Cheng Chen
CLL
49
2
0
17 Jun 2024
Efficient Language Model Training through Cross-Lingual and Progressive
  Transfer Learning
Efficient Language Model Training through Cross-Lingual and Progressive Transfer Learning
Malte Ostendorff
Georg Rehm
CLIP
VLM
CLL
41
23
0
23 Jan 2023
Multitasking Models are Robust to Structural Failure: A Neural Model for
  Bilingual Cognitive Reserve
Multitasking Models are Robust to Structural Failure: A Neural Model for Bilingual Cognitive Reserve
Giannis Daras
Negin Raoof
Zoi Gkalitsiou
A. Dimakis
21
2
0
20 Oct 2022
Utilizing Language-Image Pretraining for Efficient and Robust Bilingual
  Word Alignment
Utilizing Language-Image Pretraining for Efficient and Robust Bilingual Word Alignment
Tuan Dinh
Jy-yong Sohn
Shashank Rajput
Timothy Ossowski
Yifei Ming
Junjie Hu
Dimitris Papailiopoulos
Kangwook Lee
25
0
0
23 May 2022
WECHSEL: Effective initialization of subword embeddings for
  cross-lingual transfer of monolingual language models
WECHSEL: Effective initialization of subword embeddings for cross-lingual transfer of monolingual language models
Benjamin Minixhofer
Fabian Paischer
Navid Rekabsaz
24
73
0
13 Dec 2021
Character-level HyperNetworks for Hate Speech Detection
Character-level HyperNetworks for Hate Speech Detection
Tomer Wullach
A. Adler
Einat Minkov
16
12
0
11 Nov 2021
Cross-lingual Transfer of Monolingual Models
Cross-lingual Transfer of Monolingual Models
Evangelia Gogoulou
Ariel Ekgren
T. Isbister
Magnus Sahlgren
29
17
0
15 Sep 2021
Subword Mapping and Anchoring across Languages
Subword Mapping and Anchoring across Languages
Giorgos Vernikos
Andrei Popescu-Belis
70
12
0
09 Sep 2021
What the [MASK]? Making Sense of Language-Specific BERT Models
What the [MASK]? Making Sense of Language-Specific BERT Models
Debora Nozza
Federico Bianchi
Dirk Hovy
84
105
0
05 Mar 2020
Word Translation Without Parallel Data
Word Translation Without Parallel Data
Alexis Conneau
Guillaume Lample
MarcÁurelio Ranzato
Ludovic Denoyer
Hervé Jégou
183
1,635
0
11 Oct 2017
1