ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2006.06202
  4. Cited By
A Monolingual Approach to Contextualized Word Embeddings for
  Mid-Resource Languages

A Monolingual Approach to Contextualized Word Embeddings for Mid-Resource Languages

11 June 2020
Pedro Ortiz Suarez
Laurent Romary
Benoît Sagot
ArXivPDFHTML

Papers citing "A Monolingual Approach to Contextualized Word Embeddings for Mid-Resource Languages"

34 / 34 papers shown
Title
Lazy But Effective: Collaborative Personalized Federated Learning with Heterogeneous Data
Lazy But Effective: Collaborative Personalized Federated Learning with Heterogeneous Data
Ljubomir Rokvic
Panayiotis Danassis
Boi Faltings
FedML
35
0
0
05 May 2025
TigerLLM -- A Family of Bangla Large Language Models
TigerLLM -- A Family of Bangla Large Language Models
Nishat Raihan
Marcos Zampieri
48
0
0
14 Mar 2025
UrduLLaMA 1.0: Dataset Curation, Preprocessing, and Evaluation in Low-Resource Settings
UrduLLaMA 1.0: Dataset Curation, Preprocessing, and Evaluation in Low-Resource Settings
Layba Fiaz
Munief Hassan Tahir
Sana Shams
Sarmad Hussain
49
0
0
24 Feb 2025
Exploring Translation Mechanism of Large Language Models
Exploring Translation Mechanism of Large Language Models
Hongbin Zhang
Kehai Chen
Xuefeng Bai
Xiucheng Li
Yang Xiang
Min Zhang
59
1
0
17 Feb 2025
Data Processing for the OpenGPT-X Model Family
Data Processing for the OpenGPT-X Model Family
Nicolo' Brandizzi
Hammam Abdelwahab
Anirban Bhowmick
Lennard Helmer
Benny Jörg Stein
...
Georg Rehm
Dennis Wegener
Nicolas Flores-Herr
Joachim Kohler
Johannes Leveling
VLM
79
2
0
11 Oct 2024
An Empirical Comparison of Vocabulary Expansion and Initialization
  Approaches for Language Models
An Empirical Comparison of Vocabulary Expansion and Initialization Approaches for Language Models
Nandini Mundra
Aditya Nanda Kishore
Raj Dabre
Ratish Puduppully
Anoop Kunchukuttan
Mitesh Khapra
30
3
0
08 Jul 2024
Comprehensive Study on German Language Models for Clinical and
  Biomedical Text Understanding
Comprehensive Study on German Language Models for Clinical and Biomedical Text Understanding
Ahmad Idrissi-Yaghir
Amin Dada
Henning Schafer
Kamyar Arzideh
Giulia Baldini
...
Peter A. Horn
Christin Seifert
F. Nensa
Jens Kleesiek
Christoph M. Friedrich
AI4MH
29
2
0
08 Apr 2024
Training a Bilingual Language Model by Mapping Tokens onto a Shared
  Character Space
Training a Bilingual Language Model by Mapping Tokens onto a Shared Character Space
Aviad Rom
Kfir Bar
24
1
0
25 Feb 2024
RoBERTurk: Adjusting RoBERTa for Turkish
RoBERTurk: Adjusting RoBERTa for Turkish
Nuri Tas
17
1
0
07 Jan 2024
Unsupervised Paraphrasing of Multiword Expressions
Unsupervised Paraphrasing of Multiword Expressions
Takashi Wada
Yuji Matsumoto
Timothy Baldwin
Jey Han Lau
24
0
0
02 Jun 2023
GPT-SW3: An Autoregressive Language Model for the Nordic Languages
GPT-SW3: An Autoregressive Language Model for the Nordic Languages
Ariel Ekgren
Amaru Cuba Gyllensten
Felix Stollenwerk
Joey Öhman
T. Isbister
Evangelia Gogoulou
F. Carlsson
Alice Heiman
Judit Casademont
Magnus Sahlgren
27
13
0
22 May 2023
The Vault: A Comprehensive Multilingual Dataset for Advancing Code
  Understanding and Generation
The Vault: A Comprehensive Multilingual Dataset for Advancing Code Understanding and Generation
Dũng Nguyễn Mạnh
Nam Le Hai
An Dau
A. Nguyen
Khanh N. Nghiem
Jingnan Guo
Nghi D. Q. Bui
26
13
0
09 May 2023
On Efficient Training of Large-Scale Deep Learning Models: A Literature
  Review
On Efficient Training of Large-Scale Deep Learning Models: A Literature Review
Li Shen
Yan Sun
Zhiyuan Yu
Liang Ding
Xinmei Tian
Dacheng Tao
VLM
28
40
0
07 Apr 2023
FairDistillation: Mitigating Stereotyping in Language Models
FairDistillation: Mitigating Stereotyping in Language Models
Pieter Delobelle
Bettina Berendt
20
8
0
10 Jul 2022
You Are What You Write: Preserving Privacy in the Era of Large Language
  Models
You Are What You Write: Preserving Privacy in the Era of Large Language Models
Richard Plant
V. Giuffrida
Dimitra Gkatzia
PILM
17
19
0
20 Apr 2022
Breaking Character: Are Subwords Good Enough for MRLs After All?
Breaking Character: Are Subwords Good Enough for MRLs After All?
Omri Keren
Tal Avinari
Reut Tsarfaty
Omer Levy
28
15
0
10 Apr 2022
Towards a Cleaner Document-Oriented Multilingual Crawled Corpus
Towards a Cleaner Document-Oriented Multilingual Crawled Corpus
Julien Abadji
Pedro Ortiz Suarez
Laurent Romary
Benoît Sagot
CLL
34
153
0
17 Jan 2022
IndoNLI: A Natural Language Inference Dataset for Indonesian
IndoNLI: A Natural Language Inference Dataset for Indonesian
Rahmad Mahendra
Alham Fikri Aji
Samuel Louvan
Fahrurrozi Rahman
Clara Vania
24
29
0
27 Oct 2021
MFAQ: a Multilingual FAQ Dataset
MFAQ: a Multilingual FAQ Dataset
Maxime De Bruyn
Ehsan Lotfi
Jeska Buhmann
Walter Daelemans
RALM
42
21
0
27 Sep 2021
ParaShoot: A Hebrew Question Answering Dataset
ParaShoot: A Hebrew Question Answering Dataset
Omri Keren
Omer Levy
29
17
0
23 Sep 2021
Spanish Biomedical Crawled Corpus: A Large, Diverse Dataset for Spanish
  Biomedical Language Models
Spanish Biomedical Crawled Corpus: A Large, Diverse Dataset for Spanish Biomedical Language Models
C. Carrino
Jordi Armengol-Estapé
Ona de Gibert Bonet
Asier Gutiérrez-Fandiño
Aitor Gonzalez-Agirre
Martin Krallinger
Marta Villegas
8
20
0
16 Sep 2021
BERT, mBERT, or BiBERT? A Study on Contextualized Embeddings for Neural
  Machine Translation
BERT, mBERT, or BiBERT? A Study on Contextualized Embeddings for Neural Machine Translation
Haoran Xu
Benjamin Van Durme
Kenton W. Murray
42
57
0
09 Sep 2021
PARADISE: Exploiting Parallel Data for Multilingual Sequence-to-Sequence
  Pretraining
PARADISE: Exploiting Parallel Data for Multilingual Sequence-to-Sequence Pretraining
Machel Reid
Mikel Artetxe
VLM
42
26
0
04 Aug 2021
Machine Translation into Low-resource Language Varieties
Machine Translation into Low-resource Language Varieties
Sachin Kumar
Antonios Anastasopoulos
S. Wintner
Yulia Tsvetkov
11
29
0
12 Jun 2021
Bertinho: Galician BERT Representations
Bertinho: Galician BERT Representations
David Vilares
Marcos Garcia
Carlos Gómez-Rodríguez
57
22
0
25 Mar 2021
Is BERT a Cross-Disciplinary Knowledge Learner? A Surprising Finding of
  Pre-trained Models' Transferability
Is BERT a Cross-Disciplinary Knowledge Learner? A Surprising Finding of Pre-trained Models' Transferability
Wei-Tsung Kao
Hung-yi Lee
16
16
0
12 Mar 2021
The Interplay of Variant, Size, and Task Type in Arabic Pre-trained
  Language Models
The Interplay of Variant, Size, and Task Type in Arabic Pre-trained Language Models
Go Inoue
Bashar Alhafni
Nurpeiis Baimukan
Houda Bouamor
Nizar Habash
35
223
0
11 Mar 2021
Pre-Training BERT on Arabic Tweets: Practical Considerations
Pre-Training BERT on Arabic Tweets: Practical Considerations
Ahmed Abdelali
Sabit Hassan
Hamdy Mubarak
Kareem Darwish
Younes Samih
20
96
0
21 Feb 2021
AraGPT2: Pre-Trained Transformer for Arabic Language Generation
AraGPT2: Pre-Trained Transformer for Arabic Language Generation
Wissam Antoun
Fady Baly
Hazem M. Hajj
VLM
19
103
0
31 Dec 2020
AraELECTRA: Pre-Training Text Discriminators for Arabic Language
  Understanding
AraELECTRA: Pre-Training Text Discriminators for Arabic Language Understanding
Wissam Antoun
Fady Baly
Hazem M. Hajj
17
102
0
31 Dec 2020
Indic-Transformers: An Analysis of Transformer Language Models for
  Indian Languages
Indic-Transformers: An Analysis of Transformer Language Models for Indian Languages
Kushal Kumar Jain
Adwait Deshpande
Kumar Shridhar
F. Laumann
Ayushman Dash
43
51
0
04 Nov 2020
Accenture at CheckThat! 2020: If you say so: Post-hoc fact-checking of
  claims using transformer-based models
Accenture at CheckThat! 2020: If you say so: Post-hoc fact-checking of claims using transformer-based models
Evan Williams
Paul Rodrigues
Valerie Novak
34
42
0
05 Sep 2020
KUISAIL at SemEval-2020 Task 12: BERT-CNN for Offensive Speech
  Identification in Social Media
KUISAIL at SemEval-2020 Task 12: BERT-CNN for Offensive Speech Identification in Social Media
Ali Safaya
Moutasem Abdullatif
Deniz Yuret
31
314
0
26 Jul 2020
CoVoST 2 and Massively Multilingual Speech-to-Text Translation
CoVoST 2 and Massively Multilingual Speech-to-Text Translation
Changhan Wang
Anne Wu
J. Pino
SLR
19
71
0
20 Jul 2020
1