ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2402.00786
  4. Cited By
CroissantLLM: A Truly Bilingual French-English Language Model

CroissantLLM: A Truly Bilingual French-English Language Model

1 February 2024
Manuel Faysse
Patrick Fernandes
Nuno M. Guerreiro
António Loison
Duarte M. Alves
Caio Corro
Nicolas Boizard
Joao Alves
Ricardo Rei
Pedro H. Martins
Antoni Bigata Casademunt
François Yvon
André F.T. Martins
Gautier Viaud
C´eline Hudelot
Pierre Colombo
ArXivPDFHTML

Papers citing "CroissantLLM: A Truly Bilingual French-English Language Model"

31 / 31 papers shown
Title
SemEval-2025 Task 3: Mu-SHROOM, the Multilingual Shared Task on Hallucinations and Related Observable Overgeneration Mistakes
SemEval-2025 Task 3: Mu-SHROOM, the Multilingual Shared Task on Hallucinations and Related Observable Overgeneration Mistakes
Raúl Vázquez
Timothee Mickus
Elaine Zosa
Teemu Vahtola
Jörg Tiedemann
...
Liane Guillou
Ona de Gibert
Jaione Bengoetxea
Joseph Attieh
Marianna Apidianaki
HILM
VLM
LRM
74
0
0
16 Apr 2025
MMLU-ProX: A Multilingual Benchmark for Advanced Large Language Model Evaluation
Weihao Xuan
Rui Yang
Heli Qi
Qingcheng Zeng
Yunze Xiao
...
Edison Marrese-Taylor
Shijian Lu
Yusuke Iwasawa
Yutaka Matsuo
Irene Z Li
ELM
54
3
0
13 Mar 2025
Multilingual Language Model Pretraining using Machine-translated Data
Multilingual Language Model Pretraining using Machine-translated Data
Jiayi Wang
Yao Lu
Maurice Weber
Max Ryabinin
David Ifeoluwa Adelani
Yihong Chen
Raphael Tang
Pontus Stenetorp
LRM
65
2
0
20 Feb 2025
Chain-of-MetaWriting: Linguistic and Textual Analysis of How Small
  Language Models Write Young Students Texts
Chain-of-MetaWriting: Linguistic and Textual Analysis of How Small Language Models Write Young Students Texts
Ioana Buhnila
Georgeta Cislaru
Amalia Todirascu
80
1
0
19 Dec 2024
Training Bilingual LMs with Data Constraints in the Targeted Language
Training Bilingual LMs with Data Constraints in the Targeted Language
Skyler Seto
Maartje ter Hoeve
He Bai
Natalie Schluter
David Grangier
71
0
0
20 Nov 2024
Multilingual Pretraining Using a Large Corpus Machine-Translated from a
  Single Source Language
Multilingual Pretraining Using a Large Corpus Machine-Translated from a Single Source Language
Jiayi Wang
Yao Lu
Maurice Weber
Max Ryabinin
Yihong Chen
Raphael Tang
Pontus Stenetorp
LRM
26
1
0
31 Oct 2024
Toxicity of the Commons: Curating Open-Source Pre-Training Data
Toxicity of the Commons: Curating Open-Source Pre-Training Data
Catherine Arnett
Eliot Jones
Ivan P. Yamshchikov
Pierre-Carl Langlais
21
2
0
29 Oct 2024
Analyzing Nobel Prize Literature with Large Language Models
Analyzing Nobel Prize Literature with Large Language Models
Yang Zhenyuan
Liu Zhengliang
Zhang Jing
Lu Cen
Tai Jiaxin
...
Ge Bao
Zhang Wei
Qiang Ning
Zhang Tuo
Liu Tianming
18
3
0
22 Oct 2024
Optimizing Low-Resource Language Model Training: Comprehensive Analysis
  of Multi-Epoch, Multi-Lingual, and Two-Stage Approaches
Optimizing Low-Resource Language Model Training: Comprehensive Analysis of Multi-Epoch, Multi-Lingual, and Two-Stage Approaches
Kosuke Akimoto
M. Oyamada
21
0
0
16 Oct 2024
Exploring the Meaningfulness of Nearest Neighbor Search in
  High-Dimensional Space
Exploring the Meaningfulness of Nearest Neighbor Search in High-Dimensional Space
Zhonghan Chen
Ruiyuan Zhang
Xi Zhao
Xiaojun Cheng
Xiaofang Zhou
35
0
0
08 Oct 2024
A Survey of Large Language Models for European Languages
A Survey of Large Language Models for European Languages
Wazir Ali
S. Pyysalo
33
2
0
27 Aug 2024
Beyond English-Centric LLMs: What Language Do Multilingual Language
  Models Think in?
Beyond English-Centric LLMs: What Language Do Multilingual Language Models Think in?
Chengzhi Zhong
Fei Cheng
Qianying Liu
Junfeng Jiang
Zhen Wan
Chenhui Chu
Yugo Murawaki
Sadao Kurohashi
LRM
26
11
0
20 Aug 2024
SaulLM-54B & SaulLM-141B: Scaling Up Domain Adaptation for the Legal
  Domain
SaulLM-54B & SaulLM-141B: Scaling Up Domain Adaptation for the Legal Domain
Pierre Colombo
T. Pires
Malik Boudiaf
Rui Melo
Dominic Culver
Sofia Morgado
Etienne Malaboeuf
Gabriel Hautreux
Johanne Charpentier
Michael Desa
ELM
AILaw
ALM
27
7
0
28 Jul 2024
LLaMAX: Scaling Linguistic Horizons of LLM by Enhancing Translation
  Capabilities Beyond 100 Languages
LLaMAX: Scaling Linguistic Horizons of LLM by Enhancing Translation Capabilities Beyond 100 Languages
Yinquan Lu
Wenhao Zhu
Lei Li
Yu Qiao
Fei Yuan
34
24
0
08 Jul 2024
Reproducibility in Machine Learning-based Research: Overview, Barriers and Drivers
Reproducibility in Machine Learning-based Research: Overview, Barriers and Drivers
Harald Semmelrock
Tony Ross-Hellauer
Simone Kopeinik
Dieter Theiler
Armin Haberl
Stefan Thalmann
Dominik Kowald
42
5
0
20 Jun 2024
MTEB-French: Resources for French Sentence Embedding Evaluation and
  Analysis
MTEB-French: Resources for French Sentence Embedding Evaluation and Analysis
Mathieu Ciancone
Imene Kerboua
Marion Schaeffer
W. Siblini
31
2
0
30 May 2024
Mosaic Memory: Fuzzy Duplication in Copyright Traps for Large Language
  Models
Mosaic Memory: Fuzzy Duplication in Copyright Traps for Large Language Models
Igor Shilov
Matthieu Meeus
Yves-Alexandre de Montjoye
26
3
0
24 May 2024
Bridging the Bosphorus: Advancing Turkish Large Language Models through
  Strategies for Low-Resource Language Adaptation and Benchmarking
Bridging the Bosphorus: Advancing Turkish Large Language Models through Strategies for Low-Resource Language Adaptation and Benchmarking
Emre Can Acikgoz
Mete Erdogan
Deniz Yuret
25
7
0
07 May 2024
The Role of Language Imbalance in Cross-lingual Generalisation: Insights
  from Cloned Language Experiments
The Role of Language Imbalance in Cross-lingual Generalisation: Insights from Cloned Language Experiments
Anton Schäfer
Shauli Ravfogel
Thomas Hofmann
Tiago Pimentel
Imanol Schlag
47
3
0
11 Apr 2024
Latxa: An Open Language Model and Evaluation Suite for Basque
Latxa: An Open Language Model and Evaluation Suite for Basque
Julen Etxaniz
Oscar Sainz
Naiara Pérez
Itziar Aldabe
German Rigau
Eneko Agirre
Aitor Ormazabal
Mikel Artetxe
A. Soroa
ELM
23
22
0
29 Mar 2024
Dated Data: Tracing Knowledge Cutoffs in Large Language Models
Dated Data: Tracing Knowledge Cutoffs in Large Language Models
Jeffrey Cheng
Marc Marone
Orion Weller
Dawn J Lawrie
Daniel Khashabi
Benjamin Van Durme
36
2
0
19 Mar 2024
SaulLM-7B: A pioneering Large Language Model for Law
SaulLM-7B: A pioneering Large Language Model for Law
Pierre Colombo
T. Pires
Malik Boudiaf
Dominic Culver
Rui Melo
...
Andre F. T. Martins
Fabrizio Esposito
Vera Lúcia Raposo
Sofia Morgado
Michael Desa
ELM
AILaw
23
63
0
06 Mar 2024
Tower: An Open Multilingual Large Language Model for Translation-Related
  Tasks
Tower: An Open Multilingual Large Language Model for Translation-Related Tasks
Duarte M. Alves
José P. Pombal
Nuno M. Guerreiro
Pedro H. Martins
Joao Alves
...
Patrick Fernandes
Sweta Agrawal
Pierre Colombo
José G. C. de Souza
André F.T. Martins
LRM
29
128
0
27 Feb 2024
Copyright Traps for Large Language Models
Copyright Traps for Large Language Models
Matthieu Meeus
Igor Shilov
Manuel Faysse
Yves-Alexandre de Montjoye
20
17
0
14 Feb 2024
Aya Model: An Instruction Finetuned Open-Access Multilingual Language
  Model
Aya Model: An Instruction Finetuned Open-Access Multilingual Language Model
A. Ustun
Viraat Aryabumi
Zheng-Xin Yong
Wei-Yin Ko
Daniel D'souza
...
Shayne Longpre
Niklas Muennighoff
Marzieh Fadaee
Julia Kreutzer
Sara Hooker
ALM
ELM
SyDa
LRM
24
192
0
12 Feb 2024
Steering Large Language Models for Machine Translation with Finetuning
  and In-Context Learning
Steering Large Language Models for Machine Translation with Finetuning and In-Context Learning
Duarte M. Alves
Nuno M. Guerreiro
Joao Alves
José P. Pombal
Ricardo Rei
José G. C. de Souza
Pierre Colombo
André F.T. Martins
37
47
0
20 Oct 2023
CometKiwi: IST-Unbabel 2022 Submission for the Quality Estimation Shared
  Task
CometKiwi: IST-Unbabel 2022 Submission for the Quality Estimation Shared Task
Ricardo Rei
Marcos Vinícius Treviso
Nuno M. Guerreiro
Chrysoula Zerva
Ana C. Farinha
...
T. Glushkova
Duarte M. Alves
A. Lavie
Luísa Coheur
André F. T. Martins
50
137
0
13 Sep 2022
PAGnol: An Extra-Large French Generative Model
PAGnol: An Extra-Large French Generative Model
Julien Launay
E. L. Tommasone
B. Pannier
Franccois Boniface
A. Chatelain
Alessandro Cappelli
Iacopo Poli
Djamé Seddah
AILaw
MoE
AI4CE
30
8
0
16 Oct 2021
How Good is Your Tokenizer? On the Monolingual Performance of
  Multilingual Language Models
How Good is Your Tokenizer? On the Monolingual Performance of Multilingual Language Models
Phillip Rust
Jonas Pfeiffer
Ivan Vulić
Sebastian Ruder
Iryna Gurevych
69
235
0
31 Dec 2020
BARThez: a Skilled Pretrained French Sequence-to-Sequence Model
BARThez: a Skilled Pretrained French Sequence-to-Sequence Model
Moussa Kamal Eddine
A. Tixier
Michalis Vazirgiannis
BDL
95
64
0
23 Oct 2020
Scaling Laws for Neural Language Models
Scaling Laws for Neural Language Models
Jared Kaplan
Sam McCandlish
T. Henighan
Tom B. Brown
B. Chess
R. Child
Scott Gray
Alec Radford
Jeff Wu
Dario Amodei
220
3,054
0
23 Jan 2020
1