Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2212.06742
Cited By
ERNIE-Code: Beyond English-Centric Cross-lingual Pretraining for Programming Languages
13 December 2022
Yekun Chai
Shuohuan Wang
Chao Pang
Yu Sun
Hao Tian
Hua-Hong Wu
Re-assign community
ArXiv
PDF
HTML
Papers citing
"ERNIE-Code: Beyond English-Centric Cross-lingual Pretraining for Programming Languages"
25 / 25 papers shown
Title
REDO: Execution-Free Runtime Error Detection for COding Agents
Shou Li
Andrey Kan
Laurent Callot
Bhavana Bhasker
Muhammad Shihab Rashid
Timothy B Esler
LRM
31
0
0
10 Oct 2024
MA-RLHF: Reinforcement Learning from Human Feedback with Macro Actions
Yekun Chai
Haoran Sun
Huang Fang
Shuohuan Wang
Yu Sun
Hua-Hong Wu
88
1
0
03 Oct 2024
Bridging the Language Gap: Enhancing Multilingual Prompt-Based Code Generation in LLMs via Zero-Shot Cross-Lingual Transfer
Mingda Li
Abhijit Mishra
Utkarsh Mujumdar
34
0
0
19 Aug 2024
FuxiTranyu: A Multilingual Large Language Model Trained with Balanced Data
Haoran Sun
Renren Jin
Shaoyang Xu
Leiyu Pan
Supryadi
...
Lei Yang
Ling Shi
Juesi Xiao
Shaolin Zhu
Deyi Xiong
55
0
0
12 Aug 2024
UniCoder: Scaling Code Large Language Model via Universal Code
Tao Sun
Linzheng Chai
Jian Yang
Yuwei Yin
Hongcheng Guo
Jiaheng Liu
Bing Wang
Liqun Yang
Zhoujun Li
OffRL
LRM
60
16
0
24 Jun 2024
Tokenization Falling Short: The Curse of Tokenization
Yekun Chai
Yewei Fang
Qiwei Peng
Xuhong Li
26
0
0
17 Jun 2024
A Survey on Large Language Models for Code Generation
Juyong Jiang
Fan Wang
Jiasi Shen
Sungju Kim
Sunghun Kim
40
158
0
01 Jun 2024
Continual Learning of Large Language Models: A Comprehensive Survey
Haizhou Shi
Zihao Xu
Hengyi Wang
Weiyi Qin
Wenyuan Wang
Yibin Wang
Zifeng Wang
Sayna Ebrahimi
Hao Wang
CLL
KELM
LRM
37
62
0
25 Apr 2024
On Training Data Influence of GPT Models
Qingyi Liu
Yekun Chai
Shuohuan Wang
Yu Sun
Qiwei Peng
Keze Wang
Hua-Hong Wu
TDI
AI4CE
19
4
0
11 Apr 2024
Multilingual Large Language Model: A Survey of Resources, Taxonomy and Frontiers
Libo Qin
Qiguang Chen
Yuhang Zhou
Zhi Chen
Yinghui Li
Lizi Liao
Min Li
Wanxiang Che
Philip S. Yu
LRM
47
36
0
07 Apr 2024
Aurora-M: The First Open Source Multilingual Language Model Red-teamed according to the U.S. Executive Order
Taishi Nakamura
Mayank Mishra
Simone Tedeschi
Yekun Chai
Jason T Stillerman
...
Virendra Mehta
Matthew Blumberg
Victor May
Huu Nguyen
S. Pyysalo
LRM
21
7
0
30 Mar 2024
IRCoder: Intermediate Representations Make Language Models Robust Multilingual Code Generators
Indraneil Paul
Goran Glavas
Iryna Gurevych
35
12
0
06 Mar 2024
HumanEval-XL: A Multilingual Code Generation Benchmark for Cross-lingual Natural Language Generalization
Qiwei Peng
Yekun Chai
Xuhong Li
ELM
LM&MA
34
34
0
26 Feb 2024
Code Needs Comments: Enhancing Code LLMs with Comment Augmentation
Demin Song
Honglin Guo
Yunhua Zhou
Shuhao Xing
Yudong Wang
...
Wenwei Zhang
Qipeng Guo
Hang Yan
Xipeng Qiu
Dahua Lin
SyDa
50
8
0
20 Feb 2024
Deep Learning for Code Intelligence: Survey, Benchmark and Toolkit
Yao Wan
Yang He
Zhangqian Bi
Jianguo Zhang
Hongyu Zhang
Yulei Sui
Guandong Xu
Hai Jin
Philip S. Yu
20
20
0
30 Dec 2023
Safurai-Csharp: Harnessing Synthetic Data to improve language-specific Code LLM
Davide Cifarelli
Leonardo Boiardi
Alessandro Puppo
Leon Jovanovic
SyDa
13
1
0
06 Nov 2023
Zero-Shot Detection of Machine-Generated Codes
Xianjun Yang
Kexun Zhang
Haifeng Chen
Linda R. Petzold
William Yang Wang
Wei Cheng
DeLMO
16
11
0
08 Oct 2023
Tool-Augmented Reward Modeling
Lei Li
Yekun Chai
Shuohuan Wang
Yu Sun
Hao Tian
Ningyu Zhang
Hua-Hong Wu
OffRL
38
13
0
02 Oct 2023
Safurai 001: New Qualitative Approach for Code LLM Evaluation
Davide Cifarelli
Leonardo Boiardi
Alessandro Puppo
ELM
19
0
0
20 Sep 2023
PanGu-Coder2: Boosting Large Language Models for Code with Ranking Feedback
Bo Shen
Jiaxin Zhang
Taihong Chen
Daoguang Zan
Bing Geng
...
Ailun Yu
Jichuan Ji
Jingyang Zhao
Yuenan Guo
Qianxiang Wang
ALM
ELM
25
73
0
27 Jul 2023
Large Language Models Meet NL2Code: A Survey
Daoguang Zan
B. Chen
Fengji Zhang
Di Lu
Bingchao Wu
Bei Guan
Yongji Wang
Jian-Guang Lou
ELM
ALM
26
167
0
19 Dec 2022
A Systematic Evaluation of Large Language Models of Code
Frank F. Xu
Uri Alon
Graham Neubig
Vincent J. Hellendoorn
ELM
ALM
202
628
0
26 Feb 2022
CodeT5: Identifier-aware Unified Pre-trained Encoder-Decoder Models for Code Understanding and Generation
Yue Wang
Weishi Wang
Shafiq R. Joty
S. Hoi
210
1,485
0
02 Sep 2021
CodeXGLUE: A Machine Learning Benchmark Dataset for Code Understanding and Generation
Shuai Lu
Daya Guo
Shuo Ren
Junjie Huang
Alexey Svyatkovskiy
...
Nan Duan
Neel Sundaresan
Shao Kun Deng
Shengyu Fu
Shujie Liu
ELM
196
853
0
09 Feb 2021
Google's Neural Machine Translation System: Bridging the Gap between Human and Machine Translation
Yonghui Wu
M. Schuster
Z. Chen
Quoc V. Le
Mohammad Norouzi
...
Alex Rudnick
Oriol Vinyals
G. Corrado
Macduff Hughes
J. Dean
AIMat
716
6,724
0
26 Sep 2016
1