CodeXGLUE: A Machine Learning Benchmark Dataset for Code Understanding and Generation
Shuai Lu
Daya Guo
Shuo Ren
Junjie Huang
Alexey Svyatkovskiy
Ambrosio Blanco
Colin B. Clement
Dawn Drain
Daxin Jiang
Duyu Tang
Ge Li
Lidong Zhou
Linjun Shou
Long Zhou
Michele Tufano
Ming Gong
Ming Zhou
Nan Duan
Neel Sundaresan
Shao Kun Deng
Shengyu Fu
Shujie Liu

Abstract
Benchmark datasets have a significant impact on accelerating research in programming language tasks. In this paper, we introduce CodeXGLUE, a benchmark dataset to foster machine learning research for program understanding and generation. CodeXGLUE includes a collection of 10 tasks across 14 datasets and a platform for model evaluation and comparison. CodeXGLUE also features three baseline systems, including the BERT-style, GPT-style, and Encoder-Decoder models, to make it easy for researchers to use the platform. The availability of such data and baselines can help the development and validation of new methods that can be applied to various program understanding and generation problems.
View on arXivComments on this paper