Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2107.03374
Cited By
Evaluating Large Language Models Trained on Code
7 July 2021
Mark Chen
Jerry Tworek
Heewoo Jun
Qiming Yuan
Henrique Pondé
Jared Kaplan
Harrison Edwards
Yura Burda
Nicholas Joseph
Greg Brockman
Alex Ray
Raul Puri
Gretchen Krueger
Michael Petrov
Heidy Khlaaf
Girish Sastry
Pamela Mishkin
Brooke Chan
Scott Gray
Nick Ryder
Mikhail Pavlov
Alethea Power
Lukasz Kaiser
Mohammad Bavarian
Clemens Winter
Philippe Tillet
F. Such
D. Cummings
Matthias Plappert
Fotios Chantzis
Elizabeth Barnes
Ariel Herbert-Voss
William H. Guss
Alex Nichol
Alex Paino
Nikolas Tezak
Jie Tang
Igor Babuschkin
S. Balaji
Shantanu Jain
William Saunders
Christopher Hesse
A. Carr
Jan Leike
Joshua Achiam
Vedant Misra
Evan Morikawa
Alec Radford
Matthew Knight
Miles Brundage
Mira Murati
Katie Mayer
Peter Welinder
Bob McGrew
Dario Amodei
Sam McCandlish
Ilya Sutskever
Wojciech Zaremba
ELM
ALM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Evaluating Large Language Models Trained on Code"
50 / 868 papers shown
Title
Towards Reasoning in Large Language Models: A Survey
Jie Huang
Kevin Chen-Chuan Chang
LM&MA
ELM
LRM
27
580
0
20 Dec 2022
Don't Generate, Discriminate: A Proposal for Grounding Language Models to Real-World Environments
Yu Gu
Xiang Deng
Yu-Chuan Su
LLMAG
28
52
0
19 Dec 2022
Reasoning with Language Model Prompting: A Survey
Shuofei Qiao
Yixin Ou
Ningyu Zhang
Xiang Chen
Yunzhi Yao
Shumin Deng
Chuanqi Tan
Fei Huang
Huajun Chen
ReLM
ELM
LRM
56
310
0
19 Dec 2022
Natural Language to Code Generation in Interactive Data Science Notebooks
Pengcheng Yin
Wen-Ding Li
Kefan Xiao
Abhishek Rao
Yeming Wen
...
Paige Bailey
Michele Catasta
Henryk Michalewski
Oleksandr Polozov
Charles Sutton
25
56
0
19 Dec 2022
Emergent Analogical Reasoning in Large Language Models
Taylor W. Webb
K. Holyoak
Hongjing Lu
ReLM
ELM
LRM
AI4CE
38
291
0
19 Dec 2022
JEMMA: An Extensible Java Dataset for ML4Code Applications
Anjan Karmakar
Miltiadis Allamanis
Romain Robbes
VLM
21
3
0
18 Dec 2022
Chatbots in a Botnet World
Forrest McKee
David A. Noever
13
26
0
18 Dec 2022
Improving Cross-task Generalization of Unified Table-to-text Models with Compositional Task Configurations
Jifan Chen
Yuhao Zhang
Lan Liu
Rui Dong
Xinchi Chen
Patrick K. L. Ng
William Yang Wang
Zhiheng Huang
AI4CE
24
4
0
17 Dec 2022
Plansformer: Generating Symbolic Plans using Transformers
Vishal Pallagani
Bharath Muppasani
K. Murugesan
F. Rossi
L. Horesh
Biplav Srivastava
F. Fabiano
Andrea Loreggia
LM&Ro
LLMAG
OffRL
15
35
0
16 Dec 2022
ERNIE-Code: Beyond English-Centric Cross-lingual Pretraining for Programming Languages
Yekun Chai
Shuohuan Wang
Chao Pang
Yu Sun
Hao Tian
Hua-Hong Wu
24
35
0
13 Dec 2022
Robust and Explainable Identification of Logical Fallacies in Natural Language Arguments
Zhivar Sourati
Vishnu Priya Prasanna Venkatesh
D. Deshpande
Himanshu Rawlani
Filip Ilievski
Hông-Ân Sandlin
Alain Mermoud
AAML
31
20
0
12 Dec 2022
A Survey on Natural Language Processing for Programming
Qingfu Zhu
Xianzhen Luo
Fang Liu
Cuiyun Gao
Wanxiang Che
23
1
0
12 Dec 2022
I2MVFormer: Large Language Model Generated Multi-View Document Supervision for Zero-Shot Image Classification
Muhammad Ferjad Naeem
Muhammad Gul Zain Ali Khan
Yongqin Xian
Muhammad Zeshan Afzal
D. Stricker
Luc Van Gool
F. Tombari
VLM
27
51
0
05 Dec 2022
BudgetLongformer: Can we Cheaply Pretrain a SotA Legal Language Model From Scratch?
Joel Niklaus
Daniele Giofré
27
11
0
30 Nov 2022
Coder Reviewer Reranking for Code Generation
Tianyi Zhang
Tao Yu
Tatsunori B. Hashimoto
M. Lewis
Wen-tau Yih
Daniel Fried
Sida I. Wang
33
92
0
29 Nov 2022
Deep-Learning-based Vulnerability Detection in Binary Executables
A. Schaad
Dominik Binder
22
8
0
25 Nov 2022
GitHub Considered Harmful? Analyzing Open-Source Projects for the Automatic Generation of Cryptographic API Call Sequences
Catherine Tony
Nicolás E. Díaz Ferreyra
Riccardo Scandariato
11
4
0
24 Nov 2022
Program of Thoughts Prompting: Disentangling Computation from Reasoning for Numerical Reasoning Tasks
Wenhu Chen
Xueguang Ma
Xinyi Wang
William W. Cohen
ReLM
ReCod
LRM
61
732
0
22 Nov 2022
Exploring the Efficacy of Pre-trained Checkpoints in Text-to-Music Generation Task
Shangda Wu
Maosong Sun
17
20
0
21 Nov 2022
A Copy Mechanism for Handling Knowledge Base Elements in SPARQL Neural Machine Translation
Rose Hirigoyen
Amal Zouaq
Samuel Reyd
21
4
0
18 Nov 2022
Execution-based Evaluation for Data Science Code Generation Models
Junjie Huang
Chenglong Wang
Jipeng Zhang
Cong Yan
Haotian Cui
J. Inala
Colin B. Clement
Nan Duan
Jianfeng Gao
ELM
30
35
0
17 Nov 2022
GAMMT: Generative Ambiguity Modeling Using Multiple Transformers
Xingcheng Xu
22
0
0
16 Nov 2022
On the Compositional Generalization Gap of In-Context Learning
Arian Hosseini
Ankit Vani
Dzmitry Bahdanau
Alessandro Sordoni
Aaron C. Courville
19
24
0
15 Nov 2022
Evaluating How Fine-tuning on Bimodal Data Effects Code Generation
Gabriel Orlanski
Seonhye Yang
Michael Healy
ALM
21
5
0
15 Nov 2022
Logical Tasks for Measuring Extrapolation and Rule Comprehension
Ippei Fujisawa
Ryota Kanai
ELM
LRM
20
4
0
14 Nov 2022
Calibrated Interpretation: Confidence Estimation in Semantic Parsing
Elias Stengel-Eskin
Benjamin Van Durme
UQLM
37
24
0
14 Nov 2022
Metaphors We Learn By
Roland Memisevic
19
0
0
11 Nov 2022
BLOOM: A 176B-Parameter Open-Access Multilingual Language Model
BigScience Workshop
:
Teven Le Scao
Angela Fan
Christopher Akiki
...
Zhongli Xie
Zifan Ye
M. Bras
Younes Belkada
Thomas Wolf
VLM
101
2,306
0
09 Nov 2022
Do Users Write More Insecure Code with AI Assistants?
Neil Perry
Megha Srivastava
Deepak Kumar
Dan Boneh
ELM
AAML
17
166
0
07 Nov 2022
Experiences from Using Code Explanations Generated by Large Language Models in a Web Software Development E-Book
Stephen MacNeil
Andrew Tran
Arto Hellas
Joanne Kim
Sami Sarsa
Paul Denny
Seth Bernstein
Juho Leinonen
33
176
0
04 Nov 2022
Large Language Models Are Human-Level Prompt Engineers
Yongchao Zhou
Andrei Ioan Muresanu
Ziwen Han
Keiran Paster
Silviu Pitis
Harris Chan
Jimmy Ba
ALM
LLMAG
16
829
0
03 Nov 2022
MPCFormer: fast, performant and private Transformer inference with MPC
Dacheng Li
Rulin Shao
Hongyi Wang
Han Guo
Eric P. Xing
Haotong Zhang
13
79
0
02 Nov 2022
Emergent Linguistic Structures in Neural Networks are Fragile
Emanuele La Malfa
Matthew Wicker
Marta Kiatkowska
15
1
0
31 Oct 2022
A Simple, Yet Effective Approach to Finding Biases in Code Generation
Spyridon Mouselinos
Mateusz Malinowski
Henryk Michalewski
10
7
0
31 Oct 2022
A Solvable Model of Neural Scaling Laws
A. Maloney
Daniel A. Roberts
J. Sully
31
51
0
30 Oct 2022
Multi-Viewpoint and Multi-Evaluation with Felicitous Inductive Bias Boost Machine Abstract Reasoning Ability
Qinglai Wei
Diancheng Chen
Beiming Yuan
32
10
0
26 Oct 2022
Piloting Copilot and Codex: Hot Temperature, Cold Prompts, or Black Magic?
Jean-Baptiste Döderlein
M. Acher
D. Khelladi
B. Combemale
34
33
0
26 Oct 2022
Reading Between the Lines: Modeling User Behavior and Costs in AI-Assisted Programming
Hussein Mozannar
Gagan Bansal
Adam Fourney
Eric Horvitz
49
109
0
25 Oct 2022
Towards Better Few-Shot and Finetuning Performance with Forgetful Causal Language Models
Hao Liu
Xinyang Geng
Lisa Lee
Igor Mordatch
Sergey Levine
Sharan Narang
Pieter Abbeel
KELM
CLL
33
2
0
24 Oct 2022
When Can Transformers Ground and Compose: Insights from Compositional Generalization Benchmarks
Ankur Sikarwar
Arkil Patel
Navin Goyal
ViT
30
10
0
23 Oct 2022
Draft, Sketch, and Prove: Guiding Formal Theorem Provers with Informal Proofs
Albert Q. Jiang
Sean Welleck
Jin Peng Zhou
Wenda Li
Jiacheng Liu
M. Jamnik
Timothée Lacroix
Yuhuai Wu
Guillaume Lample
AIMat
65
157
0
21 Oct 2022
Graphically Structured Diffusion Models
Christian Weilbach
William Harvey
Frank D. Wood
DiffM
32
7
0
20 Oct 2022
ObSynth: An Interactive Synthesis System for Generating Object Models from Natural Language Specifications
Alex Gu
Tamara Mitrovska
D. Vélez
Jacob Andreas
Armando Solar-Lezama
SyDa
25
1
0
20 Oct 2022
Scaling Instruction-Finetuned Language Models
Hyung Won Chung
Le Hou
Shayne Longpre
Barret Zoph
Yi Tay
...
Jacob Devlin
Adam Roberts
Denny Zhou
Quoc V. Le
Jason W. Wei
ReLM
LRM
62
2,987
0
20 Oct 2022
Transformers Learn Shortcuts to Automata
Bingbin Liu
Jordan T. Ash
Surbhi Goel
A. Krishnamurthy
Cyril Zhang
OffRL
LRM
34
155
0
19 Oct 2022
Challenging BIG-Bench Tasks and Whether Chain-of-Thought Can Solve Them
Mirac Suzgun
Nathan Scales
Nathanael Scharli
Sebastian Gehrmann
Yi Tay
...
Aakanksha Chowdhery
Quoc V. Le
Ed H. Chi
Denny Zhou
Jason W. Wei
ALM
ELM
LRM
ReLM
74
996
0
17 Oct 2022
CAB: Comprehensive Attention Benchmarking on Long Sequence Modeling
Jinchao Zhang
Shuyang Jiang
Jiangtao Feng
Lin Zheng
Lingpeng Kong
3DV
39
9
0
14 Oct 2022
Machine Generated Text: A Comprehensive Survey of Threat Models and Detection Methods
Evan Crothers
Nathalie Japkowicz
H. Viktor
DeLMO
32
107
0
13 Oct 2022
AlphaTuning: Quantization-Aware Parameter-Efficient Adaptation of Large-Scale Pre-Trained Language Models
S. Kwon
Jeonghoon Kim
Jeongin Bae
Kang Min Yoo
Jin-Hwa Kim
Baeseong Park
Byeongwook Kim
Jung-Woo Ha
Nako Sung
Dongsoo Lee
MQ
23
30
0
08 Oct 2022
Automatic Chain of Thought Prompting in Large Language Models
Zhuosheng Zhang
Aston Zhang
Mu Li
Alexander J. Smola
ReLM
LRM
47
575
0
07 Oct 2022
Previous
1
2
3
...
14
15
16
17
18
Next