ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2107.03374
  4. Cited By
Evaluating Large Language Models Trained on Code

Evaluating Large Language Models Trained on Code

7 July 2021
Mark Chen
Jerry Tworek
Heewoo Jun
Qiming Yuan
Henrique Pondé
Jared Kaplan
Harrison Edwards
Yura Burda
Nicholas Joseph
Greg Brockman
Alex Ray
Raul Puri
Gretchen Krueger
Michael Petrov
Heidy Khlaaf
Girish Sastry
Pamela Mishkin
Brooke Chan
Scott Gray
Nick Ryder
Mikhail Pavlov
Alethea Power
Lukasz Kaiser
Mohammad Bavarian
Clemens Winter
Philippe Tillet
F. Such
D. Cummings
Matthias Plappert
Fotios Chantzis
Elizabeth Barnes
Ariel Herbert-Voss
William H. Guss
Alex Nichol
Alex Paino
Nikolas Tezak
Jie Tang
Igor Babuschkin
S. Balaji
Shantanu Jain
William Saunders
Christopher Hesse
A. Carr
Jan Leike
Joshua Achiam
Vedant Misra
Evan Morikawa
Alec Radford
Matthew Knight
Miles Brundage
Mira Murati
Katie Mayer
Peter Welinder
Bob McGrew
Dario Amodei
Sam McCandlish
Ilya Sutskever
Wojciech Zaremba
    ELM
    ALM
ArXivPDFHTML

Papers citing "Evaluating Large Language Models Trained on Code"

50 / 893 papers shown
Title
The Vault: A Comprehensive Multilingual Dataset for Advancing Code
  Understanding and Generation
The Vault: A Comprehensive Multilingual Dataset for Advancing Code Understanding and Generation
Dũng Nguyễn Mạnh
Nam Le Hai
An Dau
A. Nguyen
Khanh N. Nghiem
Jingnan Guo
Nghi D. Q. Bui
26
15
0
09 May 2023
Distilling Script Knowledge from Large Language Models for Constrained
  Language Planning
Distilling Script Knowledge from Large Language Models for Constrained Language Planning
Siyu Yuan
Jiangjie Chen
Ziquan Fu
Xuyang Ge
Soham Shah
C. R. Jankowski
Yanghua Xiao
Deqing Yang
43
47
0
09 May 2023
The EarlyBIRD Catches the Bug: On Exploiting Early Layers of Encoder
  Models for More Efficient Code Classification
The EarlyBIRD Catches the Bug: On Exploiting Early Layers of Encoder Models for More Efficient Code Classification
Anastasiia Grishina
Max Hort
Leon Moonen
22
6
0
08 May 2023
Augmented Large Language Models with Parametric Knowledge Guiding
Augmented Large Language Models with Parametric Knowledge Guiding
Ziyang Luo
Can Xu
Pu Zhao
Xiubo Geng
Chongyang Tao
Jing Ma
Qingwei Lin
Daxin Jiang
KELM
RALM
37
44
0
08 May 2023
Can LLM Already Serve as A Database Interface? A BIg Bench for
  Large-Scale Database Grounded Text-to-SQLs
Can LLM Already Serve as A Database Interface? A BIg Bench for Large-Scale Database Grounded Text-to-SQLs
Jinyang Li
Binyuan Hui
Ge Qu
Jiaxi Yang
Binhua Li
...
Guoliang Li
Kevin C. C. Chang
Fei Huang
Reynold Cheng
Yongbin Li
LMTD
36
356
0
04 May 2023
Beyond Prompts: Exploring the Design Space of Mixed-Initiative
  Co-Creativity Systems
Beyond Prompts: Exploring the Design Space of Mixed-Initiative Co-Creativity Systems
Zhiyu Lin
Upol Ehsan
Rohan Agarwal
Samihan Dani
Vidushi Vashishth
Mark O. Riedl
18
20
0
03 May 2023
CodeGen2: Lessons for Training LLMs on Programming and Natural Languages
CodeGen2: Lessons for Training LLMs on Programming and Natural Languages
Erik Nijkamp
A. Ghobadzadeh
Caiming Xiong
Silvio Savarese
Yingbo Zhou
149
164
0
03 May 2023
Semantic Compression With Large Language Models
Semantic Compression With Large Language Models
Henry Gilbert
Michael Sandborn
Douglas C. Schmidt
Jesse Spencer-Smith
Jules White
14
22
0
25 Apr 2023
Emergent and Predictable Memorization in Large Language Models
Emergent and Predictable Memorization in Large Language Models
Stella Biderman
USVSN Sai Prashanth
Lintang Sutawika
Hailey Schoelkopf
Quentin G. Anthony
Shivanshu Purohit
Edward Raf
24
116
0
21 Apr 2023
Smart Learning to Find Dumb Contracts (Extended Version)
Smart Learning to Find Dumb Contracts (Extended Version)
Tamer Abdelaziz
Aquinas Hobor
18
5
0
21 Apr 2023
Progressive-Hint Prompting Improves Reasoning in Large Language Models
Progressive-Hint Prompting Improves Reasoning in Large Language Models
Chuanyang Zheng
Zhengying Liu
Enze Xie
Zhenguo Li
Yu Li
LLMAG
ReLM
LRM
32
102
0
19 Apr 2023
A Comprehensive Evaluation of Neural SPARQL Query Generation from
  Natural Language Questions
A Comprehensive Evaluation of Neural SPARQL Query Generation from Natural Language Questions
Papa Abdou Karim Karou Diallo
Samuel Reyd
Amal Zouaq
11
6
0
16 Apr 2023
Stochastic Code Generation
Stochastic Code Generation
Swapnil Sharma
Nikita Anand
V. KranthiKiranG.
SyDa
22
0
0
14 Apr 2023
Evaluation of ChatGPT Model for Vulnerability Detection
Evaluation of ChatGPT Model for Vulnerability Detection
Anton Cheshkov
Pavel Zadorozhny
Rodion Levichev
13
68
0
12 Apr 2023
Multi-step Jailbreaking Privacy Attacks on ChatGPT
Multi-step Jailbreaking Privacy Attacks on ChatGPT
Haoran Li
Dadi Guo
Wei Fan
Mingshi Xu
Jie Huang
Fanpu Meng
Yangqiu Song
SILM
47
321
0
11 Apr 2023
Bayesian Optimization of Catalysis With In-Context Learning
Bayesian Optimization of Catalysis With In-Context Learning
M. C. Ramos
Shane S. Michtavy
Marc D. Porosoff
Andrew D. White
BDL
44
30
0
11 Apr 2023
Scallop: A Language for Neurosymbolic Programming
Scallop: A Language for Neurosymbolic Programming
Ziyang Li
Jiani Huang
Mayur Naik
ReLM
LRM
NAI
21
30
0
10 Apr 2023
Automated Reading Passage Generation with OpenAI's Large Language Model
Automated Reading Passage Generation with OpenAI's Large Language Model
Ummugul Bezirhan
M. Davier
AI4Ed
19
23
0
10 Apr 2023
Comparing Code Explanations Created by Students and Large Language
  Models
Comparing Code Explanations Created by Students and Large Language Models
Juho Leinonen
Paul Denny
Stephen MacNeil
Sami Sarsa
Seth Bernstein
Joanne Kim
Andrew Tran
Arto Hellas
LRM
33
146
0
08 Apr 2023
Towards Generating Functionally Correct Code Edits from Natural Language
  Issue Descriptions
Towards Generating Functionally Correct Code Edits from Natural Language Issue Descriptions
Sarah Fakhoury
Saikat Chakraborty
Madan Musuvathi
Shuvendu K. Lahiri
38
21
0
07 Apr 2023
"It's Weird That it Knows What I Want": Usability and Interactions with
  Copilot for Novice Programmers
"It's Weird That it Knows What I Want": Usability and Interactions with Copilot for Novice Programmers
James Prather
B. Reeves
Paul Denny
Brett A. Becker
Juho Leinonen
Andrew Luxton-Reilly
Garrett B. Powell
James Finnie-Ansley
E. Santos
31
131
0
05 Apr 2023
Document-Level Machine Translation with Large Language Models
Document-Level Machine Translation with Large Language Models
Longyue Wang
Chenyang Lyu
Tianbo Ji
Zhirui Zhang
Dian Yu
Shuming Shi
Zhaopeng Tu
ELM
21
115
0
05 Apr 2023
Scientists' Perspectives on the Potential for Generative AI in their
  Fields
Scientists' Perspectives on the Potential for Generative AI in their Fields
Meredith Ringel Morris
AI4CE
25
38
0
04 Apr 2023
CodeGeeX: A Pre-Trained Model for Code Generation with Multilingual
  Benchmarking on HumanEval-X
CodeGeeX: A Pre-Trained Model for Code Generation with Multilingual Benchmarking on HumanEval-X
Qinkai Zheng
Xiao Xia
Xu Zou
Yuxiao Dong
Shanshan Wang
...
Andi Wang
Yang Li
Teng Su
Zhilin Yang
Jie Tang
ELM
ALM
SyDa
52
316
0
30 Mar 2023
Did You Mean...? Confidence-based Trade-offs in Semantic Parsing
Did You Mean...? Confidence-based Trade-offs in Semantic Parsing
Elias Stengel-Eskin
Benjamin Van Durme
18
5
0
29 Mar 2023
On Codex Prompt Engineering for OCL Generation: An Empirical Study
On Codex Prompt Engineering for OCL Generation: An Empirical Study
Seif Abukhalaf
Mohammad Hamdaqa
Foutse Khomh
34
21
0
28 Mar 2023
Can Large Language Models assist in Hazard Analysis?
Can Large Language Models assist in Hazard Analysis?
Simon Diemert
J. Weber
ELM
11
15
0
25 Mar 2023
Towards Understanding the Generalization of Medical Text-to-SQL Models
  and Datasets
Towards Understanding the Generalization of Medical Text-to-SQL Models and Datasets
Richard Tarbell
Kim-Kwang Raymond Choo
Glenn Dietrich
Anthony Rios
15
9
0
22 Mar 2023
RepoCoder: Repository-Level Code Completion Through Iterative Retrieval
  and Generation
RepoCoder: Repository-Level Code Completion Through Iterative Retrieval and Generation
Fengji Zhang
B. Chen
Yue Zhang
Jacky Keung
Jin Liu
Daoguang Zan
Yi Mao
Jian-Guang Lou
Weizhu Chen
25
219
0
22 Mar 2023
A Comprehensive Capability Analysis of GPT-3 and GPT-3.5 Series Models
A Comprehensive Capability Analysis of GPT-3 and GPT-3.5 Series Models
Junjie Ye
Xuanting Chen
Nuo Xu
Can Zu
Zekai Shao
...
Jie Zhou
Siming Chen
Tao Gui
Qi Zhang
Xuanjing Huang
ELM
24
308
0
18 Mar 2023
Generate, Transform, Answer: Question Specific Tool Synthesis for
  Tabular Data
Generate, Transform, Answer: Question Specific Tool Synthesis for Tabular Data
Carlos Gemmell
Jeffrey Stephen Dalton
LMTD
27
13
0
17 Mar 2023
GPTs are GPTs: An Early Look at the Labor Market Impact Potential of
  Large Language Models
GPTs are GPTs: An Early Look at the Labor Market Impact Potential of Large Language Models
Tyna Eloundou
Sam Manning
Pamela Mishkin
Daniel Rock
ELM
30
380
0
17 Mar 2023
ART: Automatic multi-step reasoning and tool-use for large language
  models
ART: Automatic multi-step reasoning and tool-use for large language models
Bhargavi Paranjape
Scott M. Lundberg
Sameer Singh
Hannaneh Hajishirzi
Luke Zettlemoyer
Marco Tulio Ribeiro
KELM
ReLM
LRM
21
140
0
16 Mar 2023
Mirror: A Natural Language Interface for Data Querying, Summarization,
  and Visualization
Mirror: A Natural Language Interface for Data Querying, Summarization, and Visualization
Canwen Xu
Julian McAuley
Penghan Wang
VLM
24
4
0
15 Mar 2023
ViperGPT: Visual Inference via Python Execution for Reasoning
ViperGPT: Visual Inference via Python Execution for Reasoning
Dídac Surís
Sachit Menon
Carl Vondrick
MLLM
LRM
ReLM
45
431
0
14 Mar 2023
How Many Demonstrations Do You Need for In-context Learning?
How Many Demonstrations Do You Need for In-context Learning?
Jiuhai Chen
Lichang Chen
Chen Zhu
Tianyi Zhou
LRM
21
39
0
14 Mar 2023
Exploring ChatGPT's Ability to Rank Content: A Preliminary Study on
  Consistency with Human Preferences
Exploring ChatGPT's Ability to Rank Content: A Preliminary Study on Consistency with Human Preferences
Yunjie Ji
Yan Gong
Yiping Peng
Chao Ni
Peiyan Sun
Dongyu Pan
Baochang Ma
Xiangang Li
ELM
ALM
AI4MH
22
37
0
14 Mar 2023
A Comprehensive Survey of AI-Generated Content (AIGC): A History of
  Generative AI from GAN to ChatGPT
A Comprehensive Survey of AI-Generated Content (AIGC): A History of Generative AI from GAN to ChatGPT
Yihan Cao
Siyu Li
Yixin Liu
Zhiling Yan
Yutong Dai
Philip S. Yu
Lichao Sun
29
506
0
07 Mar 2023
ADELT: Transpilation Between Deep Learning Frameworks
ADELT: Transpilation Between Deep Learning Frameworks
Linyuan Gong
Jiayi Wang
Alvin Cheung
30
3
0
07 Mar 2023
Active Prompting with Chain-of-Thought for Large Language Models
Active Prompting with Chain-of-Thought for Large Language Models
Shizhe Diao
Pengcheng Wang
Yong Lin
Tong Zhang
ReLM
KELM
LLMAG
LRM
29
119
0
23 Feb 2023
Conversational Text-to-SQL: An Odyssey into State-of-the-Art and
  Challenges Ahead
Conversational Text-to-SQL: An Odyssey into State-of-the-Art and Challenges Ahead
S. Parthasarathi
Lu Zeng
Dilek Z. Hakkani-Tür
34
2
0
21 Feb 2023
Bounding the Capabilities of Large Language Models in Open Text
  Generation with Prompt Constraints
Bounding the Capabilities of Large Language Models in Open Text Generation with Prompt Constraints
Albert Lu
Hongxin Zhang
Yanzhe Zhang
Xuezhi Wang
Diyi Yang
LRM
29
28
0
17 Feb 2023
PAC Prediction Sets for Large Language Models of Code
PAC Prediction Sets for Large Language Models of Code
Adam Khakhar
Stephen Mell
Osbert Bastani
20
6
0
17 Feb 2023
Auditing large language models: a three-layered approach
Auditing large language models: a three-layered approach
Jakob Mokander
Jonas Schuett
Hannah Rose Kirk
Luciano Floridi
AILaw
MLAU
42
194
0
16 Feb 2023
Conversational AI-Powered Design: ChatGPT as Designer, User, and Product
Conversational AI-Powered Design: ChatGPT as Designer, User, and Product
A. Kocaballi
19
38
0
15 Feb 2023
Guiding Pretraining in Reinforcement Learning with Large Language Models
Guiding Pretraining in Reinforcement Learning with Large Language Models
Yuqing Du
Olivia Watkins
Zihan Wang
Cédric Colas
Trevor Darrell
Pieter Abbeel
Abhishek Gupta
Jacob Andreas
LM&Ro
21
174
0
13 Feb 2023
Scaffolding Progress: How Structured Editors Shape Novice Errors When
  Transitioning from Blocks to Text
Scaffolding Progress: How Structured Editors Shape Novice Errors When Transitioning from Blocks to Text
Majeed Kazemitabaar
Viktar Chyhir
David Weintrop
Tovi Grossman
14
4
0
11 Feb 2023
CodeBERTScore: Evaluating Code Generation with Pretrained Models of Code
CodeBERTScore: Evaluating Code Generation with Pretrained Models of Code
Shuyan Zhou
Uri Alon
Sumit Agarwal
Graham Neubig
ELM
ALM
29
98
0
10 Feb 2023
ChatGPT and Other Large Language Models as Evolutionary Engines for
  Online Interactive Collaborative Game Design
ChatGPT and Other Large Language Models as Evolutionary Engines for Online Interactive Collaborative Game Design
P. Lanzi
Daniele Loiacono
LLMAG
29
49
0
09 Feb 2023
CrossCodeBench: Benchmarking Cross-Task Generalization of Source Code
  Models
CrossCodeBench: Benchmarking Cross-Task Generalization of Source Code Models
Changan Niu
Chuanyi Li
Vincent Ng
Bin Luo
ELM
ALM
32
9
0
08 Feb 2023
Previous
123...131415161718
Next