ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2107.03374
  4. Cited By
Evaluating Large Language Models Trained on Code

Evaluating Large Language Models Trained on Code

7 July 2021
Mark Chen
Jerry Tworek
Heewoo Jun
Qiming Yuan
Henrique Pondé
Jared Kaplan
Harrison Edwards
Yura Burda
Nicholas Joseph
Greg Brockman
Alex Ray
Raul Puri
Gretchen Krueger
Michael Petrov
Heidy Khlaaf
Girish Sastry
Pamela Mishkin
Brooke Chan
Scott Gray
Nick Ryder
Mikhail Pavlov
Alethea Power
Lukasz Kaiser
Mohammad Bavarian
Clemens Winter
Philippe Tillet
F. Such
D. Cummings
Matthias Plappert
Fotios Chantzis
Elizabeth Barnes
Ariel Herbert-Voss
William H. Guss
Alex Nichol
Alex Paino
Nikolas Tezak
Jie Tang
Igor Babuschkin
S. Balaji
Shantanu Jain
William Saunders
Christopher Hesse
A. Carr
Jan Leike
Joshua Achiam
Vedant Misra
Evan Morikawa
Alec Radford
Matthew Knight
Miles Brundage
Mira Murati
Katie Mayer
Peter Welinder
Bob McGrew
Dario Amodei
Sam McCandlish
Ilya Sutskever
Wojciech Zaremba
    ELM
    ALM
ArXivPDFHTML

Papers citing "Evaluating Large Language Models Trained on Code"

50 / 944 papers shown
Title
We Have a Package for You! A Comprehensive Analysis of Package Hallucinations by Code Generating LLMs
We Have a Package for You! A Comprehensive Analysis of Package Hallucinations by Code Generating LLMs
Joseph Spracklen
Raveen Wijewickrama
A. H. M. N. Sakib
Anindya Maiti
Murtuza Jadliwala
Murtuza Jadliwala
45
10
0
12 Jun 2024
DCA-Bench: A Benchmark for Dataset Curation Agents
DCA-Bench: A Benchmark for Dataset Curation Agents
Benhao Huang
Yingzhuo Yu
Jin Huang
Xingjian Zhang
Jiaqi Ma
33
1
0
11 Jun 2024
Validating LLM-Generated Programs with Metamorphic Prompt Testing
Validating LLM-Generated Programs with Metamorphic Prompt Testing
Xiaoyin Wang
Dakai Zhu
38
3
0
11 Jun 2024
Scaling Large Language Model-based Multi-Agent Collaboration
Scaling Large Language Model-based Multi-Agent Collaboration
Chen Qian
Zihao Xie
YiFei Wang
Wei Liu
Yufan Dang
...
Zhuoyun Du
Weize Chen
Cheng Yang
Zhiyuan Liu
Maosong Sun
AI4CE
LLMAG
LM&Ro
61
45
0
11 Jun 2024
Improving Autoformalization using Type Checking
Improving Autoformalization using Type Checking
Auguste Poiroux
Gail Weiss
Viktor Kunčak
Antoine Bosselut
44
2
0
11 Jun 2024
Samba: Simple Hybrid State Space Models for Efficient Unlimited Context Language Modeling
Samba: Simple Hybrid State Space Models for Efficient Unlimited Context Language Modeling
Liliang Ren
Yang Liu
Yadong Lu
Yelong Shen
Chen Liang
Weizhu Chen
Mamba
74
56
0
11 Jun 2024
An LLM-Assisted Easy-to-Trigger Backdoor Attack on Code Completion
  Models: Injecting Disguised Vulnerabilities against Strong Detection
An LLM-Assisted Easy-to-Trigger Backdoor Attack on Code Completion Models: Injecting Disguised Vulnerabilities against Strong Detection
Shenao Yan
Shen Wang
Yue Duan
Hanbin Hong
Kiho Lee
Doowon Kim
Yuan Hong
AAML
SILM
43
17
0
10 Jun 2024
LGR2: Language Guided Reward Relabeling for Accelerating Hierarchical Reinforcement Learning
LGR2: Language Guided Reward Relabeling for Accelerating Hierarchical Reinforcement Learning
Utsav Singh
Pramit Bhattacharyya
Vinay P. Namboodiri
LM&Ro
47
1
0
09 Jun 2024
Is On-Device AI Broken and Exploitable? Assessing the Trust and Ethics in Small Language Models
Is On-Device AI Broken and Exploitable? Assessing the Trust and Ethics in Small Language Models
Kalyan Nakka
Jimmy Dani
Nitesh Saxena
45
1
0
08 Jun 2024
Hints-In-Browser: Benchmarking Language Models for Programming Feedback Generation
Hints-In-Browser: Benchmarking Language Models for Programming Feedback Generation
Nachiket Kotalwar
Alkis Gotovos
Adish Singla
ALM
59
4
0
07 Jun 2024
Learning Task Decomposition to Assist Humans in Competitive Programming
Learning Task Decomposition to Assist Humans in Competitive Programming
Jiaxin Wen
Ruiqi Zhong
Pei Ke
Zhihong Shao
Hongning Wang
Minlie Huang
ReLM
37
8
0
07 Jun 2024
CorDA: Context-Oriented Decomposition Adaptation of Large Language Models for Task-Aware Parameter-Efficient Fine-tuning
CorDA: Context-Oriented Decomposition Adaptation of Large Language Models for Task-Aware Parameter-Efficient Fine-tuning
Yibo Yang
Xiaojie Li
Zhongzhu Zhou
Shuaiwen Leon Song
Jianlong Wu
Liqiang Nie
Guohao Li
45
6
0
07 Jun 2024
Benchmark Data Contamination of Large Language Models: A Survey
Benchmark Data Contamination of Large Language Models: A Survey
Cheng Xu
Shuhao Guan
Derek Greene
Mohand-Tahar Kechadi
ELM
ALM
38
39
0
06 Jun 2024
Aligning Agents like Large Language Models
Aligning Agents like Large Language Models
Adam Jelley
Yuhan Cao
Dave Bignell
Sam Devlin
Tabish Rashid
LM&Ro
41
1
0
06 Jun 2024
HelloFresh: LLM Evaluations on Streams of Real-World Human Editorial
  Actions across X Community Notes and Wikipedia edits
HelloFresh: LLM Evaluations on Streams of Real-World Human Editorial Actions across X Community Notes and Wikipedia edits
Tim Franzmeyer
Aleksandar Shtedritski
Samuel Albanie
Philip H. S. Torr
João F. Henriques
Jakob N. Foerster
27
1
0
05 Jun 2024
kNN Classification of Malware Data Dependency Graph Features
kNN Classification of Malware Data Dependency Graph Features
John Musgrave
Anca L. Ralescu
23
0
0
04 Jun 2024
S3D: A Simple and Cost-Effective Self-Speculative Decoding Scheme for
  Low-Memory GPUs
S3D: A Simple and Cost-Effective Self-Speculative Decoding Scheme for Low-Memory GPUs
Wei Zhong
Manasa Bharadwaj
47
5
0
30 May 2024
A Survey Study on the State of the Art of Programming Exercise
  Generation using Large Language Models
A Survey Study on the State of the Art of Programming Exercise Generation using Large Language Models
Eduard Frankford
Ingo Höhn
Clemens Sauerwein
Ruth Breu
ELM
39
2
0
30 May 2024
Robo-Instruct: Simulator-Augmented Instruction Alignment For Finetuning Code LLMs
Robo-Instruct: Simulator-Augmented Instruction Alignment For Finetuning Code LLMs
Zichao Hu
Junyi Jessy Li
Arjun Guha
Joydeep Biswas
SyDa
ALM
51
1
0
30 May 2024
Kotlin ML Pack: Technical Report
Kotlin ML Pack: Technical Report
Sergey Titov
Mikhail Evtikhiev
Anton Shapkin
Oleg Smirnov
Sergei Boytsov
...
Dariia Karaeva
Maksim Sheptyakov
Mikhail Arkhipov
T. Bryksin
Egor Bogomolov
32
0
0
29 May 2024
Adaptive In-conversation Team Building for Language Model Agents
Adaptive In-conversation Team Building for Language Model Agents
Linxin Song
Jiale Liu
Jieyu Zhang
Shaokun Zhang
Ao Luo
Shijian Wang
Qingyun Wu
Chi Wang
LLMAG
71
10
0
29 May 2024
Cognitive Insights and Stable Coalition Matching for Fostering Multi-Agent Cooperation
Cognitive Insights and Stable Coalition Matching for Fostering Multi-Agent Cooperation
Jiaqi Shao
Tianjun Yuan
Tao Lin
Xuanyu Cao
50
0
0
28 May 2024
ReMoDetect: Reward Models Recognize Aligned LLM's Generations
ReMoDetect: Reward Models Recognize Aligned LLM's Generations
Hyunseok Lee
Jihoon Tack
Jinwoo Shin
DeLMO
40
0
0
27 May 2024
ReflectionCoder: Learning from Reflection Sequence for Enhanced One-off
  Code Generation
ReflectionCoder: Learning from Reflection Sequence for Enhanced One-off Code Generation
Houxing Ren
Mingjie Zhan
Zhongyuan Wu
Aojun Zhou
Junting Pan
Hongsheng Li
SyDa
39
7
0
27 May 2024
Code Repair with LLMs gives an Exploration-Exploitation Tradeoff
Code Repair with LLMs gives an Exploration-Exploitation Tradeoff
Hao Tang
Keya Hu
Jin Peng Zhou
Sicheng Zhong
Wei-Long Zheng
Xujie Si
Kevin Ellis
39
13
0
26 May 2024
ChatGPT Code Detection: Techniques for Uncovering the Source of Code
ChatGPT Code Detection: Techniques for Uncovering the Source of Code
Marc Oedingen
Raphael C. Engelhardt
Robin Denz
Maximilian Hammer
Wolfgang Konen
DeLMO
45
8
0
24 May 2024
A Misleading Gallery of Fluid Motion by Generative Artificial
  Intelligence
A Misleading Gallery of Fluid Motion by Generative Artificial Intelligence
Ali Kashefi
VGen
48
5
0
24 May 2024
From Frege to chatGPT: Compositionality in language, cognition, and deep
  neural networks
From Frege to chatGPT: Compositionality in language, cognition, and deep neural networks
Jacob Russin
Sam Whitman McGrath
Danielle J. Williams
Lotem Elber-Dorozko
AI4CE
73
3
0
24 May 2024
Proving Theorems Recursively
Proving Theorems Recursively
Haiming Wang
Huajian Xin
Zhengying Liu
Wenda Li
Yinya Huang
...
Zhicheng YANG
Jing Tang
Jian Yin
Zhenguo Li
Xiaodan Liang
LRM
38
12
0
23 May 2024
A Survey on Vision-Language-Action Models for Embodied AI
A Survey on Vision-Language-Action Models for Embodied AI
Yueen Ma
Zixing Song
Yuzheng Zhuang
Jianye Hao
Irwin King
LM&Ro
82
42
0
23 May 2024
AndroidWorld: A Dynamic Benchmarking Environment for Autonomous Agents
AndroidWorld: A Dynamic Benchmarking Environment for Autonomous Agents
Christopher Rawles
Sarah Clinckemaillie
Yifan Chang
Jonathan Waltz
Gabrielle Lau
...
Daniel Toyama
Robert Berry
Divya Tyamagundlu
Timothy Lillicrap
Oriana Riva
LLMAG
69
44
0
23 May 2024
Large Language Models Meet NLP: A Survey
Large Language Models Meet NLP: A Survey
Libo Qin
Qiguang Chen
Xiachong Feng
Yang Wu
Yongheng Zhang
Hai-Tao Zheng
Min Li
Wanxiang Che
Philip S. Yu
ALM
LM&MA
ELM
LRM
52
47
0
21 May 2024
PyramidInfer: Pyramid KV Cache Compression for High-throughput LLM
  Inference
PyramidInfer: Pyramid KV Cache Compression for High-throughput LLM Inference
Dongjie Yang
Xiaodong Han
Yan Gao
Yao Hu
Shilin Zhang
Hai Zhao
41
51
0
21 May 2024
Scientific Hypothesis Generation by a Large Language Model: Laboratory Validation in Breast Cancer Treatment
Scientific Hypothesis Generation by a Large Language Model: Laboratory Validation in Breast Cancer Treatment
A. Abdel-Rehim
Hector Zenil
Oghenejokpeme I. Orhobor
Marie Fisher
Ross J. Collins
...
Gareth W. Fearnley
Emma Tate
Holly X. Smith
Larisa B. Soldatova
Ross D. King
LM&MA
63
4
0
20 May 2024
Towards Modular LLMs by Building and Reusing a Library of LoRAs
Towards Modular LLMs by Building and Reusing a Library of LoRAs
O. Ostapenko
Zhan Su
E. Ponti
Laurent Charlin
Nicolas Le Roux
Matheus Pereira
Lucas Caccia
Alessandro Sordoni
MoMe
41
31
0
18 May 2024
A safety realignment framework via subspace-oriented model fusion for
  large language models
A safety realignment framework via subspace-oriented model fusion for large language models
Xin Yi
Shunfan Zheng
Linlin Wang
Xiaoling Wang
Liang He
60
20
0
15 May 2024
A Survey on RAG Meeting LLMs: Towards Retrieval-Augmented Large Language
  Models
A Survey on RAG Meeting LLMs: Towards Retrieval-Augmented Large Language Models
Wenqi Fan
Yujuan Ding
Liang-bo Ning
Shijie Wang
Hengyun Li
Dawei Yin
Tat-Seng Chua
Qing Li
RALM
3DV
40
188
0
10 May 2024
Automated Program Repair: Emerging trends pose and expose problems for
  benchmarks
Automated Program Repair: Emerging trends pose and expose problems for benchmarks
J. Renzullo
Pemma Reiter
Westley Weimer
Stephanie Forrest
36
1
0
08 May 2024
Large Language Models for Cyber Security: A Systematic Literature Review
Large Language Models for Cyber Security: A Systematic Literature Review
HanXiang Xu
Shenao Wang
Ningke Li
Kaidi Wang
Yanjie Zhao
Kai Chen
Ting Yu
Yang Liu
Haoyu Wang
34
23
0
08 May 2024
Granite Code Models: A Family of Open Foundation Models for Code
  Intelligence
Granite Code Models: A Family of Open Foundation Models for Code Intelligence
Mayank Mishra
Matt Stallone
Gaoyuan Zhang
Yikang Shen
Aditya Prasad
...
Amith Singhee
Nirmit Desai
David D. Cox
Ruchir Puri
Rameswar Panda
AI4TS
56
55
0
07 May 2024
Enabling High-Sparsity Foundational Llama Models with Efficient
  Pretraining and Deployment
Enabling High-Sparsity Foundational Llama Models with Efficient Pretraining and Deployment
Abhinav Agarwalla
Abhay Gupta
Alexandre Marques
Shubhra Pandit
Michael Goin
...
Tuan Nguyen
Mahmoud Salem
Dan Alistarh
Sean Lie
Mark Kurtz
MoE
SyDa
40
11
0
06 May 2024
Better & Faster Large Language Models via Multi-token Prediction
Better & Faster Large Language Models via Multi-token Prediction
Fabian Gloeckle
Badr Youbi Idrissi
Baptiste Rozière
David Lopez-Paz
Gabriele Synnaeve
26
92
0
30 Apr 2024
Performance-Aligned LLMs for Generating Fast Code
Performance-Aligned LLMs for Generating Fast Code
Daniel Nichols
Pranav Polasam
Harshitha Menon
Aniruddha Marathe
T. Gamblin
A. Bhatele
32
8
0
29 Apr 2024
PECC: Problem Extraction and Coding Challenges
PECC: Problem Extraction and Coding Challenges
Patrick Haller
Jonas Golde
Alan Akbik
ReLM
40
5
0
29 Apr 2024
HFT: Half Fine-Tuning for Large Language Models
HFT: Half Fine-Tuning for Large Language Models
Tingfeng Hui
Zhenyu Zhang
Shuohuan Wang
Weiran Xu
Yu Sun
Hua-Hong Wu
CLL
39
4
0
29 Apr 2024
Exploring the Limits of Fine-grained LLM-based Physics Inference via
  Premise Removal Interventions
Exploring the Limits of Fine-grained LLM-based Physics Inference via Premise Removal Interventions
Jordan Meadows
Tamsin James
André Freitas
ReLM
LRM
AI4CE
35
1
0
29 Apr 2024
BlenderAlchemy: Editing 3D Graphics with Vision-Language Models
BlenderAlchemy: Editing 3D Graphics with Vision-Language Models
Ian Huang
Guandao Yang
Leonidas J. Guibas
34
3
0
26 Apr 2024
Insights into Alignment: Evaluating DPO and its Variants Across Multiple Tasks
Insights into Alignment: Evaluating DPO and its Variants Across Multiple Tasks
Amir Saeidi
Shivanshu Verma
Chitta Baral
Chitta Baral
ALM
35
23
0
23 Apr 2024
SkinGEN: an Explainable Dermatology Diagnosis-to-Generation Framework with Interactive Vision-Language Models
SkinGEN: an Explainable Dermatology Diagnosis-to-Generation Framework with Interactive Vision-Language Models
Bo Lin
Yingjing Xu
Xuanwen Bao
Zhou Zhao
Zuyong Zhang
Zhouyang Wang
59
2
0
23 Apr 2024
Crossing the principle-practice gap in AI ethics with ethical
  problem-solving
Crossing the principle-practice gap in AI ethics with ethical problem-solving
N. Corrêa
James William Santos
Camila Galvão
Marcelo Pasetti
Dieine Schiavon
Faizah Naqvi
Robayet Hossain
N. D. Oliveira
37
4
0
16 Apr 2024
Previous
123...8910...171819
Next