Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2302.05527
Cited By
CodeBERTScore: Evaluating Code Generation with Pretrained Models of Code
10 February 2023
Shuyan Zhou
Uri Alon
Sumit Agarwal
Graham Neubig
ELM
ALM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"CodeBERTScore: Evaluating Code Generation with Pretrained Models of Code"
50 / 56 papers shown
Title
Evaluate-and-Purify: Fortifying Code Language Models Against Adversarial Attacks Using LLM-as-a-Judge
Wenhan Mu
Ling Xu
Shuren Pei
Le Mi
Huichi Zhou
AAML
ELM
48
0
0
28 Apr 2025
Technical Challenges in Maintaining Tax Prep Software with Large Language Models
Sina Gogani-Khiabani
Varsha Dewangan
Nina Olson
Ashutosh Trivedi
Saeid Tizpaz-Niari
29
0
0
25 Apr 2025
CodeVisionary: An Agent-based Framework for Evaluating Large Language Models in Code Generation
Xinchen Wang
Pengfei Gao
Chao Peng
Ruida Hu
Cuiyun Gao
ELM
31
0
0
18 Apr 2025
Evaluating the Diversity and Quality of LLM Generated Content
Alexander Shypula
Shuo Li
Botong Zhang
Vishakh Padmakumar
Kayo Yin
Osbert Bastani
34
1
0
16 Apr 2025
Rubric Is All You Need: Enhancing LLM-based Code Evaluation With Question-Specific Rubrics
Aditya Pathak
Rachit Gandhi
Vaibhav Uttam
Devansh
Yashwanth Nakka
...
Aditya Mittal
Aashna Ased
Chirag Khatri
Jagat Sesh Challa
Dhruv Kumar
40
0
0
31 Mar 2025
Ensemble Learning for Large Language Models in Text and Code Generation: A Survey
Mari Ashiga
Wei Jie
Fan Wu
Vardan K. Voskanyan
Fateme Dinmohammadi
P. Brookes
Jingzhi Gong
Zheng Wang
38
0
0
13 Mar 2025
How Diversely Can Language Models Solve Problems? Exploring the Algorithmic Diversity of Model-Generated Code
Seonghyeon Lee
Heejae Chon
Joonwon Jang
Dongha Lee
Hwanjo Yu
ALM
39
0
0
02 Mar 2025
Towards Better Understanding of Program-of-Thought Reasoning in Cross-Lingual and Multilingual Environments
Patomporn Payoungkhamdee
Pume Tuchinda
Jinheon Baek
Samuel Cahyawijaya
Can Udomcharoenchaikit
Potsawee Manakul
Peerat Limkonchotiwat
E. Chuangsuwanich
Sarana Nutanong
LRM
44
0
0
25 Feb 2025
StatLLM: A Dataset for Evaluating the Performance of Large Language Models in Statistical Analysis
Xinyi Song
Lina Lee
Kexin Xie
Xueying Liu
Xinwei Deng
Yili Hong
ALM
ELM
50
0
0
24 Feb 2025
AutoParLLM: GNN-guided Context Generation for Zero-Shot Code Parallelization using LLMs
Quazi Ishtiaque Mahmud
Ali TehraniJamsaz
Hung Phan
Le Chen
Mihai Capota
Theodore L. Willke
Nesreen Ahmed
Ali Jannesari
50
5
0
20 Feb 2025
Bridging LLM-Generated Code and Requirements: Reverse Generation technique and SBC Metric for Developer Insights
Ahilan Ayyachamy Nadar Ponnusamy
58
0
0
11 Feb 2025
SyntheT2C: Generating Synthetic Data for Fine-Tuning Large Language Models on the Text2Cypher Task
Ziije Zhong
Linqing Zhong
Zhaoze Sun
Qingyun Jin
Zengchang Qin
Xiaofan Zhang
50
6
0
28 Jan 2025
Chatting with Logs: An exploratory study on Finetuning LLMs for LogQL
Vishwanath Seshagiri
Siddharth Balyan
Vaastav Anand
Kaustubh Dhole
Ishan Sharma
Avani Wildani
José Cambronero
Andreas Züfle
92
1
0
04 Dec 2024
Human-In-the-Loop Software Development Agents
Wannita Takerngsaksiri
Jirat Pasuksmit
Patanamon Thongtanunam
C. Tantithamthavorn
Ruixiong Zhang
Fan Jiang
Jing Li
Evan Cook
K. Chen
Ming Wu
LLMAG
95
1
0
19 Nov 2024
Model Editing for LLMs4Code: How Far are We?
Xiaopeng Li
Shangwen Wang
Shasha Li
Jun Ma
Jie Yu
Xiaodong Liu
Jing Wang
Bin Ji
Weimin Zhang
KELM
39
1
0
11 Nov 2024
CodeRosetta: Pushing the Boundaries of Unsupervised Code Translation for Parallel Programming
Ali TehraniJamsaz
Arijit Bhattacharjee
Le Chen
Nesreen Ahmed
Amir Yazdanbakhsh
Ali Jannesari
21
0
0
27 Oct 2024
Mastering the Craft of Data Synthesis for CodeLLMs
Meng Chen
Philip Arthur
Qianyu Feng
Cong Duy Vu Hoang
Yu-Heng Hong
...
Mark Johnson
K. K.
Don Dharmasiri
Long Duong
Yuan-Fang Li
SyDa
46
1
0
16 Oct 2024
Agent-as-a-Judge: Evaluate Agents with Agents
Mingchen Zhuge
Changsheng Zhao
Dylan R. Ashley
Wenyi Wang
Dmitrii Khizbullin
...
Raghuraman Krishnamoorthi
Yuandong Tian
Yangyang Shi
Vikas Chandra
Jürgen Schmidhuber
ELM
57
32
0
14 Oct 2024
Consistent Autoformalization for Constructing Mathematical Libraries
Lan Zhang
Xin Quan
André Freitas
AI4CE
24
2
0
05 Oct 2024
CodeJudge: Evaluating Code Generation with Large Language Models
Weixi Tong
Tianyi Zhang
ELM
ALM
20
7
0
03 Oct 2024
CRScore: Grounding Automated Evaluation of Code Review Comments in Code Claims and Smells
Atharva Naik
Marcus Alenius
Daniel Fried
Carolyn Rose
21
0
0
29 Sep 2024
Eliciting Instruction-tuned Code Language Models' Capabilities to Utilize Auxiliary Function for Code Generation
Seonghyeon Lee
Suyeon Kim
Joonwon Jang
Heejae Chon
Dongha Lee
Hwanjo Yu
ELM
16
0
0
20 Sep 2024
CSAD: Unsupervised Component Segmentation for Logical Anomaly Detection
Yu-Hsuan Hsieh
Shang-Hong Lai
14
3
0
28 Aug 2024
What can Large Language Models Capture about Code Functional Equivalence?
Nickil Maveli
Antonio Vergari
Shay B. Cohen
25
2
0
20 Aug 2024
Better Python Programming for all: With the focus on Maintainability
Karthik Shivashankar
Antonio Martini
KELM
34
0
0
17 Aug 2024
Generating Unseen Code Tests In Infinitum
Marcel Zalmanovici
Orna Raz
E. Farchi
Iftach Freund
23
0
0
29 Jul 2024
FuncEvalGMN: Evaluating Functional Correctness of SQL via Graph Matching Network
Yi Zhan
Yang Sun
Han Weng
Longjie Cui
Guifeng Wang
Jiajun Xie
Yu Tian
Xiaoming Yin
Boyi Liu
Dongchi Huang
16
0
0
09 Jul 2024
Harnessing Business and Media Insights with Large Language Models
Yujia Bao
Ankit Parag Shah
Neeru Narang
Jonathan Rivers
Rajeev Maksey
...
Gyuhak Kim
Dengpan Yin
Don Hejna
Mo Nomeli
Wei Wei
AIFin
38
2
0
02 Jun 2024
Aligning LLMs through Multi-perspective User Preference Ranking-based Feedback for Programming Question Answering
Hongyu Yang
Liyang He
Min Hou
Shuanghong Shen
Rui Li
Jiahui Hou
Jianhui Ma
Junda Zhao
19
0
0
27 May 2024
CPsyCoun: A Report-based Multi-turn Dialogue Reconstruction and Evaluation Framework for Chinese Psychological Counseling
Chenhao Zhang
Renhao Li
Minghuan Tan
Min Yang
Jingwei Zhu
Di Yang
Jiahao Zhao
Guancheng Ye
Chengming Li
Xiping Hu
31
18
0
26 May 2024
A Transformer-Based Approach for Smart Invocation of Automatic Code Completion
Aral de Moor
A. van Deursen
M. Izadi
VLM
34
5
0
23 May 2024
On the Limitations of Embedding Based Methods for Measuring Functional Correctness for Code Generation
Atharva Naik
33
2
0
26 Apr 2024
Exploring the Impact of the Output Format on the Evaluation of Large Language Models for Code Translation
Marcos Macedo
Yuan Tian
F. Côgo
Bram Adams
25
12
0
25 Mar 2024
Exploring Language Model's Code Generation Ability with Auxiliary Functions
Seonghyeon Lee
Sanghwan Jang
Seongbo Jang
Dongha Lee
Hwanjo Yu
ALM
19
1
0
15 Mar 2024
OmniACT: A Dataset and Benchmark for Enabling Multimodal Generalist Autonomous Agents for Desktop and Web
Raghav Kapoor
Y. Butala
M. Russak
Jing Yu Koh
Kiran Kamble
Waseem Alshikh
Ruslan Salakhutdinov
LLMAG
51
44
0
27 Feb 2024
Unsupervised Evaluation of Code LLMs with Round-Trip Correctness
Miltiadis Allamanis
Sheena Panthaplackel
Pengcheng Yin
ALM
OffRL
LRM
43
9
0
13 Feb 2024
QualEval: Qualitative Evaluation for Model Improvement
Vishvak Murahari
A. Deshpande
Peter Clark
Tanmay Rajpurohit
Ashish Sabharwal
Karthik Narasimhan
A. Kalyan
14
4
0
06 Nov 2023
CodeFusion: A Pre-trained Diffusion Model for Code Generation
Mukul Singh
J. Cambronero
Sumit Gulwani
Vu Le
Carina Negreanu
Gust Verbruggen
20
28
0
26 Oct 2023
Beyond Accuracy: Evaluating Self-Consistency of Code Large Language Models with IdentityChain
Marcus J. Min
Yangruibo Ding
Luca Buratti
Saurabh Pujar
Gail E. Kaiser
Suman Jana
Baishakhi Ray
LRM
HILM
14
16
0
21 Oct 2023
CodeTransOcean: A Comprehensive Multilingual Benchmark for Code Translation
Weixiang Yan
Yuchen Tian
Yunzhe Li
Qian Chen
Wen Wang
18
35
0
08 Oct 2023
L2CEval: Evaluating Language-to-Code Generation Capabilities of Large Language Models
Ansong Ni
Pengcheng Yin
Yilun Zhao
Chen Wei
Yanjun Wang
...
Mingyuan Zhang
Chen Change Loy
Yingbo Zhou
Dragomir R. Radev
Arman Cohan
ELM
19
16
0
29 Sep 2023
Supersonic: Learning to Generate Source Code Optimizations in C/C++
Zimin Chen
Sen Fang
Monperrus Martin
18
10
0
26 Sep 2023
Context Aware Query Rewriting for Text Rankers using LLM
Abhijit Anand
Venktesh V
Vinay Setty
Avishek Anand
14
10
0
31 Aug 2023
On the Impact of Language Selection for Training and Evaluating Programming Language Models
J. Katzy
M. Izadi
A. van Deursen
26
5
0
25 Aug 2023
OctoPack: Instruction Tuning Code Large Language Models
Niklas Muennighoff
Qian Liu
A. Zebaze
Qinkai Zheng
Binyuan Hui
Terry Yue Zhuo
Swayam Singh
Xiangru Tang
Leandro von Werra
Shayne Longpre
VLM
ALM
49
116
0
14 Aug 2023
InterCode: Standardizing and Benchmarking Interactive Coding with Execution Feedback
John Yang
Akshara Prabhakar
Karthik Narasimhan
Shunyu Yao
14
102
0
26 Jun 2023
Neural Machine Translation for Code Generation
K. Dharma
Clayton T. Morrison
22
4
0
22 May 2023
Bridging the Gap: A Survey on Integrating (Human) Feedback for Natural Language Generation
Patrick Fernandes
Aman Madaan
Emmy Liu
António Farinhas
Pedro Henrique Martins
...
José G. C. de Souza
Shuyan Zhou
Tongshuang Wu
Graham Neubig
André F. T. Martins
ALM
106
56
0
01 May 2023
ICE-Score: Instructing Large Language Models to Evaluate Code
Terry Yue Zhuo
ELM
ALM
31
38
0
27 Apr 2023
Exploring Distributional Shifts in Large Language Models for Code Analysis
Shushan Arakelyan
Rocktim Jyoti Das
Yi Mao
Xiang Ren
ALM
11
18
0
16 Mar 2023
1
2
Next