Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2404.00971
Cited By
Exploring and Evaluating Hallucinations in LLM-Powered Code Generation
1 April 2024
Fang Liu
Yang Liu
Lin Shi
Houkun Huang
Ruifeng Wang
Zhen Yang
Li Zhang
Zhongqi Li
Yuchi Ma
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Exploring and Evaluating Hallucinations in LLM-Powered Code Generation"
50 / 52 papers shown
Title
Assessing LLM code generation quality through path planning tasks
Wanyi Chen
Meng-Wen Su
Mary L. Cummings
ELM
43
0
0
30 Apr 2025
Hallucination by Code Generation LLMs: Taxonomy, Benchmarks, Mitigation, and Challenges
Yunseo Lee
John Youngeun Song
Dongsun Kim
Jindae Kim
Mijung Kim
Jaechang Nam
HILM
LRM
33
0
0
29 Apr 2025
An Automated Reinforcement Learning Reward Design Framework with Large Language Model for Cooperative Platoon Coordination
Dixiao Wei
Peng Yi
Jinlong Lei
Yiguang Hong
Yuchuan Du
33
0
0
28 Apr 2025
Large Language Models for Validating Network Protocol Parsers
Mingwei Zheng
Danning Xie
X. Zhang
22
1
0
18 Apr 2025
Code Copycat Conundrum: Demystifying Repetition in LLM-based Code Generation
Mingwei Liu
Juntao Li
Ying Wang
Xueying Du
Zuoyu Ou
...
Zhao Wei
Y. Xu
Fangming Zou
Xin Peng
Yiling Lou
35
0
0
17 Apr 2025
SemEval-2025 Task 3: Mu-SHROOM, the Multilingual Shared Task on Hallucinations and Related Observable Overgeneration Mistakes
Raúl Vázquez
Timothee Mickus
Elaine Zosa
Teemu Vahtola
Jörg Tiedemann
...
Liane Guillou
Ona de Gibert
Jaione Bengoetxea
Joseph Attieh
Marianna Apidianaki
HILM
VLM
LRM
74
0
0
16 Apr 2025
C-FAITH: A Chinese Fine-Grained Benchmark for Automated Hallucination Evaluation
Xu Zhang
Zhifei Liu
Jiahao Wang
Huixuan Zhang
Fan Xu
Junzhe Zhang
Xiaojun Wan
HILM
21
0
0
14 Apr 2025
Parameters vs. Context: Fine-Grained Control of Knowledge Reliance in Language Models
Baolong Bi
Shenghua Liu
Y. Wang
Yilong Xu
Junfeng Fang
Lingrui Mei
Xueqi Cheng
KELM
55
4
0
20 Mar 2025
ProjectEval: A Benchmark for Programming Agents Automated Evaluation on Project-Level Code Generation
Kaiyuan Liu
Youcheng Pan
J. Li
Daojing He
Yang Xiang
Yexing Du
Tianrun Gao
LLMAG
ELM
46
1
0
10 Mar 2025
Benchmarking AI Models in Software Engineering: A Review, Search Tool, and Enhancement Protocol
Roham Koohestani
Philippe de Bekker
M. Izadi
VLM
45
0
0
07 Mar 2025
DeepSeek vs. ChatGPT vs. Claude: A Comparative Study for Scientific Computing and Scientific Machine Learning Tasks
Qile Jiang
Zhiwei Gao
George Em Karniadakis
LRM
55
5
0
25 Feb 2025
Pragmatic Reasoning improves LLM Code Generation
Zhuchen Cao
Sven Apel
Adish Singla
Vera Demberg
LRM
34
0
0
20 Feb 2025
Automated Consistency Analysis of LLMs
Aditya Patwardhan
Vivek Vaidya
Ashish Kundu
50
0
0
10 Feb 2025
SOK: Exploring Hallucinations and Security Risks in AI-Assisted Software Development with Insights for LLM Deployment
Ariful Haque
Sunzida Siddique
M. Rahman
Ahmed Rafi Hasan
Laxmi Rani Das
Marufa Kamal
Tasnim Masura
Kishor Datta Gupta
48
1
0
31 Jan 2025
LLM Hallucinations in Practical Code Generation: Phenomena, Mechanism, and Mitigation
Ziyao Zhang
Yanlin Wang
Chong Wang
Jiachi Chen
Zibin Zheng
108
11
0
20 Jan 2025
Context-DPO: Aligning Language Models for Context-Faithfulness
Baolong Bi
Shaohan Huang
Y. Wang
Tianchi Yang
Zihan Zhang
...
Furu Wei
Weiwei Deng
Feng Sun
Qi Zhang
Shenghua Liu
94
8
0
18 Dec 2024
A Survey of Calibration Process for Black-Box LLMs
Liangru Xie
Hui Liu
Jingying Zeng
Xianfeng Tang
Yan Han
Chen Luo
Jing Huang
Zhen Li
Suhang Wang
Qi He
74
1
0
17 Dec 2024
VALTEST: Automated Validation of Language Model Generated Test Cases
Hamed Taherkhani
Hadi Hemmati
29
0
0
13 Nov 2024
LProtector: An LLM-driven Vulnerability Detection System
Ze Sheng
Fenghua Wu
Xiangwu Zuo
Chao Li
Yuxin Qiao
Lei Hang
52
4
0
10 Nov 2024
A Deep Dive Into Large Language Model Code Generation Mistakes: What and Why?
QiHong Chen
Jiawei Li
Jiecheng Deng
Jiachen Yu
Justin Tian Jin Chen
Iftekhar Ahmed
38
0
0
03 Nov 2024
An LLM Agent for Automatic Geospatial Data Analysis
Yuxing Chen
Weijie Wang
Sylvain Lobry
Camille Kurtz
LLMAG
25
3
0
24 Oct 2024
Towards Safer Heuristics With XPlain
Pantea Karimi
Solal Pirelli
Siva Kesava Reddy Kakarla
Ryan Beckett
Santiago Segarra
Beibin Li
Pooria Namyar
Behnaz Arzani
29
0
0
19 Oct 2024
A Large Language Model-Driven Reward Design Framework via Dynamic Feedback for Reinforcement Learning
Shengjie Sun
Runze Liu
Jiafei Lyu
J. Yang
L. Zhang
Xiu Li
LRM
11
7
0
18 Oct 2024
Evaluating Quantized Large Language Models for Code Generation on Low-Resource Language Benchmarks
Enkhbold Nyamsuren
MQ
11
1
0
18 Oct 2024
ETF: An Entity Tracing Framework for Hallucination Detection in Code Summaries
Kishan Maharaj
Vitobha Munigala
Srikanth G. Tamilselvam
Prince Kumar
Sayandeep Sen
Palani Kodeswaran
Abhijit Mishra
Pushpak Bhattacharyya
HILM
26
0
0
17 Oct 2024
Controlled Automatic Task-Specific Synthetic Data Generation for Hallucination Detection
Yong Xie
Karan Aggarwal
Aitzaz Ahmad
Stephen Lau
30
0
0
16 Oct 2024
SAFREE: Training-Free and Adaptive Guard for Safe Text-to-Image And Video Generation
Jaehong Yoon
Shoubin Yu
Vaidehi Patil
Huaxiu Yao
Mohit Bansal
59
14
0
16 Oct 2024
FLARE: Faithful Logic-Aided Reasoning and Exploration
Erik Arakelyan
Pasquale Minervini
Pat Verga
Patrick Lewis
Isabelle Augenstein
ReLM
LRM
57
2
0
14 Oct 2024
Collu-Bench: A Benchmark for Predicting Language Model Hallucinations in Code
Nan Jiang
Qi Li
Lin Tan
Tianyi Zhang
HILM
21
1
0
13 Oct 2024
One Step at a Time: Combining LLMs and Static Analysis to Generate Next-Step Hints for Programming Tasks
Anastasiia Birillo
Elizaveta Artser
Anna Potriasaeva
Ilya Vlasov
Katsiaryna Dzialets
Yaroslav Golubev
Igor Gerasimov
Hieke Keuning
T. Bryksin
24
3
0
11 Oct 2024
Mitigating Gender Bias in Code Large Language Models via Model Editing
Z. Qin
Haochuan Wang
Zecheng Wang
Deyuan Liu
Cunhang Fan
Zhao Lv
Zhiying Tu
Dianhui Chu
Dianbo Sui
KELM
16
0
0
10 Oct 2024
Hallucinating AI Hijacking Attack: Large Language Models and Malicious Code Recommenders
David A. Noever
Forrest McKee
AAML
37
0
0
09 Oct 2024
Need Help? Designing Proactive AI Assistants for Programming
Valerie Chen
Alan Zhu
Sebastian Zhao
Hussein Mozannar
David Sontag
Ameet Talwalkar
27
4
0
06 Oct 2024
Did You Hear That? Introducing AADG: A Framework for Generating Benchmark Data in Audio Anomaly Detection
Ksheeraja Raghavan
Samiran Gode
Ankit Parag Shah
Surabhi Raghavan
Wolfram Burgard
Bhiksha Raj
Rita Singh
20
0
0
04 Oct 2024
Comparing Criteria Development Across Domain Experts, Lay Users, and Models in Large Language Model Evaluation
Annalisa Szymanski
Simret Araya Gebreegziabher
Oghenemaro Anuyah
Ronald A Metoyer
T. Li
ALM
ELM
19
1
0
02 Oct 2024
Code Vulnerability Repair with Large Language Model using Context-Aware Prompt Tuning
Arshiya Khan
Guannan Liu
Xing Gao
KELM
16
1
0
27 Sep 2024
Automated test generation to evaluate tool-augmented LLMs as conversational AI agents
Samuel Arcadinho
David Aparicio
Mariana Almeida
11
4
0
24 Sep 2024
A Comprehensive Framework for Evaluating API-oriented Code Generation in Large Language Models
Yixi Wu
Pengfei He
Zehao Wang
Shaowei Wang
Yuan Tian
Tse-Hsun Chen
ALM
27
0
0
23 Sep 2024
Cost-Effective Hallucination Detection for LLMs
Simon Valentin
Jinmiao Fu
Gianluca Detommaso
Shaoyuan Xu
Giovanni Zappella
Bryan Wang
HILM
20
4
0
31 Jul 2024
Rome was Not Built in a Single Step: Hierarchical Prompting for LLM-based Chip Design
Andre Nakkab
Sai Qian Zhang
Ramesh Karri
Siddharth Garg
30
0
0
23 Jul 2024
On Mitigating Code LLM Hallucinations with API Documentation
Nihal Jain
Robert Kwiatkowski
Baishakhi Ray
M. K. Ramanathan
Varun Kumar
25
7
0
13 Jul 2024
What's Wrong with Your Code Generated by Large Language Models? An Extensive Study
Shihan Dou
Haoxiang Jia
Shenxi Wu
Huiyuan Zheng
Weikang Zhou
...
Xunliang Cai
Tao Gui
Xipeng Qiu
Qi Zhang
Xuanjing Huang
19
22
0
08 Jul 2024
Code Hallucination
Mirza Masfiqur Rahman
Ashish Kundu
LRM
HILM
21
1
0
05 Jul 2024
Identifying Performance-Sensitive Configurations in Software Systems through Code Analysis with LLM Agents
Zehao Wang
Dong Jae Kim
Tse-Hsun Chen
17
1
0
18 Jun 2024
Security of AI Agents
Yifeng He
Ethan Wang
Yuyang Rong
Zifei Cheng
Hao Chen
LLMAG
23
7
0
12 Jun 2024
We Have a Package for You! A Comprehensive Analysis of Package Hallucinations by Code Generating LLMs
Joseph Spracklen
Raveen Wijewickrama
A. H. M. N. Sakib
Anindya Maiti
Murtuza Jadliwala
Murtuza Jadliwala
22
9
0
12 Jun 2024
A Survey of Useful LLM Evaluation
Ji-Lun Peng
Sijia Cheng
Egil Diau
Yung-Yu Shih
Po-Heng Chen
Yen-Ting Lin
Yun-Nung Chen
LLMAG
ELM
24
12
0
03 Jun 2024
Assessing and Verifying Task Utility in LLM-Powered Applications
Negar Arabzadeh
Siging Huo
Nikhil Mehta
Qinqyun Wu
Chi Wang
Ahmed Hassan Awadallah
Charles L. A. Clarke
Julia Kiseleva
20
3
0
03 May 2024
Exploring and Unleashing the Power of Large Language Models in Automated Code Translation
Zhen Yang
Fang Liu
Zhongxing Yu
J. Keung
Jia Li
Shuo Liu
Yifan Hong
Xiaoxue Ma
Zhi Jin
Ge Li
26
9
0
23 Apr 2024
CodeGen2: Lessons for Training LLMs on Programming and Natural Languages
Erik Nijkamp
A. Ghobadzadeh
Caiming Xiong
Silvio Savarese
Yingbo Zhou
133
163
0
03 May 2023
1
2
Next