v1v2 (latest)

CRScore: Grounding Automated Evaluation of Code Review Comments in Code Claims and Smells

North American Chapter of the Association for Computational Linguistics (NAACL), 2024

29 September 2024

Papers citing "CRScore: Grounding Automated Evaluation of Code Review Comments in Code Claims and Smells"

24 / 24 papers shown

Title
ConCISE: A Reference-Free Conciseness Evaluation Metric for LLM-Generated Answers Seyed Mohssen Ghafari Ronny Kol Juan C. Quiroz Nella Luan Monika Patial Chanaka Rupasinghe Herman Wandabwa Luiz Pizzato 53 0 0 20 Nov 2025
MetaLint: Generalizable Idiomatic Code Quality Analysis through Instruction-Following and Easy-to-Hard Generalization Atharva Naik Lawanya Baghel Dhakshin Govindarajan Darsh Agrawal Daniel Fried Carolyn Rose ALM AI4CE 185 0 0 15 Jul 2025
A GPT-based Code Review System for Programming Language Learning Lee Dong-Kyu 138 4 0 21 Jun 2024
AI-Assisted Assessment of Coding Practices in Modern Code Review Manushree Vijayvergiya M. Salawa Ivan Budiselic Dan Zheng Pascal Lamblin ... Jovan Andonov Goran Petrović Daniel Tarlow Petros Maniatis René Just 154 23 0 22 May 2024
LLM Evaluators Recognize and Favor Their Own Generations Arjun Panickssery Samuel R. Bowman Shi Feng 317 333 0 15 Apr 2024
DeepSeek-Coder: When the Large Language Model Meets Programming -- The Rise of Code Intelligence Daya Guo Qihao Zhu Dejian Yang Zhenda Xie Kai Dong ... Yu-Huan Wu Yiming Li Fuli Luo Yingfei Xiong W. Liang ELM 368 1,284 0 25 Jan 2024
Code Llama: Open Foundation Models for Code Baptiste Rozière Jonas Gehring Fabian Gloeckle Sten Sootla Itai Gat ... Hugo Touvron Louis Martin Nicolas Usunier Thomas Scialom Gabriel Synnaeve ELM ALM 415 2,708 0 24 Aug 2023
LLaMA-Reviewer: Advancing Code Review Automation with Large Language Models through Parameter-Efficient Fine-TuningIEEE International Symposium on Software Reliability Engineering (ISSRE), 2023 Jun Lu Lei Yu Xiaojia Li Li Yang Chun Zuo ALM 227 103 0 22 Aug 2023
Judging LLM-as-a-Judge with MT-Bench and Chatbot ArenaNeural Information Processing Systems (NeurIPS), 2023 Lianmin Zheng Wei-Lin Chiang Ying Sheng Siyuan Zhuang Zhanghao Wu ... Dacheng Li Eric Xing Haotong Zhang Joseph E. Gonzalez Ion Stoica ALM OSLM ELM 2.3K 6,305 0 09 Jun 2023
InfoMetIC: An Informative Metric for Reference-free Image Caption EvaluationAnnual Meeting of the Association for Computational Linguistics (ACL), 2023 Anwen Hu Shizhe Chen Liang Zhang Qin Jin 196 27 0 10 May 2023
CodeBERTScore: Evaluating Code Generation with Pretrained Models of CodeConference on Empirical Methods in Natural Language Processing (EMNLP), 2023 Shuyan Zhou Uri Alon Sumit Agarwal Graham Neubig ELM ALM 245 147 0 10 Feb 2023
Out of the BLEU: how should we assess quality of the Code Generation models?Journal of Systems and Software (JSS), 2022 Mikhail Evtikhiev Egor Bogomolov Yaroslav Sokolov T. Bryksin ALM 256 138 0 05 Aug 2022
Automated Identification of Toxic Code Reviews Using ToxiCRACM Transactions on Software Engineering and Methodology (TOSEM), 2022 Jaydeb Sarker Asif Kamal Turzo Mingyou Dong Amiangshu Bosu 168 41 0 26 Feb 2022
CLIPScore: A Reference-free Evaluation Metric for Image CaptioningConference on Empirical Methods in Natural Language Processing (EMNLP), 2021 Jack Hessel Ari Holtzman Maxwell Forbes Ronan Le Bras Yejin Choi CLIP 831 2,207 0 18 Apr 2021
Towards evaluating and eliciting high-quality documentation for intelligent systems David Piorkowski D. González John T. Richards Stephanie Houde 199 11 0 17 Nov 2020
CodeBLEU: a Method for Automatic Evaluation of Code Synthesis Shuo Ren Daya Guo Shuai Lu Long Zhou Shujie Liu Duyu Tang Neel Sundaresan M. Zhou Ambrosio Blanco Shuai Ma ELM 402 709 0 22 Sep 2020
Unsupervised Evaluation of Interactive Dialog with DialoGPTSIGDIAL Conferences (SIGDIAL), 2020 Shikib Mehri M. Eskénazi 183 201 0 23 Jun 2020
USR: An Unsupervised and Reference Free Evaluation Metric for Dialog GenerationAnnual Meeting of the Association for Computational Linguistics (ACL), 2020 Shikib Mehri M. Eskénazi 152 247 0 01 May 2020
Sentence-BERT: Sentence Embeddings using Siamese BERT-NetworksConference on Empirical Methods in Natural Language Processing (EMNLP), 2019 Nils Reimers Iryna Gurevych 1.8K 15,156 0 27 Aug 2019
VIFIDEL: Evaluating the Visual Fidelity of Image DescriptionsAnnual Meeting of the Association for Computational Linguistics (ACL), 2019 Pranava Madhyastha Josiah Wang Lucia Specia 138 38 0 22 Jul 2019
Does BLEU Score Work for Code Migration?IEEE International Conference on Program Comprehension (ICPC), 2019 Ngoc M. Tran H. Tran S. T. Nguyen H. Nguyen Tien N Nguyen 113 81 0 12 Jun 2019
BERTScore: Evaluating Text Generation with BERT Tianyi Zhang Varsha Kishore Felix Wu Kilian Q. Weinberger Yoav Artzi 1.5K 7,248 0 21 Apr 2019
ROUGE 2.0: Updated and Improved Measures for Evaluation of Summarization Tasks Kavita A. Ganesan 76 172 0 05 Mar 2018
Neural Machine Translation by Jointly Learning to Align and TranslateInternational Conference on Learning Representations (ICLR), 2014 Dzmitry Bahdanau Dong Wang Yoshua Bengio AIMat 1.5K 28,548 0 01 Sep 2014