v1v2 (latest)

A Survey of Machine Learning for Big Code and Naturalness

18 September 2017

Papers citing "A Survey of Machine Learning for Big Code and Naturalness"

50 / 298 papers shown

From Code Foundation Models to Agents and Applications: A Comprehensive Survey and Practical Guide to Code Intelligence

...

Zizheng Zhan

Jiajun Zhang

Jie Zhang

Zhaoxiang Zhang

Bo Zheng

LLMAG ALM ELM

809

23 Nov 2025

Scaling Laws for Code: A More Data-Hungry Regime

109

09 Oct 2025

DeepCodeSeek: Real-Time API Retrieval for Context-Aware Code Generation

30 Sep 2025

RANGER -- Repository-Level Agent for Graph-Enhanced Retrieval

145

27 Sep 2025

On the Soundness and Consistency of LLM Agents for Executing Test Cases Written in Natural Language

Sébastien Salva

Redha Taguelmimt

LLMAG

176

23 Sep 2025

Discovering Software Parallelization Points Using Deep Neural Networks

Izavan dos S. Correia

Henrique C. T. Santos

Tiago Ferreira

100

05 Sep 2025

The Gold Medals in an Empty Room: Diagnosing Metalinguistic Reasoning in LLMs with Camlang

184

30 Aug 2025

Previously on... Automating Code Review

Robert Heumüller

Frank Ortmeier

25 Aug 2025

The Fools are Certain; the Wise are Doubtful: Exploring LLM Confidence in Code Completion

120

22 Aug 2025

RAG for Geoscience: What We Expect, Gaps and Opportunities

125

15 Aug 2025

Vibe Coding as a Reconfiguration of Intent Mediation in Software Development: Definition, Implications, and Research Agenda

Christian Meske

Tobias Hermanns

Esther von der Weiden

Kai-Uwe Loser

Thorsten Berger

179

29 Jul 2025

Automated Code Review Using Large Language Models with Symbolic ReasoningInternational Service Availability Symposium (ISAS), 2025

Busra Icoz

Goksel Biricik

LRM

154

24 Jul 2025

AlgoTune: Can Language Models Speed Up General-Purpose Numerical Programs?

...

274

19 Jul 2025

Directed Acyclic Graph Convolutional Networks

176

13 Jun 2025

Deconstructing Obfuscation: A four-dimensional framework for evaluating Large Language Models assembly code deobfuscation capabilities

Anton Tkachenko

Dmitrij Suskevic

Benjamin Adolphi

279

26 May 2025

Towards Leveraging Large Language Model Summaries for Topic Modeling in Source Code

Michele Carissimi

Martina Saletta

C. Ferretti

178

24 Apr 2025

A Large-scale Class-level Benchmark Dataset for Code Generation with LLMs

Musfiqur Rahman

SayedHassan Khatoonabadi

Emad Shihab

ALM

266

22 Apr 2025

TD-Suite: All Batteries Included Framework for Technical Debt Classification

Karthik Shivashankar

Antonio Martini

156

15 Apr 2025

Bringing Structure to Naturalness: On the Naturalness of ASTs

Profir-Petru Pârţachi

Mahito Sugiyama

214

11 Apr 2025

Towards an Understanding of Context Utilization in Code Intelligence

...

256

11 Apr 2025

Deep Learning-based Intrusion Detection Systems: A Survey

332

10 Apr 2025

Semantic Mastery: Enhancing LLMs with Advanced Natural Language Understanding

Mohanakrishnan Hariharan

152

01 Apr 2025

Enhancing Code LLM Training with Programmer Attention

359

19 Mar 2025

Aligning Crowd-sourced Human Feedback for Reinforcement Learning on Code Generation by Large Language ModelsIEEE Transactions on Big Data (IEEE Trans. Big Data), 2025

M. Wong

C. Tan

ALM

303

19 Mar 2025

Fully Autonomous Programming using Iterative Multi-Agent Debugging with Large Language ModelsACM Transactions on Evolutionary Learning and Optimization (TELO), 2025

351

10 Mar 2025

LoRACode: LoRA Adapters for Code Embeddings

Saumya Chaturvedi

Aman Chadha

Laurent Bindschaedler

345

07 Mar 2025

Empirical evaluation of LLMs in predicting fixes of Configuration bugs in Smart Home System

Sheikh Moonwara Anjum Monisha

Atul Bharadwaj

197

16 Feb 2025

Should Code Models Learn Pedagogically? A Preliminary Evaluation of Curriculum Learning for Real-World Software Engineering TasksIEEE Working Conference on Mining Software Repositories (MSR), 2025

Kyi Shin Khant

Hong Yi Lin

Patanamon Thongtanunam

ELM

323

06 Feb 2025

Process-Supervised Reinforcement Learning for Code Generation

342

03 Feb 2025

From Critique to Clarity: A Pathway to Faithful and Personalized Code Explanations with Large Language Models

317

28 Jan 2025

Deep Learning-Based Identification of Inconsistent Method Names: How Far Are We?Empirical Software Engineering (EMSE), 2024

359

22 Jan 2025

Cracks in The Stack: Hidden Vulnerabilities and Licensing Risks in LLM Pre-Training Datasets

Mahmoud Jahanshahi

Audris Mockus

AAML

116

05 Jan 2025

Enhancing Code LLMs with Reinforcement Learning in Code Generation: A Survey

...

594

29 Dec 2024

EnStack: An Ensemble Stacking Framework of Large Language Models for Enhanced Vulnerability Detection in Source CodeBigData Congress [Services Society] (BSS), 2024

Shahriyar Zaman Ridoy

Md. Shazzad Hossain Shaon

A. Cuzzocrea

Mst. Shapna Akter

248

25 Nov 2024

Mastering the Craft of Data Synthesis for CodeLLMsNorth American Chapter of the Association for Computational Linguistics (NAACL), 2024

...

640

16 Oct 2024

In-Context Code-Text Learning for Bimodal Software Engineering

Jian Yang

Zhoujun Li

Haoye Tian

Jacques Klein

Tegawende F. Bissyande

290

08 Oct 2024

Leveraging Reviewer Experience in Code Review Comment GenerationACM Transactions on Software Engineering and Methodology (TOSEM), 2024

Hong Yi Lin

Patanamon Thongtanunam

Christoph Treude

Michael W. Godfrey

Chunhua Liu

Wachiraphan Charoenwet

247

17 Sep 2024

HyperAgent: Generalist Software Engineering Agents to Solve Coding Tasks at Scale

361

09 Sep 2024

A Joint Learning Model with Variational Interaction for Multilingual Program TranslationInternational Conference on Automated Software Engineering (ASE), 2024

Yali Du

Hui Sun

Ming Li

344

25 Aug 2024

Is Generative AI the Next Tactical Cyber Weapon For Threat Actors? Unforeseen Implications of AI Generated Cyber Attacks

255

23 Aug 2024

Deep Code Search with Naming-Agnostic Contrastive Multi-View LearningACM Transactions on Knowledge Discovery from Data (TKDD), 2024

189

18 Aug 2024

MIREncoder: Multi-modal IR-based Pretrained Embeddings for Performance Optimizations

Akash Dutta

Ali Jannesari

231

02 Jul 2024

AgileCoder: Dynamic Collaborative Agents for Software Development based on Agile Methodology

314

16 Jun 2024

Morescient GAI for Software EngineeringACM Transactions on Software Engineering and Methodology (TOSEM), 2024

Marcus Kessel

Colin Atkinson

SyDa

227

07 Jun 2024

Requirements are All You Need: The Final Frontier for End-User Software Engineering

229

22 May 2024

A Systematic Evaluation of Large Language Models for Natural Language Generation TasksChina National Conference on Chinese Computational Linguistics (CCL), 2024

Xuanfan Ni

Piji Li

ELM LRM

201

16 May 2024

Convolutional Learning on Directed Acyclic GraphsAsilomar Conference on Signals, Systems and Computers (ACSSC), 2024

Samuel Rey

Hamed Ajorlou

Gonzalo Mateos

CML AI4CE GNN

196

05 May 2024

Exploring and Unleashing the Power of Large Language Models in Automated Code Translation

307

125

23 Apr 2024

Vulnerability Detection with Code Language Models: How Far Are We?

265

145

27 Mar 2024

Genetic Auto-prompt Learning for Pre-trained Code Intelligence Language Models

307

20 Mar 2024