ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2009.08366
  4. Cited By
GraphCodeBERT: Pre-training Code Representations with Data Flow

GraphCodeBERT: Pre-training Code Representations with Data Flow

17 September 2020
Daya Guo
Shuo Ren
Shuai Lu
Zhangyin Feng
Duyu Tang
Shujie Liu
Long Zhou
Nan Duan
Alexey Svyatkovskiy
Shengyu Fu
Michele Tufano
Shao Kun Deng
Colin B. Clement
Dawn Drain
Neel Sundaresan
Jian Yin
Daxin Jiang
M. Zhou
ArXivPDFHTML

Papers citing "GraphCodeBERT: Pre-training Code Representations with Data Flow"

50 / 403 papers shown
Title
Large Language Models are Qualified Benchmark Builders: Rebuilding Pre-Training Datasets for Advancing Code Intelligence Tasks
Large Language Models are Qualified Benchmark Builders: Rebuilding Pre-Training Datasets for Advancing Code Intelligence Tasks
Kang Yang
Xinjun Mao
Shangwen Wang
Y. Wang
Tanghaoran Zhang
Bo Lin
Yihao Qin
Zhang Zhang
Yao Lu
Kamal Al-Sabahi
ALM
111
1
0
28 Apr 2025
Iterative Self-Training for Code Generation via Reinforced Re-Ranking
Iterative Self-Training for Code Generation via Reinforced Re-Ranking
Nikita Sorokin
I. Sedykh
Valentin Malykh
31
0
0
13 Apr 2025
ML For Hardware Design Interpretability: Challenges and Opportunities
ML For Hardware Design Interpretability: Challenges and Opportunities
Raymond Baartmans
Andrew Ensinger
Victor Agostinelli
Lizhong Chen
29
0
0
11 Apr 2025
DocAgent: A Multi-Agent System for Automated Code Documentation Generation
DocAgent: A Multi-Agent System for Automated Code Documentation Generation
Dayu Yang
Antoine Simoulin
Xin Qian
Xiaoyi Liu
Yuwei Cao
Zhaopu Teng
Grey Yang
LLMAG
54
0
0
11 Apr 2025
Zero-Shot Cross-Domain Code Search without Fine-Tuning
Zero-Shot Cross-Domain Code Search without Fine-Tuning
Keyu Liang
Z. Liu
Chao Liu
Zhiyuan Wan
David Lo
Xiaohu Yang
26
0
0
10 Apr 2025
BRIDGES: Bridging Graph Modality and Large Language Models within EDA Tasks
BRIDGES: Bridging Graph Modality and Large Language Models within EDA Tasks
Wei Li
Yang Zou
Christopher Ellis
Ruben Purdy
Shawn Blanton
José M. F. Moura
25
0
0
07 Apr 2025
ObscuraCoder: Powering Efficient Code LM Pre-Training Via Obfuscation Grounding
ObscuraCoder: Powering Efficient Code LM Pre-Training Via Obfuscation Grounding
Indraneil Paul
Haoyi Yang
Goran Glavas
Kristian Kersting
Iryna Gurevych
AAML
SyDa
34
0
0
27 Mar 2025
Large Language Models (LLMs) for Source Code Analysis: applications, models and datasets
Large Language Models (LLMs) for Source Code Analysis: applications, models and datasets
Hamed Jelodar
Mohammad Meymani
Roozbeh Razavi-Far
40
0
0
21 Mar 2025
LLM-Aided Customizable Profiling of Code Data Based On Programming Language Concepts
LLM-Aided Customizable Profiling of Code Data Based On Programming Language Concepts
Pankaj Thorat
Adnan Qidwai
Adrija Dhar
Aishwariya Chakraborty
Anand Eswaran
Hima Patel
Praveen Jayachandran
48
0
0
19 Mar 2025
XOXO: Stealthy Cross-Origin Context Poisoning Attacks against AI Coding Assistants
XOXO: Stealthy Cross-Origin Context Poisoning Attacks against AI Coding Assistants
Adam Storek
Mukur Gupta
Noopur Bhatt
Aditya Gupta
Janie Kim
Prashast Srivastava
Suman Jana
AAML
69
0
0
18 Mar 2025
OASIS: Order-Augmented Strategy for Improved Code Search
Zuchen Gao
Zizheng Zhan
Xianming Li
Erxin Yu
Haotian Zhang
Bin Chen
Yuqun Zhang
Jing Li
58
0
0
11 Mar 2025
R+R: Security Vulnerability Dataset Quality Is Critical
Anurag Swarnim Yadav
Joseph N. Wilson
AAML
45
0
0
09 Mar 2025
LoRACode: LoRA Adapters for Code Embeddings
Saumya Chaturvedi
Aman Chadha
Laurent Bindschaedler
59
0
0
07 Mar 2025
Grammar-Based Code Representation: Is It a Worthy Pursuit for LLMs?
Qingyuan Liang
Zhao Zhang
Zeyu Sun
Zheng Lin
Qi Luo
...
Yuqun Zhang
Haotian Zhang
Lu Zhang
Bin Chen
Y. Xiong
41
1
0
07 Mar 2025
Multimodal Learning for Just-In-Time Software Defect Prediction in Autonomous Driving Systems
Multimodal Learning for Just-In-Time Software Defect Prediction in Autonomous Driving Systems
Faisal Mohammad
Duksan Ryu
57
0
0
28 Feb 2025
GNN-Coder: Boosting Semantic Code Retrieval with Combined GNNs and Transformer
GNN-Coder: Boosting Semantic Code Retrieval with Combined GNNs and Transformer
Yufan Ye
Pu Pang
Ting Zhang
Hua Huang
61
0
0
24 Feb 2025
UniGenCoder: Merging Seq2Seq and Seq2Tree Paradigms for Unified Code Generation
UniGenCoder: Merging Seq2Seq and Seq2Tree Paradigms for Unified Code Generation
Liangying Shao
Yanfu Yan
Denys Poshyvanyk
Jinsong Su
36
1
0
18 Feb 2025
URECA: The Chain of Two Minimum Set Cover Problems exists behind Adaptation to Shifts in Semantic Code Search
URECA: The Chain of Two Minimum Set Cover Problems exists behind Adaptation to Shifts in Semantic Code Search
Seok-Ung Choi
Joonghyuk Hahn
Yo-Sub Han
51
0
0
11 Feb 2025
LessLeak-Bench: A First Investigation of Data Leakage in LLMs Across 83 Software Engineering Benchmarks
LessLeak-Bench: A First Investigation of Data Leakage in LLMs Across 83 Software Engineering Benchmarks
Xin Zhou
M. Weyssow
Ratnadira Widyasari
Ting Zhang
Junda He
Yunbo Lyu
Jianming Chang
Beiqi Zhang
Dan Huang
David Lo
PILM
210
1
0
10 Feb 2025
TCProF: Time-Complexity Prediction SSL Framework
TCProF: Time-Complexity Prediction SSL Framework
Joonghyuk Hahn
Hyeseon Ahn
Jungin Kim
Soohan Lim
Yo-Sub Han
37
0
0
10 Feb 2025
Can Large Language Models Understand Intermediate Representations?
Can Large Language Models Understand Intermediate Representations?
Hailong Jiang
Jianfeng Zhu
Yao Wan
B. Fang
Hongyu Zhang
Ruoming Jin
Qiang Guan
50
1
0
07 Feb 2025
Process-Supervised Reinforcement Learning for Code Generation
Process-Supervised Reinforcement Learning for Code Generation
Yufan Ye
Ting Zhang
Wenbin Jiang
Hua Huang
OffRL
LRM
SyDa
55
1
0
03 Feb 2025
Towards Making Flowchart Images Machine Interpretable
Towards Making Flowchart Images Machine Interpretable
S. Kamath S
Prajwal Gatti
Yogesh Kumar
Vikash Yadav
Anand Mishra
53
5
0
29 Jan 2025
Data-efficient Performance Modeling via Pre-training
Data-efficient Performance Modeling via Pre-training
Chunting Liu
Riyadh Baghdadi
41
0
0
24 Jan 2025
LLM Hallucinations in Practical Code Generation: Phenomena, Mechanism, and Mitigation
LLM Hallucinations in Practical Code Generation: Phenomena, Mechanism, and Mitigation
Ziyao Zhang
Yanlin Wang
Chong Wang
Jiachi Chen
Zibin Zheng
114
14
0
20 Jan 2025
Exploring Large Language Models for Semantic Analysis and Categorization of Android Malware
Exploring Large Language Models for Semantic Analysis and Categorization of Android Malware
Brandon J Walton
Mst Eshita Khatun
James M Ghawaly
Aisha Ali-Gombe
33
2
0
10 Jan 2025
Multi-task retriever fine-tuning for domain-specific and efficient RAG
Patrice Béchard
Orlando Marquez Ayala
35
0
0
08 Jan 2025
CGP-Tuning: Structure-Aware Soft Prompt Tuning for Code Vulnerability Detection
CGP-Tuning: Structure-Aware Soft Prompt Tuning for Code Vulnerability Detection
Ruijun Feng
Hammond Pearce
Pietro Liguori
Yulei Sui
33
0
0
08 Jan 2025
AllSpark: A Multimodal Spatio-Temporal General Intelligence Model with Ten Modalities via Language as a Reference Framework
AllSpark: A Multimodal Spatio-Temporal General Intelligence Model with Ten Modalities via Language as a Reference Framework
Run Shao
Cheng Yang
Qiujun Li
Qing Zhu
Yongjun Zhang
...
Yu Liu
Yong Tang
Dapeng Liu
Shizhong Yang
Haifeng Li
106
1
0
08 Jan 2025
On the Compression of Language Models for Code: An Empirical Study on
  CodeBERT
On the Compression of Language Models for Code: An Empirical Study on CodeBERT
Giordano dÁloisio
Luca Traini
Federica Sarro
A. Marco
63
1
0
18 Dec 2024
Transducer Tuning: Efficient Model Adaptation for Software Tasks Using
  Code Property Graphs
Transducer Tuning: Efficient Model Adaptation for Software Tasks Using Code Property Graphs
Imam Nur Bani Yusuf
Lingxiao Jiang
77
0
0
18 Dec 2024
Optimizing AI-Assisted Code Generation
Optimizing AI-Assisted Code Generation
Simon Torka
Sahin Albayrak
70
0
0
14 Dec 2024
Code LLMs: A Taxonomy-based Survey
Code LLMs: A Taxonomy-based Survey
Nishat Raihan
Christian D. Newman
Marcos Zampieri
91
1
0
11 Dec 2024
CoRNStack: High-Quality Contrastive Data for Better Code Retrieval and Reranking
CoRNStack: High-Quality Contrastive Data for Better Code Retrieval and Reranking
Tarun Suresh
R. Reddy
Yifei Xu
Zach Nussbaum
Andriy Mulyar
Brandon Duderstadt
Heng Ji
83
3
0
01 Dec 2024
ASSERTIFY: Utilizing Large Language Models to Generate Assertions for
  Production Code
ASSERTIFY: Utilizing Large Language Models to Generate Assertions for Production Code
Mohammad Jalili Torkamani
Abhinav Sharma
Nikita Mehrotra
Rahul Purandare
64
0
0
25 Nov 2024
EnStack: An Ensemble Stacking Framework of Large Language Models for
  Enhanced Vulnerability Detection in Source Code
EnStack: An Ensemble Stacking Framework of Large Language Models for Enhanced Vulnerability Detection in Source Code
Shahriyar Zaman Ridoy
Md. Shazzad Hossain Shaon
A. Cuzzocrea
Mst. Shapna Akter
62
0
0
25 Nov 2024
Prompting and Fine-tuning Large Language Models for Automated Code
  Review Comment Generation
Prompting and Fine-tuning Large Language Models for Automated Code Review Comment Generation
Md. Asif Haider
Ayesha Binte Mostofa
Sk. Sabit Bin Mosaddek
Anindya Iqbal
Toufique Ahmed
ALM
50
2
0
15 Nov 2024
A Survey on Adversarial Machine Learning for Code Data: Realistic
  Threats, Countermeasures, and Interpretations
A Survey on Adversarial Machine Learning for Code Data: Realistic Threats, Countermeasures, and Interpretations
Yulong Yang
Haoran Fan
Chenhao Lin
Qian Li
Zhengyu Zhao
Chao Shen
Xiaohong Guan
AAML
43
0
0
12 Nov 2024
Smart-LLaMA: Two-Stage Post-Training of Large Language Models for Smart
  Contract Vulnerability Detection and Explanation
Smart-LLaMA: Two-Stage Post-Training of Large Language Models for Smart Contract Vulnerability Detection and Explanation
Lei Yu
Shiqi Chen
Hang Yuan
Peng Wang
Zhirong Huang
J. Zhang
Chenjie Shen
Fengjun Zhang
Li Yang
Jiajia Ma
34
2
0
09 Nov 2024
Leveraging Large Language Models in Code Question Answering: Baselines
  and Issues
Leveraging Large Language Models in Code Question Answering: Baselines and Issues
Georgy Andryushchenko
Vladimir Ivanov
Vladimir Makharev
Elizaveta Tukhtina
Aidar Valeev
ELM
29
2
0
05 Nov 2024
Fine-Tuning LLMs for Code Mutation: A New Era of Cyber Threats
Fine-Tuning LLMs for Code Mutation: A New Era of Cyber Threats
Mohammad Setak
Pooria Madani
32
2
0
29 Oct 2024
Knowledge-Guided Prompt Learning for Request Quality Assurance in Public
  Code Review
Knowledge-Guided Prompt Learning for Request Quality Assurance in Public Code Review
Lin Li
Xinchun Yu
Xinyu Chen
Peng Liang
21
0
0
29 Oct 2024
Building A Coding Assistant via the Retrieval-Augmented Language Model
Building A Coding Assistant via the Retrieval-Augmented Language Model
Xinze Li
Hanbin Wang
Zhenghao Liu
S. Yu
Shuo Wang
Yukun Yan
Yukai Fu
Yu Gu
Ge Yu
3DV
RALM
21
2
0
21 Oct 2024
Enhancing LLM Agents for Code Generation with Possibility and Pass-rate Prioritized Experience Replay
Enhancing LLM Agents for Code Generation with Possibility and Pass-rate Prioritized Experience Replay
Yuyang Chen
Kaiyan Zhao
Yiming Wang
Ming Yang
Jian Zhang
Xiaoguang Niu
25
1
0
16 Oct 2024
Network Representation Learning for Biophysical Neural Network Analysis
Network Representation Learning for Biophysical Neural Network Analysis
Youngmok Ha
Yongjoo Kim
Hyun Jae Jang
Seungyeon Lee
Eunji Pak
17
0
0
15 Oct 2024
Do Current Language Models Support Code Intelligence for R Programming Language?
Do Current Language Models Support Code Intelligence for R Programming Language?
Zixiao Zhao
Fatemeh H. Fard
ELM
42
0
0
10 Oct 2024
Context-Augmented Code Generation Using Programming Knowledge Graphs
Context-Augmented Code Generation Using Programming Knowledge Graphs
Iman Saberi
Fatemeh H. Fard
21
1
0
09 Oct 2024
A Benchmark on Directed Graph Representation Learning in Hardware
  Designs
A Benchmark on Directed Graph Representation Learning in Hardware Designs
Haoyu Wang
Yinan Huang
Nan Wu
Pan Li
OOD
38
1
0
09 Oct 2024
StagedVulBERT: Multi-Granular Vulnerability Detection with a Novel
  Pre-trained Code Model
StagedVulBERT: Multi-Granular Vulnerability Detection with a Novel Pre-trained Code Model
Yuan Jiang
Yujian Zhang
Xiaohong Su
Christoph Treude
Tiantian Wang
45
0
0
08 Oct 2024
Showing LLM-Generated Code Selectively Based on Confidence of LLMs
Showing LLM-Generated Code Selectively Based on Confidence of LLMs
Jia Li
Yuqi Zhu
Yongmin Li
Ge Li
Zhi Jin
31
0
0
04 Oct 2024
123456789
Next