v1v2v3 (latest)

SynCoBERT: Syntax-Guided Multi-Modal Contrastive Pre-Training for Code Representation

10 August 2021

Xin Jiang

Papers citing "SynCoBERT: Syntax-Guided Multi-Modal Contrastive Pre-Training for Code Representation"

50 / 59 papers shown

SPENCER: Self-Adaptive Model Distillation for Efficient Code RetrievalACM Transactions on Software Engineering and Methodology (TOSEM), 2025

256

01 Aug 2025

On The Role of Pretrained Language Models in General-Purpose Text Embeddings: A Survey

370

28 Jul 2025

LEANCODE: Understanding Models Better for Code Simplification of Pre-trained Large Language ModelsAnnual Meeting of the Association for Computational Linguistics (ACL), 2025

434

20 May 2025

ObscuraCoder: Powering Efficient Code LM Pre-Training Via Obfuscation GroundingInternational Conference on Learning Representations (ICLR), 2025

282

27 Mar 2025

Speculative Decoding for Verilog: Speed and Quality, All in OneDesign Automation Conference (DAC), 2025

250

18 Mar 2025

Grammar-Based Code Representation: Is It a Worthy Pursuit for LLMs?Annual Meeting of the Association for Computational Linguistics (ACL), 2025

...

379

07 Mar 2025

GNN-Coder: Boosting Semantic Code Retrieval with Combined GNNs and Transformer

527

24 Feb 2025

Code LLMs: A Taxonomy-based SurveyBigData Congress [Services Society] (BSS), 2024

Nishat Raihan

Christian D. Newman

Marcos Zampieri

425

11 Dec 2024

GALLa: Graph Aligned Large Language Models for Improved Source Code UnderstandingAnnual Meeting of the Association for Computational Linguistics (ACL), 2024

622

06 Sep 2024

Top Pass: Improve Code Generation by Pass@k-Maximized Code Ranking

289

11 Aug 2024

Towards Better Code Understanding in Decoder-Only Models with Contrastive Learning

200

18 Jun 2024

Uncovering LLM-Generated Code: A Zero-Shot Synthetic Code Detector via Code Rewriting

Xuhong Zhang

360

25 May 2024

On the Limitations of Embedding Based Methods for Measuring Functional Correctness for Code Generation

Atharva Naik

299

26 Apr 2024

Analyzing the Performance of Large Language Models on Code SummarizationInternational Conference on Language Resources and Evaluation (LREC), 2024

Rajarshi Haldar

Anjali Narayan-Chen

241

10 Apr 2024

CSEPrompts: A Benchmark of Introductory Computer Science PromptsInternational Syposium on Methodologies for Intelligent Systems (ISMIS), 2024

Md. Nishat Raihan

Dhiman Goswami

Sadiya Sayara Chowdhury Puspo

263

03 Apr 2024

ProCQA: A Large-scale Community-based Programming Question Answering Dataset for Code Search

191

25 Mar 2024

Beyond Self-learned Attention: Mitigating Attention Bias in Transformer-based Models Using Attention Guidance

Jiri Gesi

Iftekhar Ahmed

265

26 Feb 2024

Code Needs Comments: Enhancing Code LLMs with Comment Augmentation

...

Qipeng Guo

Xipeng Qiu

Dahua Lin

SyDa

226

20 Feb 2024

Code Representation Learning At Scale

254

02 Feb 2024

Investigating the Efficacy of Large Language Models for Code Clone DetectionIEEE International Conference on Program Comprehension (ICPC), 2024

241

24 Jan 2024

Deep Learning for Code Intelligence: Survey, Benchmark and ToolkitACM Computing Surveys (ACM Comput. Surv.), 2023

Philip S. Yu

307

30 Dec 2023

Language Agnostic Code EmbeddingsNorth American Chapter of the Association for Computational Linguistics (NAACL), 2023

Saiteja Utpala

Alex Gu

Pin-Yu Chen

259

25 Oct 2023

Rethinking Negative Pairs in Code SearchConference on Empirical Methods in Natural Language Processing (EMNLP), 2023

410

12 Oct 2023

Contrastive Prompt Learning-based Code Search based on Interaction Matrix

243

10 Oct 2023

Laminar: A New Serverless Stream-based Framework with Semantic Code Search and Code Completion

Zaynab Zahra

Zihao Li

Rosa Filgueira

164

01 Sep 2023

Large Language Models for Software Engineering: A Systematic Literature ReviewACM Transactions on Software Engineering and Methodology (TOSEM), 2023

Kailong Wang

Haoyu Wang

484

912

21 Aug 2023

Evaluating Instruction-Tuned Large Language Models on Code Comprehension and Generation

Xin Peng

257

108

02 Aug 2023

Contrastive Learning for API Aspect AnalysisInternational Conference on Automated Software Engineering (ASE), 2023

231

31 Jul 2023

Natural Language Generation and Understanding of Big Code for AI-Assisted Programming: A ReviewEntropy (Entropy), 2023

288

142

04 Jul 2023

Exploring the Robustness of Large Language Models for Solving Programming Problems

374

26 Jun 2023

Multi-target Backdoor Attacks for Code Pre-trained ModelsAnnual Meeting of the Association for Computational Linguistics (ACL), 2023

Yang Liu

302

14 Jun 2023

Understanding Programs by Exploiting (Fuzzing) Test CasesAnnual Meeting of the Association for Computational Linguistics (ACL), 2023

Hao Chen

258

23 May 2023

Neural Machine Translation for Code Generation

K. Dharma

Clayton T. Morrison

384

22 May 2023

CCT-Code: Cross-Consistency Training for Multilingual Clone Detection and Code SearchNorth American Chapter of the Association for Computational Linguistics (NAACL), 2023

253

19 May 2023

Searching by Code: a New SearchBySnippet Dataset and SnippeR Retrieval Model for Searching by Code SnippetsInternational Conference on Language Resources and Evaluation (LREC), 2023

273

19 May 2023

CodeT5+: Open Code Large Language Models for Code Understanding and GenerationConference on Empirical Methods in Natural Language Processing (EMNLP), 2023

Yue Wang

Hung Le

Akhilesh Deepak Gotmare

487

686

13 May 2023

Code Execution with Pre-trained Language ModelsAnnual Meeting of the Association for Computational Linguistics (ACL), 2023

288

08 May 2023

Neuro-symbolic Zero-Shot Code Cloning with Cross-Language Intermediate Representation

Krishnam Hasija

Shrishti Pradhan

Manasi Patwardhan

Raveendra Kumar Medicherla

Lovekesh Vig

Ravindra Naik

177

26 Apr 2023

An Unbiased Transformer Source Code Learning with Semantic Vulnerability GraphEuropean Symposium on Security and Privacy (Euro S&P), 2023

Nafis Tanveer Islam

G. Parra

Dylan Manuel

E. Bou-Harb

Peyman Najafirad

261

17 Apr 2023

MultiCoder: Multi-Programming-Lingual Pre-Training for Low-Resource Code Completion

Cuiyun Gao

275

19 Dec 2022

An Empirical Study of Deep Learning Models for Vulnerability DetectionInternational Conference on Software Engineering (ICSE), 2022

450

140

15 Dec 2022

CLAWSAT: Towards Both Robust and Accurate Code ModelsIEEE International Conference on Software Analysis, Evolution, and Reengineering (SANER), 2022

Chuang Gan

423

21 Nov 2022

Exploring Representation-Level Augmentation for Code SearchConference on Empirical Methods in Natural Language Processing (EMNLP), 2022

Yanxian Huang

234

21 Oct 2022

Soft-Labeled Contrastive Pre-training for Function-level Code RepresentationConference on Empirical Methods in Natural Language Processing (EMNLP), 2022

Xipeng Qiu

213

18 Oct 2022

CAT-probing: A Metric-based Approach to Interpret How Pre-trained Models for Programming Language Attend Code StructureConference on Empirical Methods in Natural Language Processing (EMNLP), 2022

350

07 Oct 2022

Semantic-Preserving Adversarial Code ComprehensionInternational Conference on Computational Linguistics (COLING), 2022

195

12 Sep 2022

CommitBART: A Large Pre-trained Model for GitHub Commits

Yang Liu

262

17 Aug 2022

Finding Reusable Machine Learning Components to Build Programming Language Processing PipelinesEuropean Conference on Software Architecture (ECSA), 2022

259

11 Aug 2022

CoditT5: Pretraining for Source Code and Natural Language EditingInternational Conference on Automated Software Engineering (ASE), 2022

343

124

10 Aug 2022

Deep Learning Meets Software Engineering: A Survey on Pre-Trained Models of Source CodeInternational Joint Conference on Artificial Intelligence (IJCAI), 2022

320

24 May 2022