ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2305.06161
  4. Cited By
StarCoder: may the source be with you!

StarCoder: may the source be with you!

9 May 2023
Raymond Li
Loubna Ben Allal
Yangtian Zi
Niklas Muennighoff
Denis Kocetkov
Chenghao Mou
Marc Marone
Christopher Akiki
Jia Li
Jenny Chim
Qian Liu
Evgenii Zheltonozhskii
Terry Yue Zhuo
Thomas Wang
Olivier Dehaene
Mishig Davaadorj
J. Lamy-Poirier
João Monteiro
Oleh Shliazhko
Nicolas Angelard-Gontier
Nicholas Meade
A. Zebaze
Ming-Ho Yee
Logesh Kumar Umapathi
Jian Zhu
Benjamin Lipkin
Muhtasham Oblokulov
Zhiruo Wang
Rudra Murthy
Jason T Stillerman
S. Patel
Dmitry Abulkhanov
Marco Zocca
Manan Dey
Zhihan Zhang
N. Fahmy
Urvashi Bhattacharyya
W. Yu
Swayam Singh
Sasha Luccioni
Paulo Villegas
M. Kunakov
Fedor Zhdanov
Manuel Romero
Tony Lee
Nadav Timor
Jennifer Ding
Claire Schlesinger
Hailey Schoelkopf
Jana Ebert
Tri Dao
Mayank Mishra
A. Gu
Jennifer Robinson
Carolyn Jane Anderson
Brendan Dolan-Gavitt
Danish Contractor
Siva Reddy
Daniel Fried
Dzmitry Bahdanau
Yacine Jernite
Carlos Muñoz Ferrandis
Sean M. Hughes
Thomas Wolf
Arjun Guha
Leandro von Werra
H. D. Vries
ArXivPDFHTML

Papers citing "StarCoder: may the source be with you!"

50 / 81 papers shown
Title
Evaluate-and-Purify: Fortifying Code Language Models Against Adversarial Attacks Using LLM-as-a-Judge
Evaluate-and-Purify: Fortifying Code Language Models Against Adversarial Attacks Using LLM-as-a-Judge
Wenhan Mu
Ling Xu
Shuren Pei
Le Mi
Huichi Zhou
AAML
ELM
48
0
0
28 Apr 2025
Reimagining Urban Science: Scaling Causal Inference with Large Language Models
Reimagining Urban Science: Scaling Causal Inference with Large Language Models
Yutong Xia
Ao Qu
Yunhan Zheng
Yihong Tang
Dingyi Zhuang
...
Cathy Wu
R. Zimmermann
Lijun Sun
Roger Zimmermann
Jinhua Zhao
AI4CE
53
0
0
15 Apr 2025
DocAgent: A Multi-Agent System for Automated Code Documentation Generation
DocAgent: A Multi-Agent System for Automated Code Documentation Generation
Dayu Yang
Antoine Simoulin
Xin Qian
Xiaoyi Liu
Yuwei Cao
Zhaopu Teng
Grey Yang
LLMAG
54
0
0
11 Apr 2025
From Token to Line: Enhancing Code Generation with a Long-Term Perspective
From Token to Line: Enhancing Code Generation with a Long-Term Perspective
Tingwei Lu
Yangning Li
Liyuan Wang
Binghuai Lin
Jiwei Tang
...
Hai-tao Zheng
Yinghui Li
Bingxu An
Zhao Wei
Y. Xu
LLMAG
57
0
0
10 Apr 2025
Enhancing High-Quality Code Generation in Large Language Models with Comparative Prefix-Tuning
Enhancing High-Quality Code Generation in Large Language Models with Comparative Prefix-Tuning
Yuan Jiang
Yujian Zhang
Liang Lu
Christoph Treude
Xiaohong Su
Shan Huang
Tiantian Wang
ALM
54
0
0
12 Mar 2025
ResBench: Benchmarking LLM-Generated FPGA Designs with Resource Awareness
ResBench: Benchmarking LLM-Generated FPGA Designs with Resource Awareness
Ce Guo
Tong Zhao
51
1
0
11 Mar 2025
FEA-Bench: A Benchmark for Evaluating Repository-Level Code Generation for Feature Implementation
Wei Li
Xin Zhang
Zhongxin Guo
Shaoguang Mao
Wen Luo
Guangyue Peng
Yangyu Huang
Houfeng Wang
Scarlett Li
53
0
0
09 Mar 2025
Robust Learning of Diverse Code Edits
Robust Learning of Diverse Code Edits
Tushar Aggarwal
Swayam Singh
Abhijeet Awasthi
Aditya Kanade
Nagarajan Natarajan
SyDa
66
0
0
05 Mar 2025
IterPref: Focal Preference Learning for Code Generation via Iterative Debugging
Jie Wu
Haoling Li
Xin Zhang
Jianwen Luo
Yangyu Huang
Ruihang Chu
Y. Yang
Scarlett Li
67
0
0
04 Mar 2025
Selective Prompt Anchoring for Code Generation
Selective Prompt Anchoring for Code Generation
Yuan Tian
Tianyi Zhang
77
3
0
24 Feb 2025
How Efficient is LLM-Generated Code? A Rigorous & High-Standard Benchmark
How Efficient is LLM-Generated Code? A Rigorous & High-Standard Benchmark
Ruizhong Qiu
Weiliang Will Zeng
Hanghang Tong
James Ezick
Christopher Lott
82
15
0
20 Feb 2025
DSMoE: Matrix-Partitioned Experts with Dynamic Routing for Computation-Efficient Dense LLMs
Minxuan Lv
Zhenpeng Su
Leiyu Pan
Yizhe Xiong
Zijia Lin
...
Guiguang Ding
Cheng Luo
Di Zhang
Kun Gai
Songlin Hu
MoE
39
0
0
18 Feb 2025
UniGenCoder: Merging Seq2Seq and Seq2Tree Paradigms for Unified Code Generation
UniGenCoder: Merging Seq2Seq and Seq2Tree Paradigms for Unified Code Generation
Liangying Shao
Yanfu Yan
Denys Poshyvanyk
Jinsong Su
31
0
0
18 Feb 2025
TinyEmo: Scaling down Emotional Reasoning via Metric Projection
TinyEmo: Scaling down Emotional Reasoning via Metric Projection
Cristian Gutierrez
LRM
60
0
0
17 Feb 2025
Can Large Language Models Understand Intermediate Representations?
Can Large Language Models Understand Intermediate Representations?
Hailong Jiang
Jianfeng Zhu
Yao Wan
B. Fang
Hongyu Zhang
Ruoming Jin
Qiang Guan
48
1
0
07 Feb 2025
Learning to Generate Unit Tests for Automated Debugging
Learning to Generate Unit Tests for Automated Debugging
Archiki Prasad
Elias Stengel-Eskin
Justin Chih-Yao Chen
Zaid Khan
Mohit Bansal
ELM
76
1
0
03 Feb 2025
mHumanEval -- A Multilingual Benchmark to Evaluate Large Language Models for Code Generation
mHumanEval -- A Multilingual Benchmark to Evaluate Large Language Models for Code Generation
Nishat Raihan
Antonios Anastasopoulos
Marcos Zampieri
ELM
31
5
0
28 Jan 2025
Kimi k1.5: Scaling Reinforcement Learning with LLMs
Kimi k1.5: Scaling Reinforcement Learning with LLMs
Kimi Team
Angang Du
Bofei Gao
Bowei Xing
Changjiu Jiang
...
Zhilin Yang
Zhiqi Huang
Zihao Huang
Ziyao Xu
Z. Yang
VLM
ALM
OffRL
AI4TS
LRM
106
128
0
22 Jan 2025
LLM Hallucinations in Practical Code Generation: Phenomena, Mechanism, and Mitigation
LLM Hallucinations in Practical Code Generation: Phenomena, Mechanism, and Mitigation
Ziyao Zhang
Yanlin Wang
Chong Wang
Jiachi Chen
Zibin Zheng
111
11
0
20 Jan 2025
Enhancing Code LLMs with Reinforcement Learning in Code Generation: A Survey
Enhancing Code LLMs with Reinforcement Learning in Code Generation: A Survey
Junqiao Wang
Zeng Zhang
Yangfan He
Yuyang Song
Tianyu Shi
...
Hengyuan Xu
Kunyu Wu
Guangwu Qian
Qiuwu Chen
Lewei He
38
8
0
03 Jan 2025
WizardMath: Empowering Mathematical Reasoning for Large Language Models via Reinforced Evol-Instruct
WizardMath: Empowering Mathematical Reasoning for Large Language Models via Reinforced Evol-Instruct
Haipeng Luo
Qingfeng Sun
Can Xu
Pu Zhao
Jian-Guang Lou
...
Xiubo Geng
Qingwei Lin
Shifeng Chen
Yansong Tang
Dongmei Zhang
OSLM
LRM
93
402
0
03 Jan 2025
Unleashing the Power of Data Tsunami: A Comprehensive Survey on Data Assessment and Selection for Instruction Tuning of Language Models
Unleashing the Power of Data Tsunami: A Comprehensive Survey on Data Assessment and Selection for Instruction Tuning of Language Models
Yulei Qin
Yuncheng Yang
Pengcheng Guo
Gang Li
Hang Shao
Yuchen Shi
Zihan Xu
Yun Gu
Ke Li
Xing Sun
ALM
76
11
0
31 Dec 2024
WarriorCoder: Learning from Expert Battles to Augment Code Large Language Models
WarriorCoder: Learning from Expert Battles to Augment Code Large Language Models
Huawen Feng
Pu Zhao
Qingfeng Sun
Can Xu
Fangkai Yang
...
Qianli Ma
Qingwei Lin
Saravan Rajmohan
Dongmei Zhang
Qi Zhang
AAML
ALM
62
0
0
23 Dec 2024
OpenCoder: The Open Cookbook for Top-Tier Code Large Language Models
OpenCoder: The Open Cookbook for Top-Tier Code Large Language Models
Siming Huang
Tianhao Cheng
J.K. Liu
Jiaran Hao
L. Song
...
Ge Zhang
Zili Wang
Yuan Qi
Yinghui Xu
Wei Chu
ALM
64
16
0
07 Nov 2024
MdEval: Massively Multilingual Code Debugging
MdEval: Massively Multilingual Code Debugging
Shukai Liu
Linzheng Chai
Jian Yang
Jiajun Shi
He Zhu
...
Yu Hao
Liqun Yang
Guanglin Niu
Ge Zhang
Z. Li
LRM
ELM
61
6
0
04 Nov 2024
Sparsing Law: Towards Large Language Models with Greater Activation Sparsity
Sparsing Law: Towards Large Language Models with Greater Activation Sparsity
Yuqi Luo
Chenyang Song
Xu Han
Y. Chen
Chaojun Xiao
Zhiyuan Liu
Maosong Sun
47
3
0
04 Nov 2024
Scaling Diffusion Language Models via Adaptation from Autoregressive Models
Scaling Diffusion Language Models via Adaptation from Autoregressive Models
Shansan Gong
Shivam Agarwal
Yizhe Zhang
Jiacheng Ye
Lin Zheng
...
Peilin Zhao
W. Bi
Jiawei Han
Hao Peng
Lingpeng Kong
AI4CE
57
14
0
23 Oct 2024
Understanding Layer Significance in LLM Alignment
Understanding Layer Significance in LLM Alignment
Guangyuan Shi
Zexin Lu
Xiaoyu Dong
Wenlong Zhang
Xuanyu Zhang
Yujie Feng
Xiao-Ming Wu
41
2
0
23 Oct 2024
Enhancing LLM Agents for Code Generation with Possibility and Pass-rate Prioritized Experience Replay
Enhancing LLM Agents for Code Generation with Possibility and Pass-rate Prioritized Experience Replay
Yuyang Chen
Kaiyan Zhao
Yiming Wang
Ming Yang
Jian Zhang
Xiaoguang Niu
17
1
0
16 Oct 2024
Ada-K Routing: Boosting the Efficiency of MoE-based LLMs
Ada-K Routing: Boosting the Efficiency of MoE-based LLMs
Tongtian Yue
Longteng Guo
Jie Cheng
Xuange Gao
J. Liu
MoE
18
0
0
14 Oct 2024
Decoding Secret Memorization in Code LLMs Through Token-Level Characterization
Decoding Secret Memorization in Code LLMs Through Token-Level Characterization
Yuqing Nie
Chong Wang
K. Wang
Guoai Xu
Guosheng Xu
Haoyu Wang
OffRL
37
0
0
11 Oct 2024
Synthesizing Interpretable Control Policies through Large Language Model Guided Search
Synthesizing Interpretable Control Policies through Large Language Model Guided Search
Carlo Bosio
Mark W. Mueller
15
0
0
07 Oct 2024
Text2Chart31: Instruction Tuning for Chart Generation with Automatic Feedback
Text2Chart31: Instruction Tuning for Chart Generation with Automatic Feedback
Fatemeh Pesaran Zadeh
Juyeon Kim
Jin-Hwa Kim
Gunhee Kim
ALM
42
1
0
05 Oct 2024
RepoGraph: Enhancing AI Software Engineering with Repository-level Code Graph
RepoGraph: Enhancing AI Software Engineering with Repository-level Code Graph
Siru Ouyang
W. Yu
Kaixin Ma
Zilin Xiao
Z. Zhang
Mengzhao Jia
J. Han
H. Zhang
Dong Yu
47
12
0
03 Oct 2024
Confidential Prompting: Protecting User Prompts from Cloud LLM Providers
Confidential Prompting: Protecting User Prompts from Cloud LLM Providers
In Gim
Caihua Li
Lin Zhong
35
2
0
27 Sep 2024
ANVIL: Anomaly-based Vulnerability Identification without Labelled Training Data
ANVIL: Anomaly-based Vulnerability Identification without Labelled Training Data
Weizhou Wang
Eric Liu
Xiangyu Guo
Xiao Hu
Ilya Grishchenko
David Lie
22
1
0
28 Aug 2024
The advantages of context specific language models: the case of the Erasmian Language Model
The advantages of context specific language models: the case of the Erasmian Language Model
João Gonçalves
Nick Jelicic
Michele Murgia
Evert Stamhuis
26
0
0
13 Aug 2024
Strong Copyright Protection for Language Models via Adaptive Model
  Fusion
Strong Copyright Protection for Language Models via Adaptive Model Fusion
Javier Abad
Konstantin Donhauser
Francesco Pinto
Fanny Yang
35
4
0
29 Jul 2024
Affordance-Guided Reinforcement Learning via Visual Prompting
Affordance-Guided Reinforcement Learning via Visual Prompting
Olivia Y. Lee
Annie Xie
Kuan Fang
Karl Pertsch
Chelsea Finn
OffRL
LM&Ro
62
7
0
14 Jul 2024
AnnotatedTables: A Large Tabular Dataset with Language Model Annotations
AnnotatedTables: A Large Tabular Dataset with Language Model Annotations
Yaojie Hu
Ilias Fountalis
Jin Tian
N. Vasiloglou
LMTD
24
3
0
24 Jun 2024
CodeRAG-Bench: Can Retrieval Augment Code Generation?
CodeRAG-Bench: Can Retrieval Augment Code Generation?
Zora Zhiruo Wang
Akari Asai
Xinyan Velocity Yu
Frank F. Xu
Yiqing Xie
Graham Neubig
Daniel Fried
RALM
67
29
0
20 Jun 2024
Code-Optimise: Self-Generated Preference Data for Correctness and Efficiency
Code-Optimise: Self-Generated Preference Data for Correctness and Efficiency
Leonidas Gee
Milan Gritta
Gerasimos Lampouras
Ignacio Iacobacci
16
10
0
18 Jun 2024
How Do Large Language Models Acquire Factual Knowledge During
  Pretraining?
How Do Large Language Models Acquire Factual Knowledge During Pretraining?
Hoyeon Chang
Jinho Park
Seonghyeon Ye
Sohee Yang
Youngkyung Seo
Du-Seong Chang
Minjoon Seo
KELM
23
30
0
17 Jun 2024
Next-Generation Database Interfaces: A Survey of LLM-based Text-to-SQL
Next-Generation Database Interfaces: A Survey of LLM-based Text-to-SQL
Zijin Hong
Zheng Yuan
Qinggang Zhang
Hao Chen
Junnan Dong
Feiran Huang
Xiao Huang
56
49
0
12 Jun 2024
Leveraging Large Language Models for Efficient Failure Analysis in Game
  Development
Leveraging Large Language Models for Efficient Failure Analysis in Game Development
Leonardo Marini
Linus Gisslén
Alessandro Sestini
30
0
0
11 Jun 2024
Kotlin ML Pack: Technical Report
Kotlin ML Pack: Technical Report
Sergey Titov
Mikhail Evtikhiev
Anton Shapkin
Oleg Smirnov
Sergei Boytsov
...
Dariia Karaeva
Maksim Sheptyakov
Mikhail Arkhipov
T. Bryksin
Egor Bogomolov
24
0
0
29 May 2024
Large Language Models Meet NLP: A Survey
Large Language Models Meet NLP: A Survey
Libo Qin
Qiguang Chen
Xiachong Feng
Yang Wu
Yongheng Zhang
Yinghui Li
Min Li
Wanxiang Che
Philip S. Yu
ALM
LM&MA
ELM
LRM
38
44
0
21 May 2024
LG AI Research & KAIST at EHRSQL 2024: Self-Training Large Language
  Models with Pseudo-Labeled Unanswerable Questions for a Reliable Text-to-SQL
  System on EHRs
LG AI Research & KAIST at EHRSQL 2024: Self-Training Large Language Models with Pseudo-Labeled Unanswerable Questions for a Reliable Text-to-SQL System on EHRs
Yongrae Jo
Seongyun Lee
Minju Seo
Sung Ju Hwang
Moontae Lee
21
3
0
18 May 2024
Performance-Aligned LLMs for Generating Fast Code
Performance-Aligned LLMs for Generating Fast Code
Daniel Nichols
Pranav Polasam
Harshitha Menon
Aniruddha Marathe
T. Gamblin
A. Bhatele
21
7
0
29 Apr 2024
LLM-SR: Scientific Equation Discovery via Programming with Large Language Models
LLM-SR: Scientific Equation Discovery via Programming with Large Language Models
Parshin Shojaee
Kazem Meidani
Shashank Gupta
A. Farimani
Chandan K. Reddy
37
13
0
29 Apr 2024
12
Next