Title
Physics of Language Models: Part 1, Learning Hierarchical Language Structures Zeyuan Allen-Zhu Yuanzhi Li 448 39 0 23 May 2023
A 4D Hybrid Algorithm to Scale Parallel Training to Thousands of GPUs Siddharth Singh Prajwal Singhania Aditya K. Ranjan Zack Sating A. Bhatele 189 6 0 22 May 2023
Small Language Models Improve Giants by Rewriting Their OutputsConference of the European Chapter of the Association for Computational Linguistics (EACL), 2023 Giorgos Vernikos Arthur Bravzinskas Jakub Adamek Jonathan Mallinson Aliaksei Severyn Eric Malmi BDL LRM 221 21 0 22 May 2023
Neural Machine Translation for Code Generation K. Dharma Clayton T. Morrison 300 7 0 22 May 2023
Flover: A Temporal Fusion Framework for Efficient Autoregressive Model Parallel InferenceInternational Conference on High Performance Computing (HiPC), 2023 Jinghan Yao Nawras Alnaasan Tianrun Chen Hari Subramoni Hari Subramoni Dhabaleswar K. D. Panda 126 2 0 22 May 2023
MAGE: Machine-generated Text Detection in the WildAnnual Meeting of the Association for Computational Linguistics (ACL), 2023 Yafu Li Qintong Li Leyang Cui Wei Bi Zhilin Wang Longyue Wang Linyi Yang Shuming Shi Yue Zhang DeLMO 271 103 0 22 May 2023
Editing Large Language Models: Problems, Methods, and OpportunitiesConference on Empirical Methods in Natural Language Processing (EMNLP), 2023 Yunzhi Yao Peng Wang Bo Tian Shuyang Cheng Zhoubo Li Shumin Deng Huajun Chen Ningyu Zhang KELM 296 391 0 22 May 2023
A Pretrainer's Guide to Training Data: Measuring the Effects of Data Age, Domain Coverage, Quality, & ToxicityNorth American Chapter of the Association for Computational Linguistics (NAACL), 2023 Shayne Longpre Gregory Yauney Emily Reif Katherine Lee Adam Roberts ... Denny Zhou Jason W. Wei Kevin Robinson David M. Mimno Daphne Ippolito 332 206 0 22 May 2023
RWKV: Reinventing RNNs for the Transformer EraConference on Empirical Methods in Natural Language Processing (EMNLP), 2023 Bo Peng Eric Alcaide Quentin G. Anthony Alon Albalak Samuel Arcadinho ... Qihang Zhao P. Zhou Qinghua Zhou Jian Zhu Rui-Jie Zhu 534 816 0 22 May 2023
Iterative Forward Tuning Boosts In-Context Learning in Language ModelsAnnual Meeting of the Association for Computational Linguistics (ACL), 2023 Jiaxi Yang Binyuan Hui Min Yang Bailin Wang Bowen Li Binhua Li Fei Huang Yongbin Li 247 19 0 22 May 2023
GPT-SW3: An Autoregressive Language Model for the Nordic Languages Ariel Ekgren Amaru Cuba Gyllensten Felix Stollenwerk Joey Öhman T. Isbister Evangelia Gogoulou F. Carlsson Alice Heiman Judit Casademont Magnus Sahlgren 232 16 0 22 May 2023
Can We Edit Factual Knowledge by In-Context Learning?Conference on Empirical Methods in Natural Language Processing (EMNLP), 2023 Ce Zheng Lei Li Qingxiu Dong Yuxuan Fan Zhiyong Wu Jingjing Xu Baobao Chang KELM 216 276 0 22 May 2023
Quantifying Association Capabilities of Large Language Models and Its Implications on Privacy LeakageFindings (Findings), 2023 Hanyin Shao Jie Huang Shen Zheng Kevin Chen-Chuan Chang PILM 152 32 0 22 May 2023
LLM-CXR: Instruction-Finetuned LLM for CXR Image Understanding and GenerationInternational Conference on Learning Representations (ICLR), 2023 Suhyeon Lee Won Jun Kim Jinho Chang Jong Chul Ye MedIm 499 69 0 19 May 2023
A Survey of Safety and Trustworthiness of Large Language Models through the Lens of Verification and ValidationArtificial Intelligence Review (AIR), 2023 Xiaowei Huang Wenjie Ruan Wei Huang Gao Jin Yizhen Dong ... Sihao Wu Peipei Xu Dengyu Wu André Freitas Mustafa A. Mustafa ALM 335 140 0 19 May 2023
Learning In-context Learning for Named Entity RecognitionAnnual Meeting of the Association for Computational Linguistics (ACL), 2023 Jiawei Chen Yaojie Lu Hongyu Lin Jie Lou Wei Jia Dai Dai Hua Wu Boxi Cao Xianpei Han Le Sun NAI 242 27 0 18 May 2023
Think Outside the Code: Brainstorming Boosts Large Language Models in Code Generation Xinyu Li Jiang-Tian Xue Zheng Xie Ming Li LRM 169 37 0 18 May 2023
Temporal Knowledge Graph Forecasting Without Knowledge Using In-Context LearningConference on Empirical Methods in Natural Language Processing (EMNLP), 2023 Dong-Ho Lee Kian Ahrabian Woojeong Jin Fred Morstatter Jay Pujara 314 55 0 17 May 2023
"I'm fully who I am": Towards Centering Transgender and Non-Binary Voices to Measure Biases in Open Language GenerationConference on Fairness, Accountability and Transparency (FAccT), 2023 Anaelia Ovalle Palash Goyal Jwala Dhamala Zachary Jaggers Kai-Wei Chang Aram Galstyan R. Zemel Rahul Gupta 362 78 0 17 May 2023
A Language Model of Java Methods with Train/Test Deduplication Chia-Yi Su Aakash Bansal Vijayanta Jain S. Ghanavati Collin McMillan SyDa VLM 178 14 0 15 May 2023
CodeT5+: Open Code Large Language Models for Code Understanding and GenerationConference on Empirical Methods in Natural Language Processing (EMNLP), 2023 Yue Wang Hung Le Akhilesh Deepak Gotmare Nghi D. Q. Bui Junnan Li Steven C. H. Hoi ALM 298 609 0 13 May 2023
Evaluating Open-Domain Question Answering in the Era of Large Language ModelsAnnual Meeting of the Association for Computational Linguistics (ACL), 2023 Ehsan Kamalloo Nouha Dziri C. Clarke Davood Rafiei ELM 382 144 0 11 May 2023
StarCoder: may the source be with you! Raymond Li Loubna Ben Allal Yangtian Zi Niklas Muennighoff Denis Kocetkov ... Sean M. Hughes Thomas Wolf Arjun Guha Leandro von Werra H. D. Vries 448 1,020 0 09 May 2023
Should ChatGPT and Bard Share Revenue with Their Data Providers? A New Business Model for the AI EraAdvances in Artificial Intelligence and Machine Learning (AAIML), 2023 Dong Zhang 112 5 0 04 May 2023
Cheaply Evaluating Inference Efficiency Metrics for Autoregressive Transformer APIs Deepak Narayanan Keshav Santhanam Peter Henderson Rishi Bommasani Tony Lee Abigail Z. Jacobs 281 3 0 03 May 2023
Distilling Step-by-Step! Outperforming Larger Language Models with Less Training Data and Smaller Model SizesAnnual Meeting of the Association for Computational Linguistics (ACL), 2023 Lokesh Nagalapatti Chun-Liang Li Chih-Kuan Yeh Hootan Nakhost Yasuhisa Fujii Alexander Ratner Ranjay Krishna Chen-Yu Lee Tomas Pfister ALM 720 712 0 03 May 2023
SCOTT: Self-Consistent Chain-of-Thought DistillationAnnual Meeting of the Association for Computational Linguistics (ACL), 2023 Jamie Yap Zhengyang Wang Zheng Li K. Lynch Bing Yin Xiang Ren LRM 311 118 0 03 May 2023
Automated Code generation for Information Technology Tasks in YAML through Large Language ModelsDesign Automation Conference (DAC), 2023 Saurabh Pujar Luca Buratti Xiaojie Guo Nicolas Dupuis B. Lewis ... Atin Sood Ganesh Nalawade Matt Jones Alessandro Morari Ruchi Puri 189 6 0 02 May 2023
The Benefits of Bad Advice: Autocontrastive Decoding across Model LayersAnnual Meeting of the Association for Computational Linguistics (ACL), 2023 Ariel Gera Roni Friedman Ofir Arviv Chulaka Gunasekara Benjamin Sznajder Noam Slonim Eyal Shnarch 180 30 0 02 May 2023
Beyond Classification: Financial Reasoning in State-of-the-Art Language Models Seunghyeok Hong Han-Na Jung Moonjeong Hahm Keonju Na Sol Jin AIFin LRM 215 23 0 30 Apr 2023
Empirical Analysis of the Strengths and Weaknesses of PEFT Techniques for LLMs George Pu Anirudh Jain Jihan Yin Russell Kaplan 157 48 0 28 Apr 2023
Training and Evaluation of a Multilingual Tokenizer for GPT-SW3 Felix Stollenwerk 185 9 0 28 Apr 2023
Translate to Disambiguate: Zero-shot Multilingual Word Sense Disambiguation with Pretrained Language ModelsConference of the European Chapter of the Association for Computational Linguistics (EACL), 2023 Haoqiang Kang Terra Blevins Luke Zettlemoyer 127 2 0 26 Apr 2023
Emergent and Predictable Memorization in Large Language ModelsNeural Information Processing Systems (NeurIPS), 2023 Stella Biderman USVSN Sai Prashanth Lintang Sutawika Hailey Schoelkopf Quentin G. Anthony Shivanshu Purohit Edward Raf 221 160 0 21 Apr 2023
An Evaluation on Large Language Model Outputs: Discourse and MemorizationNatural Language Processing Journal (JNLP), 2023 Adrian de Wynter Xun Wang Alex Sokolov Qilong Gu Si-Qing Chen ELM 194 41 0 17 Apr 2023
Towards Better Instruction Following Language Models for Chinese: Investigating the Impact of Training Data and Evaluation Yunjie Ji Yan Gong Yong Deng Yiping Peng Qiang Niu Baochang Ma Xiangang Li ALM ELM 209 27 0 16 Apr 2023
Are LLMs All You Need for Task-Oriented Dialogue?SIGDIAL Conferences (SIGDIAL), 2023 Vojtvech Hudevcek Ondrej Dusek 176 76 0 13 Apr 2023
On Efficient Training of Large-Scale Deep Learning Models: A Literature Review Li Shen Yan Sun Zhiyuan Yu Liang Ding Xinmei Tian Dacheng Tao VLM 270 51 0 07 Apr 2023
Cerebras-GPT: Open Compute-Optimal Language Models Trained on the Cerebras Wafer-Scale Cluster Nolan Dey Gurpreet Gosal Zhiming Chen Chen Hemant Khachane William Marshall Ribhu Pathria Marvin Tom Joel Hestness MoE LRM 263 121 0 06 Apr 2023
Pythia: A Suite for Analyzing Large Language Models Across Training and ScalingInternational Conference on Machine Learning (ICML), 2023 Stella Biderman Hailey Schoelkopf Quentin G. Anthony Herbie Bradley Kyle O'Brien ... USVSN Sai Prashanth Edward Raff Aviya Skowron Lintang Sutawika Oskar van der Wal 364 1,603 0 03 Apr 2023
RPTQ: Reorder-based Post-training Quantization for Large Language Models Zhihang Yuan Lin Niu Jia-Wen Liu Wenyu Liu Xinggang Wang Yuzhang Shang Guangyu Sun Qiang Wu Jiaxiang Wu Bingzhe Wu MQ 516 110 0 03 Apr 2023
LLMMaps -- A Visual Metaphor for Stratified Evaluation of Large Language Models Patrik Puchert Poonam Poonam Christian van Onzenoodt Timo Ropinski 127 11 0 02 Apr 2023
Keep the Conversation Going: Fixing 162 out of 337 bugs for $0.42 each using ChatGPTInternational Symposium on Software Testing and Analysis (ISSTA), 2023 Chun Xia Lingming Zhang KELM LRM 251 121 0 01 Apr 2023
CodeGeeX: A Pre-Trained Model for Code Generation with Multilingual Benchmarking on HumanEval-XKnowledge Discovery and Data Mining (KDD), 2023 Qinkai Zheng Xiao Xia Xu Zou Yuxiao Dong Shanshan Wang ... Andi Wang Yang Li Teng Su Zhilin Yang Jie Tang ELM ALM SyDa 366 449 0 30 Mar 2023
BloombergGPT: A Large Language Model for Finance Shijie Wu Ozan Irsoy Steven Lu Vadim Dabravolski Mark Dredze Sebastian Gehrmann P. Kambadur David S. Rosenberg Gideon Mann AIFin 610 1,104 0 30 Mar 2023
The Nordic Pile: A 1.2TB Nordic Dataset for Language Modeling Joey Öhman S. Verlinden Ariel Ekgren Amaru Cuba Gyllensten T. Isbister Evangelia Gogoulou F. Carlsson Magnus Sahlgren 110 13 0 30 Mar 2023
Improving Code Generation by Training with Natural Language Feedback Angelica Chen Jérémy Scheurer Tomasz Korbak Jon Ander Campos Jun Shern Chan Samuel R. Bowman Kyunghyun Cho Ethan Perez SyDa ALM AI4CE 221 90 0 28 Mar 2023
Unlocking the Potential of ChatGPT: A Comprehensive Exploration of its Applications, Advantages, Limitations, and Future Directions in Natural Language Processing Walid Hariri AI4MH LM&MA 855 118 0 27 Mar 2023
LMCanvas: Object-Oriented Interaction to Personalize Large Language Model-Powered Writing Environments Tae Soo Kim Arghya Sarkar Yoonjoo Lee Minsuk Chang Juho Kim LLMAG MLLM 151 10 0 27 Mar 2023
MGTBench: Benchmarking Machine-Generated Text DetectionConference on Computer and Communications Security (CCS), 2023 Xinlei He Xinyue Shen Sihao Lin Michael Backes Yang Zhang DeLMO 227 138 0 26 Mar 2023