Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2204.06745
Cited By
GPT-NeoX-20B: An Open-Source Autoregressive Language Model
14 April 2022
Sid Black
Stella Biderman
Eric Hallahan
Quentin G. Anthony
Leo Gao
Laurence Golding
Horace He
Connor Leahy
Kyle McDonell
Jason Phang
Michael Pieler
USVSN Sai Prashanth
Shivanshu Purohit
Laria Reynolds
J. Tow
Benqi Wang
Samuel Weinbach
Re-assign community
ArXiv (abs)
PDF
HTML
HuggingFace (1 upvotes)
Github (7200★)
Papers citing
"GPT-NeoX-20B: An Open-Source Autoregressive Language Model"
50 / 602 papers shown
Title
Physics of Language Models: Part 1, Learning Hierarchical Language Structures
Zeyuan Allen-Zhu
Yuanzhi Li
448
39
0
23 May 2023
A 4D Hybrid Algorithm to Scale Parallel Training to Thousands of GPUs
Siddharth Singh
Prajwal Singhania
Aditya K. Ranjan
Zack Sating
A. Bhatele
189
6
0
22 May 2023
Small Language Models Improve Giants by Rewriting Their Outputs
Conference of the European Chapter of the Association for Computational Linguistics (EACL), 2023
Giorgos Vernikos
Arthur Bravzinskas
Jakub Adamek
Jonathan Mallinson
Aliaksei Severyn
Eric Malmi
BDL
LRM
221
21
0
22 May 2023
Neural Machine Translation for Code Generation
K. Dharma
Clayton T. Morrison
300
7
0
22 May 2023
Flover: A Temporal Fusion Framework for Efficient Autoregressive Model Parallel Inference
International Conference on High Performance Computing (HiPC), 2023
Jinghan Yao
Nawras Alnaasan
Tianrun Chen
Hari Subramoni
Hari Subramoni
Dhabaleswar K.
D. Panda
126
2
0
22 May 2023
MAGE: Machine-generated Text Detection in the Wild
Annual Meeting of the Association for Computational Linguistics (ACL), 2023
Yafu Li
Qintong Li
Leyang Cui
Wei Bi
Zhilin Wang
Longyue Wang
Linyi Yang
Shuming Shi
Yue Zhang
DeLMO
271
103
0
22 May 2023
Editing Large Language Models: Problems, Methods, and Opportunities
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Yunzhi Yao
Peng Wang
Bo Tian
Shuyang Cheng
Zhoubo Li
Shumin Deng
Huajun Chen
Ningyu Zhang
KELM
296
391
0
22 May 2023
A Pretrainer's Guide to Training Data: Measuring the Effects of Data Age, Domain Coverage, Quality, & Toxicity
North American Chapter of the Association for Computational Linguistics (NAACL), 2023
Shayne Longpre
Gregory Yauney
Emily Reif
Katherine Lee
Adam Roberts
...
Denny Zhou
Jason W. Wei
Kevin Robinson
David M. Mimno
Daphne Ippolito
332
206
0
22 May 2023
RWKV: Reinventing RNNs for the Transformer Era
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Bo Peng
Eric Alcaide
Quentin G. Anthony
Alon Albalak
Samuel Arcadinho
...
Qihang Zhao
P. Zhou
Qinghua Zhou
Jian Zhu
Rui-Jie Zhu
534
816
0
22 May 2023
Iterative Forward Tuning Boosts In-Context Learning in Language Models
Annual Meeting of the Association for Computational Linguistics (ACL), 2023
Jiaxi Yang
Binyuan Hui
Min Yang
Bailin Wang
Bowen Li
Binhua Li
Fei Huang
Yongbin Li
247
19
0
22 May 2023
GPT-SW3: An Autoregressive Language Model for the Nordic Languages
Ariel Ekgren
Amaru Cuba Gyllensten
Felix Stollenwerk
Joey Öhman
T. Isbister
Evangelia Gogoulou
F. Carlsson
Alice Heiman
Judit Casademont
Magnus Sahlgren
232
16
0
22 May 2023
Can We Edit Factual Knowledge by In-Context Learning?
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Ce Zheng
Lei Li
Qingxiu Dong
Yuxuan Fan
Zhiyong Wu
Jingjing Xu
Baobao Chang
KELM
216
276
0
22 May 2023
Quantifying Association Capabilities of Large Language Models and Its Implications on Privacy Leakage
Findings (Findings), 2023
Hanyin Shao
Jie Huang
Shen Zheng
Kevin Chen-Chuan Chang
PILM
152
32
0
22 May 2023
LLM-CXR: Instruction-Finetuned LLM for CXR Image Understanding and Generation
International Conference on Learning Representations (ICLR), 2023
Suhyeon Lee
Won Jun Kim
Jinho Chang
Jong Chul Ye
MedIm
499
69
0
19 May 2023
A Survey of Safety and Trustworthiness of Large Language Models through the Lens of Verification and Validation
Artificial Intelligence Review (AIR), 2023
Xiaowei Huang
Wenjie Ruan
Wei Huang
Gao Jin
Yizhen Dong
...
Sihao Wu
Peipei Xu
Dengyu Wu
André Freitas
Mustafa A. Mustafa
ALM
335
140
0
19 May 2023
Learning In-context Learning for Named Entity Recognition
Annual Meeting of the Association for Computational Linguistics (ACL), 2023
Jiawei Chen
Yaojie Lu
Hongyu Lin
Jie Lou
Wei Jia
Dai Dai
Hua Wu
Boxi Cao
Xianpei Han
Le Sun
NAI
242
27
0
18 May 2023
Think Outside the Code: Brainstorming Boosts Large Language Models in Code Generation
Xinyu Li
Jiang-Tian Xue
Zheng Xie
Ming Li
LRM
169
37
0
18 May 2023
Temporal Knowledge Graph Forecasting Without Knowledge Using In-Context Learning
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Dong-Ho Lee
Kian Ahrabian
Woojeong Jin
Fred Morstatter
Jay Pujara
314
55
0
17 May 2023
"I'm fully who I am": Towards Centering Transgender and Non-Binary Voices to Measure Biases in Open Language Generation
Conference on Fairness, Accountability and Transparency (FAccT), 2023
Anaelia Ovalle
Palash Goyal
Jwala Dhamala
Zachary Jaggers
Kai-Wei Chang
Aram Galstyan
R. Zemel
Rahul Gupta
362
78
0
17 May 2023
A Language Model of Java Methods with Train/Test Deduplication
Chia-Yi Su
Aakash Bansal
Vijayanta Jain
S. Ghanavati
Collin McMillan
SyDa
VLM
178
14
0
15 May 2023
CodeT5+: Open Code Large Language Models for Code Understanding and Generation
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Yue Wang
Hung Le
Akhilesh Deepak Gotmare
Nghi D. Q. Bui
Junnan Li
Steven C. H. Hoi
ALM
298
609
0
13 May 2023
Evaluating Open-Domain Question Answering in the Era of Large Language Models
Annual Meeting of the Association for Computational Linguistics (ACL), 2023
Ehsan Kamalloo
Nouha Dziri
C. Clarke
Davood Rafiei
ELM
382
144
0
11 May 2023
StarCoder: may the source be with you!
Raymond Li
Loubna Ben Allal
Yangtian Zi
Niklas Muennighoff
Denis Kocetkov
...
Sean M. Hughes
Thomas Wolf
Arjun Guha
Leandro von Werra
H. D. Vries
448
1,020
0
09 May 2023
Should ChatGPT and Bard Share Revenue with Their Data Providers? A New Business Model for the AI Era
Advances in Artificial Intelligence and Machine Learning (AAIML), 2023
Dong Zhang
112
5
0
04 May 2023
Cheaply Evaluating Inference Efficiency Metrics for Autoregressive Transformer APIs
Deepak Narayanan
Keshav Santhanam
Peter Henderson
Rishi Bommasani
Tony Lee
Abigail Z. Jacobs
281
3
0
03 May 2023
Distilling Step-by-Step! Outperforming Larger Language Models with Less Training Data and Smaller Model Sizes
Annual Meeting of the Association for Computational Linguistics (ACL), 2023
Lokesh Nagalapatti
Chun-Liang Li
Chih-Kuan Yeh
Hootan Nakhost
Yasuhisa Fujii
Alexander Ratner
Ranjay Krishna
Chen-Yu Lee
Tomas Pfister
ALM
720
712
0
03 May 2023
SCOTT: Self-Consistent Chain-of-Thought Distillation
Annual Meeting of the Association for Computational Linguistics (ACL), 2023
Jamie Yap
Zhengyang Wang
Zheng Li
K. Lynch
Bing Yin
Xiang Ren
LRM
311
118
0
03 May 2023
Automated Code generation for Information Technology Tasks in YAML through Large Language Models
Design Automation Conference (DAC), 2023
Saurabh Pujar
Luca Buratti
Xiaojie Guo
Nicolas Dupuis
B. Lewis
...
Atin Sood
Ganesh Nalawade
Matt Jones
Alessandro Morari
Ruchi Puri
189
6
0
02 May 2023
The Benefits of Bad Advice: Autocontrastive Decoding across Model Layers
Annual Meeting of the Association for Computational Linguistics (ACL), 2023
Ariel Gera
Roni Friedman
Ofir Arviv
Chulaka Gunasekara
Benjamin Sznajder
Noam Slonim
Eyal Shnarch
180
30
0
02 May 2023
Beyond Classification: Financial Reasoning in State-of-the-Art Language Models
Seunghyeok Hong
Han-Na Jung
Moonjeong Hahm
Keonju Na
Sol Jin
AIFin
LRM
215
23
0
30 Apr 2023
Empirical Analysis of the Strengths and Weaknesses of PEFT Techniques for LLMs
George Pu
Anirudh Jain
Jihan Yin
Russell Kaplan
157
48
0
28 Apr 2023
Training and Evaluation of a Multilingual Tokenizer for GPT-SW3
Felix Stollenwerk
185
9
0
28 Apr 2023
Translate to Disambiguate: Zero-shot Multilingual Word Sense Disambiguation with Pretrained Language Models
Conference of the European Chapter of the Association for Computational Linguistics (EACL), 2023
Haoqiang Kang
Terra Blevins
Luke Zettlemoyer
127
2
0
26 Apr 2023
Emergent and Predictable Memorization in Large Language Models
Neural Information Processing Systems (NeurIPS), 2023
Stella Biderman
USVSN Sai Prashanth
Lintang Sutawika
Hailey Schoelkopf
Quentin G. Anthony
Shivanshu Purohit
Edward Raf
221
160
0
21 Apr 2023
An Evaluation on Large Language Model Outputs: Discourse and Memorization
Natural Language Processing Journal (JNLP), 2023
Adrian de Wynter
Xun Wang
Alex Sokolov
Qilong Gu
Si-Qing Chen
ELM
194
41
0
17 Apr 2023
Towards Better Instruction Following Language Models for Chinese: Investigating the Impact of Training Data and Evaluation
Yunjie Ji
Yan Gong
Yong Deng
Yiping Peng
Qiang Niu
Baochang Ma
Xiangang Li
ALM
ELM
209
27
0
16 Apr 2023
Are LLMs All You Need for Task-Oriented Dialogue?
SIGDIAL Conferences (SIGDIAL), 2023
Vojtvech Hudevcek
Ondrej Dusek
176
76
0
13 Apr 2023
On Efficient Training of Large-Scale Deep Learning Models: A Literature Review
Li Shen
Yan Sun
Zhiyuan Yu
Liang Ding
Xinmei Tian
Dacheng Tao
VLM
270
51
0
07 Apr 2023
Cerebras-GPT: Open Compute-Optimal Language Models Trained on the Cerebras Wafer-Scale Cluster
Nolan Dey
Gurpreet Gosal
Zhiming Chen
Chen
Hemant Khachane
William Marshall
Ribhu Pathria
Marvin Tom
Joel Hestness
MoE
LRM
263
121
0
06 Apr 2023
Pythia: A Suite for Analyzing Large Language Models Across Training and Scaling
International Conference on Machine Learning (ICML), 2023
Stella Biderman
Hailey Schoelkopf
Quentin G. Anthony
Herbie Bradley
Kyle O'Brien
...
USVSN Sai Prashanth
Edward Raff
Aviya Skowron
Lintang Sutawika
Oskar van der Wal
364
1,603
0
03 Apr 2023
RPTQ: Reorder-based Post-training Quantization for Large Language Models
Zhihang Yuan
Lin Niu
Jia-Wen Liu
Wenyu Liu
Xinggang Wang
Yuzhang Shang
Guangyu Sun
Qiang Wu
Jiaxiang Wu
Bingzhe Wu
MQ
516
110
0
03 Apr 2023
LLMMaps -- A Visual Metaphor for Stratified Evaluation of Large Language Models
Patrik Puchert
Poonam Poonam
Christian van Onzenoodt
Timo Ropinski
127
11
0
02 Apr 2023
Keep the Conversation Going: Fixing 162 out of 337 bugs for $0.42 each using ChatGPT
International Symposium on Software Testing and Analysis (ISSTA), 2023
Chun Xia
Lingming Zhang
KELM
LRM
251
121
0
01 Apr 2023
CodeGeeX: A Pre-Trained Model for Code Generation with Multilingual Benchmarking on HumanEval-X
Knowledge Discovery and Data Mining (KDD), 2023
Qinkai Zheng
Xiao Xia
Xu Zou
Yuxiao Dong
Shanshan Wang
...
Andi Wang
Yang Li
Teng Su
Zhilin Yang
Jie Tang
ELM
ALM
SyDa
366
449
0
30 Mar 2023
BloombergGPT: A Large Language Model for Finance
Shijie Wu
Ozan Irsoy
Steven Lu
Vadim Dabravolski
Mark Dredze
Sebastian Gehrmann
P. Kambadur
David S. Rosenberg
Gideon Mann
AIFin
610
1,104
0
30 Mar 2023
The Nordic Pile: A 1.2TB Nordic Dataset for Language Modeling
Joey Öhman
S. Verlinden
Ariel Ekgren
Amaru Cuba Gyllensten
T. Isbister
Evangelia Gogoulou
F. Carlsson
Magnus Sahlgren
110
13
0
30 Mar 2023
Improving Code Generation by Training with Natural Language Feedback
Angelica Chen
Jérémy Scheurer
Tomasz Korbak
Jon Ander Campos
Jun Shern Chan
Samuel R. Bowman
Kyunghyun Cho
Ethan Perez
SyDa
ALM
AI4CE
221
90
0
28 Mar 2023
Unlocking the Potential of ChatGPT: A Comprehensive Exploration of its Applications, Advantages, Limitations, and Future Directions in Natural Language Processing
Walid Hariri
AI4MH
LM&MA
855
118
0
27 Mar 2023
LMCanvas: Object-Oriented Interaction to Personalize Large Language Model-Powered Writing Environments
Tae Soo Kim
Arghya Sarkar
Yoonjoo Lee
Minsuk Chang
Juho Kim
LLMAG
MLLM
151
10
0
27 Mar 2023
MGTBench: Benchmarking Machine-Generated Text Detection
Conference on Computer and Communications Security (CCS), 2023
Xinlei He
Xinyue Shen
Sihao Lin
Michael Backes
Yang Zhang
DeLMO
227
138
0
26 Mar 2023
Previous
1
2
3
...
10
11
12
13
9
Next