ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2204.06745
  4. Cited By
GPT-NeoX-20B: An Open-Source Autoregressive Language Model

GPT-NeoX-20B: An Open-Source Autoregressive Language Model

14 April 2022
Sid Black
Stella Biderman
Eric Hallahan
Quentin G. Anthony
Leo Gao
Laurence Golding
Horace He
Connor Leahy
Kyle McDonell
Jason Phang
Michael Pieler
USVSN Sai Prashanth
Shivanshu Purohit
Laria Reynolds
J. Tow
Benqi Wang
Samuel Weinbach
ArXiv (abs)PDFHTMLHuggingFace (1 upvotes)Github (7200★)

Papers citing "GPT-NeoX-20B: An Open-Source Autoregressive Language Model"

50 / 602 papers shown
Title
Physics of Language Models: Part 1, Learning Hierarchical Language Structures
Physics of Language Models: Part 1, Learning Hierarchical Language Structures
Zeyuan Allen-Zhu
Yuanzhi Li
448
39
0
23 May 2023
A 4D Hybrid Algorithm to Scale Parallel Training to Thousands of GPUs
A 4D Hybrid Algorithm to Scale Parallel Training to Thousands of GPUs
Siddharth Singh
Prajwal Singhania
Aditya K. Ranjan
Zack Sating
A. Bhatele
189
6
0
22 May 2023
Small Language Models Improve Giants by Rewriting Their Outputs
Small Language Models Improve Giants by Rewriting Their OutputsConference of the European Chapter of the Association for Computational Linguistics (EACL), 2023
Giorgos Vernikos
Arthur Bravzinskas
Jakub Adamek
Jonathan Mallinson
Aliaksei Severyn
Eric Malmi
BDLLRM
221
21
0
22 May 2023
Neural Machine Translation for Code Generation
Neural Machine Translation for Code Generation
K. Dharma
Clayton T. Morrison
300
7
0
22 May 2023
Flover: A Temporal Fusion Framework for Efficient Autoregressive Model
  Parallel Inference
Flover: A Temporal Fusion Framework for Efficient Autoregressive Model Parallel InferenceInternational Conference on High Performance Computing (HiPC), 2023
Jinghan Yao
Nawras Alnaasan
Tianrun Chen
Hari Subramoni
Hari Subramoni
Dhabaleswar K.
D. Panda
126
2
0
22 May 2023
MAGE: Machine-generated Text Detection in the Wild
MAGE: Machine-generated Text Detection in the WildAnnual Meeting of the Association for Computational Linguistics (ACL), 2023
Yafu Li
Qintong Li
Leyang Cui
Wei Bi
Zhilin Wang
Longyue Wang
Linyi Yang
Shuming Shi
Yue Zhang
DeLMO
271
103
0
22 May 2023
Editing Large Language Models: Problems, Methods, and Opportunities
Editing Large Language Models: Problems, Methods, and OpportunitiesConference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Yunzhi Yao
Peng Wang
Bo Tian
Shuyang Cheng
Zhoubo Li
Shumin Deng
Huajun Chen
Ningyu Zhang
KELM
296
391
0
22 May 2023
A Pretrainer's Guide to Training Data: Measuring the Effects of Data
  Age, Domain Coverage, Quality, & Toxicity
A Pretrainer's Guide to Training Data: Measuring the Effects of Data Age, Domain Coverage, Quality, & ToxicityNorth American Chapter of the Association for Computational Linguistics (NAACL), 2023
Shayne Longpre
Gregory Yauney
Emily Reif
Katherine Lee
Adam Roberts
...
Denny Zhou
Jason W. Wei
Kevin Robinson
David M. Mimno
Daphne Ippolito
332
206
0
22 May 2023
RWKV: Reinventing RNNs for the Transformer Era
RWKV: Reinventing RNNs for the Transformer EraConference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Bo Peng
Eric Alcaide
Quentin G. Anthony
Alon Albalak
Samuel Arcadinho
...
Qihang Zhao
P. Zhou
Qinghua Zhou
Jian Zhu
Rui-Jie Zhu
534
816
0
22 May 2023
Iterative Forward Tuning Boosts In-Context Learning in Language Models
Iterative Forward Tuning Boosts In-Context Learning in Language ModelsAnnual Meeting of the Association for Computational Linguistics (ACL), 2023
Jiaxi Yang
Binyuan Hui
Min Yang
Bailin Wang
Bowen Li
Binhua Li
Fei Huang
Yongbin Li
247
19
0
22 May 2023
GPT-SW3: An Autoregressive Language Model for the Nordic Languages
GPT-SW3: An Autoregressive Language Model for the Nordic Languages
Ariel Ekgren
Amaru Cuba Gyllensten
Felix Stollenwerk
Joey Öhman
T. Isbister
Evangelia Gogoulou
F. Carlsson
Alice Heiman
Judit Casademont
Magnus Sahlgren
232
16
0
22 May 2023
Can We Edit Factual Knowledge by In-Context Learning?
Can We Edit Factual Knowledge by In-Context Learning?Conference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Ce Zheng
Lei Li
Qingxiu Dong
Yuxuan Fan
Zhiyong Wu
Jingjing Xu
Baobao Chang
KELM
216
276
0
22 May 2023
Quantifying Association Capabilities of Large Language Models and Its
  Implications on Privacy Leakage
Quantifying Association Capabilities of Large Language Models and Its Implications on Privacy LeakageFindings (Findings), 2023
Hanyin Shao
Jie Huang
Shen Zheng
Kevin Chen-Chuan Chang
PILM
152
32
0
22 May 2023
LLM-CXR: Instruction-Finetuned LLM for CXR Image Understanding and
  Generation
LLM-CXR: Instruction-Finetuned LLM for CXR Image Understanding and GenerationInternational Conference on Learning Representations (ICLR), 2023
Suhyeon Lee
Won Jun Kim
Jinho Chang
Jong Chul Ye
MedIm
499
69
0
19 May 2023
A Survey of Safety and Trustworthiness of Large Language Models through
  the Lens of Verification and Validation
A Survey of Safety and Trustworthiness of Large Language Models through the Lens of Verification and ValidationArtificial Intelligence Review (AIR), 2023
Xiaowei Huang
Wenjie Ruan
Wei Huang
Gao Jin
Yizhen Dong
...
Sihao Wu
Peipei Xu
Dengyu Wu
André Freitas
Mustafa A. Mustafa
ALM
335
140
0
19 May 2023
Learning In-context Learning for Named Entity Recognition
Learning In-context Learning for Named Entity RecognitionAnnual Meeting of the Association for Computational Linguistics (ACL), 2023
Jiawei Chen
Yaojie Lu
Hongyu Lin
Jie Lou
Wei Jia
Dai Dai
Hua Wu
Boxi Cao
Xianpei Han
Le Sun
NAI
242
27
0
18 May 2023
Think Outside the Code: Brainstorming Boosts Large Language Models in
  Code Generation
Think Outside the Code: Brainstorming Boosts Large Language Models in Code Generation
Xinyu Li
Jiang-Tian Xue
Zheng Xie
Ming Li
LRM
169
37
0
18 May 2023
Temporal Knowledge Graph Forecasting Without Knowledge Using In-Context
  Learning
Temporal Knowledge Graph Forecasting Without Knowledge Using In-Context LearningConference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Dong-Ho Lee
Kian Ahrabian
Woojeong Jin
Fred Morstatter
Jay Pujara
314
55
0
17 May 2023
"I'm fully who I am": Towards Centering Transgender and Non-Binary
  Voices to Measure Biases in Open Language Generation
"I'm fully who I am": Towards Centering Transgender and Non-Binary Voices to Measure Biases in Open Language GenerationConference on Fairness, Accountability and Transparency (FAccT), 2023
Anaelia Ovalle
Palash Goyal
Jwala Dhamala
Zachary Jaggers
Kai-Wei Chang
Aram Galstyan
R. Zemel
Rahul Gupta
362
78
0
17 May 2023
A Language Model of Java Methods with Train/Test Deduplication
A Language Model of Java Methods with Train/Test Deduplication
Chia-Yi Su
Aakash Bansal
Vijayanta Jain
S. Ghanavati
Collin McMillan
SyDaVLM
178
14
0
15 May 2023
CodeT5+: Open Code Large Language Models for Code Understanding and
  Generation
CodeT5+: Open Code Large Language Models for Code Understanding and GenerationConference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Yue Wang
Hung Le
Akhilesh Deepak Gotmare
Nghi D. Q. Bui
Junnan Li
Steven C. H. Hoi
ALM
298
609
0
13 May 2023
Evaluating Open-Domain Question Answering in the Era of Large Language
  Models
Evaluating Open-Domain Question Answering in the Era of Large Language ModelsAnnual Meeting of the Association for Computational Linguistics (ACL), 2023
Ehsan Kamalloo
Nouha Dziri
C. Clarke
Davood Rafiei
ELM
382
144
0
11 May 2023
StarCoder: may the source be with you!
StarCoder: may the source be with you!
Raymond Li
Loubna Ben Allal
Yangtian Zi
Niklas Muennighoff
Denis Kocetkov
...
Sean M. Hughes
Thomas Wolf
Arjun Guha
Leandro von Werra
H. D. Vries
448
1,020
0
09 May 2023
Should ChatGPT and Bard Share Revenue with Their Data Providers? A New
  Business Model for the AI Era
Should ChatGPT and Bard Share Revenue with Their Data Providers? A New Business Model for the AI EraAdvances in Artificial Intelligence and Machine Learning (AAIML), 2023
Dong Zhang
112
5
0
04 May 2023
Cheaply Evaluating Inference Efficiency Metrics for Autoregressive
  Transformer APIs
Cheaply Evaluating Inference Efficiency Metrics for Autoregressive Transformer APIs
Deepak Narayanan
Keshav Santhanam
Peter Henderson
Rishi Bommasani
Tony Lee
Abigail Z. Jacobs
281
3
0
03 May 2023
Distilling Step-by-Step! Outperforming Larger Language Models with Less
  Training Data and Smaller Model Sizes
Distilling Step-by-Step! Outperforming Larger Language Models with Less Training Data and Smaller Model SizesAnnual Meeting of the Association for Computational Linguistics (ACL), 2023
Lokesh Nagalapatti
Chun-Liang Li
Chih-Kuan Yeh
Hootan Nakhost
Yasuhisa Fujii
Alexander Ratner
Ranjay Krishna
Chen-Yu Lee
Tomas Pfister
ALM
720
712
0
03 May 2023
SCOTT: Self-Consistent Chain-of-Thought Distillation
SCOTT: Self-Consistent Chain-of-Thought DistillationAnnual Meeting of the Association for Computational Linguistics (ACL), 2023
Jamie Yap
Zhengyang Wang
Zheng Li
K. Lynch
Bing Yin
Xiang Ren
LRM
311
118
0
03 May 2023
Automated Code generation for Information Technology Tasks in YAML
  through Large Language Models
Automated Code generation for Information Technology Tasks in YAML through Large Language ModelsDesign Automation Conference (DAC), 2023
Saurabh Pujar
Luca Buratti
Xiaojie Guo
Nicolas Dupuis
B. Lewis
...
Atin Sood
Ganesh Nalawade
Matt Jones
Alessandro Morari
Ruchi Puri
189
6
0
02 May 2023
The Benefits of Bad Advice: Autocontrastive Decoding across Model Layers
The Benefits of Bad Advice: Autocontrastive Decoding across Model LayersAnnual Meeting of the Association for Computational Linguistics (ACL), 2023
Ariel Gera
Roni Friedman
Ofir Arviv
Chulaka Gunasekara
Benjamin Sznajder
Noam Slonim
Eyal Shnarch
180
30
0
02 May 2023
Beyond Classification: Financial Reasoning in State-of-the-Art Language
  Models
Beyond Classification: Financial Reasoning in State-of-the-Art Language Models
Seunghyeok Hong
Han-Na Jung
Moonjeong Hahm
Keonju Na
Sol Jin
AIFinLRM
215
23
0
30 Apr 2023
Empirical Analysis of the Strengths and Weaknesses of PEFT Techniques
  for LLMs
Empirical Analysis of the Strengths and Weaknesses of PEFT Techniques for LLMs
George Pu
Anirudh Jain
Jihan Yin
Russell Kaplan
157
48
0
28 Apr 2023
Training and Evaluation of a Multilingual Tokenizer for GPT-SW3
Training and Evaluation of a Multilingual Tokenizer for GPT-SW3
Felix Stollenwerk
185
9
0
28 Apr 2023
Translate to Disambiguate: Zero-shot Multilingual Word Sense
  Disambiguation with Pretrained Language Models
Translate to Disambiguate: Zero-shot Multilingual Word Sense Disambiguation with Pretrained Language ModelsConference of the European Chapter of the Association for Computational Linguistics (EACL), 2023
Haoqiang Kang
Terra Blevins
Luke Zettlemoyer
127
2
0
26 Apr 2023
Emergent and Predictable Memorization in Large Language Models
Emergent and Predictable Memorization in Large Language ModelsNeural Information Processing Systems (NeurIPS), 2023
Stella Biderman
USVSN Sai Prashanth
Lintang Sutawika
Hailey Schoelkopf
Quentin G. Anthony
Shivanshu Purohit
Edward Raf
221
160
0
21 Apr 2023
An Evaluation on Large Language Model Outputs: Discourse and
  Memorization
An Evaluation on Large Language Model Outputs: Discourse and MemorizationNatural Language Processing Journal (JNLP), 2023
Adrian de Wynter
Xun Wang
Alex Sokolov
Qilong Gu
Si-Qing Chen
ELM
194
41
0
17 Apr 2023
Towards Better Instruction Following Language Models for Chinese:
  Investigating the Impact of Training Data and Evaluation
Towards Better Instruction Following Language Models for Chinese: Investigating the Impact of Training Data and Evaluation
Yunjie Ji
Yan Gong
Yong Deng
Yiping Peng
Qiang Niu
Baochang Ma
Xiangang Li
ALMELM
209
27
0
16 Apr 2023
Are LLMs All You Need for Task-Oriented Dialogue?
Are LLMs All You Need for Task-Oriented Dialogue?SIGDIAL Conferences (SIGDIAL), 2023
Vojtvech Hudevcek
Ondrej Dusek
176
76
0
13 Apr 2023
On Efficient Training of Large-Scale Deep Learning Models: A Literature
  Review
On Efficient Training of Large-Scale Deep Learning Models: A Literature Review
Li Shen
Yan Sun
Zhiyuan Yu
Liang Ding
Xinmei Tian
Dacheng Tao
VLM
270
51
0
07 Apr 2023
Cerebras-GPT: Open Compute-Optimal Language Models Trained on the
  Cerebras Wafer-Scale Cluster
Cerebras-GPT: Open Compute-Optimal Language Models Trained on the Cerebras Wafer-Scale Cluster
Nolan Dey
Gurpreet Gosal
Zhiming Chen
Chen
Hemant Khachane
William Marshall
Ribhu Pathria
Marvin Tom
Joel Hestness
MoELRM
263
121
0
06 Apr 2023
Pythia: A Suite for Analyzing Large Language Models Across Training and
  Scaling
Pythia: A Suite for Analyzing Large Language Models Across Training and ScalingInternational Conference on Machine Learning (ICML), 2023
Stella Biderman
Hailey Schoelkopf
Quentin G. Anthony
Herbie Bradley
Kyle O'Brien
...
USVSN Sai Prashanth
Edward Raff
Aviya Skowron
Lintang Sutawika
Oskar van der Wal
364
1,603
0
03 Apr 2023
RPTQ: Reorder-based Post-training Quantization for Large Language Models
RPTQ: Reorder-based Post-training Quantization for Large Language Models
Zhihang Yuan
Lin Niu
Jia-Wen Liu
Wenyu Liu
Xinggang Wang
Yuzhang Shang
Guangyu Sun
Qiang Wu
Jiaxiang Wu
Bingzhe Wu
MQ
516
110
0
03 Apr 2023
LLMMaps -- A Visual Metaphor for Stratified Evaluation of Large Language
  Models
LLMMaps -- A Visual Metaphor for Stratified Evaluation of Large Language Models
Patrik Puchert
Poonam Poonam
Christian van Onzenoodt
Timo Ropinski
127
11
0
02 Apr 2023
Keep the Conversation Going: Fixing 162 out of 337 bugs for $0.42 each
  using ChatGPT
Keep the Conversation Going: Fixing 162 out of 337 bugs for $0.42 each using ChatGPTInternational Symposium on Software Testing and Analysis (ISSTA), 2023
Chun Xia
Lingming Zhang
KELMLRM
251
121
0
01 Apr 2023
CodeGeeX: A Pre-Trained Model for Code Generation with Multilingual
  Benchmarking on HumanEval-X
CodeGeeX: A Pre-Trained Model for Code Generation with Multilingual Benchmarking on HumanEval-XKnowledge Discovery and Data Mining (KDD), 2023
Qinkai Zheng
Xiao Xia
Xu Zou
Yuxiao Dong
Shanshan Wang
...
Andi Wang
Yang Li
Teng Su
Zhilin Yang
Jie Tang
ELMALMSyDa
366
449
0
30 Mar 2023
BloombergGPT: A Large Language Model for Finance
BloombergGPT: A Large Language Model for Finance
Shijie Wu
Ozan Irsoy
Steven Lu
Vadim Dabravolski
Mark Dredze
Sebastian Gehrmann
P. Kambadur
David S. Rosenberg
Gideon Mann
AIFin
610
1,104
0
30 Mar 2023
The Nordic Pile: A 1.2TB Nordic Dataset for Language Modeling
The Nordic Pile: A 1.2TB Nordic Dataset for Language Modeling
Joey Öhman
S. Verlinden
Ariel Ekgren
Amaru Cuba Gyllensten
T. Isbister
Evangelia Gogoulou
F. Carlsson
Magnus Sahlgren
110
13
0
30 Mar 2023
Improving Code Generation by Training with Natural Language Feedback
Improving Code Generation by Training with Natural Language Feedback
Angelica Chen
Jérémy Scheurer
Tomasz Korbak
Jon Ander Campos
Jun Shern Chan
Samuel R. Bowman
Kyunghyun Cho
Ethan Perez
SyDaALMAI4CE
221
90
0
28 Mar 2023
Unlocking the Potential of ChatGPT: A Comprehensive Exploration of its
  Applications, Advantages, Limitations, and Future Directions in Natural
  Language Processing
Unlocking the Potential of ChatGPT: A Comprehensive Exploration of its Applications, Advantages, Limitations, and Future Directions in Natural Language Processing
Walid Hariri
AI4MHLM&MA
855
118
0
27 Mar 2023
LMCanvas: Object-Oriented Interaction to Personalize Large Language
  Model-Powered Writing Environments
LMCanvas: Object-Oriented Interaction to Personalize Large Language Model-Powered Writing Environments
Tae Soo Kim
Arghya Sarkar
Yoonjoo Lee
Minsuk Chang
Juho Kim
LLMAGMLLM
151
10
0
27 Mar 2023
MGTBench: Benchmarking Machine-Generated Text Detection
MGTBench: Benchmarking Machine-Generated Text DetectionConference on Computer and Communications Security (CCS), 2023
Xinlei He
Xinyue Shen
Sihao Lin
Michael Backes
Yang Zhang
DeLMO
227
138
0
26 Mar 2023
Previous
123...101112139
Next