ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2204.06745
  4. Cited By
GPT-NeoX-20B: An Open-Source Autoregressive Language Model

GPT-NeoX-20B: An Open-Source Autoregressive Language Model

14 April 2022
Sid Black
Stella Biderman
Eric Hallahan
Quentin G. Anthony
Leo Gao
Laurence Golding
Horace He
Connor Leahy
Kyle McDonell
Jason Phang
Michael Pieler
USVSN Sai Prashanth
Shivanshu Purohit
Laria Reynolds
J. Tow
Benqi Wang
Samuel Weinbach
ArXiv (abs)PDFHTMLHuggingFace (1 upvotes)Github (7200★)

Papers citing "GPT-NeoX-20B: An Open-Source Autoregressive Language Model"

50 / 603 papers shown
User Simulation with Large Language Models for Evaluating Task-Oriented
  Dialogue
User Simulation with Large Language Models for Evaluating Task-Oriented Dialogue
Sam Davidson
Salvatore Romeo
Raphael Shu
James Gung
Arshit Gupta
Saab Mansour
Yi Zhang
ALMLLMAG
254
3
0
23 Sep 2023
Knowledge Sanitization of Large Language Models
Knowledge Sanitization of Large Language Models
Yoichi Ishibashi
Hidetoshi Shimodaira
KELM
254
37
0
21 Sep 2023
SlimPajama-DC: Understanding Data Combinations for LLM Training
SlimPajama-DC: Understanding Data Combinations for LLM Training
Zhiqiang Shen
Tianhua Tao
Liqun Ma
Willie Neiswanger
Zhengzhong Liu
...
Bowen Tan
Joel Hestness
Natalia Vassilieva
Daria Soboleva
Eric Xing
434
69
0
19 Sep 2023
CFGPT: Chinese Financial Assistant with Large Language Model
CFGPT: Chinese Financial Assistant with Large Language Model
Jiangtong Li
Hao Wang
Guoxuan Wang
Yang Lei
Dawei Cheng
Zhijun Ding
Changjun Jiang
179
17
0
19 Sep 2023
Flash-LLM: Enabling Cost-Effective and Highly-Efficient Large Generative
  Model Inference with Unstructured Sparsity
Flash-LLM: Enabling Cost-Effective and Highly-Efficient Large Generative Model Inference with Unstructured SparsityProceedings of the VLDB Endowment (PVLDB), 2023
Haojun Xia
Zhen Zheng
Yuchao Li
Donglin Zhuang
Zhongzhu Zhou
Xiafei Qiu
Yong Li
Wei Lin
Shuaiwen Leon Song
170
22
0
19 Sep 2023
Generative modeling, design and analysis of spider silk protein
  sequences for enhanced mechanical properties
Generative modeling, design and analysis of spider silk protein sequences for enhanced mechanical propertiesAdvanced Functional Materials (Adv. Funct. Mater.), 2023
Wei Lu
David L. Kaplan
Markus J. Buehler
159
39
0
18 Sep 2023
Struc-Bench: Are Large Language Models Really Good at Generating Complex
  Structured Data?
Struc-Bench: Are Large Language Models Really Good at Generating Complex Structured Data?
Xiangru Tang
Yiming Zong
Jason Phang
Yilun Zhao
Wangchunshu Zhou
Arman Cohan
Mark B. Gerstein
LMTDELMALM
268
16
0
16 Sep 2023
CoCA: Fusing Position Embedding with Collinear Constrained Attention in
  Transformers for Long Context Window Extending
CoCA: Fusing Position Embedding with Collinear Constrained Attention in Transformers for Long Context Window ExtendingAnnual Meeting of the Association for Computational Linguistics (ACL), 2023
Shiyi Zhu
Jingting Ye
Wei Jiang
Siqiao Xue
Qi Zhang
Yifan Wu
Jianguo Li
138
6
0
15 Sep 2023
CATfOOD: Counterfactual Augmented Training for Improving Out-of-Domain
  Performance and Calibration
CATfOOD: Counterfactual Augmented Training for Improving Out-of-Domain Performance and CalibrationConference of the European Chapter of the Association for Computational Linguistics (EACL), 2023
Rachneet Sachdeva
Martin Tutek
Iryna Gurevych
OODD
311
16
0
14 Sep 2023
EarthPT: a time series foundation model for Earth Observation
EarthPT: a time series foundation model for Earth Observation
Michael J. Smith
Luke Fleming
James E. Geach
AI4TS
219
14
0
13 Sep 2023
From Base to Conversational: Japanese Instruction Dataset and Tuning
  Large Language Models
From Base to Conversational: Japanese Instruction Dataset and Tuning Large Language ModelsBigData Congress [Services Society] (BSS), 2023
Masahiro Suzuki
Masanori Hirano
Hiroki Sakaji
282
7
0
07 Sep 2023
Data-Juicer: A One-Stop Data Processing System for Large Language Models
Data-Juicer: A One-Stop Data Processing System for Large Language Models
Daoyuan Chen
Yilun Huang
Zhijian Ma
Hesen Chen
Xuchen Pan
...
Zhaoyang Liu
Jinyang Gao
Yaliang Li
Bolin Ding
Jingren Zhou
SyDaVLM
297
59
0
05 Sep 2023
RenAIssance: A Survey into AI Text-to-Image Generation in the Era of
  Large Model
RenAIssance: A Survey into AI Text-to-Image Generation in the Era of Large ModelIEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2023
Fengxiang Bie
Jianlong Wu
Zhongzhu Zhou
Adam Ghanem
Minjia Zhang
...
Pareesa Ameneh Golnari
David A. Clifton
Yuxiong He
Dacheng Tao
Shuaiwen Leon Song
EGVM
253
55
0
02 Sep 2023
YaRN: Efficient Context Window Extension of Large Language Models
YaRN: Efficient Context Window Extension of Large Language ModelsInternational Conference on Learning Representations (ICLR), 2023
Bowen Peng
Jeffrey Quesnelle
Honglu Fan
Enrico Shippole
OSLM
392
403
0
31 Aug 2023
Examining User-Friendly and Open-Sourced Large GPT Models: A Survey on
  Language, Multimodal, and Scientific GPT Models
Examining User-Friendly and Open-Sourced Large GPT Models: A Survey on Language, Multimodal, and Scientific GPT Models
Ran Bi
Su He
Zhenyu He
Jiacheng Lin
Qizhi Pei
Jie Shao
Wei Zhang
LM&MASyDa
199
14
0
27 Aug 2023
Code Llama: Open Foundation Models for Code
Code Llama: Open Foundation Models for Code
Baptiste Rozière
Jonas Gehring
Fabian Gloeckle
Sten Sootla
Itai Gat
...
Hugo Touvron
Louis Martin
Nicolas Usunier
Thomas Scialom
Gabriel Synnaeve
ELMALM
457
2,786
0
24 Aug 2023
Anonymity at Risk? Assessing Re-Identification Capabilities of Large
  Language Models
Anonymity at Risk? Assessing Re-Identification Capabilities of Large Language Models
Alex Nyffenegger
Matthias Sturmer
Joel Niklaus
210
10
0
22 Aug 2023
Instruction Tuning for Large Language Models: A Survey
Instruction Tuning for Large Language Models: A Survey
Shengyu Zhang
Linfeng Dong
Xiaoya Li
Sen Zhang
Xiaofei Sun
...
Jiwei Li
Runyi Hu
Tianwei Zhang
Leilei Gan
Guoyin Wang
LM&MA
914
759
0
21 Aug 2023
Large Language Models for Software Engineering: A Systematic Literature
  Review
Large Language Models for Software Engineering: A Systematic Literature ReviewACM Transactions on Software Engineering and Methodology (TOSEM), 2023
Xinying Hou
Yanjie Zhao
Yue Liu
Zhou Yang
Kailong Wang
Li Li
Xiapu Luo
David Lo
John C. Grundy
Haoyu Wang
358
743
0
21 Aug 2023
WizardMath: Empowering Mathematical Reasoning for Large Language Models via Reinforced Evol-Instruct
WizardMath: Empowering Mathematical Reasoning for Large Language Models via Reinforced Evol-InstructInternational Conference on Learning Representations (ICLR), 2023
Haipeng Luo
Qingfeng Sun
Can Xu
Lu Wang
Jian-Guang Lou
...
Xiubo Geng
Qingwei Lin
Shifeng Chen
Yansong Tang
Dongmei Zhang
LRMOSLM
800
624
0
18 Aug 2023
PMET: Precise Model Editing in a Transformer
PMET: Precise Model Editing in a TransformerAAAI Conference on Artificial Intelligence (AAAI), 2023
Xiaopeng Li
Shasha Li
Shezheng Song
Jing Yang
Jun Ma
Jie Yu
KELM
519
178
0
17 Aug 2023
AudioFormer: Audio Transformer learns audio feature representations from discrete acoustic codes
Zhaohui Li
Haitao Wang
Xinghua Jiang
426
1
0
14 Aug 2023
OctoPack: Instruction Tuning Code Large Language Models
OctoPack: Instruction Tuning Code Large Language ModelsInternational Conference on Learning Representations (ICLR), 2023
Niklas Muennighoff
Qian Liu
A. Zebaze
Qinkai Zheng
Binyuan Hui
Terry Yue Zhuo
Swayam Singh
Xiangru Tang
Leandro von Werra
Shayne Longpre
VLMALM
359
186
0
14 Aug 2023
Large Language Models for Information Retrieval: A Survey
Large Language Models for Information Retrieval: A Survey
Yutao Zhu
Huaying Yuan
Shuting Wang
Jiongnan Liu
Wenhan Liu
Chenlong Deng
Haonan Chen
Zheng Liu
Zhicheng Dou
Ji-Rong Wen
KELM
634
452
0
14 Aug 2023
Three Ways of Using Large Language Models to Evaluate Chat
Three Ways of Using Large Language Models to Evaluate Chat
Ondvrej Plátek
Vojtvech Hudevcek
Patrícia Schmidtová
Mateusz Lango
Ondrej Dusek
ALM
167
7
0
12 Aug 2023
Bringing order into the realm of Transformer-based language models for
  artificial intelligence and law
Bringing order into the realm of Transformer-based language models for artificial intelligence and lawArtificial Intelligence and Law (ICAIL), 2023
C. M. Greco
Andrea Tagarelli
AILaw
217
41
0
10 Aug 2023
SILO Language Models: Isolating Legal Risk In a Nonparametric Datastore
SILO Language Models: Isolating Legal Risk In a Nonparametric DatastoreInternational Conference on Learning Representations (ICLR), 2023
Sewon Min
Suchin Gururangan
Eric Wallace
Hannaneh Hajishirzi
Noah A. Smith
Luke Zettlemoyer
AILaw
276
87
0
08 Aug 2023
Large Language Model Prompt Chaining for Long Legal Document
  Classification
Large Language Model Prompt Chaining for Long Legal Document Classification
Dietrich Trautmann
ELMAILaw
149
19
0
08 Aug 2023
Continual Pre-Training of Large Language Models: How to (re)warm your
  model?
Continual Pre-Training of Large Language Models: How to (re)warm your model?
Kshitij Gupta
Benjamin Thérien
Adam Ibrahim
Mats L. Richter
Quentin G. Anthony
Eugene Belilovsky
Irina Rish
Timothée Lesort
KELM
382
135
0
08 Aug 2023
Evaluating and Explaining Large Language Models for Code Using Syntactic
  Structures
Evaluating and Explaining Large Language Models for Code Using Syntactic Structures
David Nader-Palacio
Alejandro Velasco
Daniel Rodríguez-Cárdenas
Kevin Moran
Denys Poshyvanyk
219
12
0
07 Aug 2023
RecycleGPT: An Autoregressive Language Model with Recyclable Module
RecycleGPT: An Autoregressive Language Model with Recyclable Module
Yu Jiang
Qiaozhi He
Xiaomin Zhuang
Zhihua Wu
Kunpeng Wang
Wenlai Zhao
Guangwen Yang
KELM
274
3
0
07 Aug 2023
Learning to Paraphrase Sentences to Different Complexity Levels
Learning to Paraphrase Sentences to Different Complexity LevelsTransactions of the Association for Computational Linguistics (TACL), 2023
Alison Chi
Li-Kuang Chen
Yi-Chen Chang
Shu-Hui Lee
Jason J. S. Chang
170
15
0
04 Aug 2023
TransNormerLLM: A Faster and Better Large Language Model with Improved
  TransNormer
TransNormerLLM: A Faster and Better Large Language Model with Improved TransNormer
Zhen Qin
Dong Li
Weigao Sun
Weixuan Sun
Xuyang Shen
...
Yunshen Wei
Baohong Lv
Xiao Luo
Yu Qiao
Yiran Zhong
186
32
0
27 Jul 2023
Exploiting the Potential of Seq2Seq Models as Robust Few-Shot Learners
Exploiting the Potential of Seq2Seq Models as Robust Few-Shot Learners
Jihyeon Janel Lee
Dain Kim
Doohae Jung
Boseop Kim
Kyoung-Woon On
107
0
0
27 Jul 2023
Evaluating the Ripple Effects of Knowledge Editing in Language Models
Evaluating the Ripple Effects of Knowledge Editing in Language ModelsTransactions of the Association for Computational Linguistics (TACL), 2023
Roi Cohen
Eden Biran
Ori Yoran
Amir Globerson
Mor Geva
KELM
364
227
0
24 Jul 2023
A Zero-shot and Few-shot Study of Instruction-Finetuned Large Language
  Models Applied to Clinical and Biomedical Tasks
A Zero-shot and Few-shot Study of Instruction-Finetuned Large Language Models Applied to Clinical and Biomedical TasksInternational Conference on Language Resources and Evaluation (LREC), 2023
Yanis Labrak
Mickael Rouvier
Richard Dufour
LM&MA
237
47
0
22 Jul 2023
FinPT: Financial Risk Prediction with Profile Tuning on Pretrained
  Foundation Models
FinPT: Financial Risk Prediction with Profile Tuning on Pretrained Foundation Models
Yuwei Yin
Yazheng Yang
Jian Yang
Qi Liu
147
22
0
22 Jul 2023
FinGPT: Democratizing Internet-scale Data for Financial Large Language
  Models
FinGPT: Democratizing Internet-scale Data for Financial Large Language Models
Xiao-Yang Liu
Guoxuan Wang
Hongyang Yang
Daochen Zha
AIFin
250
87
0
19 Jul 2023
Overthinking the Truth: Understanding how Language Models Process False
  Demonstrations
Overthinking the Truth: Understanding how Language Models Process False DemonstrationsInternational Conference on Learning Representations (ICLR), 2023
Danny Halawi
Jean-Stanislas Denain
Jacob Steinhardt
312
72
0
18 Jul 2023
On the application of Large Language Models for language teaching and
  assessment technology
On the application of Large Language Models for language teaching and assessment technology
Andrew Caines
Luca Benedetto
Shiva Taslimipoor
Christopher Davis
Yuan Gao
...
Marek Rei
H. Yannakoudakis
Andrew Mullooly
D. Nicholls
P. Buttery
ELM
261
61
0
17 Jul 2023
Generating Benchmarks for Factuality Evaluation of Language Models
Generating Benchmarks for Factuality Evaluation of Language ModelsConference of the European Chapter of the Association for Computational Linguistics (EACL), 2023
Dor Muhlgay
Ori Ram
Inbal Magar
Yoav Levine
Nir Ratner
Yonatan Belinkov
Omri Abend
Kevin Leyton-Brown
Amnon Shashua
Y. Shoham
HILM
247
123
0
13 Jul 2023
A Comprehensive Overview of Large Language Models
A Comprehensive Overview of Large Language ModelsACM Transactions on Intelligent Systems and Technology (ACM TIST), 2023
Humza Naveed
Asad Ullah Khan
Shi Qiu
Muhammad Saqib
Saeed Anwar
Muhammad Usman
Naveed Akhtar
Nick Barnes
Lin Wang
OffRL
854
1,173
0
12 Jul 2023
QIGen: Generating Efficient Kernels for Quantized Inference on Large
  Language Models
QIGen: Generating Efficient Kernels for Quantized Inference on Large Language Models
Tommaso Pegolotti
Elias Frantar
Dan Alistarh
Markus Püschel
MQ
68
5
0
07 Jul 2023
Evaluating Biased Attitude Associations of Language Models in an
  Intersectional Context
Evaluating Biased Attitude Associations of Language Models in an Intersectional ContextAAAI/ACM Conference on AI, Ethics, and Society (AIES), 2023
Shiva Omrani Sabbaghi
Robert Wolfe
Aylin Caliskan
201
29
0
07 Jul 2023
Several categories of Large Language Models (LLMs): A Short Survey
Several categories of Large Language Models (LLMs): A Short SurveyInternational Journal for Research in Applied Science and Engineering Technology (IJRASET), 2023
Saurabh Pahune
Manoj Chandrasekharan
AILaw
202
30
0
05 Jul 2023
Natural Language Generation and Understanding of Big Code for
  AI-Assisted Programming: A Review
Natural Language Generation and Understanding of Big Code for AI-Assisted Programming: A ReviewEntropy (Entropy), 2023
M. Wong
Shangxin Guo
Ching Nam Hang
Siu-Wai Ho
C. Tan
239
122
0
04 Jul 2023
InstructEval: Systematic Evaluation of Instruction Selection Methods
InstructEval: Systematic Evaluation of Instruction Selection Methods
Anirudh Ajith
Chris Pan
Mengzhou Xia
Ameet Deshpande
Karthik Narasimhan
ELM
186
22
0
01 Jul 2023
Mirage: Towards Low-interruption Services on Batch GPU Clusters with
  Reinforcement Learning
Mirage: Towards Low-interruption Services on Batch GPU Clusters with Reinforcement LearningInternational Conference for High Performance Computing, Networking, Storage and Analysis (SC), 2023
Qi-Dong Ding
Pengfei Zheng
Shreyas Kudari
Shivaram Venkataraman
Zhao-jie Zhang
VLMOffRL
164
5
0
25 Jun 2023
H$_2$O: Heavy-Hitter Oracle for Efficient Generative Inference of Large
  Language Models
H2_22​O: Heavy-Hitter Oracle for Efficient Generative Inference of Large Language ModelsNeural Information Processing Systems (NeurIPS), 2023
Zhenyu Zhang
Ying Sheng
Wanrong Zhu
Tianlong Chen
Lianmin Zheng
...
Yuandong Tian
Christopher Ré
Clark W. Barrett
Zinan Lin
Beidi Chen
VLM
755
474
0
24 Jun 2023
Long-range Language Modeling with Self-retrieval
Long-range Language Modeling with Self-retrievalTransactions of the Association for Computational Linguistics (TACL), 2023
Ohad Rubin
Jonathan Berant
RALMKELM
219
31
0
23 Jun 2023
Previous
123...789...111213
Next