ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2204.06745
  4. Cited By
GPT-NeoX-20B: An Open-Source Autoregressive Language Model

GPT-NeoX-20B: An Open-Source Autoregressive Language Model

14 April 2022
Sid Black
Stella Biderman
Eric Hallahan
Quentin G. Anthony
Leo Gao
Laurence Golding
Horace He
Connor Leahy
Kyle McDonell
Jason Phang
Michael Pieler
USVSN Sai Prashanth
Shivanshu Purohit
Laria Reynolds
J. Tow
Benqi Wang
Samuel Weinbach
ArXiv (abs)PDFHTMLHuggingFace (1 upvotes)Github (7200★)

Papers citing "GPT-NeoX-20B: An Open-Source Autoregressive Language Model"

50 / 602 papers shown
Title
Guiding Language Models of Code with Global Context using Monitors
Guiding Language Models of Code with Global Context using Monitors
Lakshya A Agrawal
Aditya Kanade
Navin Goyal
Shuvendu K. Lahiri
S. Rajamani
301
33
0
19 Jun 2023
ZeRO++: Extremely Efficient Collective Communication for Giant Model
  Training
ZeRO++: Extremely Efficient Collective Communication for Giant Model Training
Guanhua Wang
Heyang Qin
S. A. Jacobs
Connor Holmes
Samyam Rajbhandari
Olatunji Ruwase
Feng Yan
Lei Yang
Yuxiong He
VLM
198
77
0
16 Jun 2023
You Don't Need Robust Machine Learning to Manage Adversarial Attack
  Risks
You Don't Need Robust Machine Learning to Manage Adversarial Attack Risks
Edward Raff
M. Benaroch
Andrew L. Farris
AAML
128
4
0
16 Jun 2023
KoLA: Carefully Benchmarking World Knowledge of Large Language Models
KoLA: Carefully Benchmarking World Knowledge of Large Language ModelsInternational Conference on Learning Representations (ICLR), 2023
Jifan Yu
Xiaozhi Wang
Shangqing Tu
S. Cao
Daniel Zhang-Li
...
Lei Hou
Zhiyuan Liu
Bin Xu
Jie Tang
Juanzi Li
ELMALM
241
82
0
15 Jun 2023
ChessGPT: Bridging Policy Learning and Language Modeling
ChessGPT: Bridging Policy Learning and Language ModelingNeural Information Processing Systems (NeurIPS), 2023
Xidong Feng
Yicheng Luo
Ziyan Wang
Hongrui Tang
Mengyue Yang
Youssef Attia El Hili
D. Mguni
Yali Du
Jun Wang
226
66
0
15 Jun 2023
WizardCoder: Empowering Code Large Language Models with Evol-Instruct
WizardCoder: Empowering Code Large Language Models with Evol-InstructInternational Conference on Learning Representations (ICLR), 2023
Ziyang Luo
Can Xu
Lu Wang
Qingfeng Sun
Xiubo Geng
Wenxiang Hu
Chongyang Tao
Jing Ma
Qingwei Lin
Daxin Jiang
ELMSyDaALM
622
827
0
14 Jun 2023
Questioning the Survey Responses of Large Language Models
Questioning the Survey Responses of Large Language ModelsNeural Information Processing Systems (NeurIPS), 2023
Ricardo Dominguez-Olmedo
Moritz Hardt
Celestine Mendler-Dünner
257
56
0
13 Jun 2023
Probing Quantifier Comprehension in Large Language Models: Another
  Example of Inverse Scaling
Probing Quantifier Comprehension in Large Language Models: Another Example of Inverse ScalingBlackboxNLP Workshop on Analyzing and Interpreting Neural Networks for NLP (BlackboxNLP), 2023
Akshat Gupta
ELMLRM
169
7
0
12 Jun 2023
Gradient Ascent Post-training Enhances Language Model Generalization
Gradient Ascent Post-training Enhances Language Model GeneralizationAnnual Meeting of the Association for Computational Linguistics (ACL), 2023
Dongkeun Yoon
Joel Jang
Sungdong Kim
Minjoon Seo
VLMAI4CE
176
3
0
12 Jun 2023
Recurrent Attention Networks for Long-text Modeling
Recurrent Attention Networks for Long-text ModelingAnnual Meeting of the Association for Computational Linguistics (ACL), 2023
Xianming Li
Zongxi Li
Xiaotian Luo
Haoran Xie
Xing Lee
Yingbin Zhao
Fu Lee Wang
Qing Li
RALM
153
20
0
12 Jun 2023
Language Versatilists vs. Specialists: An Empirical Revisiting on
  Multilingual Transfer Ability
Language Versatilists vs. Specialists: An Empirical Revisiting on Multilingual Transfer Ability
Jiacheng Ye
Xijia Tao
Lingpeng Kong
LRM
191
32
0
11 Jun 2023
Reducing Barriers to Self-Supervised Learning: HuBERT Pre-training with
  Academic Compute
Reducing Barriers to Self-Supervised Learning: HuBERT Pre-training with Academic ComputeInterspeech (Interspeech), 2023
William Chen
Xuankai Chang
Yifan Peng
Zhaoheng Ni
Soumi Maiti
Shinji Watanabe
SSL
233
29
0
11 Jun 2023
A Comprehensive Review of State-of-The-Art Methods for Java Code
  Generation from Natural Language Text
A Comprehensive Review of State-of-The-Art Methods for Java Code Generation from Natural Language TextNatural Language Processing Journal (JNLP), 2023
Jessica Nayeli López Espejel
Mahaman Sanoussi Yahaya Alassan
El Mehdi Chouham
Walid Dahhane
E. Ettifouri
187
13
0
10 Jun 2023
S$^{3}$: Increasing GPU Utilization during Generative Inference for
  Higher Throughput
S3^{3}3: Increasing GPU Utilization during Generative Inference for Higher ThroughputNeural Information Processing Systems (NeurIPS), 2023
Yunho Jin
Chun-Feng Wu
David Brooks
Gu-Yeon Wei
211
92
0
09 Jun 2023
The Age of Synthetic Realities: Challenges and Opportunities
The Age of Synthetic Realities: Challenges and OpportunitiesAPSIPA Transactions on Signal and Information Processing (TASIP), 2023
J. P. Cardenuto
Jing Yang
Rafael Padilha
Renjie Wan
Daniel Moreira
Haoliang Li
Shiqi Wang
Fernanda A. Andaló
Sébastien Marcel
Anderson de Rezende Rocha
DeLMO
180
33
0
09 Jun 2023
Xiezhi: An Ever-Updating Benchmark for Holistic Domain Knowledge
  Evaluation
Xiezhi: An Ever-Updating Benchmark for Holistic Domain Knowledge EvaluationAAAI Conference on Artificial Intelligence (AAAI), 2023
Zhouhong Gu
Xiaoxuan Zhu
Haoning Ye
Lin Zhang
Jianchen Wang
...
Zili Wang
Shusen Wang
Weiguo Zheng
Hongwei Feng
Yanghua Xiao
ALMELM
241
73
0
09 Jun 2023
CUED at ProbSum 2023: Hierarchical Ensemble of Summarization Models
CUED at ProbSum 2023: Hierarchical Ensemble of Summarization ModelsWorkshop on Biomedical Natural Language Processing (BioNLP), 2023
Potsawee Manakul
Yassir Fathullah
Adian Liusie
Vyas Raina
Vatsal Raina
Mark Gales
125
13
0
08 Jun 2023
PIXIU: A Large Language Model, Instruction Data and Evaluation Benchmark
  for Finance
PIXIU: A Large Language Model, Instruction Data and Evaluation Benchmark for Finance
Qianqian Xie
Weiguang Han
Xiao Zhang
Yanzhao Lai
Min Peng
Alejandro Lopez-Lira
Jimin Huang
ALM
165
220
0
08 Jun 2023
INSTRUCTEVAL: Towards Holistic Evaluation of Instruction-Tuned Large
  Language Models
INSTRUCTEVAL: Towards Holistic Evaluation of Instruction-Tuned Large Language Models
Yew Ken Chia
Pengfei Hong
Lidong Bing
Soujanya Poria
ELM
156
72
0
07 Jun 2023
ChatGPT is fun, but it is not funny! Humor is still challenging Large
  Language Models
ChatGPT is fun, but it is not funny! Humor is still challenging Large Language ModelsWorkshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis (WASSA), 2023
Sophie F. Jentzsch
Kristian Kersting
LRM
137
45
0
07 Jun 2023
Information Flow Control in Machine Learning through Modular Model
  Architecture
Information Flow Control in Machine Learning through Modular Model ArchitectureUSENIX Security Symposium (USENIX Security), 2023
Trishita Tiwari
Suchin Gururangan
Chuan Guo
Weizhe Hua
Sanjay Kariyappa
Udit Gupta
Wenjie Xiong
Kiwan Maeng
Hsien-Hsin S. Lee
G. E. Suh
159
9
0
05 Jun 2023
Semantically-Prompted Language Models Improve Visual Descriptions
Semantically-Prompted Language Models Improve Visual Descriptions
Michael Ogezi
B. Hauer
Grzegorz Kondrak
VLM
142
1
0
05 Jun 2023
Video-LLaMA: An Instruction-tuned Audio-Visual Language Model for Video
  Understanding
Video-LLaMA: An Instruction-tuned Audio-Visual Language Model for Video UnderstandingConference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Hang Zhang
Xin Li
Lidong Bing
MLLM
498
1,434
0
05 Jun 2023
LexGPT 0.1: pre-trained GPT-J models with Pile of Law
LexGPT 0.1: pre-trained GPT-J models with Pile of Law
Jieh-Sheng Lee
AILaw
87
14
0
05 Jun 2023
A Technical Report for Polyglot-Ko: Open-Source Large-Scale Korean
  Language Models
A Technical Report for Polyglot-Ko: Open-Source Large-Scale Korean Language Models
H. Ko
Kichang Yang
Minho Ryu
Taekyoon Choi
Seungmu Yang
Jiwung Hyun
Sung-Yong Park
Kyubyong Park
123
32
0
04 Jun 2023
Harnessing large-language models to generate private synthetic text
Harnessing large-language models to generate private synthetic text
Alexey Kurakin
Natalia Ponomareva
Umar Syed
Liam MacDermed
Seth Neel
SILMSyDa
215
53
0
02 Jun 2023
GAIA Search: Hugging Face and Pyserini Interoperability for NLP Training
  Data Exploration
GAIA Search: Hugging Face and Pyserini Interoperability for NLP Training Data ExplorationAnnual Meeting of the Association for Computational Linguistics (ACL), 2023
Aleksandra Piktus
Odunayo Ogundepo
Christopher Akiki
Akintunde Oladipo
Xinyu Crystina Zhang
Hailey Schoelkopf
Stella Biderman
Martin Potthast
Jimmy J. Lin
CVBM
166
11
0
02 Jun 2023
The RefinedWeb Dataset for Falcon LLM: Outperforming Curated Corpora
  with Web Data, and Web Data Only
The RefinedWeb Dataset for Falcon LLM: Outperforming Curated Corpora with Web Data, and Web Data Only
Guilherme Penedo
Quentin Malartic
Daniel Hesslow
Ruxandra-Aimée Cojocaru
Alessandro Cappelli
Hamza Alobeidli
B. Pannier
Ebtesam Almazrouei
Julien Launay
348
870
0
01 Jun 2023
AWQ: Activation-aware Weight Quantization for LLM Compression and
  Acceleration
AWQ: Activation-aware Weight Quantization for LLM Compression and AccelerationConference on Machine Learning and Systems (MLSys), 2023
Ji Lin
Jiaming Tang
Haotian Tang
Shang Yang
Wei-Ming Chen
Wei-Chen Wang
Guangxuan Xiao
Xingyu Dang
Chuang Gan
Song Han
EDLMQ
647
909
0
01 Jun 2023
Transformers learn to implement preconditioned gradient descent for
  in-context learning
Transformers learn to implement preconditioned gradient descent for in-context learningNeural Information Processing Systems (NeurIPS), 2023
Kwangjun Ahn
Xiang Cheng
Hadi Daneshmand
S. Sra
ODL
271
229
0
01 Jun 2023
A Systematic Study and Comprehensive Evaluation of ChatGPT on Benchmark
  Datasets
A Systematic Study and Comprehensive Evaluation of ChatGPT on Benchmark DatasetsAnnual Meeting of the Association for Computational Linguistics (ACL), 2023
Md Tahmid Rahman Laskar
M Saiful Bari
Mizanur Rahman
Md Amran Hossen Bhuiyan
Shafiq Joty
J. Huang
LM&MAELMALM
343
211
0
29 May 2023
DNA-GPT: Divergent N-Gram Analysis for Training-Free Detection of
  GPT-Generated Text
DNA-GPT: Divergent N-Gram Analysis for Training-Free Detection of GPT-Generated TextInternational Conference on Learning Representations (ICLR), 2023
Xianjun Yang
Wei Cheng
Yue Wu
Linda R. Petzold
William Yang Wang
Haifeng Chen
DeLMO
269
129
0
27 May 2023
RAMP: Retrieval and Attribute-Marking Enhanced Prompting for
  Attribute-Controlled Translation
RAMP: Retrieval and Attribute-Marking Enhanced Prompting for Attribute-Controlled TranslationAnnual Meeting of the Association for Computational Linguistics (ACL), 2023
Gabriele Sarti
Phu Mon Htut
Xing Niu
B. Hsu
Anna Currey
Georgiana Dinu
Maria Nadejde
LRM
177
13
0
26 May 2023
Efficient Detection of LLM-generated Texts with a Bayesian Surrogate
  Model
Efficient Detection of LLM-generated Texts with a Bayesian Surrogate ModelAnnual Meeting of the Association for Computational Linguistics (ACL), 2023
Yibo Miao
Hongcheng Gao
Hao Zhang
Zhijie Deng
DeLMO
241
23
0
26 May 2023
On the Tool Manipulation Capability of Open-source Large Language Models
On the Tool Manipulation Capability of Open-source Large Language Models
Qiantong Xu
Fenglu Hong
Yangqiu Song
Changran Hu
Zheng Chen
Jian Zhang
LLMAG
196
94
0
25 May 2023
Training Data Extraction From Pre-trained Language Models: A Survey
Training Data Extraction From Pre-trained Language Models: A Survey
Shotaro Ishihara
229
52
0
25 May 2023
Scaling Data-Constrained Language Models
Scaling Data-Constrained Language ModelsNeural Information Processing Systems (NeurIPS), 2023
Niklas Muennighoff
Alexander M. Rush
Boaz Barak
Teven Le Scao
Aleksandra Piktus
Nouamane Tazi
S. Pyysalo
Thomas Wolf
Colin Raffel
ALM
525
310
0
25 May 2023
Measuring and Mitigating Constraint Violations of In-Context Learning
  for Utterance-to-API Semantic Parsing
Measuring and Mitigating Constraint Violations of In-Context Learning for Utterance-to-API Semantic ParsingConference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Shufan Wang
Sébastien Jean
Sailik Sengupta
James Gung
Nikolaos Pappas
Yi Zhang
130
6
0
24 May 2023
ImageNetVC: Zero- and Few-Shot Visual Commonsense Evaluation on 1000
  ImageNet Categories
ImageNetVC: Zero- and Few-Shot Visual Commonsense Evaluation on 1000 ImageNet CategoriesConference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Heming Xia
Qingxiu Dong
Lei Li
Jingjing Xu
Tianyu Liu
Ziwei Qin
Zhifang Sui
MLLMVLM
113
4
0
24 May 2023
Editing Common Sense in Transformers
Editing Common Sense in TransformersConference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Anshita Gupta
Debanjan Mondal
Akshay Krishna Sheshadri
Wenlong Zhao
Xiang Lorraine Li
Sarah Wiegreffe
Niket Tandon
KELM
186
32
0
24 May 2023
Don't Take This Out of Context! On the Need for Contextual Models and
  Evaluations for Stylistic Rewriting
Don't Take This Out of Context! On the Need for Contextual Models and Evaluations for Stylistic Rewriting
Akhila Yerukola
Xuhui Zhou
Elizabeth Clark
Maarten Sap
169
7
0
24 May 2023
Have Large Language Models Developed a Personality?: Applicability of
  Self-Assessment Tests in Measuring Personality in LLMs
Have Large Language Models Developed a Personality?: Applicability of Self-Assessment Tests in Measuring Personality in LLMs
Xiaoyang Song
Akshat Gupta
Kiyan Mohebbizadeh
Shujie Hu
Anant Singh
127
34
0
24 May 2023
BAND: Biomedical Alert News Dataset
BAND: Biomedical Alert News DatasetAAAI Conference on Artificial Intelligence (AAAI), 2023
Z. Fu
Meiru Zhang
Zaiqiao Meng
Yannan Shen
David L. Buckeridge
Nigel Collier
114
3
0
23 May 2023
Sophia: A Scalable Stochastic Second-order Optimizer for Language Model
  Pre-training
Sophia: A Scalable Stochastic Second-order Optimizer for Language Model Pre-trainingInternational Conference on Learning Representations (ICLR), 2023
Hong Liu
Zhiyuan Li
David Leo Wright Hall
Abigail Z. Jacobs
Tengyu Ma
VLM
478
212
0
23 May 2023
Active Learning Principles for In-Context Learning with Large Language
  Models
Active Learning Principles for In-Context Learning with Large Language ModelsConference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Katerina Margatina
Timo Schick
Nikolaos Aletras
Jane Dwivedi-Yu
339
61
0
23 May 2023
Goat: Fine-tuned LLaMA Outperforms GPT-4 on Arithmetic Tasks
Goat: Fine-tuned LLaMA Outperforms GPT-4 on Arithmetic Tasks
Tiedong Liu
K. H. Low
ALM
178
98
0
23 May 2023
DetectLLM: Leveraging Log Rank Information for Zero-Shot Detection of
  Machine-Generated Text
DetectLLM: Leveraging Log Rank Information for Zero-Shot Detection of Machine-Generated TextConference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Jinyan Su
Terry Yue Zhuo
Haiyan Zhao
Preslav Nakov
DeLMO
219
198
0
23 May 2023
Learning from Mistakes via Cooperative Study Assistant for Large
  Language Models
Learning from Mistakes via Cooperative Study Assistant for Large Language ModelsConference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Danqing Wang
Lei Li
180
8
0
23 May 2023
CombLM: Adapting Black-Box Language Models through Small Fine-Tuned
  Models
CombLM: Adapting Black-Box Language Models through Small Fine-Tuned ModelsConference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Aitor Ormazabal
Mikel Artetxe
Eneko Agirre
173
28
0
23 May 2023
Do All Languages Cost the Same? Tokenization in the Era of Commercial
  Language Models
Do All Languages Cost the Same? Tokenization in the Era of Commercial Language ModelsConference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Orevaoghene Ahia
Sachin Kumar
Hila Gonen
Jungo Kasai
David R. Mortensen
Noah A. Smith
Yulia Tsvetkov
225
136
0
23 May 2023
Previous
123...1011121389
Next