ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2204.06745
  4. Cited By
GPT-NeoX-20B: An Open-Source Autoregressive Language Model

GPT-NeoX-20B: An Open-Source Autoregressive Language Model

14 April 2022
Sid Black
Stella Biderman
Eric Hallahan
Quentin G. Anthony
Leo Gao
Laurence Golding
Horace He
Connor Leahy
Kyle McDonell
Jason Phang
Michael Pieler
USVSN Sai Prashanth
Shivanshu Purohit
Laria Reynolds
J. Tow
Benqi Wang
Samuel Weinbach
ArXiv (abs)PDFHTMLHuggingFace (1 upvotes)Github (7200★)

Papers citing "GPT-NeoX-20B: An Open-Source Autoregressive Language Model"

50 / 602 papers shown
Title
Large Language Models Struggle to Learn Long-Tail Knowledge
Large Language Models Struggle to Learn Long-Tail KnowledgeInternational Conference on Machine Learning (ICML), 2022
Nikhil Kandpal
H. Deng
Adam Roberts
Eric Wallace
Colin Raffel
RALMKELM
399
539
0
15 Nov 2022
An FNet based Auto Encoder for Long Sequence News Story Generation
An FNet based Auto Encoder for Long Sequence News Story Generation
Paul K. Mandal
Rakeshkumar V. Mahto
66
2
0
15 Nov 2022
Prompting Language Models for Linguistic Structure
Prompting Language Models for Linguistic StructureAnnual Meeting of the Association for Computational Linguistics (ACL), 2022
Terra Blevins
Hila Gonen
Luke Zettlemoyer
LRM
224
52
0
15 Nov 2022
Logical Tasks for Measuring Extrapolation and Rule Comprehension
Logical Tasks for Measuring Extrapolation and Rule Comprehension
Ippei Fujisawa
Ryota Kanai
ELMLRM
136
5
0
14 Nov 2022
BLOOM: A 176B-Parameter Open-Access Multilingual Language Model
BLOOM: A 176B-Parameter Open-Access Multilingual Language Model
BigScience Workshop
:
Teven Le Scao
Angela Fan
Christopher Akiki
...
Zhongli Xie
Zifan Ye
M. Bras
Younes Belkada
Thomas Wolf
VLM
804
2,713
0
09 Nov 2022
nBIIG: A Neural BI Insights Generation System for Table Reporting
nBIIG: A Neural BI Insights Generation System for Table ReportingAAAI Conference on Artificial Intelligence (AAAI), 2022
Yotam Perlitz
D. Sheinwald
Noam Slonim
Michal Shmueli-Scheuer
84
2
0
08 Nov 2022
Astronomia ex machina: a history, primer, and outlook on neural networks
  in astronomy
Astronomia ex machina: a history, primer, and outlook on neural networks in astronomyRoyal Society Open Science (RSOS), 2022
Michael J. Smith
James E. Geach
171
46
0
07 Nov 2022
PINTO: Faithful Language Reasoning Using Prompt-Generated Rationales
PINTO: Faithful Language Reasoning Using Prompt-Generated RationalesInternational Conference on Learning Representations (ICLR), 2022
Peifeng Wang
Aaron Chan
Filip Ilievski
Muhao Chen
Xiang Ren
LRMReLM
300
69
0
03 Nov 2022
Knowledge-in-Context: Towards Knowledgeable Semi-Parametric Language
  Models
Knowledge-in-Context: Towards Knowledgeable Semi-Parametric Language ModelsInternational Conference on Learning Representations (ICLR), 2022
Xiaoman Pan
Wenlin Yao
Hongming Zhang
Dian Yu
Dong Yu
Jianshu Chen
KELM
475
26
0
28 Oct 2022
Piloting Copilot, Codex, and StarCoder2: Hot Temperature, Cold Prompts, or Black Magic?
Piloting Copilot, Codex, and StarCoder2: Hot Temperature, Cold Prompts, or Black Magic?Journal of Systems and Software (JSS), 2022
Jean-Baptiste Döderlein
Nguessan Hermann Kouadio
M. Acher
D. Khelladi
B. Combemale
201
36
0
26 Oct 2022
Weakly Supervised Data Augmentation Through Prompting for Dialogue
  Understanding
Weakly Supervised Data Augmentation Through Prompting for Dialogue Understanding
Maximillian Chen
Alexandros Papangelis
Chenyang Tao
Andrew Rosenbaum
Seokhwan Kim
Yang Liu
Zhou Yu
Dilek Z. Hakkani-Tür
168
38
0
25 Oct 2022
Towards Better Few-Shot and Finetuning Performance with Forgetful Causal
  Language Models
Towards Better Few-Shot and Finetuning Performance with Forgetful Causal Language Models
Hao Liu
Xinyang Geng
Lisa Lee
Igor Mordatch
Sergey Levine
Sharan Narang
Pieter Abbeel
KELMCLL
225
3
0
24 Oct 2022
The Curious Case of Absolute Position Embeddings
The Curious Case of Absolute Position EmbeddingsConference on Empirical Methods in Natural Language Processing (EMNLP), 2022
Koustuv Sinha
Amirhossein Kazemnejad
Siva Reddy
J. Pineau
Dieuwke Hupkes
Adina Williams
196
19
0
23 Oct 2022
A Causal Framework to Quantify the Robustness of Mathematical Reasoning
  with Language Models
A Causal Framework to Quantify the Robustness of Mathematical Reasoning with Language ModelsAnnual Meeting of the Association for Computational Linguistics (ACL), 2022
Alessandro Stolfo
Zhijing Jin
Kumar Shridhar
Bernhard Schölkopf
Mrinmaya Sachan
ELMOODLRM
355
78
0
21 Oct 2022
Attribution and Obfuscation of Neural Text Authorship: A Data Mining
  Perspective
Attribution and Obfuscation of Neural Text Authorship: A Data Mining PerspectiveSIGKDD Explorations (SIGKDD Explor.), 2022
Adaku Uchendu
Thai Le
Dongwon Lee
DeLMO
257
52
0
19 Oct 2022
Plug-and-Play VQA: Zero-shot VQA by Conjoining Large Pretrained Models
  with Zero Training
Plug-and-Play VQA: Zero-shot VQA by Conjoining Large Pretrained Models with Zero TrainingConference on Empirical Methods in Natural Language Processing (EMNLP), 2022
A. M. H. Tiong
Junnan Li
Boyang Albert Li
Silvio Savarese
Guosheng Lin
MLLM
229
126
0
17 Oct 2022
Robust Preference Learning for Storytelling via Contrastive
  Reinforcement Learning
Robust Preference Learning for Storytelling via Contrastive Reinforcement Learning
Louis Castricato
Alexander Havrilla
Shahbuland Matiana
Michael Pieler
Anbang Ye
Ian Yang
Spencer Frazier
Mark O. Riedl
289
13
0
14 Oct 2022
Machine Generated Text: A Comprehensive Survey of Threat Models and
  Detection Methods
Machine Generated Text: A Comprehensive Survey of Threat Models and Detection MethodsIEEE Access (IEEE Access), 2022
Evan Crothers
Nathalie Japkowicz
H. Viktor
DeLMO
338
153
0
13 Oct 2022
Mass-Editing Memory in a Transformer
Mass-Editing Memory in a TransformerInternational Conference on Learning Representations (ICLR), 2022
Kevin Meng
Arnab Sen Sharma
A. Andonian
Yonatan Belinkov
David Bau
KELMVLM
357
768
0
13 Oct 2022
EleutherAI: Going Beyond "Open Science" to "Science in the Open"
EleutherAI: Going Beyond "Open Science" to "Science in the Open"
Jason Phang
Herbie Bradley
Leo Gao
Louis Castricato
Stella Biderman
VLM
148
16
0
12 Oct 2022
GLM-130B: An Open Bilingual Pre-trained Model
GLM-130B: An Open Bilingual Pre-trained ModelInternational Conference on Learning Representations (ICLR), 2022
Aohan Zeng
Xiao Liu
Zhengxiao Du
Zihan Wang
Hanyu Lai
...
Jidong Zhai
Wenguang Chen
Peng Zhang
Yuxiao Dong
Jie Tang
BDLLRM
682
1,206
0
05 Oct 2022
Explaining Patterns in Data with Language Models via Interpretable
  Autoprompting
Explaining Patterns in Data with Language Models via Interpretable Autoprompting
Chandan Singh
John X. Morris
J. Aneja
Alexander M. Rush
Jianfeng Gao
LRM
143
0
0
04 Oct 2022
Protein structure generation via folding diffusion
Protein structure generation via folding diffusionNature Communications (Nat Commun), 2022
Kevin E. Wu
Kevin Kaichuang Yang
Rianne van den Berg
James Zou
Alex X. Lu
Ava P. Amini
DiffM
301
250
0
30 Sep 2022
Unpacking Large Language Models with Conceptual Consistency
Unpacking Large Language Models with Conceptual Consistency
Pritish Sahu
Michael Cogswell
Yunye Gong
Ajay Divakaran
LRM
192
19
0
29 Sep 2022
Who is GPT-3? An Exploration of Personality, Values and Demographics
Who is GPT-3? An Exploration of Personality, Values and Demographics
Marilù Miotto
Nicola Rossberg
Bennett Kleinberg
ELMPILM
148
136
0
28 Sep 2022
Petals: Collaborative Inference and Fine-tuning of Large Models
Petals: Collaborative Inference and Fine-tuning of Large ModelsAnnual Meeting of the Association for Computational Linguistics (ACL), 2022
Alexander Borzunov
Dmitry Baranchuk
Tim Dettmers
Max Ryabinin
Younes Belkada
Artem Chumachenko
Pavel Samygin
Colin Raffel
VLM
210
89
0
02 Sep 2022
FOLIO: Natural Language Reasoning with First-Order Logic
FOLIO: Natural Language Reasoning with First-Order LogicConference on Empirical Methods in Natural Language Processing (EMNLP), 2022
Simeng Han
Hailey Schoelkopf
Yilun Zhao
Zhenting Qi
Martin Riddell
...
Yingbo Zhou
Caiming Xiong
Rex Ying
Arman Cohan
Dragomir R. Radev
ReLMLRM
321
151
0
02 Sep 2022
Annotated Dataset Creation through General Purpose Language Models for
  non-English Medical NLP
Annotated Dataset Creation through General Purpose Language Models for non-English Medical NLP
Johann Frei
Frank Kramer
127
2
0
30 Aug 2022
MultiPL-E: A Scalable and Extensible Approach to Benchmarking Neural
  Code Generation
MultiPL-E: A Scalable and Extensible Approach to Benchmarking Neural Code Generation
Federico Cassano
John Gouwar
Daniel Nguyen
S. Nguyen
Luna Phipps-Costin
...
Carolyn Jane Anderson
Molly Q. Feldman
Arjun Guha
Michael Greenberg
Abhinav Jangda
ELM
481
119
0
17 Aug 2022
Domain-Specific Text Generation for Machine Translation
Domain-Specific Text Generation for Machine TranslationConference of the Association for Machine Translation in the Americas (AMTA), 2022
Yasmin Moslem
Rejwanul Haque
John D. Kelleher
Andy Way
145
23
0
11 Aug 2022
What Can Transformers Learn In-Context? A Case Study of Simple Function
  Classes
What Can Transformers Learn In-Context? A Case Study of Simple Function ClassesNeural Information Processing Systems (NeurIPS), 2022
Shivam Garg
Dimitris Tsipras
Abigail Z. Jacobs
Gregory Valiant
541
659
0
01 Aug 2022
Multi-Level Fine-Tuning, Data Augmentation, and Few-Shot Learning for
  Specialized Cyber Threat Intelligence
Multi-Level Fine-Tuning, Data Augmentation, and Few-Shot Learning for Specialized Cyber Threat IntelligenceComputers & security (Comput. Secur.), 2022
Markus Bayer
Tobias Frey
Christian A. Reuter
AAML
125
20
0
22 Jul 2022
Can large language models reason about medical questions?
Can large language models reason about medical questions?Patterns (Patterns), 2022
Valentin Liévin
C. Hother
Andreas Geert Motzfeldt
Ole Winther
ELMLM&MAAI4MHLRM
462
384
0
17 Jul 2022
Recent Developments in AI and USPTO Open Data
Recent Developments in AI and USPTO Open Data
Scott Beliveau
Jerry Ma
107
1
0
12 Jul 2022
LM-Nav: Robotic Navigation with Large Pre-Trained Models of Language,
  Vision, and Action
LM-Nav: Robotic Navigation with Large Pre-Trained Models of Language, Vision, and ActionConference on Robot Learning (CoRL), 2022
Dhruv Shah
B. Osinski
Brian Ichter
Sergey Levine
LM&Ro
487
595
0
10 Jul 2022
DeepSpeed Inference: Enabling Efficient Inference of Transformer Models
  at Unprecedented Scale
DeepSpeed Inference: Enabling Efficient Inference of Transformer Models at Unprecedented ScaleInternational Conference for High Performance Computing, Networking, Storage and Analysis (SC), 2022
Reza Yazdani Aminabadi
Samyam Rajbhandari
Minjia Zhang
A. A. Awan
Cheng-rong Li
...
Elton Zheng
Jeff Rasley
Shaden Smith
Olatunji Ruwase
Yuxiong He
348
487
0
30 Jun 2022
BigBIO: A Framework for Data-Centric Biomedical Natural Language
  Processing
BigBIO: A Framework for Data-Centric Biomedical Natural Language ProcessingNeural Information Processing Systems (NeurIPS), 2022
Jason Alan Fries
Leon Weber
Natasha Seelam
Gabriel Altay
Debajyoti Datta
...
Minh Chien Vu
Trishala Neeraj
Jonas Golde
Albert Villanova del Moral
Benjamin Beilharz
LM&MA
208
54
0
30 Jun 2022
Repository-Level Prompt Generation for Large Language Models of Code
Repository-Level Prompt Generation for Large Language Models of CodeInternational Conference on Machine Learning (ICML), 2022
Disha Shrivastava
Hugo Larochelle
Daniel Tarlow
203
172
0
26 Jun 2022
Fault-Aware Neural Code Rankers
Fault-Aware Neural Code RankersNeural Information Processing Systems (NeurIPS), 2022
J. Inala
Chenglong Wang
Mei Yang
Andrés Codas
Mark Encarnación
Shuvendu K. Lahiri
Madan Musuvathi
Jianfeng Gao
ALM
245
51
0
04 Jun 2022
Code Generation Tools (Almost) for Free? A Study of Few-Shot,
  Pre-Trained Language Models on Code
Code Generation Tools (Almost) for Free? A Study of Few-Shot, Pre-Trained Language Models on Code
Patrick Bareiss
Beatriz Souza
Marcelo d’Amorim
Michael Pradel
ELM
202
91
0
02 Jun 2022
Ground-Truth Labels Matter: A Deeper Look into Input-Label
  Demonstrations
Ground-Truth Labels Matter: A Deeper Look into Input-Label DemonstrationsConference on Empirical Methods in Natural Language Processing (EMNLP), 2022
Kang Min Yoo
Junyeob Kim
Sungmin Cho
Hyunsoo Cho
Hwiyeol Jo
Sang-Woo Lee
Sang-goo Lee
Taeuk Kim
277
142
0
25 May 2022
Evaluating and Inducing Personality in Pre-trained Language Models
Evaluating and Inducing Personality in Pre-trained Language ModelsNeural Information Processing Systems (NeurIPS), 2022
Guangyuan Jiang
Manjie Xu
Song-Chun Zhu
Wenjuan Han
Fangqiu Yi
Yixin Zhu
264
119
0
20 May 2022
OPT: Open Pre-trained Transformer Language Models
OPT: Open Pre-trained Transformer Language Models
Susan Zhang
Stephen Roller
Naman Goyal
Mikel Artetxe
Moya Chen
...
Daniel Simig
Punit Singh Koura
Anjali Sridhar
Tianlu Wang
Luke Zettlemoyer
VLMOSLMAI4CE
799
4,320
0
02 May 2022
Inferring Implicit Relations in Complex Questions with Language Models
Inferring Implicit Relations in Complex Questions with Language ModelsConference on Empirical Methods in Natural Language Processing (EMNLP), 2022
Uri Katz
Mor Geva
Jonathan Berant
ReLMLRM
118
12
0
28 Apr 2022
VQGAN-CLIP: Open Domain Image Generation and Editing with Natural
  Language Guidance
VQGAN-CLIP: Open Domain Image Generation and Editing with Natural Language GuidanceEuropean Conference on Computer Vision (ECCV), 2022
Katherine Crowson
Stella Biderman
Daniel Kornis
Dashiell Stander
Eric Hallahan
Louis Castricato
Edward Raff
CLIP
391
436
0
18 Apr 2022
mGPT: Few-Shot Learners Go Multilingual
mGPT: Few-Shot Learners Go MultilingualTransactions of the Association for Computational Linguistics (TACL), 2022
Oleh Shliazhko
Alena Fenogenova
Maria Tikhonova
Vladislav Mikhailov
Anastasia Kozlova
Tatiana Shavrina
312
184
0
15 Apr 2022
InCoder: A Generative Model for Code Infilling and Synthesis
InCoder: A Generative Model for Code Infilling and SynthesisInternational Conference on Learning Representations (ICLR), 2022
Daniel Fried
Armen Aghajanyan
Jessy Lin
Sida I. Wang
Eric Wallace
Freda Shi
Ruiqi Zhong
Anuj Kumar
Luke Zettlemoyer
M. Lewis
SyDa
289
785
0
12 Apr 2022
What Language Model Architecture and Pretraining Objective Work Best for
  Zero-Shot Generalization?
What Language Model Architecture and Pretraining Objective Work Best for Zero-Shot Generalization?International Conference on Machine Learning (ICML), 2022
Thomas Wang
Adam Roberts
Daniel Hesslow
Teven Le Scao
Hyung Won Chung
Iz Beltagy
Julien Launay
Colin Raffel
258
211
0
12 Apr 2022
GrIPS: Gradient-free, Edit-based Instruction Search for Prompting Large
  Language Models
GrIPS: Gradient-free, Edit-based Instruction Search for Prompting Large Language ModelsConference of the European Chapter of the Association for Computational Linguistics (EACL), 2022
Archiki Prasad
Peter Hase
Xiang Zhou
Joey Tianyi Zhou
201
146
0
14 Mar 2022
Sustainable Cloud Services for Verbal Interaction with Embodied Agents
Sustainable Cloud Services for Verbal Interaction with Embodied Agents
Lucrezia Grassi
Carmine Tommaso Recchiuto
A. Sgorbissa
282
10
0
04 Mar 2022
Previous
123...111213
Next