Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2204.06745
Cited By
GPT-NeoX-20B: An Open-Source Autoregressive Language Model
14 April 2022
Sid Black
Stella Biderman
Eric Hallahan
Quentin G. Anthony
Leo Gao
Laurence Golding
Horace He
Connor Leahy
Kyle McDonell
Jason Phang
Michael Pieler
USVSN Sai Prashanth
Shivanshu Purohit
Laria Reynolds
J. Tow
Benqi Wang
Samuel Weinbach
Re-assign community
ArXiv
PDF
HTML
Papers citing
"GPT-NeoX-20B: An Open-Source Autoregressive Language Model"
50 / 554 papers shown
Title
Astronomia ex machina: a history, primer, and outlook on neural networks in astronomy
Michael J. Smith
James E. Geach
16
32
0
07 Nov 2022
PINTO: Faithful Language Reasoning Using Prompt-Generated Rationales
Peifeng Wang
Aaron Chan
Filip Ilievski
Muhao Chen
Xiang Ren
LRM
ReLM
13
59
0
03 Nov 2022
Knowledge-in-Context: Towards Knowledgeable Semi-Parametric Language Models
Xiaoman Pan
Wenlin Yao
Hongming Zhang
Dian Yu
Dong Yu
Jianshu Chen
KELM
189
22
0
28 Oct 2022
Piloting Copilot and Codex: Hot Temperature, Cold Prompts, or Black Magic?
Jean-Baptiste Döderlein
M. Acher
D. Khelladi
B. Combemale
34
33
0
26 Oct 2022
Weakly Supervised Data Augmentation Through Prompting for Dialogue Understanding
Maximillian Chen
Alexandros Papangelis
Chenyang Tao
Andrew Rosenbaum
Seokhwan Kim
Yang Liu
Zhou Yu
Dilek Z. Hakkani-Tür
28
32
0
25 Oct 2022
Towards Better Few-Shot and Finetuning Performance with Forgetful Causal Language Models
Hao Liu
Xinyang Geng
Lisa Lee
Igor Mordatch
Sergey Levine
Sharan Narang
Pieter Abbeel
KELM
CLL
30
2
0
24 Oct 2022
The Curious Case of Absolute Position Embeddings
Koustuv Sinha
Amirhossein Kazemnejad
Siva Reddy
J. Pineau
Dieuwke Hupkes
Adina Williams
75
15
0
23 Oct 2022
A Causal Framework to Quantify the Robustness of Mathematical Reasoning with Language Models
Alessandro Stolfo
Zhijing Jin
Kumar Shridhar
Bernhard Schölkopf
Mrinmaya Sachan
ELM
OOD
LRM
30
61
0
21 Oct 2022
Attribution and Obfuscation of Neural Text Authorship: A Data Mining Perspective
Adaku Uchendu
Thai Le
Dongwon Lee
DeLMO
19
40
0
19 Oct 2022
Plug-and-Play VQA: Zero-shot VQA by Conjoining Large Pretrained Models with Zero Training
A. M. H. Tiong
Junnan Li
Boyang Albert Li
Silvio Savarese
S. Hoi
MLLM
27
101
0
17 Oct 2022
Robust Preference Learning for Storytelling via Contrastive Reinforcement Learning
Louis Castricato
Alexander Havrilla
Shahbuland Matiana
Michael Pieler
Anbang Ye
Ian Yang
Spencer Frazier
Mark O. Riedl
23
12
0
14 Oct 2022
Machine Generated Text: A Comprehensive Survey of Threat Models and Detection Methods
Evan Crothers
Nathalie Japkowicz
H. Viktor
DeLMO
20
107
0
13 Oct 2022
Mass-Editing Memory in a Transformer
Kevin Meng
Arnab Sen Sharma
A. Andonian
Yonatan Belinkov
David Bau
KELM
VLM
31
520
0
13 Oct 2022
EleutherAI: Going Beyond "Open Science" to "Science in the Open"
Jason Phang
Herbie Bradley
Leo Gao
Louis Castricato
Stella Biderman
VLM
35
12
0
12 Oct 2022
GLM-130B: An Open Bilingual Pre-trained Model
Aohan Zeng
Xiao Liu
Zhengxiao Du
Zihan Wang
Hanyu Lai
...
Jidong Zhai
Wenguang Chen
Peng-Zhen Zhang
Yuxiao Dong
Jie Tang
BDL
LRM
245
1,071
0
05 Oct 2022
Explaining Patterns in Data with Language Models via Interpretable Autoprompting
Chandan Singh
John X. Morris
J. Aneja
Alexander M. Rush
Jianfeng Gao
LRM
20
0
0
04 Oct 2022
Protein structure generation via folding diffusion
Kevin E. Wu
Kevin Kaichuang Yang
Rianne van den Berg
James Y. Zou
Alex X. Lu
Ava P. Amini
DiffM
25
191
0
30 Sep 2022
Unpacking Large Language Models with Conceptual Consistency
Pritish Sahu
Michael Cogswell
Yunye Gong
Ajay Divakaran
LRM
79
16
0
29 Sep 2022
Who is GPT-3? An Exploration of Personality, Values and Demographics
Marilù Miotto
Nicola Rossberg
Bennett Kleinberg
ELM
PILM
19
105
0
28 Sep 2022
Petals: Collaborative Inference and Fine-tuning of Large Models
Alexander Borzunov
Dmitry Baranchuk
Tim Dettmers
Max Ryabinin
Younes Belkada
Artem Chumachenko
Pavel Samygin
Colin Raffel
VLM
22
61
0
02 Sep 2022
FOLIO: Natural Language Reasoning with First-Order Logic
Simeng Han
Hailey Schoelkopf
Yilun Zhao
Zhenting Qi
Martin Riddell
...
Yingbo Zhou
Caiming Xiong
Rex Ying
Arman Cohan
Dragomir R. Radev
ReLM
LRM
26
91
0
02 Sep 2022
Annotated Dataset Creation through General Purpose Language Models for non-English Medical NLP
Johann Frei
Frank Kramer
14
1
0
30 Aug 2022
MultiPL-E: A Scalable and Extensible Approach to Benchmarking Neural Code Generation
Federico Cassano
John Gouwar
Daniel Nguyen
S. Nguyen
Luna Phipps-Costin
...
Carolyn Jane Anderson
Molly Q. Feldman
Arjun Guha
Michael Greenberg
Abhinav Jangda
ELM
22
81
0
17 Aug 2022
Domain-Specific Text Generation for Machine Translation
Yasmin Moslem
Rejwanul Haque
John D. Kelleher
Andy Way
8
16
0
11 Aug 2022
What Can Transformers Learn In-Context? A Case Study of Simple Function Classes
Shivam Garg
Dimitris Tsipras
Percy Liang
Gregory Valiant
19
447
0
01 Aug 2022
Multi-Level Fine-Tuning, Data Augmentation, and Few-Shot Learning for Specialized Cyber Threat Intelligence
Markus Bayer
Tobias Frey
Christian A. Reuter
AAML
16
15
0
22 Jul 2022
Can large language models reason about medical questions?
Valentin Liévin
C. Hother
Andreas Geert Motzfeldt
Ole Winther
ELM
LM&MA
AI4MH
LRM
14
299
0
17 Jul 2022
Recent Developments in AI and USPTO Open Data
Scott Beliveau
Jerry Ma
14
1
0
12 Jul 2022
LM-Nav: Robotic Navigation with Large Pre-Trained Models of Language, Vision, and Action
Dhruv Shah
B. Osinski
Brian Ichter
Sergey Levine
LM&Ro
139
435
0
10 Jul 2022
DeepSpeed Inference: Enabling Efficient Inference of Transformer Models at Unprecedented Scale
Reza Yazdani Aminabadi
Samyam Rajbhandari
Minjia Zhang
A. A. Awan
Cheng-rong Li
...
Elton Zheng
Jeff Rasley
Shaden Smith
Olatunji Ruwase
Yuxiong He
24
334
0
30 Jun 2022
BigBIO: A Framework for Data-Centric Biomedical Natural Language Processing
Jason Alan Fries
Leon Weber
Natasha Seelam
Gabriel Altay
Debajyoti Datta
...
Minh Chien Vu
Trishala Neeraj
Jonas Golde
Albert Villanova del Moral
Benjamin Beilharz
LM&MA
93
45
0
30 Jun 2022
Repository-Level Prompt Generation for Large Language Models of Code
Disha Shrivastava
Hugo Larochelle
Daniel Tarlow
15
136
0
26 Jun 2022
Fault-Aware Neural Code Rankers
J. Inala
Chenglong Wang
Mei Yang
Andrés Codas
Mark Encarnación
Shuvendu K. Lahiri
Madan Musuvathi
Jianfeng Gao
ALM
11
41
0
04 Jun 2022
Code Generation Tools (Almost) for Free? A Study of Few-Shot, Pre-Trained Language Models on Code
Patrick Bareiss
Beatriz Souza
Marcelo d’Amorim
Michael Pradel
ELM
8
75
0
02 Jun 2022
Ground-Truth Labels Matter: A Deeper Look into Input-Label Demonstrations
Kang Min Yoo
Junyeob Kim
Hyuhng Joon Kim
Hyunsoo Cho
Hwiyeol Jo
Sang-Woo Lee
Sang-goo Lee
Taeuk Kim
23
123
0
25 May 2022
Evaluating and Inducing Personality in Pre-trained Language Models
Guangyuan Jiang
Manjie Xu
Song-Chun Zhu
Wenjuan Han
Chi Zhang
Yixin Zhu
26
75
0
20 May 2022
OPT: Open Pre-trained Transformer Language Models
Susan Zhang
Stephen Roller
Naman Goyal
Mikel Artetxe
Moya Chen
...
Daniel Simig
Punit Singh Koura
Anjali Sridhar
Tianlu Wang
Luke Zettlemoyer
VLM
OSLM
AI4CE
33
3,475
0
02 May 2022
Inferring Implicit Relations in Complex Questions with Language Models
Uri Katz
Mor Geva
Jonathan Berant
ReLM
LRM
14
11
0
28 Apr 2022
VQGAN-CLIP: Open Domain Image Generation and Editing with Natural Language Guidance
Katherine Crowson
Stella Biderman
Daniel Kornis
Dashiell Stander
Eric Hallahan
Louis Castricato
Edward Raff
CLIP
57
367
0
18 Apr 2022
mGPT: Few-Shot Learners Go Multilingual
Oleh Shliazhko
Alena Fenogenova
Maria Tikhonova
Vladislav Mikhailov
Anastasia Kozlova
Tatiana Shavrina
26
148
0
15 Apr 2022
InCoder: A Generative Model for Code Infilling and Synthesis
Daniel Fried
Armen Aghajanyan
Jessy Lin
Sida I. Wang
Eric Wallace
Freda Shi
Ruiqi Zhong
Wen-tau Yih
Luke Zettlemoyer
M. Lewis
SyDa
22
625
0
12 Apr 2022
What Language Model Architecture and Pretraining Objective Work Best for Zero-Shot Generalization?
Thomas Wang
Adam Roberts
Daniel Hesslow
Teven Le Scao
Hyung Won Chung
Iz Beltagy
Julien Launay
Colin Raffel
18
167
0
12 Apr 2022
GrIPS: Gradient-free, Edit-based Instruction Search for Prompting Large Language Models
Archiki Prasad
Peter Hase
Xiang Zhou
Mohit Bansal
15
117
0
14 Mar 2022
Sustainable Cloud Services for Verbal Interaction with Embodied Agents
Lucrezia Grassi
C. Recchiuto
A. Sgorbissa
13
8
0
04 Mar 2022
Training language models to follow instructions with human feedback
Long Ouyang
Jeff Wu
Xu Jiang
Diogo Almeida
Carroll L. Wainwright
...
Amanda Askell
Peter Welinder
Paul Christiano
Jan Leike
Ryan J. Lowe
OSLM
ALM
303
11,881
0
04 Mar 2022
A Systematic Evaluation of Large Language Models of Code
Frank F. Xu
Uri Alon
Graham Neubig
Vincent J. Hellendoorn
ELM
ALM
202
628
0
26 Feb 2022
Fooling MOSS Detection with Pretrained Language Models
Stella Biderman
Edward Raff
DeLMO
4
35
0
19 Jan 2022
Multitask Prompted Training Enables Zero-Shot Task Generalization
Victor Sanh
Albert Webson
Colin Raffel
Stephen H. Bach
Lintang Sutawika
...
T. Bers
Stella Biderman
Leo Gao
Thomas Wolf
Alexander M. Rush
LRM
211
1,654
0
15 Oct 2021
MSP: Multi-Stage Prompting for Making Pre-trained Language Models Better Translators
Zhixing Tan
Xiangwen Zhang
Shuo Wang
Yang Liu
VLM
LRM
205
52
0
13 Oct 2021
What Changes Can Large-scale Language Models Bring? Intensive Study on HyperCLOVA: Billions-scale Korean Generative Pretrained Transformers
Boseop Kim
Hyoungseok Kim
Sang-Woo Lee
Gichang Lee
Donghyun Kwak
...
Jaewook Kang
Inho Kang
Jung-Woo Ha
W. Park
Nako Sung
VLM
241
121
0
10 Sep 2021
Previous
1
2
3
...
10
11
12
Next