Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2204.06745
Cited By
GPT-NeoX-20B: An Open-Source Autoregressive Language Model
14 April 2022
Sid Black
Stella Biderman
Eric Hallahan
Quentin G. Anthony
Leo Gao
Laurence Golding
Horace He
Connor Leahy
Kyle McDonell
Jason Phang
Michael Pieler
USVSN Sai Prashanth
Shivanshu Purohit
Laria Reynolds
J. Tow
Benqi Wang
Samuel Weinbach
Re-assign community
ArXiv (abs)
PDF
HTML
HuggingFace (1 upvotes)
Github (7200★)
Papers citing
"GPT-NeoX-20B: An Open-Source Autoregressive Language Model"
50 / 602 papers shown
Title
Exploring the Impact of Instruction Data Scaling on Large Language Models: An Empirical Study on Real-World Use Cases
Yunjie Ji
Yong Deng
Yan Gong
Yiping Peng
Qiang Niu
Guang Dai
Baochang Ma
Xiangang Li
ALM
143
115
0
26 Mar 2023
Fundamentals of Generative Large Language Models and Perspectives in Cyber-Defense
Andrei Kucharavy
Z. Schillaci
Loic Maréchal
Maxime Wursch
Ljiljana Dolamic
Remi Sabonnadiere
Dimitri Percia David
Alain Mermoud
Vincent Lenders
ELM
AI4CE
136
37
0
21 Mar 2023
EVA-02: A Visual Representation for Neon Genesis
Image and Vision Computing (IVC), 2023
Yuxin Fang
Quan-Sen Sun
Xinggang Wang
Tiejun Huang
Xinlong Wang
Yue Cao
VLM
ViT
CLIP
352
390
0
20 Mar 2023
cito: An R package for training neural networks using torch
Christian Amesoeder
F. Hartig
Maximilian Pichler
151
5
0
16 Mar 2023
SelfCheckGPT: Zero-Resource Black-Box Hallucination Detection for Generative Large Language Models
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Potsawee Manakul
Adian Liusie
Mark Gales
HILM
LRM
409
636
0
15 Mar 2023
Exploring ChatGPT's Ability to Rank Content: A Preliminary Study on Consistency with Human Preferences
Yunjie Ji
Yan Gong
Yiping Peng
Chao Ni
Peiyan Sun
Dongyu Pan
Baochang Ma
Xiangang Li
ELM
ALM
AI4MH
117
40
0
14 Mar 2023
Eliciting Latent Predictions from Transformers with the Tuned Lens
Nora Belrose
Zach Furman
Logan Smith
Danny Halawi
Igor V. Ostrovsky
Lev McKinney
Stella Biderman
Jacob Steinhardt
440
305
0
14 Mar 2023
Baldur: Whole-Proof Generation and Repair with Large Language Models
E. First
M. Rabe
Talia Ringer
Yuriy Brun
276
135
0
08 Mar 2023
The BigScience ROOTS Corpus: A 1.6TB Composite Multilingual Dataset
Neural Information Processing Systems (NeurIPS), 2023
Hugo Laurenccon
Lucile Saulnier
Thomas Wang
Christopher Akiki
Albert Villanova del Moral
...
Violette Lepercq
Suzana Ilić
Margaret Mitchell
Sasha Luccioni
Yacine Jernite
AI4CE
AILaw
186
194
0
07 Mar 2023
OpenICL: An Open-Source Framework for In-context Learning
Annual Meeting of the Association for Computational Linguistics (ACL), 2023
Zhenyu Wu
Yaoxiang Wang
Jiacheng Ye
Jiangtao Feng
Jingjing Xu
Yu Qiao
Zhiyong Wu
157
58
0
06 Mar 2023
Prismer: A Vision-Language Model with Multi-Task Experts
Shikun Liu
Linxi Fan
Edward Johns
Zhiding Yu
Chaowei Xiao
Anima Anandkumar
VLM
MLLM
292
33
0
04 Mar 2023
Investigating the Effectiveness of Task-Agnostic Prefix Prompt for Instruction Following
AAAI Conference on Artificial Intelligence (AAAI), 2023
Seonghyeon Ye
Hyeonbin Hwang
Sohee Yang
Hyeongu Yun
Yireun Kim
Minjoon Seo
LRM
210
46
0
28 Feb 2023
LLaMA: Open and Efficient Foundation Language Models
Hugo Touvron
Thibaut Lavril
Gautier Izacard
Xavier Martinet
Marie-Anne Lachaux
...
Faisal Azhar
Aurelien Rodriguez
Armand Joulin
Edouard Grave
Guillaume Lample
ALM
PILM
2.7K
17,373
0
27 Feb 2023
SpikeGPT: Generative Pre-trained Language Model with Spiking Neural Networks
Rui-Jie Zhu
Qihang Zhao
Guoqi Li
Nhan Duy Truong
BDL
VLM
407
111
0
27 Feb 2023
On the Robustness of ChatGPT: An Adversarial and Out-of-distribution Perspective
IEEE Data Engineering Bulletin (IEEE Data Eng. Bull.), 2023
Yongfeng Zhang
Xixu Hu
Wenxin Hou
Hao Chen
Runkai Zheng
...
Weirong Ye
Xiubo Geng
Binxing Jiao
Yue Zhang
Xingxu Xie
AI4MH
450
284
0
22 Feb 2023
In-context Example Selection with Influences
Nguyen Tai
Eric Wong
288
67
0
21 Feb 2023
Conversation Style Transfer using Few-Shot Learning
International Joint Conference on Natural Language Processing (IJCNLP), 2023
Shamik Roy
Raphael Shu
Nikolaos Pappas
Elman Mansimov
Yi Zhang
Saab Mansour
Dan Roth
191
9
0
16 Feb 2023
Do We Still Need Clinical Language Models?
ACM Conference on Health, Inference, and Learning (CHIL), 2023
Eric P. Lehman
Evan Hernandez
Diwakar Mahajan
Jonas Wulff
Micah J. Smith
Zachary M. Ziegler
Daniel Nadler
Peter Szolovits
Alistair E. W. Johnson
Emily Alsentzer
LM&MA
AI4MH
224
149
0
16 Feb 2023
Transformer models: an introduction and catalog
X. Amatriain
Ananth Sankar
Jie Bing
Praveen Kumar Bodigutla
Timothy J. Hazen
Michaeel Kazi
441
70
0
12 Feb 2023
In-Context Learning with Many Demonstration Examples
Mukai Li
Shansan Gong
Jiangtao Feng
Yiheng Xu
Jinchao Zhang
Zhiyong Wu
Lingpeng Kong
222
42
0
09 Feb 2023
ChatGPT versus Traditional Question Answering for Knowledge Graphs: Current Status and Future Directions Towards Knowledge Graph Chatbots
Reham Omar
Omij Mangukiya
Panos Kalnis
Essam Mansour
AI4MH
135
88
0
08 Feb 2023
The Gradient of Generative AI Release: Methods and Considerations
Conference on Fairness, Accountability and Transparency (FAccT), 2023
Irene Solaiman
166
125
0
05 Feb 2023
Down the Rabbit Hole: Detecting Online Extremism, Radicalisation, and Politicised Hate Speech
ACM Computing Surveys (ACM Comput. Surv.), 2023
Jarod Govers
Philip G. Feldman
Aaron Dant
Panos Patros
84
36
0
27 Jan 2023
DetectGPT: Zero-Shot Machine-Generated Text Detection using Probability Curvature
International Conference on Machine Learning (ICML), 2023
E. Mitchell
Yoonho Lee
Alexander Khazatsky
Christopher D. Manning
Chelsea Finn
645
822
0
26 Jan 2023
Efficient Language Model Training through Cross-Lingual and Progressive Transfer Learning
Malte Ostendorff
Georg Rehm
CLIP
VLM
CLL
254
34
0
23 Jan 2023
Blind Judgement: Agent-Based Supreme Court Modelling With GPT
S. Hamilton
LLMAG
ELM
132
44
0
12 Jan 2023
Cramming: Training a Language Model on a Single GPU in One Day
International Conference on Machine Learning (ICML), 2022
Jonas Geiping
Tom Goldstein
MoE
252
100
0
28 Dec 2022
Why Does Surprisal From Larger Transformer-Based Language Models Provide a Poorer Fit to Human Reading Times?
Transactions of the Association for Computational Linguistics (TACL), 2022
Byung-Doh Oh
William Schuler
137
146
0
23 Dec 2022
JASMINE: Arabic GPT Models for Few-Shot Learning
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2022
El Moatez Billah Nagoudi
Muhammad Abdul-Mageed
AbdelRahim Elmadany
Alcides Alcoba Inciarte
Md. Tawkat Islam Khondaker
176
13
0
21 Dec 2022
Analyzing Semantic Faithfulness of Language Models via Input Intervention on Question Answering
International Conference on Computational Logic (ICCL), 2022
Akshay Chaturvedi
Swarnadeep Bhar
Soumadeep Saha
Utpal Garain
Nicholas Asher
205
8
0
21 Dec 2022
When Not to Trust Language Models: Investigating Effectiveness of Parametric and Non-Parametric Memories
Annual Meeting of the Association for Computational Linguistics (ACL), 2022
Alex Troy Mallen
Akari Asai
Victor Zhong
Rajarshi Das
Daniel Khashabi
Hannaneh Hajishirzi
RALM
HILM
KELM
317
849
0
20 Dec 2022
Is GPT-3 a Good Data Annotator?
Annual Meeting of the Association for Computational Linguistics (ACL), 2022
Bosheng Ding
Chengwei Qin
Linlin Liu
Yew Ken Chia
Shafiq Joty
Boyang Albert Li
Lidong Bing
290
298
0
20 Dec 2022
CoCo: Coherence-Enhanced Machine-Generated Text Detection Under Data Limitation With Contrastive Learning
Xiaoming Liu
Zhaohan Zhang
Yichen Wang
Hang Pu
Y. Lan
Chao Shen
239
49
0
20 Dec 2022
CoCoMIC: Code Completion By Jointly Modeling In-file and Cross-file Context
International Conference on Language Resources and Evaluation (LREC), 2022
Yangruibo Ding
Zijian Wang
Wasi Uddin Ahmad
M. K. Ramanathan
Ramesh Nallapati
Parminder Bhatia
Dan Roth
Bing Xiang
288
89
0
20 Dec 2022
Inducing Character-level Structure in Subword-based Language Models with Type-level Interchange Intervention Training
Annual Meeting of the Association for Computational Linguistics (ACL), 2022
Jing-ling Huang
Zhengxuan Wu
Kyle Mahowald
Christopher Potts
205
15
0
19 Dec 2022
Z-ICL: Zero-Shot In-Context Learning with Pseudo-Demonstrations
Annual Meeting of the Association for Computational Linguistics (ACL), 2022
Xinxi Lyu
Sewon Min
Iz Beltagy
Luke Zettlemoyer
Hannaneh Hajishirzi
VLM
141
78
0
19 Dec 2022
KNIFE: Distilling Reasoning Knowledge From Free-Text Rationales
Aaron Chan
Zhiyuan Zeng
Wyatt Lake
Brihi Joshi
Hanjie Chen
Xiang Ren
ReLM
LRM
175
1
0
19 Dec 2022
The case for 4-bit precision: k-bit Inference Scaling Laws
International Conference on Machine Learning (ICML), 2022
Tim Dettmers
Luke Zettlemoyer
MQ
333
282
0
19 Dec 2022
BLOOM+1: Adding Language Support to BLOOM for Zero-Shot Prompting
Annual Meeting of the Association for Computational Linguistics (ACL), 2022
Zheng-Xin Yong
Hailey Schoelkopf
Niklas Muennighoff
Alham Fikri Aji
David Ifeoluwa Adelani
...
Genta Indra Winata
Stella Biderman
Edward Raff
Dragomir R. Radev
Vassilina Nikoulina
CLL
VLM
AI4CE
LRM
330
105
0
19 Dec 2022
Large Language Models Meet NL2Code: A Survey
Annual Meeting of the Association for Computational Linguistics (ACL), 2022
Daoguang Zan
B. Chen
Fengji Zhang
Di Lu
Bingchao Wu
Bei Guan
Yongji Wang
Jian-Guang Lou
ELM
ALM
201
226
0
19 Dec 2022
Rethinking the Role of Scale for In-Context Learning: An Interpretability-based Case Study at 66 Billion Scale
Annual Meeting of the Association for Computational Linguistics (ACL), 2022
Hritik Bansal
Karthik Gopalakrishnan
Saket Dingliwal
S. Bodapati
Katrin Kirchhoff
Dan Roth
LRM
212
63
0
18 Dec 2022
Synthesis and Evaluation of a Domain-specific Large Data Set for Dungeons & Dragons
Pacific Asia Conference on Language, Information and Computation (PACLIC), 2022
Akila Peiris
Nisansa de Silva
114
5
0
18 Dec 2022
Self-Prompting Large Language Models for Zero-Shot Open-Domain QA
North American Chapter of the Association for Computational Linguistics (NAACL), 2022
Junlong Li
Jinyuan Wang
Zhuosheng Zhang
Hai Zhao
LRM
180
51
0
16 Dec 2022
Implicit causality in GPT-2: a case study
International Conference on Computational Semantics (IWCS), 2022
H. Huynh
T. Lentz
Emiel van Miltenburg
LRM
156
3
0
08 Dec 2022
Legal Prompt Engineering for Multilingual Legal Judgement Prediction
Dietrich Trautmann
Alina Petrova
Frank Schilder
ELM
AILaw
193
90
0
05 Dec 2022
Self-Destructing Models: Increasing the Costs of Harmful Dual Uses of Foundation Models
AAAI/ACM Conference on AI, Ethics, and Society (AIES), 2022
Peter Henderson
E. Mitchell
Christopher D. Manning
Dan Jurafsky
Chelsea Finn
176
62
0
27 Nov 2022
Understanding BLOOM: An empirical study on diverse NLP tasks
Parag Dakle
Sai Krishna Rallabandi
Preethi Raghavan
AI4CE
183
4
0
27 Nov 2022
DS-1000: A Natural and Reliable Benchmark for Data Science Code Generation
International Conference on Machine Learning (ICML), 2022
Yuhang Lai
Chengxi Li
Yiming Wang
Tianyi Zhang
Ruiqi Zhong
Luke Zettlemoyer
Scott Yih
Daniel Fried
Si-yi Wang
Tao Yu
ELM
ALM
259
434
0
18 Nov 2022
Galactica: A Large Language Model for Science
Ross Taylor
Marcin Kardas
Guillem Cucurull
Thomas Scialom
Anthony Hartshorn
Elvis Saravia
Andrew Poulton
Viktor Kerkez
Robert Stojnic
ELM
ReLM
343
912
0
16 Nov 2022
Evaluating the Factual Consistency of Large Language Models Through News Summarization
Annual Meeting of the Association for Computational Linguistics (ACL), 2022
Derek Tam
Anisha Mascarenhas
Shiyue Zhang
Sarah Kwan
Joey Tianyi Zhou
Colin Raffel
HILM
253
126
0
15 Nov 2022
Previous
1
2
3
...
10
11
12
13
Next