Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2204.06745
Cited By
GPT-NeoX-20B: An Open-Source Autoregressive Language Model
14 April 2022
Sid Black
Stella Biderman
Eric Hallahan
Quentin G. Anthony
Leo Gao
Laurence Golding
Horace He
Connor Leahy
Kyle McDonell
Jason Phang
Michael Pieler
USVSN Sai Prashanth
Shivanshu Purohit
Laria Reynolds
J. Tow
Benqi Wang
Samuel Weinbach
Re-assign community
ArXiv (abs)
PDF
HTML
HuggingFace (1 upvotes)
Github (7200★)
Papers citing
"GPT-NeoX-20B: An Open-Source Autoregressive Language Model"
50 / 602 papers shown
Title
Guiding Language Models of Code with Global Context using Monitors
Lakshya A Agrawal
Aditya Kanade
Navin Goyal
Shuvendu K. Lahiri
S. Rajamani
301
33
0
19 Jun 2023
ZeRO++: Extremely Efficient Collective Communication for Giant Model Training
Guanhua Wang
Heyang Qin
S. A. Jacobs
Connor Holmes
Samyam Rajbhandari
Olatunji Ruwase
Feng Yan
Lei Yang
Yuxiong He
VLM
198
77
0
16 Jun 2023
You Don't Need Robust Machine Learning to Manage Adversarial Attack Risks
Edward Raff
M. Benaroch
Andrew L. Farris
AAML
128
4
0
16 Jun 2023
KoLA: Carefully Benchmarking World Knowledge of Large Language Models
International Conference on Learning Representations (ICLR), 2023
Jifan Yu
Xiaozhi Wang
Shangqing Tu
S. Cao
Daniel Zhang-Li
...
Lei Hou
Zhiyuan Liu
Bin Xu
Jie Tang
Juanzi Li
ELM
ALM
241
82
0
15 Jun 2023
ChessGPT: Bridging Policy Learning and Language Modeling
Neural Information Processing Systems (NeurIPS), 2023
Xidong Feng
Yicheng Luo
Ziyan Wang
Hongrui Tang
Mengyue Yang
Youssef Attia El Hili
D. Mguni
Yali Du
Jun Wang
226
66
0
15 Jun 2023
WizardCoder: Empowering Code Large Language Models with Evol-Instruct
International Conference on Learning Representations (ICLR), 2023
Ziyang Luo
Can Xu
Lu Wang
Qingfeng Sun
Xiubo Geng
Wenxiang Hu
Chongyang Tao
Jing Ma
Qingwei Lin
Daxin Jiang
ELM
SyDa
ALM
622
827
0
14 Jun 2023
Questioning the Survey Responses of Large Language Models
Neural Information Processing Systems (NeurIPS), 2023
Ricardo Dominguez-Olmedo
Moritz Hardt
Celestine Mendler-Dünner
257
56
0
13 Jun 2023
Probing Quantifier Comprehension in Large Language Models: Another Example of Inverse Scaling
BlackboxNLP Workshop on Analyzing and Interpreting Neural Networks for NLP (BlackboxNLP), 2023
Akshat Gupta
ELM
LRM
169
7
0
12 Jun 2023
Gradient Ascent Post-training Enhances Language Model Generalization
Annual Meeting of the Association for Computational Linguistics (ACL), 2023
Dongkeun Yoon
Joel Jang
Sungdong Kim
Minjoon Seo
VLM
AI4CE
176
3
0
12 Jun 2023
Recurrent Attention Networks for Long-text Modeling
Annual Meeting of the Association for Computational Linguistics (ACL), 2023
Xianming Li
Zongxi Li
Xiaotian Luo
Haoran Xie
Xing Lee
Yingbin Zhao
Fu Lee Wang
Qing Li
RALM
153
20
0
12 Jun 2023
Language Versatilists vs. Specialists: An Empirical Revisiting on Multilingual Transfer Ability
Jiacheng Ye
Xijia Tao
Lingpeng Kong
LRM
191
32
0
11 Jun 2023
Reducing Barriers to Self-Supervised Learning: HuBERT Pre-training with Academic Compute
Interspeech (Interspeech), 2023
William Chen
Xuankai Chang
Yifan Peng
Zhaoheng Ni
Soumi Maiti
Shinji Watanabe
SSL
233
29
0
11 Jun 2023
A Comprehensive Review of State-of-The-Art Methods for Java Code Generation from Natural Language Text
Natural Language Processing Journal (JNLP), 2023
Jessica Nayeli López Espejel
Mahaman Sanoussi Yahaya Alassan
El Mehdi Chouham
Walid Dahhane
E. Ettifouri
187
13
0
10 Jun 2023
S
3
^{3}
3
: Increasing GPU Utilization during Generative Inference for Higher Throughput
Neural Information Processing Systems (NeurIPS), 2023
Yunho Jin
Chun-Feng Wu
David Brooks
Gu-Yeon Wei
211
92
0
09 Jun 2023
The Age of Synthetic Realities: Challenges and Opportunities
APSIPA Transactions on Signal and Information Processing (TASIP), 2023
J. P. Cardenuto
Jing Yang
Rafael Padilha
Renjie Wan
Daniel Moreira
Haoliang Li
Shiqi Wang
Fernanda A. Andaló
Sébastien Marcel
Anderson de Rezende Rocha
DeLMO
180
33
0
09 Jun 2023
Xiezhi: An Ever-Updating Benchmark for Holistic Domain Knowledge Evaluation
AAAI Conference on Artificial Intelligence (AAAI), 2023
Zhouhong Gu
Xiaoxuan Zhu
Haoning Ye
Lin Zhang
Jianchen Wang
...
Zili Wang
Shusen Wang
Weiguo Zheng
Hongwei Feng
Yanghua Xiao
ALM
ELM
241
73
0
09 Jun 2023
CUED at ProbSum 2023: Hierarchical Ensemble of Summarization Models
Workshop on Biomedical Natural Language Processing (BioNLP), 2023
Potsawee Manakul
Yassir Fathullah
Adian Liusie
Vyas Raina
Vatsal Raina
Mark Gales
125
13
0
08 Jun 2023
PIXIU: A Large Language Model, Instruction Data and Evaluation Benchmark for Finance
Qianqian Xie
Weiguang Han
Xiao Zhang
Yanzhao Lai
Min Peng
Alejandro Lopez-Lira
Jimin Huang
ALM
165
220
0
08 Jun 2023
INSTRUCTEVAL: Towards Holistic Evaluation of Instruction-Tuned Large Language Models
Yew Ken Chia
Pengfei Hong
Lidong Bing
Soujanya Poria
ELM
156
72
0
07 Jun 2023
ChatGPT is fun, but it is not funny! Humor is still challenging Large Language Models
Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis (WASSA), 2023
Sophie F. Jentzsch
Kristian Kersting
LRM
137
45
0
07 Jun 2023
Information Flow Control in Machine Learning through Modular Model Architecture
USENIX Security Symposium (USENIX Security), 2023
Trishita Tiwari
Suchin Gururangan
Chuan Guo
Weizhe Hua
Sanjay Kariyappa
Udit Gupta
Wenjie Xiong
Kiwan Maeng
Hsien-Hsin S. Lee
G. E. Suh
159
9
0
05 Jun 2023
Semantically-Prompted Language Models Improve Visual Descriptions
Michael Ogezi
B. Hauer
Grzegorz Kondrak
VLM
142
1
0
05 Jun 2023
Video-LLaMA: An Instruction-tuned Audio-Visual Language Model for Video Understanding
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Hang Zhang
Xin Li
Lidong Bing
MLLM
498
1,434
0
05 Jun 2023
LexGPT 0.1: pre-trained GPT-J models with Pile of Law
Jieh-Sheng Lee
AILaw
87
14
0
05 Jun 2023
A Technical Report for Polyglot-Ko: Open-Source Large-Scale Korean Language Models
H. Ko
Kichang Yang
Minho Ryu
Taekyoon Choi
Seungmu Yang
Jiwung Hyun
Sung-Yong Park
Kyubyong Park
123
32
0
04 Jun 2023
Harnessing large-language models to generate private synthetic text
Alexey Kurakin
Natalia Ponomareva
Umar Syed
Liam MacDermed
Seth Neel
SILM
SyDa
215
53
0
02 Jun 2023
GAIA Search: Hugging Face and Pyserini Interoperability for NLP Training Data Exploration
Annual Meeting of the Association for Computational Linguistics (ACL), 2023
Aleksandra Piktus
Odunayo Ogundepo
Christopher Akiki
Akintunde Oladipo
Xinyu Crystina Zhang
Hailey Schoelkopf
Stella Biderman
Martin Potthast
Jimmy J. Lin
CVBM
166
11
0
02 Jun 2023
The RefinedWeb Dataset for Falcon LLM: Outperforming Curated Corpora with Web Data, and Web Data Only
Guilherme Penedo
Quentin Malartic
Daniel Hesslow
Ruxandra-Aimée Cojocaru
Alessandro Cappelli
Hamza Alobeidli
B. Pannier
Ebtesam Almazrouei
Julien Launay
348
870
0
01 Jun 2023
AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration
Conference on Machine Learning and Systems (MLSys), 2023
Ji Lin
Jiaming Tang
Haotian Tang
Shang Yang
Wei-Ming Chen
Wei-Chen Wang
Guangxuan Xiao
Xingyu Dang
Chuang Gan
Song Han
EDL
MQ
647
909
0
01 Jun 2023
Transformers learn to implement preconditioned gradient descent for in-context learning
Neural Information Processing Systems (NeurIPS), 2023
Kwangjun Ahn
Xiang Cheng
Hadi Daneshmand
S. Sra
ODL
271
229
0
01 Jun 2023
A Systematic Study and Comprehensive Evaluation of ChatGPT on Benchmark Datasets
Annual Meeting of the Association for Computational Linguistics (ACL), 2023
Md Tahmid Rahman Laskar
M Saiful Bari
Mizanur Rahman
Md Amran Hossen Bhuiyan
Shafiq Joty
J. Huang
LM&MA
ELM
ALM
343
211
0
29 May 2023
DNA-GPT: Divergent N-Gram Analysis for Training-Free Detection of GPT-Generated Text
International Conference on Learning Representations (ICLR), 2023
Xianjun Yang
Wei Cheng
Yue Wu
Linda R. Petzold
William Yang Wang
Haifeng Chen
DeLMO
269
129
0
27 May 2023
RAMP: Retrieval and Attribute-Marking Enhanced Prompting for Attribute-Controlled Translation
Annual Meeting of the Association for Computational Linguistics (ACL), 2023
Gabriele Sarti
Phu Mon Htut
Xing Niu
B. Hsu
Anna Currey
Georgiana Dinu
Maria Nadejde
LRM
177
13
0
26 May 2023
Efficient Detection of LLM-generated Texts with a Bayesian Surrogate Model
Annual Meeting of the Association for Computational Linguistics (ACL), 2023
Yibo Miao
Hongcheng Gao
Hao Zhang
Zhijie Deng
DeLMO
241
23
0
26 May 2023
On the Tool Manipulation Capability of Open-source Large Language Models
Qiantong Xu
Fenglu Hong
Yangqiu Song
Changran Hu
Zheng Chen
Jian Zhang
LLMAG
196
94
0
25 May 2023
Training Data Extraction From Pre-trained Language Models: A Survey
Shotaro Ishihara
229
52
0
25 May 2023
Scaling Data-Constrained Language Models
Neural Information Processing Systems (NeurIPS), 2023
Niklas Muennighoff
Alexander M. Rush
Boaz Barak
Teven Le Scao
Aleksandra Piktus
Nouamane Tazi
S. Pyysalo
Thomas Wolf
Colin Raffel
ALM
525
310
0
25 May 2023
Measuring and Mitigating Constraint Violations of In-Context Learning for Utterance-to-API Semantic Parsing
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Shufan Wang
Sébastien Jean
Sailik Sengupta
James Gung
Nikolaos Pappas
Yi Zhang
130
6
0
24 May 2023
ImageNetVC: Zero- and Few-Shot Visual Commonsense Evaluation on 1000 ImageNet Categories
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Heming Xia
Qingxiu Dong
Lei Li
Jingjing Xu
Tianyu Liu
Ziwei Qin
Zhifang Sui
MLLM
VLM
113
4
0
24 May 2023
Editing Common Sense in Transformers
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Anshita Gupta
Debanjan Mondal
Akshay Krishna Sheshadri
Wenlong Zhao
Xiang Lorraine Li
Sarah Wiegreffe
Niket Tandon
KELM
186
32
0
24 May 2023
Don't Take This Out of Context! On the Need for Contextual Models and Evaluations for Stylistic Rewriting
Akhila Yerukola
Xuhui Zhou
Elizabeth Clark
Maarten Sap
169
7
0
24 May 2023
Have Large Language Models Developed a Personality?: Applicability of Self-Assessment Tests in Measuring Personality in LLMs
Xiaoyang Song
Akshat Gupta
Kiyan Mohebbizadeh
Shujie Hu
Anant Singh
127
34
0
24 May 2023
BAND: Biomedical Alert News Dataset
AAAI Conference on Artificial Intelligence (AAAI), 2023
Z. Fu
Meiru Zhang
Zaiqiao Meng
Yannan Shen
David L. Buckeridge
Nigel Collier
114
3
0
23 May 2023
Sophia: A Scalable Stochastic Second-order Optimizer for Language Model Pre-training
International Conference on Learning Representations (ICLR), 2023
Hong Liu
Zhiyuan Li
David Leo Wright Hall
Abigail Z. Jacobs
Tengyu Ma
VLM
478
212
0
23 May 2023
Active Learning Principles for In-Context Learning with Large Language Models
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Katerina Margatina
Timo Schick
Nikolaos Aletras
Jane Dwivedi-Yu
339
61
0
23 May 2023
Goat: Fine-tuned LLaMA Outperforms GPT-4 on Arithmetic Tasks
Tiedong Liu
K. H. Low
ALM
178
98
0
23 May 2023
DetectLLM: Leveraging Log Rank Information for Zero-Shot Detection of Machine-Generated Text
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Jinyan Su
Terry Yue Zhuo
Haiyan Zhao
Preslav Nakov
DeLMO
219
198
0
23 May 2023
Learning from Mistakes via Cooperative Study Assistant for Large Language Models
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Danqing Wang
Lei Li
180
8
0
23 May 2023
CombLM: Adapting Black-Box Language Models through Small Fine-Tuned Models
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Aitor Ormazabal
Mikel Artetxe
Eneko Agirre
173
28
0
23 May 2023
Do All Languages Cost the Same? Tokenization in the Era of Commercial Language Models
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Orevaoghene Ahia
Sachin Kumar
Hila Gonen
Jungo Kasai
David R. Mortensen
Noah A. Smith
Yulia Tsvetkov
225
136
0
23 May 2023
Previous
1
2
3
...
10
11
12
13
8
9
Next