Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2204.06745
Cited By
GPT-NeoX-20B: An Open-Source Autoregressive Language Model
14 April 2022
Sid Black
Stella Biderman
Eric Hallahan
Quentin G. Anthony
Leo Gao
Laurence Golding
Horace He
Connor Leahy
Kyle McDonell
Jason Phang
Michael Pieler
USVSN Sai Prashanth
Shivanshu Purohit
Laria Reynolds
J. Tow
Benqi Wang
Samuel Weinbach
Re-assign community
ArXiv (abs)
PDF
HTML
HuggingFace (1 upvotes)
Github (7200★)
Papers citing
"GPT-NeoX-20B: An Open-Source Autoregressive Language Model"
50 / 603 papers shown
User Simulation with Large Language Models for Evaluating Task-Oriented Dialogue
Sam Davidson
Salvatore Romeo
Raphael Shu
James Gung
Arshit Gupta
Saab Mansour
Yi Zhang
ALM
LLMAG
254
3
0
23 Sep 2023
Knowledge Sanitization of Large Language Models
Yoichi Ishibashi
Hidetoshi Shimodaira
KELM
254
37
0
21 Sep 2023
SlimPajama-DC: Understanding Data Combinations for LLM Training
Zhiqiang Shen
Tianhua Tao
Liqun Ma
Willie Neiswanger
Zhengzhong Liu
...
Bowen Tan
Joel Hestness
Natalia Vassilieva
Daria Soboleva
Eric Xing
434
69
0
19 Sep 2023
CFGPT: Chinese Financial Assistant with Large Language Model
Jiangtong Li
Hao Wang
Guoxuan Wang
Yang Lei
Dawei Cheng
Zhijun Ding
Changjun Jiang
179
17
0
19 Sep 2023
Flash-LLM: Enabling Cost-Effective and Highly-Efficient Large Generative Model Inference with Unstructured Sparsity
Proceedings of the VLDB Endowment (PVLDB), 2023
Haojun Xia
Zhen Zheng
Yuchao Li
Donglin Zhuang
Zhongzhu Zhou
Xiafei Qiu
Yong Li
Wei Lin
Shuaiwen Leon Song
170
22
0
19 Sep 2023
Generative modeling, design and analysis of spider silk protein sequences for enhanced mechanical properties
Advanced Functional Materials (Adv. Funct. Mater.), 2023
Wei Lu
David L. Kaplan
Markus J. Buehler
159
39
0
18 Sep 2023
Struc-Bench: Are Large Language Models Really Good at Generating Complex Structured Data?
Xiangru Tang
Yiming Zong
Jason Phang
Yilun Zhao
Wangchunshu Zhou
Arman Cohan
Mark B. Gerstein
LMTD
ELM
ALM
268
16
0
16 Sep 2023
CoCA: Fusing Position Embedding with Collinear Constrained Attention in Transformers for Long Context Window Extending
Annual Meeting of the Association for Computational Linguistics (ACL), 2023
Shiyi Zhu
Jingting Ye
Wei Jiang
Siqiao Xue
Qi Zhang
Yifan Wu
Jianguo Li
138
6
0
15 Sep 2023
CATfOOD: Counterfactual Augmented Training for Improving Out-of-Domain Performance and Calibration
Conference of the European Chapter of the Association for Computational Linguistics (EACL), 2023
Rachneet Sachdeva
Martin Tutek
Iryna Gurevych
OODD
311
16
0
14 Sep 2023
EarthPT: a time series foundation model for Earth Observation
Michael J. Smith
Luke Fleming
James E. Geach
AI4TS
219
14
0
13 Sep 2023
From Base to Conversational: Japanese Instruction Dataset and Tuning Large Language Models
BigData Congress [Services Society] (BSS), 2023
Masahiro Suzuki
Masanori Hirano
Hiroki Sakaji
282
7
0
07 Sep 2023
Data-Juicer: A One-Stop Data Processing System for Large Language Models
Daoyuan Chen
Yilun Huang
Zhijian Ma
Hesen Chen
Xuchen Pan
...
Zhaoyang Liu
Jinyang Gao
Yaliang Li
Bolin Ding
Jingren Zhou
SyDa
VLM
297
59
0
05 Sep 2023
RenAIssance: A Survey into AI Text-to-Image Generation in the Era of Large Model
IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2023
Fengxiang Bie
Jianlong Wu
Zhongzhu Zhou
Adam Ghanem
Minjia Zhang
...
Pareesa Ameneh Golnari
David A. Clifton
Yuxiong He
Dacheng Tao
Shuaiwen Leon Song
EGVM
253
55
0
02 Sep 2023
YaRN: Efficient Context Window Extension of Large Language Models
International Conference on Learning Representations (ICLR), 2023
Bowen Peng
Jeffrey Quesnelle
Honglu Fan
Enrico Shippole
OSLM
392
403
0
31 Aug 2023
Examining User-Friendly and Open-Sourced Large GPT Models: A Survey on Language, Multimodal, and Scientific GPT Models
Ran Bi
Su He
Zhenyu He
Jiacheng Lin
Qizhi Pei
Jie Shao
Wei Zhang
LM&MA
SyDa
199
14
0
27 Aug 2023
Code Llama: Open Foundation Models for Code
Baptiste Rozière
Jonas Gehring
Fabian Gloeckle
Sten Sootla
Itai Gat
...
Hugo Touvron
Louis Martin
Nicolas Usunier
Thomas Scialom
Gabriel Synnaeve
ELM
ALM
457
2,786
0
24 Aug 2023
Anonymity at Risk? Assessing Re-Identification Capabilities of Large Language Models
Alex Nyffenegger
Matthias Sturmer
Joel Niklaus
210
10
0
22 Aug 2023
Instruction Tuning for Large Language Models: A Survey
Shengyu Zhang
Linfeng Dong
Xiaoya Li
Sen Zhang
Xiaofei Sun
...
Jiwei Li
Runyi Hu
Tianwei Zhang
Leilei Gan
Guoyin Wang
LM&MA
914
759
0
21 Aug 2023
Large Language Models for Software Engineering: A Systematic Literature Review
ACM Transactions on Software Engineering and Methodology (TOSEM), 2023
Xinying Hou
Yanjie Zhao
Yue Liu
Zhou Yang
Kailong Wang
Li Li
Xiapu Luo
David Lo
John C. Grundy
Haoyu Wang
358
743
0
21 Aug 2023
WizardMath: Empowering Mathematical Reasoning for Large Language Models via Reinforced Evol-Instruct
International Conference on Learning Representations (ICLR), 2023
Haipeng Luo
Qingfeng Sun
Can Xu
Lu Wang
Jian-Guang Lou
...
Xiubo Geng
Qingwei Lin
Shifeng Chen
Yansong Tang
Dongmei Zhang
LRM
OSLM
800
624
0
18 Aug 2023
PMET: Precise Model Editing in a Transformer
AAAI Conference on Artificial Intelligence (AAAI), 2023
Xiaopeng Li
Shasha Li
Shezheng Song
Jing Yang
Jun Ma
Jie Yu
KELM
519
178
0
17 Aug 2023
AudioFormer: Audio Transformer learns audio feature representations from discrete acoustic codes
Zhaohui Li
Haitao Wang
Xinghua Jiang
426
1
0
14 Aug 2023
OctoPack: Instruction Tuning Code Large Language Models
International Conference on Learning Representations (ICLR), 2023
Niklas Muennighoff
Qian Liu
A. Zebaze
Qinkai Zheng
Binyuan Hui
Terry Yue Zhuo
Swayam Singh
Xiangru Tang
Leandro von Werra
Shayne Longpre
VLM
ALM
359
186
0
14 Aug 2023
Large Language Models for Information Retrieval: A Survey
Yutao Zhu
Huaying Yuan
Shuting Wang
Jiongnan Liu
Wenhan Liu
Chenlong Deng
Haonan Chen
Zheng Liu
Zhicheng Dou
Ji-Rong Wen
KELM
634
452
0
14 Aug 2023
Three Ways of Using Large Language Models to Evaluate Chat
Ondvrej Plátek
Vojtvech Hudevcek
Patrícia Schmidtová
Mateusz Lango
Ondrej Dusek
ALM
167
7
0
12 Aug 2023
Bringing order into the realm of Transformer-based language models for artificial intelligence and law
Artificial Intelligence and Law (ICAIL), 2023
C. M. Greco
Andrea Tagarelli
AILaw
217
41
0
10 Aug 2023
SILO Language Models: Isolating Legal Risk In a Nonparametric Datastore
International Conference on Learning Representations (ICLR), 2023
Sewon Min
Suchin Gururangan
Eric Wallace
Hannaneh Hajishirzi
Noah A. Smith
Luke Zettlemoyer
AILaw
276
87
0
08 Aug 2023
Large Language Model Prompt Chaining for Long Legal Document Classification
Dietrich Trautmann
ELM
AILaw
149
19
0
08 Aug 2023
Continual Pre-Training of Large Language Models: How to (re)warm your model?
Kshitij Gupta
Benjamin Thérien
Adam Ibrahim
Mats L. Richter
Quentin G. Anthony
Eugene Belilovsky
Irina Rish
Timothée Lesort
KELM
382
135
0
08 Aug 2023
Evaluating and Explaining Large Language Models for Code Using Syntactic Structures
David Nader-Palacio
Alejandro Velasco
Daniel Rodríguez-Cárdenas
Kevin Moran
Denys Poshyvanyk
219
12
0
07 Aug 2023
RecycleGPT: An Autoregressive Language Model with Recyclable Module
Yu Jiang
Qiaozhi He
Xiaomin Zhuang
Zhihua Wu
Kunpeng Wang
Wenlai Zhao
Guangwen Yang
KELM
274
3
0
07 Aug 2023
Learning to Paraphrase Sentences to Different Complexity Levels
Transactions of the Association for Computational Linguistics (TACL), 2023
Alison Chi
Li-Kuang Chen
Yi-Chen Chang
Shu-Hui Lee
Jason J. S. Chang
170
15
0
04 Aug 2023
TransNormerLLM: A Faster and Better Large Language Model with Improved TransNormer
Zhen Qin
Dong Li
Weigao Sun
Weixuan Sun
Xuyang Shen
...
Yunshen Wei
Baohong Lv
Xiao Luo
Yu Qiao
Yiran Zhong
186
32
0
27 Jul 2023
Exploiting the Potential of Seq2Seq Models as Robust Few-Shot Learners
Jihyeon Janel Lee
Dain Kim
Doohae Jung
Boseop Kim
Kyoung-Woon On
107
0
0
27 Jul 2023
Evaluating the Ripple Effects of Knowledge Editing in Language Models
Transactions of the Association for Computational Linguistics (TACL), 2023
Roi Cohen
Eden Biran
Ori Yoran
Amir Globerson
Mor Geva
KELM
364
227
0
24 Jul 2023
A Zero-shot and Few-shot Study of Instruction-Finetuned Large Language Models Applied to Clinical and Biomedical Tasks
International Conference on Language Resources and Evaluation (LREC), 2023
Yanis Labrak
Mickael Rouvier
Richard Dufour
LM&MA
237
47
0
22 Jul 2023
FinPT: Financial Risk Prediction with Profile Tuning on Pretrained Foundation Models
Yuwei Yin
Yazheng Yang
Jian Yang
Qi Liu
147
22
0
22 Jul 2023
FinGPT: Democratizing Internet-scale Data for Financial Large Language Models
Xiao-Yang Liu
Guoxuan Wang
Hongyang Yang
Daochen Zha
AIFin
250
87
0
19 Jul 2023
Overthinking the Truth: Understanding how Language Models Process False Demonstrations
International Conference on Learning Representations (ICLR), 2023
Danny Halawi
Jean-Stanislas Denain
Jacob Steinhardt
312
72
0
18 Jul 2023
On the application of Large Language Models for language teaching and assessment technology
Andrew Caines
Luca Benedetto
Shiva Taslimipoor
Christopher Davis
Yuan Gao
...
Marek Rei
H. Yannakoudakis
Andrew Mullooly
D. Nicholls
P. Buttery
ELM
261
61
0
17 Jul 2023
Generating Benchmarks for Factuality Evaluation of Language Models
Conference of the European Chapter of the Association for Computational Linguistics (EACL), 2023
Dor Muhlgay
Ori Ram
Inbal Magar
Yoav Levine
Nir Ratner
Yonatan Belinkov
Omri Abend
Kevin Leyton-Brown
Amnon Shashua
Y. Shoham
HILM
247
123
0
13 Jul 2023
A Comprehensive Overview of Large Language Models
ACM Transactions on Intelligent Systems and Technology (ACM TIST), 2023
Humza Naveed
Asad Ullah Khan
Shi Qiu
Muhammad Saqib
Saeed Anwar
Muhammad Usman
Naveed Akhtar
Nick Barnes
Lin Wang
OffRL
854
1,173
0
12 Jul 2023
QIGen: Generating Efficient Kernels for Quantized Inference on Large Language Models
Tommaso Pegolotti
Elias Frantar
Dan Alistarh
Markus Püschel
MQ
68
5
0
07 Jul 2023
Evaluating Biased Attitude Associations of Language Models in an Intersectional Context
AAAI/ACM Conference on AI, Ethics, and Society (AIES), 2023
Shiva Omrani Sabbaghi
Robert Wolfe
Aylin Caliskan
201
29
0
07 Jul 2023
Several categories of Large Language Models (LLMs): A Short Survey
International Journal for Research in Applied Science and Engineering Technology (IJRASET), 2023
Saurabh Pahune
Manoj Chandrasekharan
AILaw
202
30
0
05 Jul 2023
Natural Language Generation and Understanding of Big Code for AI-Assisted Programming: A Review
Entropy (Entropy), 2023
M. Wong
Shangxin Guo
Ching Nam Hang
Siu-Wai Ho
C. Tan
239
122
0
04 Jul 2023
InstructEval: Systematic Evaluation of Instruction Selection Methods
Anirudh Ajith
Chris Pan
Mengzhou Xia
Ameet Deshpande
Karthik Narasimhan
ELM
186
22
0
01 Jul 2023
Mirage: Towards Low-interruption Services on Batch GPU Clusters with Reinforcement Learning
International Conference for High Performance Computing, Networking, Storage and Analysis (SC), 2023
Qi-Dong Ding
Pengfei Zheng
Shreyas Kudari
Shivaram Venkataraman
Zhao-jie Zhang
VLM
OffRL
164
5
0
25 Jun 2023
H
2
_2
2
O: Heavy-Hitter Oracle for Efficient Generative Inference of Large Language Models
Neural Information Processing Systems (NeurIPS), 2023
Zhenyu Zhang
Ying Sheng
Wanrong Zhu
Tianlong Chen
Lianmin Zheng
...
Yuandong Tian
Christopher Ré
Clark W. Barrett
Zinan Lin
Beidi Chen
VLM
755
474
0
24 Jun 2023
Long-range Language Modeling with Self-retrieval
Transactions of the Association for Computational Linguistics (TACL), 2023
Ohad Rubin
Jonathan Berant
RALM
KELM
219
31
0
23 Jun 2023
Previous
1
2
3
...
7
8
9
...
11
12
13
Next