Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2204.06745
Cited By
GPT-NeoX-20B: An Open-Source Autoregressive Language Model
14 April 2022
Sid Black
Stella Biderman
Eric Hallahan
Quentin G. Anthony
Leo Gao
Laurence Golding
Horace He
Connor Leahy
Kyle McDonell
Jason Phang
Michael Pieler
USVSN Sai Prashanth
Shivanshu Purohit
Laria Reynolds
J. Tow
Benqi Wang
Samuel Weinbach
Re-assign community
ArXiv
PDF
HTML
Papers citing
"GPT-NeoX-20B: An Open-Source Autoregressive Language Model"
50 / 554 papers shown
Title
CoCA: Fusing Position Embedding with Collinear Constrained Attention in Transformers for Long Context Window Extending
Shiyi Zhu
Jingting Ye
Wei Jiang
Siqiao Xue
Qi Zhang
Yifan Wu
Jianguo Li
27
4
0
15 Sep 2023
CATfOOD: Counterfactual Augmented Training for Improving Out-of-Domain Performance and Calibration
Rachneet Sachdeva
Martin Tutek
Iryna Gurevych
OODD
10
10
0
14 Sep 2023
EarthPT: a time series foundation model for Earth Observation
Michael J. Smith
Luke Fleming
James E. Geach
AI4TS
22
7
0
13 Sep 2023
From Base to Conversational: Japanese Instruction Dataset and Tuning Large Language Models
Masahiro Suzuki
Masanori Hirano
Hiroki Sakaji
39
6
0
07 Sep 2023
Data-Juicer: A One-Stop Data Processing System for Large Language Models
Daoyuan Chen
Yilun Huang
Zhijian Ma
Hesen Chen
Xuchen Pan
...
Zhaoyang Liu
Jinyang Gao
Yaliang Li
Bolin Ding
Jingren Zhou
SyDa
VLM
18
29
0
05 Sep 2023
RenAIssance: A Survey into AI Text-to-Image Generation in the Era of Large Model
Fengxiang Bie
Yibo Yang
Zhongzhu Zhou
Adam Ghanem
Minjia Zhang
...
Pareesa Ameneh Golnari
David A. Clifton
Yuxiong He
Dacheng Tao
S. Song
EGVM
25
18
0
02 Sep 2023
YaRN: Efficient Context Window Extension of Large Language Models
Bowen Peng
Jeffrey Quesnelle
Honglu Fan
Enrico Shippole
OSLM
11
224
0
31 Aug 2023
Examining User-Friendly and Open-Sourced Large GPT Models: A Survey on Language, Multimodal, and Scientific GPT Models
Kaiyuan Gao
Su He
Zhenyu He
Jiacheng Lin
Qizhi Pei
Jie Shao
Wei Zhang
LM&MA
SyDa
30
4
0
27 Aug 2023
Code Llama: Open Foundation Models for Code
Baptiste Rozière
Jonas Gehring
Fabian Gloeckle
Sten Sootla
Itai Gat
...
Hugo Touvron
Louis Martin
Nicolas Usunier
Thomas Scialom
Gabriel Synnaeve
ELM
ALM
63
1,890
0
24 Aug 2023
Anonymity at Risk? Assessing Re-Identification Capabilities of Large Language Models
Alex Nyffenegger
Matthias Sturmer
Joel Niklaus
26
5
0
22 Aug 2023
Instruction Tuning for Large Language Models: A Survey
Shengyu Zhang
Linfeng Dong
Xiaoya Li
Sen Zhang
Xiaofei Sun
...
Jiwei Li
Runyi Hu
Tianwei Zhang
Fei Wu
Guoyin Wang
LM&MA
21
532
0
21 Aug 2023
Large Language Models for Software Engineering: A Systematic Literature Review
Xinying Hou
Yanjie Zhao
Yue Liu
Zhou Yang
Kailong Wang
Li Li
Xiapu Luo
David Lo
John C. Grundy
Haoyu Wang
25
322
0
21 Aug 2023
PMET: Precise Model Editing in a Transformer
Xiaopeng Li
Shasha Li
Shezheng Song
Jing Yang
Jun Ma
Jie Yu
KELM
26
115
0
17 Aug 2023
AudioFormer: Audio Transformer learns audio feature representations from discrete acoustic codes
Zhaohui Li
Haitao Wang
Xinghua Jiang
29
1
0
14 Aug 2023
OctoPack: Instruction Tuning Code Large Language Models
Niklas Muennighoff
Qian Liu
A. Zebaze
Qinkai Zheng
Binyuan Hui
Terry Yue Zhuo
Swayam Singh
Xiangru Tang
Leandro von Werra
Shayne Longpre
VLM
ALM
60
117
0
14 Aug 2023
Large Language Models for Information Retrieval: A Survey
Yutao Zhu
Huaying Yuan
Shuting Wang
Jiongnan Liu
Wenhan Liu
Chenlong Deng
Haonan Chen
Zhicheng Dou
Ji-Rong Wen
KELM
44
283
0
14 Aug 2023
Three Ways of Using Large Language Models to Evaluate Chat
Ondvrej Plátek
Vojtvech Hudevcek
Patrícia Schmidtová
Mateusz Lango
Ondrej Dusek
ALM
19
5
0
12 Aug 2023
Bringing order into the realm of Transformer-based language models for artificial intelligence and law
C. M. Greco
Andrea Tagarelli
AILaw
22
19
0
10 Aug 2023
SILO Language Models: Isolating Legal Risk In a Nonparametric Datastore
Sewon Min
Suchin Gururangan
Eric Wallace
Hannaneh Hajishirzi
Noah A. Smith
Luke Zettlemoyer
AILaw
22
63
0
08 Aug 2023
Large Language Model Prompt Chaining for Long Legal Document Classification
Dietrich Trautmann
ELM
AILaw
21
10
0
08 Aug 2023
Continual Pre-Training of Large Language Models: How to (re)warm your model?
Kshitij Gupta
Benjamin Thérien
Adam Ibrahim
Mats L. Richter
Quentin G. Anthony
Eugene Belilovsky
Irina Rish
Timothée Lesort
KELM
22
99
0
08 Aug 2023
Evaluating and Explaining Large Language Models for Code Using Syntactic Structures
David Nader-Palacio
Alejandro Velasco
Daniel Rodríguez-Cárdenas
Kevin Moran
Denys Poshyvanyk
34
8
0
07 Aug 2023
RecycleGPT: An Autoregressive Language Model with Recyclable Module
Yu Jiang
Qiaozhi He
Xiaomin Zhuang
Zhihua Wu
Kunpeng Wang
Wenlai Zhao
Guangwen Yang
KELM
23
3
0
07 Aug 2023
Learning to Paraphrase Sentences to Different Complexity Levels
Alison Chi
Li-Kuang Chen
Yi-Chen Chang
Shu-Hui Lee
Jason J. S. Chang
19
10
0
04 Aug 2023
TransNormerLLM: A Faster and Better Large Language Model with Improved TransNormer
Zhen Qin
Dong Li
Weigao Sun
Weixuan Sun
Xuyang Shen
...
Yunshen Wei
Baohong Lv
Xiao Luo
Yu Qiao
Yiran Zhong
43
15
0
27 Jul 2023
Exploiting the Potential of Seq2Seq Models as Robust Few-Shot Learners
Jihyeon Janel Lee
Dain Kim
Doohae Jung
Boseop Kim
Kyoung-Woon On
28
0
0
27 Jul 2023
Evaluating the Ripple Effects of Knowledge Editing in Language Models
Roi Cohen
Eden Biran
Ori Yoran
Amir Globerson
Mor Geva
KELM
33
155
0
24 Jul 2023
A Zero-shot and Few-shot Study of Instruction-Finetuned Large Language Models Applied to Clinical and Biomedical Tasks
Yanis Labrak
Mickael Rouvier
Richard Dufour
LM&MA
10
25
0
22 Jul 2023
FinPT: Financial Risk Prediction with Profile Tuning on Pretrained Foundation Models
Yuwei Yin
Yazheng Yang
Jian Yang
Qi Liu
13
12
0
22 Jul 2023
FinGPT: Democratizing Internet-scale Data for Financial Large Language Models
Xiao-Yang Liu
Guoxuan Wang
Hongyang Yang
Daochen Zha
AIFin
36
42
0
19 Jul 2023
Overthinking the Truth: Understanding how Language Models Process False Demonstrations
Danny Halawi
Jean-Stanislas Denain
Jacob Steinhardt
28
52
0
18 Jul 2023
On the application of Large Language Models for language teaching and assessment technology
Andrew Caines
Luca Benedetto
Shiva Taslimipoor
Christopher Davis
Yuan Gao
...
Marek Rei
H. Yannakoudakis
Andrew Mullooly
D. Nicholls
P. Buttery
ELM
14
41
0
17 Jul 2023
Generating Benchmarks for Factuality Evaluation of Language Models
Dor Muhlgay
Ori Ram
Inbal Magar
Yoav Levine
Nir Ratner
Yonatan Belinkov
Omri Abend
Kevin Leyton-Brown
Amnon Shashua
Y. Shoham
HILM
25
91
0
13 Jul 2023
A Comprehensive Overview of Large Language Models
Humza Naveed
Asad Ullah Khan
Shi Qiu
Muhammad Saqib
Saeed Anwar
Muhammad Usman
Naveed Akhtar
Nick Barnes
Ajmal Saeed Mian
OffRL
46
523
0
12 Jul 2023
QIGen: Generating Efficient Kernels for Quantized Inference on Large Language Models
Tommaso Pegolotti
Elias Frantar
Dan Alistarh
Markus Püschel
MQ
17
3
0
07 Jul 2023
Evaluating Biased Attitude Associations of Language Models in an Intersectional Context
Shiva Omrani Sabbaghi
Robert Wolfe
Aylin Caliskan
23
22
0
07 Jul 2023
Several categories of Large Language Models (LLMs): A Short Survey
Saurabh Pahune
Manoj Chandrasekharan
AILaw
17
14
0
05 Jul 2023
Natural Language Generation and Understanding of Big Code for AI-Assisted Programming: A Review
M. Wong
Shangxin Guo
Ching Nam Hang
Siu-Wai Ho
C. Tan
33
78
0
04 Jul 2023
InstructEval: Systematic Evaluation of Instruction Selection Methods
Anirudh Ajith
Chris Pan
Mengzhou Xia
A. Deshpande
Karthik Narasimhan
ELM
25
16
0
01 Jul 2023
Mirage: Towards Low-interruption Services on Batch GPU Clusters with Reinforcement Learning
Qi-Dong Ding
Pengfei Zheng
Shreyas Kudari
Shivaram Venkataraman
Zhao-jie Zhang
VLM
OffRL
8
3
0
25 Jun 2023
H
2
_2
2
O: Heavy-Hitter Oracle for Efficient Generative Inference of Large Language Models
Zhenyu (Allen) Zhang
Ying Sheng
Tianyi Zhou
Tianlong Chen
Lianmin Zheng
...
Yuandong Tian
Christopher Ré
Clark W. Barrett
Zhangyang Wang
Beidi Chen
VLM
47
248
0
24 Jun 2023
Long-range Language Modeling with Self-retrieval
Ohad Rubin
Jonathan Berant
RALM
KELM
19
18
0
23 Jun 2023
Textbooks Are All You Need
Suriya Gunasekar
Yi Zhang
J. Aneja
C. C. T. Mendes
Allison Del Giorno
...
Sébastien Bubeck
Ronen Eldan
Adam Tauman Kalai
Y. Lee
Yuan-Fang Li
AI4CE
ALM
SyDa
22
386
0
20 Jun 2023
Guiding Language Models of Code with Global Context using Monitors
Lakshya A Agrawal
Aditya Kanade
Navin Goyal
Shuvendu K. Lahiri
S. Rajamani
38
23
0
19 Jun 2023
ZeRO++: Extremely Efficient Collective Communication for Giant Model Training
Guanhua Wang
Heyang Qin
S. A. Jacobs
Connor Holmes
Samyam Rajbhandari
Olatunji Ruwase
Feng Yan
Lei Yang
Yuxiong He
VLM
55
56
0
16 Jun 2023
You Don't Need Robust Machine Learning to Manage Adversarial Attack Risks
Edward Raff
M. Benaroch
Andrew L. Farris
AAML
22
2
0
16 Jun 2023
KoLA: Carefully Benchmarking World Knowledge of Large Language Models
Jifan Yu
Xiaozhi Wang
Shangqing Tu
S. Cao
Daniel Zhang-Li
...
Lei Hou
Zhiyuan Liu
Bin Xu
Jie Tang
Juanzi Li
ELM
ALM
31
66
0
15 Jun 2023
ChessGPT: Bridging Policy Learning and Language Modeling
Xidong Feng
Yicheng Luo
Ziyan Wang
Hongrui Tang
Mengyue Yang
Kun Shao
D. Mguni
Yali Du
Jun Wang
14
38
0
15 Jun 2023
WizardCoder: Empowering Code Large Language Models with Evol-Instruct
Ziyang Luo
Can Xu
Pu Zhao
Qingfeng Sun
Xiubo Geng
Wenxiang Hu
Chongyang Tao
Jing Ma
Qingwei Lin
Daxin Jiang
SyDa
ALM
ELM
17
631
0
14 Jun 2023
Questioning the Survey Responses of Large Language Models
Ricardo Dominguez-Olmedo
Moritz Hardt
Celestine Mendler-Dünner
26
30
0
13 Jun 2023
Previous
1
2
3
...
6
7
8
...
10
11
12
Next