Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2204.06745
Cited By
GPT-NeoX-20B: An Open-Source Autoregressive Language Model
14 April 2022
Sid Black
Stella Biderman
Eric Hallahan
Quentin G. Anthony
Leo Gao
Laurence Golding
Horace He
Connor Leahy
Kyle McDonell
Jason Phang
Michael Pieler
USVSN Sai Prashanth
Shivanshu Purohit
Laria Reynolds
J. Tow
Benqi Wang
Samuel Weinbach
Re-assign community
ArXiv (abs)
PDF
HTML
HuggingFace (1 upvotes)
Github (7200★)
Papers citing
"GPT-NeoX-20B: An Open-Source Autoregressive Language Model"
50 / 603 papers shown
Getting the most out of your tokenizer for pre-training and domain adaptation
Gautier Dagan
Gabriele Synnaeve
Baptiste Rozière
356
57
0
01 Feb 2024
OLMo: Accelerating the Science of Language Models
Dirk Groeneveld
Iz Beltagy
Pete Walsh
Akshita Bhagia
Rodney Michael Kinney
...
Jesse Dodge
Kyle Lo
Luca Soldaini
Noah A. Smith
Hanna Hajishirzi
OSLM
651
550
0
01 Feb 2024
Does DetectGPT Fully Utilize Perturbation? Bridging Selective Perturbation to Fine-tuned Contrastive Learning Detector would be Better
Shengchao Liu
Xiaoming Liu
Yichen Wang
Zehua Cheng
Chengzhengxu Li
Zhaohan Zhang
Y. Lan
Chao Shen
DeLMO
224
14
0
01 Feb 2024
Probing Language Models' Gesture Understanding for Enhanced Human-AI Interaction
Philipp Wicke
132
4
0
31 Jan 2024
TeenyTinyLlama: open-source tiny language models trained in Brazilian Portuguese
N. Corrêa
Sophia Falk
Shiza Fatimah
Aniket Sen
N. D. Oliveira
268
22
0
30 Jan 2024
NoFunEval: Funny How Code LMs Falter on Requirements Beyond Functional Correctness
Manav Singhal
Tushar Aggarwal
Abhijeet Awasthi
Nagarajan Natarajan
Aditya Kanade
287
24
0
29 Jan 2024
Evaluating Gender Bias in Large Language Models via Chain-of-Thought Prompting
Masahiro Kaneko
Danushka Bollegala
Naoaki Okazaki
Timothy Baldwin
LRM
250
49
0
28 Jan 2024
OMPGPT: A Generative Pre-trained Transformer Model for OpenMP
European Conference on Parallel Processing (Euro-Par), 2024
Le Chen
Arijit Bhattacharjee
Nesreen Ahmed
N. Hasabnis
Gal Oren
Vy A. Vo
Ali Jannesari
VLM
224
23
0
28 Jan 2024
A Survey on Data Augmentation in Large Model Era
Yue Zhou
Chenlu Guo
Xu Wang
Yi-Ju Chang
Yuan Wu
LM&MA
VLM
485
49
0
27 Jan 2024
DsDm: Model-Aware Dataset Selection with Datamodels
International Conference on Machine Learning (ICML), 2024
Logan Engstrom
Axel Feldmann
Aleksander Madry
OODD
286
90
0
23 Jan 2024
Enhancing In-context Learning via Linear Probe Calibration
International Conference on Artificial Intelligence and Statistics (AISTATS), 2024
Momin Abbas
Yi Zhou
Parikshit Ram
Nathalie Baracaldo
Horst Samulowitz
Theodoros Salonidis
Tianyi Chen
242
17
0
22 Jan 2024
Text Embedding Inversion Security for Multilingual Language Models
Annual Meeting of the Association for Computational Linguistics (ACL), 2024
Yiyi Chen
Heather Lent
Johannes Bjerva
441
24
0
22 Jan 2024
AttentionLego: An Open-Source Building Block For Spatially-Scalable Large Language Model Accelerator With Processing-In-Memory Technology
Rongqing Cong
Wenyang He
Mingxuan Li
Bangning Luo
Zebin Yang
Yuchao Yang
Ru Huang
Bonan Yan
55
4
0
21 Jan 2024
Beyond Traditional Benchmarks: Analyzing Behaviors of Open LLMs on Data-to-Text Generation
Zdeněk Kasner
Ondrej Dusek
331
22
0
18 Jan 2024
Deciphering Textual Authenticity: A Generalized Strategy through the Lens of Large Language Semantics for Detecting Human vs. Machine-Generated Text
USENIX Security Symposium (USENIX Security), 2024
Mazal Bethany
Brandon Wherry
Emet Bethany
Nishant Vishwamitra
Anthony Rios
Peyman Najafirad
DeLMO
224
13
0
17 Jan 2024
The What, Why, and How of Context Length Extension Techniques in Large Language Models -- A Detailed Survey
Saurav Pawar
S.M. Towhidul Islam Tonmoy
S. M. M. Zaman
Vinija Jain
Vasu Sharma
Amitava Das
215
41
0
15 Jan 2024
Extending LLMs' Context Window with 100 Samples
Yikai Zhang
Junlong Li
Pengfei Liu
213
17
0
13 Jan 2024
Mind Your Format: Towards Consistent Evaluation of In-Context Learning Improvements
Annual Meeting of the Association for Computational Linguistics (ACL), 2024
Anton Voronov
Lena Wolf
Max Ryabinin
332
72
0
12 Jan 2024
Chain of History: Learning and Forecasting with LLMs for Temporal Knowledge Graph Completion
Ruilin Luo
Tianle Gu
Haoling Li
Junzhe Li
Zicheng Lin
Jiayi Li
Yujiu Yang
AI4CE
425
15
0
11 Jan 2024
Universal Vulnerabilities in Large Language Models: Backdoor Attacks for In-context Learning
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2024
Shuai Zhao
Meihuizi Jia
Anh Tuan Luu
Fengjun Pan
Jinming Wen
AAML
489
70
0
11 Jan 2024
Risk Taxonomy, Mitigation, and Assessment Benchmarks of Large Language Model Systems
Tianyu Cui
Yanling Wang
Chuanpu Fu
Yong Xiao
Sijia Li
...
Junwu Xiong
Xinyu Kong
ZuJie Wen
Ke Xu
Qi Li
321
99
0
11 Jan 2024
How predictable is language model benchmark performance?
David Owen
ELM
LRM
248
32
0
09 Jan 2024
Exploring Prompt-Based Methods for Zero-Shot Hypernym Prediction with Large Language Models
M. Tikhomirov
Natalia Loukachevitch
117
0
0
09 Jan 2024
The Dawn After the Dark: An Empirical Study on Factuality Hallucination in Large Language Models
Junyi Li
Jie Chen
Ruiyang Ren
Xiaoxue Cheng
Wayne Xin Zhao
Jian-Yun Nie
Ji-Rong Wen
HILM
265
109
0
06 Jan 2024
GOAT-Bench: Safety Insights to Large Multimodal Models through Meme-Based Social Abuse
ACM Transactions on Intelligent Systems and Technology (ACM TIST), 2024
Hongzhan Lin
Ziyang Luo
Bo Wang
Ruichao Yang
Jing Ma
532
50
0
03 Jan 2024
Differentially Private Low-Rank Adaptation of Large Language Model Using Federated Learning
ACM Transactions on Management Information Systems (ACM TMIS), 2023
Xiao-Yang Liu
Rongyi Zhu
Daochen Zha
Jiechao Gao
Shan Zhong
Matt White
Yijia Zhao
279
48
0
29 Dec 2023
Spike No More: Stabilizing the Pre-training of Large Language Models
Sho Takase
Shun Kiyono
Sosuke Kobayashi
Jun Suzuki
425
28
0
28 Dec 2023
Large Language Models for Conducting Advanced Text Analytics Information Systems Research
Benjamin Ampel
Chi-Heng Yang
Junjie Hu
Hsinchun Chen
350
12
0
27 Dec 2023
MoTCoder: Elevating Large Language Models with Modular of Thought for Challenging Programming Tasks
Jingyao Li
Pengguang Chen
Jiaya Jia
Hong Xu
Jiaya Jia
LRM
211
10
0
26 Dec 2023
Efficient LLM inference solution on Intel GPU
Hui Wu
Yi Gan
Feng Yuan
Jing Ma
Wei Zhu
...
Hong Zhu
Yuhua Zhu
Xiaoli Liu
Jinghui Gu
Peng Zhao
170
4
0
19 Dec 2023
kNN-ICL: Compositional Task-Oriented Parsing Generalization with Nearest Neighbor In-Context Learning
Wenting Zhao
Ye Liu
Yao Wan
Yibo Wang
Qingyang Wu
Zhongfen Deng
Jiangshu Du
Shuaiqi Liu
Yunlong Xu
Philip S. Yu
200
11
0
17 Dec 2023
Paloma: A Benchmark for Evaluating Language Model Fit
Ian H. Magnusson
Akshita Bhagia
Valentin Hofmann
Luca Soldaini
A. Jha
...
Iz Beltagy
Hanna Hajishirzi
Noah A. Smith
Kyle Richardson
Jesse Dodge
332
47
0
16 Dec 2023
SECap: Speech Emotion Captioning with Large Language Model
AAAI Conference on Artificial Intelligence (AAAI), 2023
Yaoxun Xu
Hangting Chen
Jianwei Yu
Qiaochu Huang
Zhiyong Wu
Shixiong Zhang
Guangzhi Li
Yi Luo
Rongzhi Gu
257
55
0
16 Dec 2023
Catwalk: A Unified Language Model Evaluation Framework for Many Datasets
Dirk Groeneveld
Anas Awadalla
Iz Beltagy
Akshita Bhagia
Ian H. Magnusson
Hao Peng
Oyvind Tafjord
Pete Walsh
Kyle Richardson
Jesse Dodge
265
2
0
15 Dec 2023
Learn or Recall? Revisiting Incremental Learning with Pre-trained Language Models
Annual Meeting of the Association for Computational Linguistics (ACL), 2023
Junhao Zheng
Shengjie Qiu
Qianli Ma
393
13
0
13 Dec 2023
An LLM Compiler for Parallel Function Calling
Sehoon Kim
Suhong Moon
Ryan Tabrizi
Nicholas Lee
Michael W. Mahoney
Kurt Keutzer
A. Gholami
LRM
380
114
0
07 Dec 2023
Integrating Pre-Trained Speech and Language Models for End-to-End Speech Recognition
Yukiya Hono
Koh Mitsuda
Tianyu Zhao
Kentaro Mitsui
Toshiaki Wakatsuki
Kei Sawada
AuLLM
258
16
0
06 Dec 2023
SmoothQuant+: Accurate and Efficient 4-bit Post-Training WeightQuantization for LLM
Jiayi Pan
Chengcan Wang
Kaifu Zheng
Yangguang Li
Zhenyu Wang
Bin Feng
MQ
185
7
0
06 Dec 2023
Scaling Laws for Adversarial Attacks on Language Model Activations
Stanislav Fort
140
21
0
05 Dec 2023
Efficient Online Data Mixing For Language Model Pre-Training
Alon Albalak
Liangming Pan
Colin Raffel
Wenjie Wang
310
65
0
05 Dec 2023
FFT: Towards Harmlessness Evaluation and Analysis for LLMs with Factuality, Fairness, Toxicity
Shiyao Cui
Zhenyu Zhang
Yilong Chen
Wenyuan Zhang
Tianyun Liu
Siqi Wang
Tingwen Liu
220
21
0
30 Nov 2023
LLMs for Science: Usage for Code Generation and Data Analysis
Mohamed Nejjar
Luca Zacharias
Fabian Stiehle
Ingo Weber
235
60
0
28 Nov 2023
Pre-trained Language Models Do Not Help Auto-regressive Text-to-Image Generation
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Yuhui Zhang
Brandon McKinzie
Zhe Gan
Vaishaal Shankar
Alexander Toshev
144
3
0
27 Nov 2023
Enhancing Uncertainty-Based Hallucination Detection with Stronger Focus
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Tianhang Zhang
Lin Qiu
Qipeng Guo
Cheng Deng
Yue Zhang
Zheng Zhang
Cheng Zhou
Xinbing Wang
Luoyi Fu
HILM
264
91
0
22 Nov 2023
LIMIT: Less Is More for Instruction Tuning Across Evaluation Paradigms
Aditi Jha
Sam Havens
Jeremey Dohmann
Alex Trott
Jacob P. Portes
ALM
123
15
0
22 Nov 2023
Towards Better Parameter-Efficient Fine-Tuning for Large Language Models: A Position Paper
Chengyu Wang
Junbing Yan
Wei Zhang
Jun Huang
ALM
191
4
0
22 Nov 2023
AcademicGPT: Empowering Academic Research
Shufa Wei
Xiaolong Xu
Xianbiao Qi
Xi Yin
Jun Xia
...
Chihao Dai
Lihua Wang
Xiaohui Liu
Lei Zhang
Yutao Xie
LM&MA
219
5
0
21 Nov 2023
Investigating Data Contamination in Modern Benchmarks for Large Language Models
Chunyuan Deng
Yilun Zhao
Xiangru Tang
Mark B. Gerstein
Arman Cohan
AAML
ELM
390
111
0
16 Nov 2023
LongBoX: Evaluating Transformers on Long-Sequence Clinical Tasks
Mihir Parmar
Aakanksha Naik
Himanshu Gupta
Disha Agrawal
Chitta Baral
LM&MA
152
3
0
16 Nov 2023
zrLLM: Zero-Shot Relational Learning on Temporal Knowledge Graphs with Large Language Models
North American Chapter of the Association for Computational Linguistics (NAACL), 2023
Zifeng Ding
Heling Cai
Jingpei Wu
Yunpu Ma
Ruotong Liao
Bo Xiong
Volker Tresp
AI4TS
269
22
0
15 Nov 2023
Previous
1
2
3
...
5
6
7
...
11
12
13
Next