ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2204.06745
  4. Cited By
GPT-NeoX-20B: An Open-Source Autoregressive Language Model

GPT-NeoX-20B: An Open-Source Autoregressive Language Model

14 April 2022
Sid Black
Stella Biderman
Eric Hallahan
Quentin G. Anthony
Leo Gao
Laurence Golding
Horace He
Connor Leahy
Kyle McDonell
Jason Phang
Michael Pieler
USVSN Sai Prashanth
Shivanshu Purohit
Laria Reynolds
J. Tow
Benqi Wang
Samuel Weinbach
ArXiv (abs)PDFHTMLHuggingFace (1 upvotes)Github (7200★)

Papers citing "GPT-NeoX-20B: An Open-Source Autoregressive Language Model"

50 / 603 papers shown
"We Demand Justice!": Towards Social Context Grounding of Political
  Texts
"We Demand Justice!": Towards Social Context Grounding of Political TextsConference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Rajkumar Pujari
Chengfei Wu
Dan Goldwasser
AILaw
345
5
0
15 Nov 2023
XplainLLM: A QA Explanation Dataset for Understanding LLM
  Decision-Making
XplainLLM: A QA Explanation Dataset for Understanding LLM Decision-MakingConference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Zichen Chen
Jianda Chen
Mitali Gaidhani
Ambuj K. Singh
Misha Sra
216
4
0
15 Nov 2023
A Ship of Theseus: Curious Cases of Paraphrasing in LLM-Generated Texts
A Ship of Theseus: Curious Cases of Paraphrasing in LLM-Generated TextsAnnual Meeting of the Association for Computational Linguistics (ACL), 2023
Nafis Irtiza Tripto
Saranya Venkatraman
Dominik Macko
Robert Moro
Ivan Srba
Adaku Uchendu
Thai V. Le
Dongwon Lee
DeLMO
360
26
0
14 Nov 2023
STEER: Unified Style Transfer with Expert Reinforcement
STEER: Unified Style Transfer with Expert ReinforcementConference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Skyler Hallinan
Faeze Brahman
Ximing Lu
Jaehun Jung
Sean Welleck
Yejin Choi
OffRL
181
17
0
13 Nov 2023
In-context Vectors: Making In Context Learning More Effective and
  Controllable Through Latent Space Steering
In-context Vectors: Making In Context Learning More Effective and Controllable Through Latent Space SteeringInternational Conference on Machine Learning (ICML), 2023
Sheng Liu
Haotian Ye
Lei Xing
James Y. Zou
250
215
0
11 Nov 2023
Chain of Images for Intuitively Reasoning
Chain of Images for Intuitively Reasoning
Fanxu Meng
Haotong Yang
Yiding Wang
Muhan Zhang
LRM
239
16
0
09 Nov 2023
Ziya2: Data-centric Learning is All LLMs Need
Ziya2: Data-centric Learning is All LLMs Need
Ruyi Gan
Ziwei Wu
Renliang Sun
Junyu Lu
Xiaojun Wu
...
Ping Yang
Qi Yang
Hao Wang
Jiaxing Zhang
Yan Song
VLMALM
299
26
0
06 Nov 2023
Large language models implicitly learn to straighten neural sentence
  trajectories to construct a predictive representation of natural language
Large language models implicitly learn to straighten neural sentence trajectories to construct a predictive representation of natural languagebioRxiv (bioRxiv), 2023
Eghbal A. Hosseini
Evelina Fedorenko
LLMSV
189
13
0
05 Nov 2023
Vision-Language Foundation Models as Effective Robot Imitators
Vision-Language Foundation Models as Effective Robot ImitatorsInternational Conference on Learning Representations (ICLR), 2023
Xinghang Li
Minghuan Liu
Hanbo Zhang
Cunjun Yu
Jie Xu
...
Ya Jing
Weinan Zhang
Huaping Liu
Hang Li
Tao Kong
LM&Ro
494
308
0
02 Nov 2023
Predicting Question-Answering Performance of Large Language Models
  through Semantic Consistency
Predicting Question-Answering Performance of Large Language Models through Semantic ConsistencyIEEE Games Entertainment Media Conference (IEEE GEM), 2023
Ella Rabinovich
Samuel Ackerman
Orna Raz
E. Farchi
Ateret Anaby-Tavor
554
32
0
02 Nov 2023
InstructCoder: Instruction Tuning Large Language Models for Code Editing
InstructCoder: Instruction Tuning Large Language Models for Code EditingAnnual Meeting of the Association for Computational Linguistics (ACL), 2023
Kaixin Li
Qisheng Hu
Xu Zhao
Hui Chen
Yuxi Xie
Tiedong Liu
Qizhe Xie
Junxian He
ALMSyDa
227
27
0
31 Oct 2023
TeacherLM: Teaching to Fish Rather Than Giving the Fish, Language
  Modeling Likewise
TeacherLM: Teaching to Fish Rather Than Giving the Fish, Language Modeling Likewise
Nan He
Hanyu Lai
Chenyang Zhao
Zirui Cheng
Junting Pan
...
Zhaohui Hou
Zhiyuan Huang
Shaoqing Lu
Ding Liang
Mingjie Zhan
LRM
254
14
0
29 Oct 2023
FP8-LM: Training FP8 Large Language Models
FP8-LM: Training FP8 Large Language Models
Houwen Peng
Kan Wu
Yixuan Wei
Guoshuai Zhao
Yuxiang Yang
...
Zheng Zhang
Shuguang Liu
Joe Chau
Han Hu
Jun Zhou
MQ
307
63
0
27 Oct 2023
Evaluation of large language models using an Indian language LGBTI+
  lexicon
Evaluation of large language models using an Indian language LGBTI+ lexiconAI Ethics Journal (JAE), 2023
Aditya Joshi
S. Rawat
A. Dange
109
1
0
26 Oct 2023
Codebook Features: Sparse and Discrete Interpretability for Neural
  Networks
Codebook Features: Sparse and Discrete Interpretability for Neural NetworksInternational Conference on Machine Learning (ICML), 2023
Alex Tamkin
Mohammad Taufeeque
Noah D. Goodman
214
41
0
26 Oct 2023
Deja Vu: Contextual Sparsity for Efficient LLMs at Inference Time
Deja Vu: Contextual Sparsity for Efficient LLMs at Inference TimeInternational Conference on Machine Learning (ICML), 2023
Zichang Liu
Jue Wang
Tri Dao
Wanrong Zhu
Binhang Yuan
...
Anshumali Shrivastava
Ce Zhang
Yuandong Tian
Christopher Ré
Beidi Chen
BDL
357
280
0
26 Oct 2023
Detecting Pretraining Data from Large Language Models
Detecting Pretraining Data from Large Language ModelsInternational Conference on Learning Representations (ICLR), 2023
Weijia Shi
Anirudh Ajith
Mengzhou Xia
Yangsibo Huang
Daogao Liu
Terra Blevins
Danqi Chen
Luke Zettlemoyer
MIALM
436
315
0
25 Oct 2023
CLEX: Continuous Length Extrapolation for Large Language Models
CLEX: Continuous Length Extrapolation for Large Language ModelsInternational Conference on Learning Representations (ICLR), 2023
Guanzheng Chen
Xin Li
Zaiqiao Meng
Shangsong Liang
Li Bing
287
38
0
25 Oct 2023
Locally Differentially Private Document Generation Using Zero Shot
  Prompting
Locally Differentially Private Document Generation Using Zero Shot PromptingConference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Saiteja Utpala
Sara Hooker
Pin-Yu Chen
274
59
0
24 Oct 2023
BLESS: Benchmarking Large Language Models on Sentence Simplification
BLESS: Benchmarking Large Language Models on Sentence SimplificationConference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Tannon Kew
Alison Chi
Laura Vásquez-Rodríguez
Sweta Agrawal
Dennis Aumiller
Fernando Alva-Manchego
Teven Le Scao
239
53
0
24 Oct 2023
Function Vectors in Large Language Models
Function Vectors in Large Language ModelsInternational Conference on Learning Representations (ICLR), 2023
Eric Todd
Millicent Li
Arnab Sen Sharma
Aaron Mueller
Byron C. Wallace
David Bau
324
182
0
23 Oct 2023
Geographical Erasure in Language Generation
Geographical Erasure in Language GenerationConference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Pola Schwöbel
Jacek Golebiowski
Michele Donini
Cédric Archambeau
Danish Pruthi
181
11
0
23 Oct 2023
Ensemble-Instruct: Generating Instruction-Tuning Data with a
  Heterogeneous Mixture of LMs
Ensemble-Instruct: Generating Instruction-Tuning Data with a Heterogeneous Mixture of LMs
Young-Suk Lee
Md Arafat Sultan
Yousef El-Kurdi
Tahira Naseem Asim Munawar
Radu Florian
Salim Roukos
Ramón Fernández Astudillo
SyDa
223
7
0
21 Oct 2023
CrossCodeEval: A Diverse and Multilingual Benchmark for Cross-File Code
  Completion
CrossCodeEval: A Diverse and Multilingual Benchmark for Cross-File Code CompletionNeural Information Processing Systems (NeurIPS), 2023
Yangruibo Ding
Zijian Wang
Wasi Uddin Ahmad
Hantian Ding
Ming Tan
...
M. K. Ramanathan
Ramesh Nallapati
Parminder Bhatia
Dan Roth
Bing Xiang
ELM
280
194
0
17 Oct 2023
H2O Open Ecosystem for State-of-the-art Large Language Models
H2O Open Ecosystem for State-of-the-art Large Language ModelsConference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Arno Candel
Jon McKinney
Philipp Singer
Pascal Pfeiffer
Maximilian Jeblick
Chun Ming Lee
Marcos V. Conde
VLM
152
5
0
17 Oct 2023
Llemma: An Open Language Model For Mathematics
Llemma: An Open Language Model For MathematicsInternational Conference on Learning Representations (ICLR), 2023
Zhangir Azerbayev
Hailey Schoelkopf
Keiran Paster
Marco Dos Santos
Alexander Shmakov
Albert Q. Jiang
Jia Deng
Stella Biderman
Sean Welleck
CLL
331
388
0
16 Oct 2023
Generative Calibration for In-context Learning
Generative Calibration for In-context LearningConference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Zhongtao Jiang
Yuanzhe Zhang
Cao Liu
Jun Zhao
Kang Liu
416
22
0
16 Oct 2023
Unsupervised Domain Adaption for Neural Information Retrieval
Unsupervised Domain Adaption for Neural Information Retrieval
Carlos Dominguez
Jon Ander Campos
Eneko Agirre
Gorka Azkune
167
0
0
13 Oct 2023
SeqXGPT: Sentence-Level AI-Generated Text Detection
SeqXGPT: Sentence-Level AI-Generated Text DetectionConference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Pengyu Wang
Linyang Li
Ke Ren
Botian Jiang
Dong Zhang
Xipeng Qiu
DeLMO
389
87
0
13 Oct 2023
Training Generative Question-Answering on Synthetic Data Obtained from
  an Instruct-tuned Model
Training Generative Question-Answering on Synthetic Data Obtained from an Instruct-tuned ModelPacific Asia Conference on Language, Information and Computation (PACLIC), 2023
Kosuke Takahashi
Takahiro Omi
Kosuke Arima
Tatsuya Ishigaki
162
2
0
12 Oct 2023
GenTKG: Generative Forecasting on Temporal Knowledge Graph with Large
  Language Models
GenTKG: Generative Forecasting on Temporal Knowledge Graph with Large Language Models
Ruotong Liao
Xu Jia
Yangzhe Li
Yunpu Ma
Volker Tresp
AI4TS
404
49
0
11 Oct 2023
LLMs Killed the Script Kiddie: How Agents Supported by Large Language
  Models Change the Landscape of Network Threat Testing
LLMs Killed the Script Kiddie: How Agents Supported by Large Language Models Change the Landscape of Network Threat Testing
Stephen Moskal
Sam Laney
Erik Hemberg
Una-May O’Reilly
207
31
0
10 Oct 2023
CodeFuse-13B: A Pretrained Multi-lingual Code Large Language Model
CodeFuse-13B: A Pretrained Multi-lingual Code Large Language Model
Peng Di
Jianguo Li
Hang Yu
Wei Jiang
Wenting Cai
...
Zelin Zhao
Xunjin Zheng
Hailian Zhou
Lifu Zhu
Xianying Zhu
ELMALMAI4CE
224
26
0
10 Oct 2023
FABRIC: Automated Scoring and Feedback Generation for Essays
FABRIC: Automated Scoring and Feedback Generation for Essays
Jieun Han
Haneul Yoo
Jun-Hee Myung
Minsun Kim
Hyunseung Lim
...
Tak Yeon Lee
Hwajung Hong
Juho Kim
So-Yeon Ahn
Alice Oh
104
9
0
08 Oct 2023
MenatQA: A New Dataset for Testing the Temporal Comprehension and
  Reasoning Abilities of Large Language Models
MenatQA: A New Dataset for Testing the Temporal Comprehension and Reasoning Abilities of Large Language ModelsConference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Yifan Wei
Yisong Su
Huanhuan Ma
Xiaoyan Yu
Fangyu Lei
Yuanzhe Zhang
Jun Zhao
Kang Liu
LRM
242
19
0
08 Oct 2023
Fast-DetectGPT: Efficient Zero-Shot Detection of Machine-Generated Text
  via Conditional Probability Curvature
Fast-DetectGPT: Efficient Zero-Shot Detection of Machine-Generated Text via Conditional Probability CurvatureInternational Conference on Learning Representations (ICLR), 2023
Guangsheng Bao
Yanbin Zhao
Zhiyang Teng
Linyi Yang
Yue Zhang
346
251
0
08 Oct 2023
How Reliable Are AI-Generated-Text Detectors? An Assessment Framework
  Using Evasive Soft Prompts
How Reliable Are AI-Generated-Text Detectors? An Assessment Framework Using Evasive Soft PromptsConference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Tharindu Kumarage
Paras Sheth
Raha Moraffah
Joshua Garland
Huan Liu
DeLMO
181
32
0
08 Oct 2023
Pushing the Limits of Pre-training for Time Series Forecasting in the
  CloudOps Domain
Pushing the Limits of Pre-training for Time Series Forecasting in the CloudOps Domain
Gerald Woo
Chenghao Liu
Akshat Kumar
Doyen Sahoo
AI4TSAI4CE
361
18
0
08 Oct 2023
GoLLIE: Annotation Guidelines improve Zero-Shot Information-Extraction
GoLLIE: Annotation Guidelines improve Zero-Shot Information-ExtractionInternational Conference on Learning Representations (ICLR), 2023
Oscar Sainz
Iker García-Ferrero
Rodrigo Agerri
Oier López de Lacalle
German Rigau
Eneko Agirre
706
139
0
05 Oct 2023
InstructProtein: Aligning Human and Protein Language via Knowledge
  Instruction
InstructProtein: Aligning Human and Protein Language via Knowledge InstructionAnnual Meeting of the Association for Computational Linguistics (ACL), 2023
Zeyuan Wang
Qiang Zhang
Keyan Ding
Ming Qin
Zhuang Xiang
Xiaotong Li
Huajun Chen
245
36
0
05 Oct 2023
Low Resource Summarization using Pre-trained Language Models
Low Resource Summarization using Pre-trained Language Models
Mubashir Munaf
Hammad Afzal
N. Iltaf
Khawir Mahmood
180
15
0
04 Oct 2023
Large Language Models for Test-Free Fault Localization
Large Language Models for Test-Free Fault LocalizationInternational Conference on Software Engineering (ICSE), 2023
Aidan Z. H. Yang
Ruben Martins
Claire Le Goues
Vincent J. Hellendoorn
LRM
234
161
0
03 Oct 2023
Synthetic Data Generation in Low-Resource Settings via Fine-Tuning of
  Large Language Models
Synthetic Data Generation in Low-Resource Settings via Fine-Tuning of Large Language Models
Jean Kaddour
Qi Liu
SyDa
241
3
0
02 Oct 2023
GrowLength: Accelerating LLMs Pretraining by Progressively Growing
  Training Length
GrowLength: Accelerating LLMs Pretraining by Progressively Growing Training Length
Hongye Jin
Xiaotian Han
Jingfeng Yang
Zhimeng Jiang
Chia-Yuan Chang
Helen Zhou
157
15
0
01 Oct 2023
L2CEval: Evaluating Language-to-Code Generation Capabilities of Large
  Language Models
L2CEval: Evaluating Language-to-Code Generation Capabilities of Large Language ModelsTransactions of the Association for Computational Linguistics (TACL), 2023
Ansong Ni
Pengcheng Yin
Yilun Zhao
Chen Wei
Yanjun Wang
...
Mingyuan Zhang
Chen Change Loy
Yingbo Zhou
Dragomir R. Radev
Arman Cohan
ELM
246
29
0
29 Sep 2023
Qwen Technical Report
Qwen Technical Report
Jinze Bai
Shuai Bai
Yunfei Chu
Zeyu Cui
Kai Dang
...
Zhenru Zhang
Chang Zhou
Jingren Zhou
Xiaohuan Zhou
Tianhang Zhu
OSLM
822
3,094
0
28 Sep 2023
Identifying and Mitigating Privacy Risks Stemming from Language Models:
  A Survey
Identifying and Mitigating Privacy Risks Stemming from Language Models: A Survey
Victoria Smith
Ali Shahin Shamsabadi
Carolyn Ashurst
Adrian Weller
PILM
484
41
0
27 Sep 2023
Joint Prediction and Denoising for Large-scale Multilingual
  Self-supervised Learning
Joint Prediction and Denoising for Large-scale Multilingual Self-supervised LearningAutomatic Speech Recognition & Understanding (ASRU), 2023
William Chen
Jiatong Shi
Brian Yan
Dan Berrebbi
Wangyou Zhang
Yifan Peng
Xuankai Chang
Soumi Maiti
Shinji Watanabe
262
13
0
26 Sep 2023
Physics of Language Models: Part 3.2, Knowledge Manipulation
Physics of Language Models: Part 3.2, Knowledge ManipulationInternational Conference on Learning Representations (ICLR), 2023
Zeyuan Allen-Zhu
Yuanzhi Li
KELM
406
142
0
25 Sep 2023
Physics of Language Models: Part 3.1, Knowledge Storage and Extraction
Physics of Language Models: Part 3.1, Knowledge Storage and ExtractionInternational Conference on Machine Learning (ICML), 2023
Zeyuan Allen-Zhu
Yuanzhi Li
KELM
533
237
0
25 Sep 2023
Previous
123...678...111213
Next
Page 7 of 13
Pageof 13