GPT-NeoX-20B: An Open-Source Autoregressive Language Model

14 April 2022

ArXiv (abs)PDF HTML HuggingFace (1 upvotes)Github (7200★)

Papers citing "GPT-NeoX-20B: An Open-Source Autoregressive Language Model"

50 / 603 papers shown

User Simulation with Large Language Models for Evaluating Task-Oriented Dialogue

254

23 Sep 2023

Knowledge Sanitization of Large Language Models

Yoichi Ishibashi

Hidetoshi Shimodaira

KELM

254

21 Sep 2023

SlimPajama-DC: Understanding Data Combinations for LLM Training

...

434

19 Sep 2023

CFGPT: Chinese Financial Assistant with Large Language Model

Dawei Cheng

179

19 Sep 2023

Flash-LLM: Enabling Cost-Effective and Highly-Efficient Large Generative Model Inference with Unstructured SparsityProceedings of the VLDB Endowment (PVLDB), 2023

Zhen Zheng

Yong Li

170

19 Sep 2023

Generative modeling, design and analysis of spider silk protein sequences for enhanced mechanical propertiesAdvanced Functional Materials (Adv. Funct. Mater.), 2023

Wei Lu

David L. Kaplan

Markus J. Buehler

159

18 Sep 2023

Struc-Bench: Are Large Language Models Really Good at Generating Complex Structured Data?

Arman Cohan

268

16 Sep 2023

CoCA: Fusing Position Embedding with Collinear Constrained Attention in Transformers for Long Context Window ExtendingAnnual Meeting of the Association for Computational Linguistics (ACL), 2023

138

15 Sep 2023

CATfOOD: Counterfactual Augmented Training for Improving Out-of-Domain Performance and CalibrationConference of the European Chapter of the Association for Computational Linguistics (EACL), 2023

Rachneet Sachdeva

Martin Tutek

Iryna Gurevych

OODD

311

14 Sep 2023

EarthPT: a time series foundation model for Earth Observation

219

13 Sep 2023

From Base to Conversational: Japanese Instruction Dataset and Tuning Large Language ModelsBigData Congress [Services Society] (BSS), 2023

Masahiro Suzuki

Masanori Hirano

Hiroki Sakaji

282

07 Sep 2023

Data-Juicer: A One-Stop Data Processing System for Large Language Models

...

Jingren Zhou

297

05 Sep 2023

RenAIssance: A Survey into AI Text-to-Image Generation in the Era of Large ModelIEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2023

...

Pareesa Ameneh Golnari

Yuxiong He

253

02 Sep 2023

YaRN: Efficient Context Window Extension of Large Language ModelsInternational Conference on Learning Representations (ICLR), 2023

392

403

31 Aug 2023

Examining User-Friendly and Open-Sourced Large GPT Models: A Survey on Language, Multimodal, and Scientific GPT Models

Wei Zhang

199

27 Aug 2023

Code Llama: Open Foundation Models for Code

Baptiste Rozière

...

Louis Martin

457

2,786

24 Aug 2023

Anonymity at Risk? Assessing Re-Identification Capabilities of Large Language Models

Alex Nyffenegger

Matthias Sturmer

Joel Niklaus

210

22 Aug 2023

Instruction Tuning for Large Language Models: A Survey

...

Jiwei Li

914

759

21 Aug 2023

Large Language Models for Software Engineering: A Systematic Literature ReviewACM Transactions on Software Engineering and Methodology (TOSEM), 2023

Kailong Wang

Haoyu Wang

358

743

21 Aug 2023

WizardMath: Empowering Mathematical Reasoning for Large Language Models via Reinforced Evol-InstructInternational Conference on Learning Representations (ICLR), 2023

...

800

624

18 Aug 2023

PMET: Precise Model Editing in a TransformerAAAI Conference on Artificial Intelligence (AAAI), 2023

Shasha Li

519

178

17 Aug 2023

AudioFormer: Audio Transformer learns audio feature representations from discrete acoustic codes

Zhaohui Li

Haitao Wang

Xinghua Jiang

426

14 Aug 2023

OctoPack: Instruction Tuning Code Large Language ModelsInternational Conference on Learning Representations (ICLR), 2023

Niklas Muennighoff

359

186

14 Aug 2023

Large Language Models for Information Retrieval: A Survey

634

452

14 Aug 2023

Three Ways of Using Large Language Models to Evaluate Chat

167

12 Aug 2023

Bringing order into the realm of Transformer-based language models for artificial intelligence and lawArtificial Intelligence and Law (ICAIL), 2023

C. M. Greco

Andrea Tagarelli

AILaw

217

10 Aug 2023

SILO Language Models: Isolating Legal Risk In a Nonparametric DatastoreInternational Conference on Learning Representations (ICLR), 2023

Luke Zettlemoyer

276

08 Aug 2023

Large Language Model Prompt Chaining for Long Legal Document Classification

Dietrich Trautmann

ELM AILaw

149

08 Aug 2023

Continual Pre-Training of Large Language Models: How to (re)warm your model?

382

135

08 Aug 2023

Evaluating and Explaining Large Language Models for Code Using Syntactic Structures

David Nader-Palacio

Alejandro Velasco

Daniel Rodríguez-Cárdenas

Kevin Moran

Denys Poshyvanyk

219

07 Aug 2023

RecycleGPT: An Autoregressive Language Model with Recyclable Module

274

07 Aug 2023

Learning to Paraphrase Sentences to Different Complexity LevelsTransactions of the Association for Computational Linguistics (TACL), 2023

170

04 Aug 2023

TransNormerLLM: A Faster and Better Large Language Model with Improved TransNormer

Zhen Qin

...

Xiao Luo

Yu Qiao

Yiran Zhong

186

27 Jul 2023

Exploiting the Potential of Seq2Seq Models as Robust Few-Shot Learners

107

27 Jul 2023

Evaluating the Ripple Effects of Knowledge Editing in Language ModelsTransactions of the Association for Computational Linguistics (TACL), 2023

364

227

24 Jul 2023

A Zero-shot and Few-shot Study of Instruction-Finetuned Large Language Models Applied to Clinical and Biomedical TasksInternational Conference on Language Resources and Evaluation (LREC), 2023

237

22 Jul 2023

FinPT: Financial Risk Prediction with Profile Tuning on Pretrained Foundation Models

Yuwei Yin

Yazheng Yang

Jian Yang

Qi Liu

147

22 Jul 2023

FinGPT: Democratizing Internet-scale Data for Financial Large Language Models

Daochen Zha

250

19 Jul 2023

Overthinking the Truth: Understanding how Language Models Process False DemonstrationsInternational Conference on Learning Representations (ICLR), 2023

Danny Halawi

Jean-Stanislas Denain

Jacob Steinhardt

312

18 Jul 2023

On the application of Large Language Models for language teaching and assessment technology

...

261

17 Jul 2023

Generating Benchmarks for Factuality Evaluation of Language ModelsConference of the European Chapter of the Association for Computational Linguistics (EACL), 2023

247

123

13 Jul 2023

A Comprehensive Overview of Large Language ModelsACM Transactions on Intelligent Systems and Technology (ACM TIST), 2023

Saeed Anwar

Muhammad Usman

854

1,173

12 Jul 2023

QIGen: Generating Efficient Kernels for Quantized Inference on Large Language Models

Tommaso Pegolotti

Elias Frantar

Dan Alistarh

Markus Püschel

07 Jul 2023

Evaluating Biased Attitude Associations of Language Models in an Intersectional ContextAAAI/ACM Conference on AI, Ethics, and Society (AIES), 2023

Shiva Omrani Sabbaghi

Robert Wolfe

Aylin Caliskan

201

07 Jul 2023

Several categories of Large Language Models (LLMs): A Short SurveyInternational Journal for Research in Applied Science and Engineering Technology (IJRASET), 2023

Saurabh Pahune

Manoj Chandrasekharan

AILaw

202

05 Jul 2023

Natural Language Generation and Understanding of Big Code for AI-Assisted Programming: A ReviewEntropy (Entropy), 2023

239

122

04 Jul 2023

InstructEval: Systematic Evaluation of Instruction Selection Methods

186

01 Jul 2023

Mirage: Towards Low-interruption Services on Batch GPU Clusters with Reinforcement LearningInternational Conference for High Performance Computing, Networking, Storage and Analysis (SC), 2023

Qi-Dong Ding

Pengfei Zheng

Shreyas Kudari

Shivaram Venkataraman

Zhao-jie Zhang

VLM OffRL

164

25 Jun 2023

H$_2$O: Heavy-Hitter Oracle for Efficient Generative Inference of Large
Language Models

_2

O: Heavy-Hitter Oracle for Efficient Generative Inference of Large Language ModelsNeural Information Processing Systems (NeurIPS), 2023

...

755

474

24 Jun 2023

Long-range Language Modeling with Self-retrievalTransactions of the Association for Computational Linguistics (TACL), 2023

Ohad Rubin

Jonathan Berant

RALM KELM

219

23 Jun 2023