GPT-NeoX-20B: An Open-Source Autoregressive Language Model

14 April 2022

ArXiv (abs)PDF HTML HuggingFace (1 upvotes)Github (7200★)

Papers citing "GPT-NeoX-20B: An Open-Source Autoregressive Language Model"

50 / 603 papers shown

Explaining Large Language Models with gSMILE

Zeinab Dehghani

Mohammed Naveed Akram

Adil Khan

Mohammed Naveed Akram

Y. Papadopoulos

MILM LRM

569

27 May 2025

Domain Gating Ensemble Networks for AI-Generated Text Detection

211

20 May 2025

Vectors from Larger Language Models Predict Human Reading Time and fMRI Data More Poorly when Dimensionality Expansion is Controlled

Yi-Chien Lin

Hongao Zhu

William Schuler

206

18 May 2025

Automatic Calibration for Membership Inference Attack on Large Language Models

Mohammad Amin Roshani

Prashant Khanduri

Dongxiao Zhu

267

06 May 2025

Open Challenges in Multi-Agent Security: Towards Secure Systems of Interacting AI Agents

Christian Schroeder de Witt

AAML AI4CE

1.1K

04 May 2025

Demystifying optimized prompts in language models

Rimon Melamed

Lucas H. McCabe

H. H. Huang

262

04 May 2025

An Empirical Study on the Effectiveness of Large Language Models for Binary Code Understanding

258

30 Apr 2025

From Precision to Perception: User-Centred Evaluation of Keyword Extraction Algorithms for Internet-Scale Contextual Advertising

Jingwen Cai

Sara Leckner

Johanna Björklund

218

30 Apr 2025

Modes of Sequence Models and Learning Coefficients

Zhongtian Chen

Daniel Murfet

344

25 Apr 2025

DataS^3: Dataset Subset Selection for Specialization

...

260

22 Apr 2025

How Private is Your Attention? Bridging Privacy with In-Context Learning

322

22 Apr 2025

Honey, I Shrunk the Language Model: Impact of Knowledge Distillation Methods on Performance and Explainability

369

22 Apr 2025

Reinforcing Compositional Retrieval: Retrieving Step-by-Step for Composing Informative ContextsAnnual Meeting of the Association for Computational Linguistics (ACL), 2025

836

15 Apr 2025

Iterative Self-Training for Code Generation via Reinforced Re-RankingEuropean Conference on Information Retrieval (ECIR), 2025

Nikita Sorokin

I. Sedykh

Valentin Malykh

171

13 Apr 2025

Efficient and Asymptotically Unbiased Constrained Decoding for Large Language ModelsInternational Conference on Artificial Intelligence and Statistics (AISTATS), 2025

210

12 Apr 2025

Position: Beyond Euclidean -- Foundation Models Should Embrace Non-Euclidean Geometries

275

11 Apr 2025

Of All StrIPEs: Investigating Structure-informed Positional Encoding for Efficient Music Generation

Manvi Agarwal

Changhong Wang

Gaël Richard

177

07 Apr 2025

TiC-LM: A Web-Scale Benchmark for Time-Continual LLM PretrainingAnnual Meeting of the Association for Computational Linguistics (ACL), 2025

Jeffrey Li

Mohammadreza Armandpour

...

429

02 Apr 2025

Short-PHD: Detecting Short LLM-generated Text with Topological Data Analysis After Off-topic Content Insertion

264

01 Apr 2025

Shared Global and Local Geometry of Language Model Embeddings

468

27 Mar 2025

Mist: Efficient Distributed Training of Large Language Models via Memory-Parallelism Co-OptimizationEuropean Conference on Computer Systems (EuroSys), 2025

Zhanda Zhu

Christina Giannoula

Muralidhar Andoorveedu

225

24 Mar 2025

Adaptive Rank Allocation: Speeding Up Modern Transformers with RaNA AdaptersInternational Conference on Learning Representations (ICLR), 2025

472

23 Mar 2025

Large Language Models (LLMs) for Source Code Analysis: applications, models and datasets

Hamed Jelodar

Mohammad Meymani

Roozbeh Razavi-Far

263

21 Mar 2025

LLM Braces: Straightening Out LLM Predictions with Relevant Sub-UpdatesAnnual Meeting of the Association for Computational Linguistics (ACL), 2025

Ying Shen

Lifu Huang

342

20 Mar 2025

xLSTM 7B: A Recurrent LLM for Fast and Efficient Inference

275

17 Mar 2025

Enhancing High-Quality Code Generation in Large Language Models with Comparative Prefix-Tuning

288

12 Mar 2025

DependEval: Benchmarking LLMs for Repository Dependency UnderstandingAnnual Meeting of the Association for Computational Linguistics (ACL), 2025

169

09 Mar 2025

^2

M: Mutual Information Scaling Law for Long-Context Language Modeling

311

06 Mar 2025

Feature-Level Insights into Artificial Text Detection with Sparse AutoencodersAnnual Meeting of the Association for Computational Linguistics (ACL), 2025

247

05 Mar 2025

Zero-Shot Multi-Label Classification of Bangla Documents: Large Decoders Vs. Classic Encoders

Souvika Sarkar

M. Hasan

S. Karmaker

256

04 Mar 2025

296

25 Feb 2025

LDGen: Enhancing Text-to-Image Synthesis via Large Language Model-Driven Language Representation

368

25 Feb 2025

UrduLLaMA 1.0: Dataset Curation, Preprocessing, and Evaluation in Low-Resource Settings

220

24 Feb 2025

Towards Auto-Regressive Next-Token Prediction: In-Context Learning Emerges from GeneralizationInternational Conference on Learning Representations (ICLR), 2025

333

24 Feb 2025

SAE-V: Interpreting Multimodal Models for Enhanced Alignment

360

22 Feb 2025

Revealing and Mitigating Over-Attention in Knowledge EditingInternational Conference on Learning Representations (ICLR), 2025

577

21 Feb 2025

Comprehensive Analysis of Transparency and Accessibility of ChatGPT, DeepSeek, And other SoTA Large Language Models

Ranjan Sapkota

Shaina Raza

Manoj Karkee

266

21 Feb 2025

EfficientLLM: Scalable Pruning-Aware Pretraining for Architecture-Agnostic Edge Language Models

627

10 Feb 2025

LessLeak-Bench: A First Investigation of Data Leakage in LLMs Across 83 Software Engineering Benchmarks

994

10 Feb 2025

LCTG Bench: LLM Controlled Text Generation Benchmark

275

28 Jan 2025

Towards Cross-Tokenizer Distillation: the Universal Logit Distillation Loss for LLMs

449

28 Jan 2025

Complete Chess Games Enable LLM Become A Chess MasterNorth American Chapter of the Association for Computational Linguistics (NAACL), 2025

275

26 Jan 2025

Parameters vs FLOPs: Scaling Laws for Optimal Sparsity for Mixture-of-Experts Language Models

Samira Abnar

Harshay Shah

Dan Busbridge

Alaaeldin Mohamed Elnouby Ali

J. Susskind

Vimal Thilak

MoE LRM

537

21 Jan 2025

LLM Hallucinations in Practical Code Generation: Phenomena, Mechanism, and Mitigation

430

20 Jan 2025

On the Consideration of AI Openness: Can Good Intent Be Abused?AAAI Conference on Artificial Intelligence (AAAI), 2024

337

08 Jan 2025

Dataset Decomposition: Faster LLM Training with Variable Sequence Length CurriculumNeural Information Processing Systems (NeurIPS), 2024

Hadi Pouransari

Chun-Liang Li

Jen-Hao Rick Chang

Pavan Kumar Anasosalu Vasu

Cem Koc

Vaishaal Shankar

Oncel Tuzel

341

08 Jan 2025

OpenCodeInterpreter: Integrating Code Generation with Execution and RefinementAnnual Meeting of the Association for Computational Linguistics (ACL), 2024

482

202

08 Jan 2025

Clinical Insights: A Comprehensive Review of Language Models in MedicinePLOS Digital Health (PDH), 2024

570

08 Jan 2025

Scaling Large Language Model Training on Frontier with Low-Bandwidth PartitioningInternational Conference on High Performance Computing (HiPC), 2024

335

08 Jan 2025

HuRef: HUman-REadable Fingerprint for Large Language ModelsNeural Information Processing Systems (NeurIPS), 2023

376

08 Jan 2025