ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2204.06745
  4. Cited By
GPT-NeoX-20B: An Open-Source Autoregressive Language Model

GPT-NeoX-20B: An Open-Source Autoregressive Language Model

14 April 2022
Sid Black
Stella Biderman
Eric Hallahan
Quentin G. Anthony
Leo Gao
Laurence Golding
Horace He
Connor Leahy
Kyle McDonell
Jason Phang
Michael Pieler
USVSN Sai Prashanth
Shivanshu Purohit
Laria Reynolds
J. Tow
Benqi Wang
Samuel Weinbach
ArXiv (abs)PDFHTMLHuggingFace (1 upvotes)Github (7200★)

Papers citing "GPT-NeoX-20B: An Open-Source Autoregressive Language Model"

50 / 603 papers shown
Explaining Large Language Models with gSMILE
Explaining Large Language Models with gSMILE
Zeinab Dehghani
Mohammed Naveed Akram
Adil Khan
Mohammed Naveed Akram
Y. Papadopoulos
MILMLRM
569
0
0
27 May 2025
Domain Gating Ensemble Networks for AI-Generated Text Detection
Domain Gating Ensemble Networks for AI-Generated Text Detection
Arihant Tripathi
Liam Dugan
Charis Gao
Maggie Huan
Emma Jin
Peter Zhang
David Zhang
Julia Zhao
Chris Callison-Burch
VLM
211
0
0
20 May 2025
Vectors from Larger Language Models Predict Human Reading Time and fMRI Data More Poorly when Dimensionality Expansion is Controlled
Vectors from Larger Language Models Predict Human Reading Time and fMRI Data More Poorly when Dimensionality Expansion is Controlled
Yi-Chien Lin
Hongao Zhu
William Schuler
206
3
0
18 May 2025
Automatic Calibration for Membership Inference Attack on Large Language Models
Automatic Calibration for Membership Inference Attack on Large Language Models
Saleh Zare Zade
Yao Qiang
Xiangyu Zhou
Hui Zhu
Mohammad Amin Roshani
Prashant Khanduri
Dongxiao Zhu
267
3
0
06 May 2025
Open Challenges in Multi-Agent Security: Towards Secure Systems of Interacting AI Agents
Open Challenges in Multi-Agent Security: Towards Secure Systems of Interacting AI Agents
Christian Schroeder de Witt
AAMLAI4CE
1.1K
35
0
04 May 2025
Demystifying optimized prompts in language models
Demystifying optimized prompts in language models
Rimon Melamed
Lucas H. McCabe
H. H. Huang
262
0
0
04 May 2025
An Empirical Study on the Effectiveness of Large Language Models for Binary Code Understanding
An Empirical Study on the Effectiveness of Large Language Models for Binary Code Understanding
Xiuwei Shang
Zhenkan Fu
Shaoyin Cheng
Guoqiang Chen
Gangyang Li
Li Hu
Weinan Zhang
N. Yu
258
1
0
30 Apr 2025
From Precision to Perception: User-Centred Evaluation of Keyword Extraction Algorithms for Internet-Scale Contextual Advertising
From Precision to Perception: User-Centred Evaluation of Keyword Extraction Algorithms for Internet-Scale Contextual Advertising
Jingwen Cai
Sara Leckner
Johanna Björklund
218
0
0
30 Apr 2025
Modes of Sequence Models and Learning Coefficients
Modes of Sequence Models and Learning Coefficients
Zhongtian Chen
Daniel Murfet
344
3
0
25 Apr 2025
DataS^3: Dataset Subset Selection for Specialization
DataS^3: Dataset Subset Selection for Specialization
Neha Hulkund
Alaa Maalouf
Levi Cai
Daniel Yang
Tsun-Hsuan Wang
...
Ken Goldberg
Hannah Kerner
Irene Chen
Yogesh A. Girdhar
Sara Beery
260
2
0
22 Apr 2025
How Private is Your Attention? Bridging Privacy with In-Context Learning
How Private is Your Attention? Bridging Privacy with In-Context Learning
Soham Bonnerjee
Zhen Wei
Yeon
Anna Asch
Sagnik Nandy
Promit Ghosal
322
0
0
22 Apr 2025
Honey, I Shrunk the Language Model: Impact of Knowledge Distillation Methods on Performance and Explainability
Honey, I Shrunk the Language Model: Impact of Knowledge Distillation Methods on Performance and Explainability
Daniel Hendriks
Philipp Spitzer
Niklas Kühl
G. Satzger
369
3
0
22 Apr 2025
Reinforcing Compositional Retrieval: Retrieving Step-by-Step for Composing Informative Contexts
Reinforcing Compositional Retrieval: Retrieving Step-by-Step for Composing Informative ContextsAnnual Meeting of the Association for Computational Linguistics (ACL), 2025
Quanyu Long
Jianda Chen
Zhengyuan Liu
Nancy F. Chen
Wenya Wang
Sinno Jialin Pan
KELMRALMLRM
836
1
0
15 Apr 2025
Iterative Self-Training for Code Generation via Reinforced Re-Ranking
Iterative Self-Training for Code Generation via Reinforced Re-RankingEuropean Conference on Information Retrieval (ECIR), 2025
Nikita Sorokin
I. Sedykh
Valentin Malykh
171
2
0
13 Apr 2025
Efficient and Asymptotically Unbiased Constrained Decoding for Large Language Models
Efficient and Asymptotically Unbiased Constrained Decoding for Large Language ModelsInternational Conference on Artificial Intelligence and Statistics (AISTATS), 2025
Haotian Ye
Himanshu Jain
Chong You
A. Suresh
Haowei Lin
James Zou
Felix X. Yu
210
2
0
12 Apr 2025
Position: Beyond Euclidean -- Foundation Models Should Embrace Non-Euclidean Geometries
Position: Beyond Euclidean -- Foundation Models Should Embrace Non-Euclidean Geometries
Neil He
Jiahong Liu
Buze Zhang
N. Bui
Ali Maatouk
Menglin Yang
Irwin King
Melanie Weber
Rex Ying
275
4
0
11 Apr 2025
Of All StrIPEs: Investigating Structure-informed Positional Encoding for Efficient Music Generation
Of All StrIPEs: Investigating Structure-informed Positional Encoding for Efficient Music Generation
Manvi Agarwal
Changhong Wang
Gaël Richard
177
0
0
07 Apr 2025
TiC-LM: A Web-Scale Benchmark for Time-Continual LLM Pretraining
TiC-LM: A Web-Scale Benchmark for Time-Continual LLM PretrainingAnnual Meeting of the Association for Computational Linguistics (ACL), 2025
Jeffrey Li
Mohammadreza Armandpour
Iman Mirzadeh
Sachin Mehta
Vaishaal Shankar
...
Samy Bengio
Oncel Tuzel
Mehrdad Farajtabar
Hadi Pouransari
Fartash Faghri
CLLKELM
429
4
0
02 Apr 2025
Short-PHD: Detecting Short LLM-generated Text with Topological Data Analysis After Off-topic Content Insertion
Short-PHD: Detecting Short LLM-generated Text with Topological Data Analysis After Off-topic Content Insertion
Dongjun Wei
Minjia Mao
Xiao Fang
Michael Chau
DeLMO
264
3
0
01 Apr 2025
Shared Global and Local Geometry of Language Model Embeddings
Shared Global and Local Geometry of Language Model Embeddings
Andrew Lee
Melanie Weber
F. Viégas
Martin Wattenberg
FedML
468
13
0
27 Mar 2025
Mist: Efficient Distributed Training of Large Language Models via Memory-Parallelism Co-Optimization
Mist: Efficient Distributed Training of Large Language Models via Memory-Parallelism Co-OptimizationEuropean Conference on Computer Systems (EuroSys), 2025
Zhanda Zhu
Christina Giannoula
Muralidhar Andoorveedu
Qidong Su
Karttikeya Mangalam
Bojian Zheng
Gennady Pekhimenko
VLMMoE
225
6
0
24 Mar 2025
Adaptive Rank Allocation: Speeding Up Modern Transformers with RaNA Adapters
Adaptive Rank Allocation: Speeding Up Modern Transformers with RaNA AdaptersInternational Conference on Learning Representations (ICLR), 2025
Roberto Garcia
Jerry Liu
Daniel Sorvisto
Sabri Eyuboglu
472
0
0
23 Mar 2025
Large Language Models (LLMs) for Source Code Analysis: applications, models and datasets
Large Language Models (LLMs) for Source Code Analysis: applications, models and datasets
Hamed Jelodar
Mohammad Meymani
Roozbeh Razavi-Far
263
17
0
21 Mar 2025
LLM Braces: Straightening Out LLM Predictions with Relevant Sub-Updates
LLM Braces: Straightening Out LLM Predictions with Relevant Sub-UpdatesAnnual Meeting of the Association for Computational Linguistics (ACL), 2025
Ying Shen
Lifu Huang
342
2
0
20 Mar 2025
xLSTM 7B: A Recurrent LLM for Fast and Efficient Inference
xLSTM 7B: A Recurrent LLM for Fast and Efficient Inference
M. Beck
Korbinian Poppel
Phillip Lippe
Richard Kurle
P. Blies
Günter Klambauer
Sebastian Böck
Sepp Hochreiter
LRM
275
11
0
17 Mar 2025
Enhancing High-Quality Code Generation in Large Language Models with Comparative Prefix-Tuning
Enhancing High-Quality Code Generation in Large Language Models with Comparative Prefix-Tuning
Yuan Jiang
Yujian Zhang
Liang Lu
Christoph Treude
Xiaohong Su
Shan Huang
Tiantian Wang
ALM
288
2
0
12 Mar 2025
DependEval: Benchmarking LLMs for Repository Dependency UnderstandingAnnual Meeting of the Association for Computational Linguistics (ACL), 2025
Junjia Du
Yadi Liu
Hongcheng Guo
Jiawei Wang
Haojian Huang
Yunyi Ni
Zhiyu Li
169
9
0
09 Mar 2025
L$^2$M: Mutual Information Scaling Law for Long-Context Language Modeling
L2^22M: Mutual Information Scaling Law for Long-Context Language Modeling
Zhuo Chen
Oriol Mayné i Comas
Zhuotao Jin
Di Luo
Marin Soljacic
311
5
0
06 Mar 2025
Feature-Level Insights into Artificial Text Detection with Sparse AutoencodersAnnual Meeting of the Association for Computational Linguistics (ACL), 2025
Kristian Kuznetsov
Laida Kushnareva
Polina Druzhinina
Anton Razzhigaev
Anastasia Voznyuk
Irina Piontkovskaya
Evgeny Burnaev
Serguei Barannikov
247
5
0
05 Mar 2025
Zero-Shot Multi-Label Classification of Bangla Documents: Large Decoders Vs. Classic Encoders
Souvika Sarkar
M. Hasan
S. Karmaker
256
1
0
04 Mar 2025
Self-Adjust Softmax
Self-Adjust Softmax
Chuanyang Zheng
Yihang Gao
Guoxuan Chen
Han Shi
Jing Xiong
Xiaozhe Ren
Chao Huang
Xin Jiang
Zhiyu Li
Yu Li
296
3
0
25 Feb 2025
LDGen: Enhancing Text-to-Image Synthesis via Large Language Model-Driven Language Representation
LDGen: Enhancing Text-to-Image Synthesis via Large Language Model-Driven Language Representation
Pengzhi Li
Pengfei Yu
Zide Liu
Wei He
Xuhao Pan
Xudong Rao
Tao Wei
Wei Chen
VLM
368
5
0
25 Feb 2025
UrduLLaMA 1.0: Dataset Curation, Preprocessing, and Evaluation in Low-Resource Settings
UrduLLaMA 1.0: Dataset Curation, Preprocessing, and Evaluation in Low-Resource Settings
Layba Fiaz
Munief Hassan Tahir
Sana Shams
Sarmad Hussain
220
1
0
24 Feb 2025
Towards Auto-Regressive Next-Token Prediction: In-Context Learning Emerges from Generalization
Towards Auto-Regressive Next-Token Prediction: In-Context Learning Emerges from GeneralizationInternational Conference on Learning Representations (ICLR), 2025
Zixuan Gong
Xiaolin Hu
Huayi Tang
Yong Liu
333
2
0
24 Feb 2025
SAE-V: Interpreting Multimodal Models for Enhanced Alignment
SAE-V: Interpreting Multimodal Models for Enhanced Alignment
Hantao Lou
Changye Li
Yalan Qin
Yaodong Yang
360
6
0
22 Feb 2025
Revealing and Mitigating Over-Attention in Knowledge Editing
Revealing and Mitigating Over-Attention in Knowledge EditingInternational Conference on Learning Representations (ICLR), 2025
Pinzheng Wang
Zecheng Tang
Keyan Zhou
Junlin Li
Qiaoming Zhu
Hao Fei
KELM
577
4
0
21 Feb 2025
Comprehensive Analysis of Transparency and Accessibility of ChatGPT, DeepSeek, And other SoTA Large Language Models
Comprehensive Analysis of Transparency and Accessibility of ChatGPT, DeepSeek, And other SoTA Large Language Models
Ranjan Sapkota
Shaina Raza
Manoj Karkee
266
15
0
21 Feb 2025
EfficientLLM: Scalable Pruning-Aware Pretraining for Architecture-Agnostic Edge Language Models
EfficientLLM: Scalable Pruning-Aware Pretraining for Architecture-Agnostic Edge Language Models
Xingrun Xing
Zheng Liu
Shitao Xiao
Boyan Gao
Yiming Liang
Wanpeng Zhang
Haokun Lin
Guoqi Li
Jiajun Zhang
LRM
627
8
0
10 Feb 2025
LessLeak-Bench: A First Investigation of Data Leakage in LLMs Across 83 Software Engineering Benchmarks
LessLeak-Bench: A First Investigation of Data Leakage in LLMs Across 83 Software Engineering Benchmarks
Xin Zhou
Martin Weyssow
Ratnadira Widyasari
Ting Zhang
Junda He
Yunbo Lyu
Jianming Chang
Beiqi Zhang
Dan Huang
David Lo
PILM
994
30
0
10 Feb 2025
LCTG Bench: LLM Controlled Text Generation Benchmark
Kemal Kurniawan
Masato Mita
Peinan Zhang
S. Sasaki
Ryosuke Ishigami
Naoaki Okazaki
275
0
0
28 Jan 2025
Towards Cross-Tokenizer Distillation: the Universal Logit Distillation Loss for LLMs
Towards Cross-Tokenizer Distillation: the Universal Logit Distillation Loss for LLMs
Nicolas Boizard
Kevin El Haddad
C´eline Hudelot
Pierre Colombo
449
27
0
28 Jan 2025
Complete Chess Games Enable LLM Become A Chess Master
Complete Chess Games Enable LLM Become A Chess MasterNorth American Chapter of the Association for Computational Linguistics (NAACL), 2025
Yinqi Zhang
Xintian Han
Haolong Li
Kedi Chen
Shaohui Lin
ReLMELM
275
9
0
26 Jan 2025
Parameters vs FLOPs: Scaling Laws for Optimal Sparsity for Mixture-of-Experts Language Models
Parameters vs FLOPs: Scaling Laws for Optimal Sparsity for Mixture-of-Experts Language Models
Samira Abnar
Harshay Shah
Dan Busbridge
Alaaeldin Mohamed Elnouby Ali
J. Susskind
Vimal Thilak
MoELRM
537
25
0
21 Jan 2025
LLM Hallucinations in Practical Code Generation: Phenomena, Mechanism, and Mitigation
LLM Hallucinations in Practical Code Generation: Phenomena, Mechanism, and Mitigation
Ziyao Zhang
Yanlin Wang
Chong Wang
Jiachi Chen
Zibin Zheng
430
84
0
20 Jan 2025
On the Consideration of AI Openness: Can Good Intent Be Abused?
On the Consideration of AI Openness: Can Good Intent Be Abused?AAAI Conference on Artificial Intelligence (AAAI), 2024
Yeeun Kim
Eunkyung Choi
Hyunjun Kim
Hongseok Oh
Hyunseo Shin
Wonseok Hwang
SILM
337
3
0
08 Jan 2025
Dataset Decomposition: Faster LLM Training with Variable Sequence Length Curriculum
Dataset Decomposition: Faster LLM Training with Variable Sequence Length CurriculumNeural Information Processing Systems (NeurIPS), 2024
Hadi Pouransari
Chun-Liang Li
Jen-Hao Rick Chang
Pavan Kumar Anasosalu Vasu
Cem Koc
Vaishaal Shankar
Oncel Tuzel
341
23
0
08 Jan 2025
OpenCodeInterpreter: Integrating Code Generation with Execution and Refinement
OpenCodeInterpreter: Integrating Code Generation with Execution and RefinementAnnual Meeting of the Association for Computational Linguistics (ACL), 2024
Tianyu Zheng
Ge Zhang
Shangda Wu
Xueling Liu
Bill Yuchen Lin
Jie Fu
Lei Ma
Xiang Yue
SyDa
482
202
0
08 Jan 2025
Clinical Insights: A Comprehensive Review of Language Models in Medicine
Clinical Insights: A Comprehensive Review of Language Models in MedicinePLOS Digital Health (PDH), 2024
Nikita Neveditsin
Pawan Lingras
V. Mago
LM&MA
570
17
0
08 Jan 2025
Scaling Large Language Model Training on Frontier with Low-Bandwidth Partitioning
Scaling Large Language Model Training on Frontier with Low-Bandwidth PartitioningInternational Conference on High Performance Computing (HiPC), 2024
Lang Xu
Quentin G. Anthony
Jacob Hatef
Hari Subramoni
Hari Subramoni
Dhabaleswar K.
Panda
335
1
0
08 Jan 2025
HuRef: HUman-REadable Fingerprint for Large Language Models
HuRef: HUman-REadable Fingerprint for Large Language ModelsNeural Information Processing Systems (NeurIPS), 2023
Boyi Zeng
Cheng Zhou
Yuncong Hu
Yi Xu
Chenghu Zhou
Xiang Wang
Yu Yu
Zhouhan Lin
376
30
0
08 Jan 2025
Previous
12345...111213
Next