ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2204.06745
  4. Cited By
GPT-NeoX-20B: An Open-Source Autoregressive Language Model

GPT-NeoX-20B: An Open-Source Autoregressive Language Model

14 April 2022
Sid Black
Stella Biderman
Eric Hallahan
Quentin G. Anthony
Leo Gao
Laurence Golding
Horace He
Connor Leahy
Kyle McDonell
Jason Phang
Michael Pieler
USVSN Sai Prashanth
Shivanshu Purohit
Laria Reynolds
J. Tow
Benqi Wang
Samuel Weinbach
ArXiv (abs)PDFHTMLHuggingFace (1 upvotes)Github (7200★)

Papers citing "GPT-NeoX-20B: An Open-Source Autoregressive Language Model"

50 / 603 papers shown
Prompting Techniques for Secure Code Generation: A Systematic Investigation
Prompting Techniques for Secure Code Generation: A Systematic Investigation
Catherine Tony
Nicolás E. Díaz Ferreyra
Markus Mutas
Salem Dhiff
Riccardo Scandariato
SILM
430
35
0
09 Jul 2024
LLMBox: A Comprehensive Library for Large Language Models
LLMBox: A Comprehensive Library for Large Language Models
Tianyi Tang
Yiwen Hu
Bingqian Li
Wenyang Luo
Zijing Qin
...
Chunxuan Xia
Junyi Li
Kun Zhou
Wayne Xin Zhao
Ji-Rong Wen
162
4
0
08 Jul 2024
Looking into Black Box Code Language Models
Looking into Black Box Code Language Models
Muhammad Umair Haider
Umar Farooq
A.B. Siddique
Mark Marron
244
6
0
05 Jul 2024
Leveraging Graph Structures to Detect Hallucinations in Large Language
  Models
Leveraging Graph Structures to Detect Hallucinations in Large Language Models
Noa Nonkes
Sergei Agaronian
Evangelos Kanoulas
Roxana Petcu
166
5
0
05 Jul 2024
Learning to (Learn at Test Time): RNNs with Expressive Hidden States
Learning to (Learn at Test Time): RNNs with Expressive Hidden States
Yu Sun
Xinhao Li
Karan Dalal
Jiarui Xu
Arjun Vikram
...
Xinlei Chen
Xiaolong Wang
Sanmi Koyejo
Tatsunori Hashimoto
Carlos Guestrin
605
184
0
05 Jul 2024
Universal Length Generalization with Turing Programs
Universal Length Generalization with Turing Programs
Kaiying Hou
David Brandfonbrener
Sham Kakade
Samy Jelassi
Eran Malach
228
18
0
03 Jul 2024
Raw Text is All you Need: Knowledge-intensive Multi-turn Instruction
  Tuning for Large Language Model
Raw Text is All you Need: Knowledge-intensive Multi-turn Instruction Tuning for Large Language Model
Xia Hou
Qifeng Li
Jian Yang
Tongliang Li
Linzheng Chai
...
Hangyuan Ji
Zhoujun Li
Jixuan Nie
Jingbo Dun
Wenfeng Song
168
4
0
03 Jul 2024
Efficient Training of Language Models with Compact and Consistent Next
  Token Distributions
Efficient Training of Language Models with Compact and Consistent Next Token Distributions
Ashutosh Sathe
Sunita Sarawagi
204
0
0
03 Jul 2024
Towards More Realistic Extraction Attacks: An Adversarial Perspective
Towards More Realistic Extraction Attacks: An Adversarial Perspective
Yash More
Prakhar Ganesh
G. Farnadi
AAML
353
8
0
02 Jul 2024
RegMix: Data Mixture as Regression for Language Model Pre-training
RegMix: Data Mixture as Regression for Language Model Pre-training
Qian Liu
Xiaosen Zheng
Niklas Muennighoff
Guangtao Zeng
Longxu Dou
Tianyu Pang
Jing Jiang
Min Lin
MoE
369
93
1
01 Jul 2024
ScaleBiO: Scalable Bilevel Optimization for LLM Data Reweighting
ScaleBiO: Scalable Bilevel Optimization for LLM Data Reweighting
Boyao Wang
Dylan Zhang
Hanning Zhang
Xingyuan Pan
Minrui Xu
Jipeng Zhang
Renjie Pi
Xiaoyu Wang
Tong Zhang
415
24
0
28 Jun 2024
Resolving Discrepancies in Compute-Optimal Scaling of Language Models
Resolving Discrepancies in Compute-Optimal Scaling of Language Models
Tomer Porian
Mitchell Wortsman
J. Jitsev
Ludwig Schmidt
Y. Carmon
443
50
0
27 Jun 2024
RoboUniView: Visual-Language Model with Unified View Representation for
  Robotic Manipulaiton
RoboUniView: Visual-Language Model with Unified View Representation for Robotic Manipulaiton
Fanfan Liu
Feng Yan
Liming Zheng
Chengjian Feng
Yiyang Huang
Lin Ma
LM&Ro
435
22
0
27 Jun 2024
Enhancing Data Privacy in Large Language Models through Private
  Association Editing
Enhancing Data Privacy in Large Language Models through Private Association Editing
Davide Venditti
Elena Sofia Ruzzetti
Giancarlo A. Xompero
Cristina Giannone
Andrea Favalli
Raniero Romagnoli
Fabio Massimo Zanzotto
KELM
176
7
0
26 Jun 2024
OPT-Tree: Speculative Decoding with Adaptive Draft Tree Structure
OPT-Tree: Speculative Decoding with Adaptive Draft Tree Structure
Jikai Wang
Yi Su
Juntao Li
Qingrong Xia
Zi Ye
Xinyu Duan
Zhefeng Wang
Min Zhang
402
32
0
25 Jun 2024
Large Vocabulary Size Improves Large Language Models
Large Vocabulary Size Improves Large Language Models
Sho Takase
Ryokan Ri
Shun Kiyono
Takuya Kato
305
8
0
24 Jun 2024
Blind Baselines Beat Membership Inference Attacks for Foundation Models
Blind Baselines Beat Membership Inference Attacks for Foundation Models
Debeshee Das
Jie Zhang
Florian Tramèr
MIALM
689
56
1
23 Jun 2024
ReCaLL: Membership Inference via Relative Conditional Log-Likelihoods
ReCaLL: Membership Inference via Relative Conditional Log-Likelihoods
Roy Xie
Junlin Wang
Ruomin Huang
Minxing Zhang
Rong Ge
Jian Pei
Neil Zhenqiang Gong
Bhuwan Dhingra
MIALM
568
38
0
23 Jun 2024
Evaluating Diversity in Automatic Poetry Generation
Evaluating Diversity in Automatic Poetry Generation
Yanran Chen
Hannes Groner
Sina Zarrieß
Steffen Eger
287
14
0
21 Jun 2024
Protecting Privacy Through Approximating Optimal Parameters for Sequence
  Unlearning in Language Models
Protecting Privacy Through Approximating Optimal Parameters for Sequence Unlearning in Language Models
Dohyun Lee
Daniel Rim
Minseok Choi
Jaegul Choo
PILMMU
192
11
0
20 Jun 2024
VELO: A Vector Database-Assisted Cloud-Edge Collaborative LLM QoS
  Optimization Framework
VELO: A Vector Database-Assisted Cloud-Edge Collaborative LLM QoS Optimization Framework
Zhi Yao
Zhiqing Tang
Jiong Lou
Ping Shen
Weijia Jia
270
18
0
19 Jun 2024
Evaluating $n$-Gram Novelty of Language Models Using Rusty-DAWG
Evaluating nnn-Gram Novelty of Language Models Using Rusty-DAWG
William Merrill
Noah A. Smith
Yanai Elazar
ELMTDI
395
14
0
18 Jun 2024
MLKV: Multi-Layer Key-Value Heads for Memory Efficient Transformer
  Decoding
MLKV: Multi-Layer Key-Value Heads for Memory Efficient Transformer Decoding
Zayd Muhammad Kawakibi Zuhri
Muhammad Farid Adilazuarda
Ayu Purwarianti
Alham Fikri Aji
248
16
0
13 Jun 2024
State Soup: In-Context Skill Learning, Retrieval and Mixing
State Soup: In-Context Skill Learning, Retrieval and Mixing
Maciej Pióro
Maciej Wołczyk
Razvan Pascanu
J. Oswald
João Sacramento
117
1
0
12 Jun 2024
Evaluating Zero-Shot Long-Context LLM Compression
Evaluating Zero-Shot Long-Context LLM Compression
Chenyu Wang
Yihan Wang
Kai Li
277
0
0
10 Jun 2024
Causal Estimation of Memorisation Profiles
Causal Estimation of Memorisation ProfilesAnnual Meeting of the Association for Computational Linguistics (ACL), 2024
Pietro Lesci
Clara Meister
Thomas Hofmann
Andreas Vlachos
Tiago Pimentel
268
11
0
06 Jun 2024
Enhancing In-Context Learning Performance with just SVD-Based Weight
  Pruning: A Theoretical Perspective
Enhancing In-Context Learning Performance with just SVD-Based Weight Pruning: A Theoretical Perspective
Xinhao Yao
Xiaolin Hu
Shenzhi Yang
Yong Liu
240
3
0
06 Jun 2024
BindGPT: A Scalable Framework for 3D Molecular Design via Language
  Modeling and Reinforcement Learning
BindGPT: A Scalable Framework for 3D Molecular Design via Language Modeling and Reinforcement Learning
Artem Zholus
Maksim Kuznetsov
Roman Schutski
Rim Shayakhmetov
Daniil Polykovskiy
Sarath Chandar
Alex Zhavoronkov
DiffMAI4CE
188
15
0
06 Jun 2024
Block Transformer: Global-to-Local Language Modeling for Fast Inference
Block Transformer: Global-to-Local Language Modeling for Fast Inference
Namgyu Ho
Sangmin Bae
Taehyeon Kim
Hyunjik Jo
Yireun Kim
Tal Schuster
Adam Fisch
James Thorne
Se-Young Yun
306
27
0
04 Jun 2024
Editing the Mind of Giants: An In-Depth Exploration of Pitfalls of
  Knowledge Editing in Large Language Models
Editing the Mind of Giants: An In-Depth Exploration of Pitfalls of Knowledge Editing in Large Language Models
Cheng-Hsun Hsueh
Paul Kuo-Ming Huang
Tzu-Han Lin
Che-Wei Liao
Hung-Chieh Fang
Chao-Wei Huang
Yun-Nung Chen
KELM
218
9
0
03 Jun 2024
R2C2-Coder: Enhancing and Benchmarking Real-world Repository-level Code Completion Abilities of Code Large Language Models
R2C2-Coder: Enhancing and Benchmarking Real-world Repository-level Code Completion Abilities of Code Large Language Models
Ken Deng
Jiaheng Liu
He Zhu
Congnan Liu
Jingxin Li
...
Yuchi Xu
Bangyu Xiang
Bangyu Xiang
Bo Zheng
B. Zheng
322
8
0
03 Jun 2024
A Survey on Large Language Models for Code Generation
A Survey on Large Language Models for Code Generation
Juyong Jiang
Fan Wang
Jiasi Shen
Sungju Kim
Sunghun Kim
508
503
0
01 Jun 2024
Robust Knowledge Distillation Based on Feature Variance Against
  Backdoored Teacher Model
Robust Knowledge Distillation Based on Feature Variance Against Backdoored Teacher Model
Jinyin Chen
Xiaoming Zhao
Haibin Zheng
Xiao Li
Sheng Xiang
Haifeng Guo
AAML
152
7
0
01 Jun 2024
Using Large Language Models for Humanitarian Frontline Negotiation:
  Opportunities and Considerations
Using Large Language Models for Humanitarian Frontline Negotiation: Opportunities and Considerations
Zilin Ma
Susannah Su
Su
Nathan Zhao
Linn Bieske
...
Boxiang Wang
Jinglun Gao
Zihan Wen
Claude Bruderlein
Weiwei Pan
73
2
0
30 May 2024
Faster Cascades via Speculative Decoding
Faster Cascades via Speculative Decoding
Harikrishna Narasimhan
Wittawat Jitkrittum
A. S. Rawat
Seungyeon Kim
Neha Gupta
A. Menon
Sanjiv Kumar
LRM
360
19
0
29 May 2024
Various Lengths, Constant Speed: Efficient Language Modeling with
  Lightning Attention
Various Lengths, Constant Speed: Efficient Language Modeling with Lightning Attention
Zhen Qin
Weigao Sun
Dong Li
Xuyang Shen
Weixuan Sun
Yiran Zhong
269
22
0
27 May 2024
Aligning LLMs through Multi-perspective User Preference Ranking-based
  Feedback for Programming Question Answering
Aligning LLMs through Multi-perspective User Preference Ranking-based Feedback for Programming Question Answering
Hongyu Yang
Liyang He
Min Hou
Shuanghong Shen
Rui Li
Jiahui Hou
Jianhui Ma
Junda Zhao
154
5
0
27 May 2024
Lessons from the Trenches on Reproducible Evaluation of Language Models
Lessons from the Trenches on Reproducible Evaluation of Language Models
Stella Biderman
Hailey Schoelkopf
Lintang Sutawika
Leo Gao
J. Tow
...
Xiangru Tang
Kevin A. Wang
Genta Indra Winata
Franccois Yvon
Andy Zou
ELMALM
357
102
3
23 May 2024
A Survey on Vision-Language-Action Models for Embodied AI
A Survey on Vision-Language-Action Models for Embodied AI
Yueen Ma
Zixing Song
Yuzheng Zhuang
Jianye Hao
Irwin King
LM&Ro
880
164
0
23 May 2024
The AI Community Building the Future? A Quantitative Analysis of
  Development Activity on Hugging Face Hub
The AI Community Building the Future? A Quantitative Analysis of Development Activity on Hugging Face Hub
Cailean Osborne
Jennifer Ding
Hannah Rose Kirk
228
59
0
20 May 2024
Alternators For Sequence Modeling
Alternators For Sequence Modeling
Mohammad Reza Rezaei
Adji Bousso Dieng
219
2
0
20 May 2024
The Future of Large Language Model Pre-training is Federated
The Future of Large Language Model Pre-training is Federated
Lorenzo Sani
Alexandru Iacob
Zeyu Cao
Bill Marino
Yan Gao
...
Wanru Zhao
William F. Shen
Preslav Aleksandrov
Xinchi Qiu
Nicholas D. Lane
AI4CE
438
37
0
17 May 2024
IGOT: Information Gain Optimized Tokenizer on Domain Adaptive
  Pretraining
IGOT: Information Gain Optimized Tokenizer on Domain Adaptive Pretraining
Dawei Feng
Yihai Zhang
Zhixuan Xu
SyDa
115
0
0
16 May 2024
Zero-Shot Tokenizer Transfer
Zero-Shot Tokenizer TransferNeural Information Processing Systems (NeurIPS), 2024
Benjamin Minixhofer
Edoardo Ponti
Ivan Vulić
VLM
274
25
0
13 May 2024
OpenBA-V2: Reaching 77.3% High Compression Ratio with Fast Multi-Stage
  Pruning
OpenBA-V2: Reaching 77.3% High Compression Ratio with Fast Multi-Stage Pruning
Dan Qiao
Yi Su
Pinzheng Wang
Jing Ye
Wen Xie
...
Wenliang Chen
Guohong Fu
Guodong Zhou
Qiaoming Zhu
Min Zhang
MQ
222
1
0
09 May 2024
Fishing for Magikarp: Automatically Detecting Under-trained Tokens in
  Large Language Models
Fishing for Magikarp: Automatically Detecting Under-trained Tokens in Large Language ModelsConference on Empirical Methods in Natural Language Processing (EMNLP), 2024
Sander Land
Max Bartolo
271
36
0
08 May 2024
Who Wrote This? The Key to Zero-Shot LLM-Generated Text Detection Is GECScore
Who Wrote This? The Key to Zero-Shot LLM-Generated Text Detection Is GECScore
Junchao Wu
Runzhe Zhan
Yang Li
Shu Yang
Xuebo Liu
Lidia S. Chao
Min Zhang
DeLMO
284
18
0
07 May 2024
CofiPara: A Coarse-to-fine Paradigm for Multimodal Sarcasm Target
  Identification with Large Multimodal Models
CofiPara: A Coarse-to-fine Paradigm for Multimodal Sarcasm Target Identification with Large Multimodal Models
Hongzhan Lin
Zixin Chen
Ziyang Luo
Mingfei Cheng
Jing Ma
Guang Chen
221
14
0
01 May 2024
A Careful Examination of Large Language Model Performance on Grade
  School Arithmetic
A Careful Examination of Large Language Model Performance on Grade School Arithmetic
Hugh Zhang
Jeff Da
Dean Lee
Vaughn Robinson
Catherine Wu
...
Qin Lyu
Sean Hendryx
Russell Kaplan
Michele Lunati
Summer Yue
ALMLRMELM
398
164
0
01 May 2024
RAG and RAU: A Survey on Retrieval-Augmented Language Model in Natural Language Processing
RAG and RAU: A Survey on Retrieval-Augmented Language Model in Natural Language Processing
Yucheng Hu
Yuxing Lu
RALM
397
31
0
30 Apr 2024
Previous
12345...111213
Next