ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2204.06745
  4. Cited By
GPT-NeoX-20B: An Open-Source Autoregressive Language Model

GPT-NeoX-20B: An Open-Source Autoregressive Language Model

14 April 2022
Sid Black
Stella Biderman
Eric Hallahan
Quentin G. Anthony
Leo Gao
Laurence Golding
Horace He
Connor Leahy
Kyle McDonell
Jason Phang
Michael Pieler
USVSN Sai Prashanth
Shivanshu Purohit
Laria Reynolds
J. Tow
Benqi Wang
Samuel Weinbach
ArXivPDFHTML

Papers citing "GPT-NeoX-20B: An Open-Source Autoregressive Language Model"

50 / 554 papers shown
Title
Universal Length Generalization with Turing Programs
Universal Length Generalization with Turing Programs
Kaiying Hou
David Brandfonbrener
Sham Kakade
Samy Jelassi
Eran Malach
40
7
0
03 Jul 2024
Raw Text is All you Need: Knowledge-intensive Multi-turn Instruction
  Tuning for Large Language Model
Raw Text is All you Need: Knowledge-intensive Multi-turn Instruction Tuning for Large Language Model
Xia Hou
Qifeng Li
Jian Yang
Tongliang Li
Linzheng Chai
...
Hangyuan Ji
Zhoujun Li
Jixuan Nie
Jingbo Dun
Wenfeng Song
25
3
0
03 Jul 2024
Efficient Training of Language Models with Compact and Consistent Next
  Token Distributions
Efficient Training of Language Models with Compact and Consistent Next Token Distributions
Ashutosh Sathe
Sunita Sarawagi
32
0
0
03 Jul 2024
Towards More Realistic Extraction Attacks: An Adversarial Perspective
Towards More Realistic Extraction Attacks: An Adversarial Perspective
Yash More
Prakhar Ganesh
G. Farnadi
AAML
66
6
0
02 Jul 2024
RegMix: Data Mixture as Regression for Language Model Pre-training
RegMix: Data Mixture as Regression for Language Model Pre-training
Qian Liu
Xiaosen Zheng
Niklas Muennighoff
Guangtao Zeng
Longxu Dou
Tianyu Pang
Jing Jiang
Min-Bin Lin
MoE
67
36
1
01 Jul 2024
ScaleBiO: Scalable Bilevel Optimization for LLM Data Reweighting
ScaleBiO: Scalable Bilevel Optimization for LLM Data Reweighting
Rui Pan
Jipeng Zhang
Xingyuan Pan
Renjie Pi
Xiaoyu Wang
Tong Zhang
45
5
0
28 Jun 2024
Resolving Discrepancies in Compute-Optimal Scaling of Language Models
Resolving Discrepancies in Compute-Optimal Scaling of Language Models
Tomer Porian
Mitchell Wortsman
J. Jitsev
Ludwig Schmidt
Y. Carmon
50
19
0
27 Jun 2024
RoboUniView: Visual-Language Model with Unified View Representation for
  Robotic Manipulaiton
RoboUniView: Visual-Language Model with Unified View Representation for Robotic Manipulaiton
Fanfan Liu
Feng Yan
Liming Zheng
Chengjian Feng
Yiyang Huang
Lin Ma
LM&Ro
23
11
0
27 Jun 2024
Enhancing Data Privacy in Large Language Models through Private
  Association Editing
Enhancing Data Privacy in Large Language Models through Private Association Editing
Davide Venditti
Elena Sofia Ruzzetti
Giancarlo A. Xompero
Cristina Giannone
Andrea Favalli
Raniero Romagnoli
Fabio Massimo Zanzotto
KELM
30
2
0
26 Jun 2024
OPT-Tree: Speculative Decoding with Adaptive Draft Tree Structure
OPT-Tree: Speculative Decoding with Adaptive Draft Tree Structure
Jikai Wang
Yi Su
Juntao Li
Qingrong Xia
Zi Ye
Xinyu Duan
Zhefeng Wang
Min Zhang
34
11
0
25 Jun 2024
Large Vocabulary Size Improves Large Language Models
Large Vocabulary Size Improves Large Language Models
Sho Takase
Ryokan Ri
Shun Kiyono
Takuya Kato
37
3
0
24 Jun 2024
ReCaLL: Membership Inference via Relative Conditional Log-Likelihoods
ReCaLL: Membership Inference via Relative Conditional Log-Likelihoods
Roy Xie
Junlin Wang
Ruomin Huang
Minxing Zhang
Rong Ge
Jian Pei
Neil Zhenqiang Gong
Bhuwan Dhingra
MIALM
40
11
0
23 Jun 2024
Blind Baselines Beat Membership Inference Attacks for Foundation Models
Blind Baselines Beat Membership Inference Attacks for Foundation Models
Debeshee Das
Jie Zhang
Florian Tramèr
MIALM
72
28
1
23 Jun 2024
Evaluating Diversity in Automatic Poetry Generation
Evaluating Diversity in Automatic Poetry Generation
Yanran Chen
Hannes Groner
Sina Zarrieß
Steffen Eger
34
8
0
21 Jun 2024
Protecting Privacy Through Approximating Optimal Parameters for Sequence
  Unlearning in Language Models
Protecting Privacy Through Approximating Optimal Parameters for Sequence Unlearning in Language Models
Dohyun Lee
Daniel Rim
Minseok Choi
Jaegul Choo
PILM
MU
57
4
0
20 Jun 2024
VELO: A Vector Database-Assisted Cloud-Edge Collaborative LLM QoS
  Optimization Framework
VELO: A Vector Database-Assisted Cloud-Edge Collaborative LLM QoS Optimization Framework
Zhi Yao
Zhiqing Tang
Jiong Lou
Ping Shen
Weijia Jia
40
7
0
19 Jun 2024
Evaluating $n$-Gram Novelty of Language Models Using Rusty-DAWG
Evaluating nnn-Gram Novelty of Language Models Using Rusty-DAWG
William Merrill
Noah A. Smith
Yanai Elazar
ELM
TDI
35
9
0
18 Jun 2024
MLKV: Multi-Layer Key-Value Heads for Memory Efficient Transformer
  Decoding
MLKV: Multi-Layer Key-Value Heads for Memory Efficient Transformer Decoding
Zayd Muhammad Kawakibi Zuhri
Muhammad Farid Adilazuarda
Ayu Purwarianti
Alham Fikri Aji
37
7
0
13 Jun 2024
State Soup: In-Context Skill Learning, Retrieval and Mixing
State Soup: In-Context Skill Learning, Retrieval and Mixing
Maciej Pióro
Maciej Wołczyk
Razvan Pascanu
J. Oswald
João Sacramento
25
1
0
12 Jun 2024
Evaluating Zero-Shot Long-Context LLM Compression
Evaluating Zero-Shot Long-Context LLM Compression
Chenyu Wang
Yihan Wang
Kai Li
49
0
0
10 Jun 2024
Causal Estimation of Memorisation Profiles
Causal Estimation of Memorisation Profiles
Pietro Lesci
Clara Meister
Thomas Hofmann
Andreas Vlachos
Tiago Pimentel
43
5
0
06 Jun 2024
Enhancing In-Context Learning Performance with just SVD-Based Weight
  Pruning: A Theoretical Perspective
Enhancing In-Context Learning Performance with just SVD-Based Weight Pruning: A Theoretical Perspective
Xinhao Yao
Xiaolin Hu
Shenzhi Yang
Yong Liu
39
2
0
06 Jun 2024
BindGPT: A Scalable Framework for 3D Molecular Design via Language
  Modeling and Reinforcement Learning
BindGPT: A Scalable Framework for 3D Molecular Design via Language Modeling and Reinforcement Learning
Artem Zholus
Maksim Kuznetsov
Roman Schutski
Rim Shayakhmetov
Daniil Polykovskiy
Sarath Chandar
Alex Zhavoronkov
DiffM
AI4CE
35
4
0
06 Jun 2024
Block Transformer: Global-to-Local Language Modeling for Fast Inference
Block Transformer: Global-to-Local Language Modeling for Fast Inference
Namgyu Ho
Sangmin Bae
Taehyeon Kim
Hyunjik Jo
Yireun Kim
Tal Schuster
Adam Fisch
James Thorne
Se-Young Yun
45
7
0
04 Jun 2024
Editing the Mind of Giants: An In-Depth Exploration of Pitfalls of
  Knowledge Editing in Large Language Models
Editing the Mind of Giants: An In-Depth Exploration of Pitfalls of Knowledge Editing in Large Language Models
Cheng-Hsun Hsueh
Paul Kuo-Ming Huang
Tzu-Han Lin
Che-Wei Liao
Hung-Chieh Fang
Chao-Wei Huang
Yun-Nung Chen
KELM
31
5
0
03 Jun 2024
R2C2-Coder: Enhancing and Benchmarking Real-world Repository-level Code
  Completion Abilities of Code Large Language Models
R2C2-Coder: Enhancing and Benchmarking Real-world Repository-level Code Completion Abilities of Code Large Language Models
Ken Deng
Jiaheng Liu
He Zhu
Congnan Liu
Jingxin Li
...
Yuanxing Zhang
Wenbo Su
Bangyu Xiang
Tiezheng Ge
Bo Zheng
40
2
0
03 Jun 2024
A Survey on Large Language Models for Code Generation
A Survey on Large Language Models for Code Generation
Juyong Jiang
Fan Wang
Jiasi Shen
Sungju Kim
Sunghun Kim
40
158
0
01 Jun 2024
Robust Knowledge Distillation Based on Feature Variance Against
  Backdoored Teacher Model
Robust Knowledge Distillation Based on Feature Variance Against Backdoored Teacher Model
Jinyin Chen
Xiaoming Zhao
Haibin Zheng
Xiao Li
Sheng Xiang
Haifeng Guo
AAML
25
3
0
01 Jun 2024
Using Large Language Models for Humanitarian Frontline Negotiation:
  Opportunities and Considerations
Using Large Language Models for Humanitarian Frontline Negotiation: Opportunities and Considerations
Zilin Ma
Susannah Su
Su
Nathan Zhao
Linn Bieske
...
Boxiang Wang
Jinglun Gao
Zihan Wen
Claude Bruderlein
Weiwei Pan
17
0
0
30 May 2024
Faster Cascades via Speculative Decoding
Faster Cascades via Speculative Decoding
Harikrishna Narasimhan
Wittawat Jitkrittum
A. S. Rawat
Seungyeon Kim
Neha Gupta
A. Menon
Sanjiv Kumar
LRM
44
6
0
29 May 2024
Various Lengths, Constant Speed: Efficient Language Modeling with
  Lightning Attention
Various Lengths, Constant Speed: Efficient Language Modeling with Lightning Attention
Zhen Qin
Weigao Sun
Dong Li
Xuyang Shen
Weixuan Sun
Yiran Zhong
46
9
0
27 May 2024
Aligning LLMs through Multi-perspective User Preference Ranking-based
  Feedback for Programming Question Answering
Aligning LLMs through Multi-perspective User Preference Ranking-based Feedback for Programming Question Answering
Hongyu Yang
Liyang He
Min Hou
Shuanghong Shen
Rui Li
Jiahui Hou
Jianhui Ma
Junda Zhao
27
4
0
27 May 2024
Lessons from the Trenches on Reproducible Evaluation of Language Models
Lessons from the Trenches on Reproducible Evaluation of Language Models
Stella Biderman
Hailey Schoelkopf
Lintang Sutawika
Leo Gao
J. Tow
...
Xiangru Tang
Kevin A. Wang
Genta Indra Winata
Franccois Yvon
Andy Zou
ELM
ALM
130
52
3
23 May 2024
A Survey on Vision-Language-Action Models for Embodied AI
A Survey on Vision-Language-Action Models for Embodied AI
Yueen Ma
Zixing Song
Yuzheng Zhuang
Jianye Hao
Irwin King
LM&Ro
67
41
0
23 May 2024
The AI Community Building the Future? A Quantitative Analysis of
  Development Activity on Hugging Face Hub
The AI Community Building the Future? A Quantitative Analysis of Development Activity on Hugging Face Hub
Cailean Osborne
Jennifer Ding
Hannah Rose Kirk
22
41
0
20 May 2024
Alternators For Sequence Modeling
Alternators For Sequence Modeling
Mohammad Reza Rezaei
Adji Bousso Dieng
26
0
0
20 May 2024
The Future of Large Language Model Pre-training is Federated
The Future of Large Language Model Pre-training is Federated
Lorenzo Sani
Alexandru Iacob
Zeyu Cao
Bill Marino
Yan Gao
...
Wanru Zhao
William F. Shen
Preslav Aleksandrov
Xinchi Qiu
Nicholas D. Lane
AI4CE
33
12
0
17 May 2024
IGOT: Information Gain Optimized Tokenizer on Domain Adaptive
  Pretraining
IGOT: Information Gain Optimized Tokenizer on Domain Adaptive Pretraining
Dawei Feng
Yihai Zhang
Zhixuan Xu
SyDa
22
0
0
16 May 2024
Zero-Shot Tokenizer Transfer
Zero-Shot Tokenizer Transfer
Benjamin Minixhofer
E. Ponti
Ivan Vulić
VLM
44
9
0
13 May 2024
OpenBA-V2: Reaching 77.3% High Compression Ratio with Fast Multi-Stage
  Pruning
OpenBA-V2: Reaching 77.3% High Compression Ratio with Fast Multi-Stage Pruning
Dan Qiao
Yi Su
Pinzheng Wang
Jing Ye
Wen Xie
...
Wenliang Chen
Guohong Fu
Guodong Zhou
Qiaoming Zhu
Min Zhang
MQ
32
0
0
09 May 2024
Fishing for Magikarp: Automatically Detecting Under-trained Tokens in
  Large Language Models
Fishing for Magikarp: Automatically Detecting Under-trained Tokens in Large Language Models
Sander Land
Max Bartolo
26
20
0
08 May 2024
Who Wrote This? The Key to Zero-Shot LLM-Generated Text Detection Is GECScore
Who Wrote This? The Key to Zero-Shot LLM-Generated Text Detection Is GECScore
Junchao Wu
Runzhe Zhan
Derek F. Wong
Shu Yang
Xuebo Liu
Lidia S. Chao
Min Zhang
DeLMO
33
3
0
07 May 2024
CofiPara: A Coarse-to-fine Paradigm for Multimodal Sarcasm Target
  Identification with Large Multimodal Models
CofiPara: A Coarse-to-fine Paradigm for Multimodal Sarcasm Target Identification with Large Multimodal Models
Hongzhan Lin
Zixin Chen
Ziyang Luo
Mingfei Cheng
Jing Ma
Guang Chen
31
6
0
01 May 2024
A Careful Examination of Large Language Model Performance on Grade
  School Arithmetic
A Careful Examination of Large Language Model Performance on Grade School Arithmetic
Hugh Zhang
Jeff Da
Dean Lee
Vaughn Robinson
Catherine Wu
...
Qin Lyu
Sean Hendryx
Russell Kaplan
Michele Lunati
Summer Yue
ALM
LRM
ELM
27
92
0
01 May 2024
RAG and RAU: A Survey on Retrieval-Augmented Language Model in Natural
  Language Processing
RAG and RAU: A Survey on Retrieval-Augmented Language Model in Natural Language Processing
Yucheng Hu
Yuxing Lu
RALM
58
17
0
30 Apr 2024
Text Quality-Based Pruning for Efficient Training of Language Models
Text Quality-Based Pruning for Efficient Training of Language Models
Vasu Sharma
Karthik Padthe
Newsha Ardalani
Kushal Tirumala
Russell Howes
...
Po-Yao Huang
Shang-Wen Li
Armen Aghajanyan
Gargi Ghosh
Luke Zettlemoyer
44
5
0
26 Apr 2024
A Survey on Retrieval-Augmented Text Generation for Large Language
  Models
A Survey on Retrieval-Augmented Text Generation for Large Language Models
Yizheng Huang
Jimmy X. Huang
3DV
RALM
58
44
0
17 Apr 2024
Language Model Cascades: Token-level uncertainty and beyond
Language Model Cascades: Token-level uncertainty and beyond
Neha Gupta
Harikrishna Narasimhan
Wittawat Jitkrittum
A. S. Rawat
A. Menon
Sanjiv Kumar
UQLM
41
42
0
15 Apr 2024
JaFIn: Japanese Financial Instruction Dataset
JaFIn: Japanese Financial Instruction Dataset
Kota Tanabe
Masahiro Suzuki
Hiroki Sakaji
Itsuki Noda
39
1
0
14 Apr 2024
Navigating the Landscape of Large Language Models: A Comprehensive
  Review and Analysis of Paradigms and Fine-Tuning Strategies
Navigating the Landscape of Large Language Models: A Comprehensive Review and Analysis of Paradigms and Fine-Tuning Strategies
Benjue Weng
LM&MA
30
7
0
13 Apr 2024
Previous
123456...101112
Next