ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2204.06745
  4. Cited By
GPT-NeoX-20B: An Open-Source Autoregressive Language Model

GPT-NeoX-20B: An Open-Source Autoregressive Language Model

14 April 2022
Sid Black
Stella Biderman
Eric Hallahan
Quentin G. Anthony
Leo Gao
Laurence Golding
Horace He
Connor Leahy
Kyle McDonell
Jason Phang
Michael Pieler
USVSN Sai Prashanth
Shivanshu Purohit
Laria Reynolds
J. Tow
Benqi Wang
Samuel Weinbach
ArXivPDFHTML

Papers citing "GPT-NeoX-20B: An Open-Source Autoregressive Language Model"

50 / 554 papers shown
Title
Probing Quantifier Comprehension in Large Language Models: Another
  Example of Inverse Scaling
Probing Quantifier Comprehension in Large Language Models: Another Example of Inverse Scaling
Akshat Gupta
ELM
LRM
16
7
0
12 Jun 2023
Gradient Ascent Post-training Enhances Language Model Generalization
Gradient Ascent Post-training Enhances Language Model Generalization
Dongkeun Yoon
Joel Jang
Sungdong Kim
Minjoon Seo
VLM
AI4CE
11
3
0
12 Jun 2023
Recurrent Attention Networks for Long-text Modeling
Recurrent Attention Networks for Long-text Modeling
Xianming Li
Zongxi Li
Xiaotian Luo
Haoran Xie
Xing Lee
Yingbin Zhao
Fu Lee Wang
Qing Li
RALM
20
15
0
12 Jun 2023
Language Versatilists vs. Specialists: An Empirical Revisiting on
  Multilingual Transfer Ability
Language Versatilists vs. Specialists: An Empirical Revisiting on Multilingual Transfer Ability
Jiacheng Ye
Xijia Tao
Lingpeng Kong
LRM
33
22
0
11 Jun 2023
Reducing Barriers to Self-Supervised Learning: HuBERT Pre-training with
  Academic Compute
Reducing Barriers to Self-Supervised Learning: HuBERT Pre-training with Academic Compute
William Chen
Xuankai Chang
Yifan Peng
Zhaoheng Ni
Soumi Maiti
Shinji Watanabe
SSL
18
25
0
11 Jun 2023
A Comprehensive Review of State-of-The-Art Methods for Java Code
  Generation from Natural Language Text
A Comprehensive Review of State-of-The-Art Methods for Java Code Generation from Natural Language Text
Jessica Nayeli López Espejel
Mahaman Sanoussi Yahaya Alassan
El Mehdi Chouham
Walid Dahhane
E. Ettifouri
21
12
0
10 Jun 2023
S$^{3}$: Increasing GPU Utilization during Generative Inference for
  Higher Throughput
S3^{3}3: Increasing GPU Utilization during Generative Inference for Higher Throughput
Yunho Jin
Chun-Feng Wu
David Brooks
Gu-Yeon Wei
29
62
0
09 Jun 2023
The Age of Synthetic Realities: Challenges and Opportunities
The Age of Synthetic Realities: Challenges and Opportunities
J. P. Cardenuto
Jing Yang
Rafael Padilha
Renjie Wan
Daniel Moreira
Haoliang Li
Shiqi Wang
Fernanda A. Andaló
Sébastien Marcel
Anderson de Rezende Rocha
DeLMO
42
29
0
09 Jun 2023
Xiezhi: An Ever-Updating Benchmark for Holistic Domain Knowledge
  Evaluation
Xiezhi: An Ever-Updating Benchmark for Holistic Domain Knowledge Evaluation
Zhouhong Gu
Xiaoxuan Zhu
Haoning Ye
Lin Zhang
Jianchen Wang
...
Zili Wang
Shusen Wang
Weiguo Zheng
Hongwei Feng
Yanghua Xiao
ALM
ELM
22
58
0
09 Jun 2023
CUED at ProbSum 2023: Hierarchical Ensemble of Summarization Models
CUED at ProbSum 2023: Hierarchical Ensemble of Summarization Models
Potsawee Manakul
Yassir Fathullah
Adian Liusie
Vyas Raina
Vatsal Raina
Mark J. F. Gales
27
12
0
08 Jun 2023
PIXIU: A Large Language Model, Instruction Data and Evaluation Benchmark
  for Finance
PIXIU: A Large Language Model, Instruction Data and Evaluation Benchmark for Finance
Qianqian Xie
Weiguang Han
Xiao Zhang
Yanzhao Lai
Min Peng
Alejandro Lopez-Lira
Jimin Huang
ALM
10
135
0
08 Jun 2023
INSTRUCTEVAL: Towards Holistic Evaluation of Instruction-Tuned Large
  Language Models
INSTRUCTEVAL: Towards Holistic Evaluation of Instruction-Tuned Large Language Models
Yew Ken Chia
Pengfei Hong
Lidong Bing
Soujanya Poria
ELM
25
61
0
07 Jun 2023
ChatGPT is fun, but it is not funny! Humor is still challenging Large
  Language Models
ChatGPT is fun, but it is not funny! Humor is still challenging Large Language Models
Sophie F. Jentzsch
Kristian Kersting
LRM
6
33
0
07 Jun 2023
Information Flow Control in Machine Learning through Modular Model
  Architecture
Information Flow Control in Machine Learning through Modular Model Architecture
Trishita Tiwari
Suchin Gururangan
Chuan Guo
Weizhe Hua
Sanjay Kariyappa
Udit Gupta
Wenjie Xiong
Kiwan Maeng
Hsien-Hsin S. Lee
G. E. Suh
19
6
0
05 Jun 2023
Semantically-Prompted Language Models Improve Visual Descriptions
Semantically-Prompted Language Models Improve Visual Descriptions
Michael Ogezi
B. Hauer
Grzegorz Kondrak
VLM
8
0
0
05 Jun 2023
Video-LLaMA: An Instruction-tuned Audio-Visual Language Model for Video
  Understanding
Video-LLaMA: An Instruction-tuned Audio-Visual Language Model for Video Understanding
Hang Zhang
Xin Li
Lidong Bing
MLLM
53
948
0
05 Jun 2023
LexGPT 0.1: pre-trained GPT-J models with Pile of Law
LexGPT 0.1: pre-trained GPT-J models with Pile of Law
Jieh-Sheng Lee
AILaw
11
10
0
05 Jun 2023
A Technical Report for Polyglot-Ko: Open-Source Large-Scale Korean
  Language Models
A Technical Report for Polyglot-Ko: Open-Source Large-Scale Korean Language Models
H. Ko
Kichang Yang
Minho Ryu
Taekyoon Choi
Seungmu Yang
Jiwung Hyun
Sung-Yong Park
Kyubyong Park
28
29
0
04 Jun 2023
Harnessing large-language models to generate private synthetic text
Harnessing large-language models to generate private synthetic text
Alexey Kurakin
Natalia Ponomareva
Umar Syed
Liam MacDermed
Andreas Terzis
SILM
SyDa
25
34
0
02 Jun 2023
GAIA Search: Hugging Face and Pyserini Interoperability for NLP Training
  Data Exploration
GAIA Search: Hugging Face and Pyserini Interoperability for NLP Training Data Exploration
Aleksandra Piktus
Odunayo Ogundepo
Christopher Akiki
Akintunde Oladipo
Xinyu Crystina Zhang
Hailey Schoelkopf
Stella Biderman
Martin Potthast
Jimmy J. Lin
CVBM
36
10
0
02 Jun 2023
The RefinedWeb Dataset for Falcon LLM: Outperforming Curated Corpora
  with Web Data, and Web Data Only
The RefinedWeb Dataset for Falcon LLM: Outperforming Curated Corpora with Web Data, and Web Data Only
Guilherme Penedo
Quentin Malartic
Daniel Hesslow
Ruxandra-Aimée Cojocaru
Alessandro Cappelli
Hamza Alobeidli
B. Pannier
Ebtesam Almazrouei
Julien Launay
27
744
0
01 Jun 2023
AWQ: Activation-aware Weight Quantization for LLM Compression and
  Acceleration
AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration
Ji Lin
Jiaming Tang
Haotian Tang
Shang Yang
Wei-Ming Chen
Wei-Chen Wang
Guangxuan Xiao
Xingyu Dang
Chuang Gan
Song Han
EDL
MQ
25
463
0
01 Jun 2023
Transformers learn to implement preconditioned gradient descent for
  in-context learning
Transformers learn to implement preconditioned gradient descent for in-context learning
Kwangjun Ahn
Xiang Cheng
Hadi Daneshmand
S. Sra
ODL
17
147
0
01 Jun 2023
A Systematic Study and Comprehensive Evaluation of ChatGPT on Benchmark
  Datasets
A Systematic Study and Comprehensive Evaluation of ChatGPT on Benchmark Datasets
Md Tahmid Rahman Laskar
M Saiful Bari
Mizanur Rahman
Md Amran Hossen Bhuiyan
Shafiq R. Joty
J. Huang
LM&MA
ELM
ALM
41
178
0
29 May 2023
DNA-GPT: Divergent N-Gram Analysis for Training-Free Detection of
  GPT-Generated Text
DNA-GPT: Divergent N-Gram Analysis for Training-Free Detection of GPT-Generated Text
Xianjun Yang
Wei Cheng
Yue Wu
Linda R. Petzold
William Yang Wang
Haifeng Chen
DeLMO
28
84
0
27 May 2023
RAMP: Retrieval and Attribute-Marking Enhanced Prompting for
  Attribute-Controlled Translation
RAMP: Retrieval and Attribute-Marking Enhanced Prompting for Attribute-Controlled Translation
Gabriele Sarti
Phu Mon Htut
Xing Niu
B. Hsu
Anna Currey
Georgiana Dinu
Maria Nadejde
LRM
37
9
0
26 May 2023
Efficient Detection of LLM-generated Texts with a Bayesian Surrogate
  Model
Efficient Detection of LLM-generated Texts with a Bayesian Surrogate Model
Yibo Miao
Hongcheng Gao
Hao Zhang
Zhijie Deng
DeLMO
30
20
0
26 May 2023
On the Tool Manipulation Capability of Open-source Large Language Models
On the Tool Manipulation Capability of Open-source Large Language Models
Qiantong Xu
Fenglu Hong
B. Li
Changran Hu
Zheng Chen
Jian Zhang
LLMAG
19
68
0
25 May 2023
Scaling Data-Constrained Language Models
Scaling Data-Constrained Language Models
Niklas Muennighoff
Alexander M. Rush
Boaz Barak
Teven Le Scao
Aleksandra Piktus
Nouamane Tazi
S. Pyysalo
Thomas Wolf
Colin Raffel
ALM
21
197
0
25 May 2023
Training Data Extraction From Pre-trained Language Models: A Survey
Training Data Extraction From Pre-trained Language Models: A Survey
Shotaro Ishihara
24
46
0
25 May 2023
Measuring and Mitigating Constraint Violations of In-Context Learning
  for Utterance-to-API Semantic Parsing
Measuring and Mitigating Constraint Violations of In-Context Learning for Utterance-to-API Semantic Parsing
Shufan Wang
Sébastien Jean
Sailik Sengupta
James Gung
Nikolaos Pappas
Yi Zhang
31
6
0
24 May 2023
ImageNetVC: Zero- and Few-Shot Visual Commonsense Evaluation on 1000
  ImageNet Categories
ImageNetVC: Zero- and Few-Shot Visual Commonsense Evaluation on 1000 ImageNet Categories
Heming Xia
Qingxiu Dong
Lei Li
Jingjing Xu
Tianyu Liu
Ziwei Qin
Zhifang Sui
MLLM
VLM
16
3
0
24 May 2023
Editing Common Sense in Transformers
Editing Common Sense in Transformers
Anshita Gupta
Debanjan Mondal
Akshay Krishna Sheshadri
Wenlong Zhao
Xiang Lorraine Li
Sarah Wiegreffe
Niket Tandon
KELM
29
20
0
24 May 2023
Don't Take This Out of Context! On the Need for Contextual Models and
  Evaluations for Stylistic Rewriting
Don't Take This Out of Context! On the Need for Contextual Models and Evaluations for Stylistic Rewriting
Akhila Yerukola
Xuhui Zhou
Elizabeth Clark
Maarten Sap
23
6
0
24 May 2023
Have Large Language Models Developed a Personality?: Applicability of
  Self-Assessment Tests in Measuring Personality in LLMs
Have Large Language Models Developed a Personality?: Applicability of Self-Assessment Tests in Measuring Personality in LLMs
Xiaoyang Song
Akshat Gupta
Kiyan Mohebbizadeh
Shujie Hu
Anant Singh
24
25
0
24 May 2023
BAND: Biomedical Alert News Dataset
BAND: Biomedical Alert News Dataset
Z. Fu
Meiru Zhang
Zaiqiao Meng
Yannan Shen
David L. Buckeridge
Nigel Collier
17
3
0
23 May 2023
Sophia: A Scalable Stochastic Second-order Optimizer for Language Model
  Pre-training
Sophia: A Scalable Stochastic Second-order Optimizer for Language Model Pre-training
Hong Liu
Zhiyuan Li
David Leo Wright Hall
Percy Liang
Tengyu Ma
VLM
15
128
0
23 May 2023
Active Learning Principles for In-Context Learning with Large Language
  Models
Active Learning Principles for In-Context Learning with Large Language Models
Katerina Margatina
Timo Schick
Nikolaos Aletras
Jane Dwivedi-Yu
22
39
0
23 May 2023
Goat: Fine-tuned LLaMA Outperforms GPT-4 on Arithmetic Tasks
Goat: Fine-tuned LLaMA Outperforms GPT-4 on Arithmetic Tasks
Tiedong Liu
K. H. Low
ALM
28
81
0
23 May 2023
DetectLLM: Leveraging Log Rank Information for Zero-Shot Detection of
  Machine-Generated Text
DetectLLM: Leveraging Log Rank Information for Zero-Shot Detection of Machine-Generated Text
Jinyan Su
Terry Yue Zhuo
Di Wang
Preslav Nakov
DeLMO
27
121
0
23 May 2023
Learning from Mistakes via Cooperative Study Assistant for Large
  Language Models
Learning from Mistakes via Cooperative Study Assistant for Large Language Models
Danqing Wang
Lei Li
23
6
0
23 May 2023
CombLM: Adapting Black-Box Language Models through Small Fine-Tuned
  Models
CombLM: Adapting Black-Box Language Models through Small Fine-Tuned Models
Aitor Ormazabal
Mikel Artetxe
Eneko Agirre
30
19
0
23 May 2023
Do All Languages Cost the Same? Tokenization in the Era of Commercial
  Language Models
Do All Languages Cost the Same? Tokenization in the Era of Commercial Language Models
Orevaoghene Ahia
Sachin Kumar
Hila Gonen
Jungo Kasai
David R. Mortensen
Noah A. Smith
Yulia Tsvetkov
34
80
0
23 May 2023
Physics of Language Models: Part 1, Learning Hierarchical Language
  Structures
Physics of Language Models: Part 1, Learning Hierarchical Language Structures
Zeyuan Allen-Zhu
Yuanzhi Li
27
15
0
23 May 2023
A 4D Hybrid Algorithm to Scale Parallel Training to Thousands of GPUs
A 4D Hybrid Algorithm to Scale Parallel Training to Thousands of GPUs
Siddharth Singh
Prajwal Singhania
Aditya K. Ranjan
Zack Sating
A. Bhatele
25
3
0
22 May 2023
Small Language Models Improve Giants by Rewriting Their Outputs
Small Language Models Improve Giants by Rewriting Their Outputs
Giorgos Vernikos
Arthur Bravzinskas
Jakub Adamek
Jonathan Mallinson
Aliaksei Severyn
Eric Malmi
BDL
LRM
20
14
0
22 May 2023
Neural Machine Translation for Code Generation
Neural Machine Translation for Code Generation
K. Dharma
Clayton T. Morrison
30
4
0
22 May 2023
Flover: A Temporal Fusion Framework for Efficient Autoregressive Model
  Parallel Inference
Flover: A Temporal Fusion Framework for Efficient Autoregressive Model Parallel Inference
Jinghan Yao
Nawras Alnaasan
Tianrun Chen
A. Shafi
Hari Subramoni
Dhabaleswar K.
D. Panda
24
2
0
22 May 2023
MAGE: Machine-generated Text Detection in the Wild
MAGE: Machine-generated Text Detection in the Wild
Yafu Li
Qintong Li
Leyang Cui
Wei Bi
Zhilin Wang
Longyue Wang
Linyi Yang
Shuming Shi
Yue Zhang
DeLMO
39
41
0
22 May 2023
Editing Large Language Models: Problems, Methods, and Opportunities
Editing Large Language Models: Problems, Methods, and Opportunities
Yunzhi Yao
Peng Wang
Bo Tian
Shuyang Cheng
Zhoubo Li
Shumin Deng
Huajun Chen
Ningyu Zhang
KELM
30
278
0
22 May 2023
Previous
123...101112789
Next