ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2204.06745
  4. Cited By
GPT-NeoX-20B: An Open-Source Autoregressive Language Model

GPT-NeoX-20B: An Open-Source Autoregressive Language Model

14 April 2022
Sid Black
Stella Biderman
Eric Hallahan
Quentin G. Anthony
Leo Gao
Laurence Golding
Horace He
Connor Leahy
Kyle McDonell
Jason Phang
Michael Pieler
USVSN Sai Prashanth
Shivanshu Purohit
Laria Reynolds
J. Tow
Benqi Wang
Samuel Weinbach
ArXiv (abs)PDFHTMLHuggingFace (1 upvotes)Github (7200★)

Papers citing "GPT-NeoX-20B: An Open-Source Autoregressive Language Model"

50 / 603 papers shown
Text Quality-Based Pruning for Efficient Training of Language Models
Text Quality-Based Pruning for Efficient Training of Language Models
Vasu Sharma
Karthik Padthe
Newsha Ardalani
Kushal Tirumala
Russell Howes
...
Po-Yao Huang
Shang-Wen Li
Armen Aghajanyan
Gargi Ghosh
Luke Zettlemoyer
278
8
0
26 Apr 2024
A Survey on Retrieval-Augmented Text Generation for Large Language
  Models
A Survey on Retrieval-Augmented Text Generation for Large Language Models
Yizheng Huang
Jimmy X. Huang
3DVRALM
321
91
0
17 Apr 2024
Language Model Cascades: Token-level uncertainty and beyond
Language Model Cascades: Token-level uncertainty and beyond
Neha Gupta
Harikrishna Narasimhan
Wittawat Jitkrittum
A. S. Rawat
A. Menon
Sanjiv Kumar
UQLM
461
90
0
15 Apr 2024
JaFIn: Japanese Financial Instruction Dataset
JaFIn: Japanese Financial Instruction Dataset
Kota Tanabe
Masahiro Suzuki
Hiroki Sakaji
Itsuki Noda
166
2
0
14 Apr 2024
Navigating the Landscape of Large Language Models: A Comprehensive
  Review and Analysis of Paradigms and Fine-Tuning Strategies
Navigating the Landscape of Large Language Models: A Comprehensive Review and Analysis of Paradigms and Fine-Tuning Strategies
Benjue Weng
LM&MA
284
15
0
13 Apr 2024
Is Next Token Prediction Sufficient for GPT? Exploration on Code Logic
  Comprehension
Is Next Token Prediction Sufficient for GPT? Exploration on Code Logic Comprehension
Mengnan Qi
Yufan Huang
Yongqiang Yao
Maoquan Wang
Bin Gu
Neel Sundaresan
204
6
0
13 Apr 2024
Eagle and Finch: RWKV with Matrix-Valued States and Dynamic Recurrence
Eagle and Finch: RWKV with Matrix-Valued States and Dynamic Recurrence
Bo Peng
Daniel Goldstein
Quentin G. Anthony
Alon Albalak
Eric Alcaide
...
Bingchen Zhao
Qihang Zhao
Peng Zhou
Jian Zhu
Ruijie Zhu
317
136
0
08 Apr 2024
Untangle the KNOT: Interweaving Conflicting Knowledge and Reasoning
  Skills in Large Language Models
Untangle the KNOT: Interweaving Conflicting Knowledge and Reasoning Skills in Large Language ModelsInternational Conference on Language Resources and Evaluation (LREC), 2024
Yantao Liu
Zijun Yao
Xin Lv
Yuchen Fan
S. Cao
Jifan Yu
Lei Hou
Juanzi Li
236
3
0
04 Apr 2024
Min-K%++: Improved Baseline for Detecting Pre-Training Data from Large Language Models
Min-K%++: Improved Baseline for Detecting Pre-Training Data from Large Language Models
Jingyang Zhang
Jingwei Sun
Eric C. Yeats
Ouyang Yang
Martin Kuo
Jianyi Zhang
Hao Frank Yang
Hai "Helen" Li
710
78
0
03 Apr 2024
Constrained Robotic Navigation on Preferred Terrains Using LLMs and
  Speech Instruction: Exploiting the Power of Adverbs
Constrained Robotic Navigation on Preferred Terrains Using LLMs and Speech Instruction: Exploiting the Power of AdverbsInternational Symposium on Experimental Robotics (ISER), 2024
F. Lotfi
F. Faraji
Nikhil Kakodkar
Travis Manderson
David Meger
Gregory Dudek
LM&Ro
140
0
0
02 Apr 2024
Peer-aided Repairer: Empowering Large Language Models to Repair Advanced
  Student Assignments
Peer-aided Repairer: Empowering Large Language Models to Repair Advanced Student Assignments
Qianhui Zhao
Fang Liu
Li Zhang
Yang Liu
Zhen Yan
Zhenghao Chen
Yufei Zhou
Jing Jiang
Ge Li
183
0
0
02 Apr 2024
Release of Pre-Trained Models for the Japanese Language
Release of Pre-Trained Models for the Japanese LanguageInternational Conference on Language Resources and Evaluation (LREC), 2024
Kei Sawada
Tianyu Zhao
Makoto Shing
Kentaro Mitsui
Akio Kaga
Yukiya Hono
Toshiaki Wakatsuki
Koh Mitsuda
207
29
0
02 Apr 2024
Beyond One-Size-Fits-All: Multi-Domain, Multi-Task Framework for
  Embedding Model Selection
Beyond One-Size-Fits-All: Multi-Domain, Multi-Task Framework for Embedding Model Selection
Vivek Khetan
86
0
0
30 Mar 2024
The Invalsi Benchmarks: measuring Linguistic and Mathematical
  understanding of Large Language Models in Italian
The Invalsi Benchmarks: measuring Linguistic and Mathematical understanding of Large Language Models in Italian
Andrea Esuli
Giovanni Puccetti
ELM
252
7
0
27 Mar 2024
FoC: Figure out the Cryptographic Functions in Stripped Binaries with LLMs
FoC: Figure out the Cryptographic Functions in Stripped Binaries with LLMs
Guoqiang Chen
Xiuwei Shang
Shaoyin Cheng
Yanming Zhang
Weiming Zhang
Neng H. Yu
N. Yu
352
7
0
27 Mar 2024
Continual Few-shot Event Detection via Hierarchical Augmentation
  Networks
Continual Few-shot Event Detection via Hierarchical Augmentation Networks
Chenlong Zhang
Pengfei Cao
Yubo Chen
Kang Liu
Qing Cui
Mengshu Sun
Jun Zhao
227
5
0
26 Mar 2024
Language Models for Text Classification: Is In-Context Learning Enough?
Language Models for Text Classification: Is In-Context Learning Enough?
A. Edwards
Jose Camacho-Collados
LRM
253
53
0
26 Mar 2024
Large Language Models in Biomedical and Health Informatics: A
  Bibliometric Review
Large Language Models in Biomedical and Health Informatics: A Bibliometric Review
Huizi Yu
Lizhou Fan
Jinkui Chi
Jiayan Zhou
Zihui Ma
...
Sijia He
Haoyang Ling
Yongfeng Zhang
Ashvin Gandhi
Xin Ma
LM&MA
436
2
0
24 Mar 2024
Comprehensive Reassessment of Large-Scale Evaluation Outcomes in LLMs: A
  Multifaceted Statistical Approach
Comprehensive Reassessment of Large-Scale Evaluation Outcomes in LLMs: A Multifaceted Statistical Approach
Kun Sun
Rong Wang
Anders Sogaard
291
6
0
22 Mar 2024
FollowIR: Evaluating and Teaching Information Retrieval Models to Follow
  Instructions
FollowIR: Evaluating and Teaching Information Retrieval Models to Follow InstructionsNorth American Chapter of the Association for Computational Linguistics (NAACL), 2024
Orion Weller
Benjamin Chang
Sean MacAvaney
Kyle Lo
Arman Cohan
Benjamin Van Durme
Dawn J Lawrie
Luca Soldaini
309
62
0
22 Mar 2024
ChatGPT Alternative Solutions: Large Language Models Survey
ChatGPT Alternative Solutions: Large Language Models Survey
H. Alipour
Nick Pendar
Kohinoor Roy
LM&MA
158
10
0
21 Mar 2024
Dated Data: Tracing Knowledge Cutoffs in Large Language Models
Dated Data: Tracing Knowledge Cutoffs in Large Language Models
Jeffrey Cheng
Marc Marone
Orion Weller
Dawn J Lawrie
Daniel Khashabi
Benjamin Van Durme
286
47
0
19 Mar 2024
Rectifying Demonstration Shortcut in In-Context Learning
Rectifying Demonstration Shortcut in In-Context LearningNorth American Chapter of the Association for Computational Linguistics (NAACL), 2024
Joonwon Jang
Sanghwan Jang
Wonbin Kweon
Minjin Jeon
Hwanjo Yu
347
4
0
14 Mar 2024
The Garden of Forking Paths: Observing Dynamic Parameters Distribution
  in Large Language Models
The Garden of Forking Paths: Observing Dynamic Parameters Distribution in Large Language Models
Carlo Nicolini
Jacopo Staiano
Bruno Lepri
Raffaele Marino
MoE
165
1
0
13 Mar 2024
Language models scale reliably with over-training and on downstream
  tasks
Language models scale reliably with over-training and on downstream tasksInternational Conference on Learning Representations (ICLR), 2024
S. Gadre
Georgios Smyrnis
Vaishaal Shankar
Suchin Gururangan
Mitchell Wortsman
...
Y. Carmon
Achal Dave
Reinhard Heckel
Niklas Muennighoff
Ludwig Schmidt
ALMELMLRM
345
76
0
13 Mar 2024
Mastering Text, Code and Math Simultaneously via Fusing Highly
  Specialized Language Models
Mastering Text, Code and Math Simultaneously via Fusing Highly Specialized Language Models
Ning Ding
Yulin Chen
Ganqu Cui
Xingtai Lv
Weilin Zhao
Ruobing Xie
Bowen Zhou
Zhiyuan Liu
Maosong Sun
ALMMoMeAI4CE
451
8
0
13 Mar 2024
MEIT: Multimodal Electrocardiogram Instruction Tuning on Large Language Models for Report Generation
MEIT: Multimodal Electrocardiogram Instruction Tuning on Large Language Models for Report Generation
Zhongwei Wan
Che Liu
Xin Wang
Chaofan Tao
Hui Shen
Zhenwu Peng
Jie Fu
Rossella Arcucci
Huaxiu Yao
404
18
0
07 Mar 2024
Reliable, Adaptable, and Attributable Language Models with Retrieval
Reliable, Adaptable, and Attributable Language Models with Retrieval
Akari Asai
Zexuan Zhong
Danqi Chen
Pang Wei Koh
Luke Zettlemoyer
Hanna Hajishirzi
Anuj Kumar
KELMRALM
322
82
0
05 Mar 2024
How Well Can Transformers Emulate In-context Newton's Method?
How Well Can Transformers Emulate In-context Newton's Method?
Angeliki Giannou
Liu Yang
Tianhao Wang
Dimitris Papailiopoulos
Jason D. Lee
243
27
0
05 Mar 2024
Online Training of Large Language Models: Learn while chatting
Online Training of Large Language Models: Learn while chatting
Juhao Liang
Ziwei Wang
Zhuoheng Ma
Jianquan Li
Zhiyi Zhang
Xiangbo Wu
Benyou Wang
KELM
266
7
0
04 Mar 2024
An Improved Traditional Chinese Evaluation Suite for Foundation Model
An Improved Traditional Chinese Evaluation Suite for Foundation Model
Zhi Rui Tam
Ya-Ting Pai
Yen-Wei Lee
Jun-Da Chen
Wei-Min Chu
Sega Cheng
Hong-Han Shuai
ELM
489
15
0
04 Mar 2024
Exploring the Efficacy of Large Language Models in Summarizing Mental
  Health Counseling Sessions: A Benchmark Study
Exploring the Efficacy of Large Language Models in Summarizing Mental Health Counseling Sessions: A Benchmark Study
Prottay Kumar Adhikary
Aseem Srivastava
Shivani Kumar
Salam Michael Singh
Puneet Manuja
Jini K. Gopinath
Vijay Krishnan
Swati Kedia
K. Deb
Tanmoy Chakraborty
AI4MH
285
22
0
29 Feb 2024
On the Societal Impact of Open Foundation Models
On the Societal Impact of Open Foundation Models
Sayash Kapoor
Rishi Bommasani
Kevin Klyman
Shayne Longpre
Ashwin Ramaswami
...
Victor Storchan
Daniel Zhang
Mark A. Lemley
Abigail Z. Jacobs
Arvind Narayanan
305
85
0
27 Feb 2024
Language Models for Code Completion: A Practical Evaluation
Language Models for Code Completion: A Practical Evaluation
Maliheh Izadi
Jonathan Katzy
Tim van Dam
Marc Otten
R. Popescu
Arie van Deursen
ALMELM
173
67
0
25 Feb 2024
Fast Adversarial Attacks on Language Models In One GPU Minute
Fast Adversarial Attacks on Language Models In One GPU Minute
Vinu Sankar Sadasivan
Shoumik Saha
Gaurang Sriramanan
Priyatham Kattakinda
Atoosa Malemir Chegini
Soheil Feizi
MIALM
336
69
0
23 Feb 2024
MobileLLM: Optimizing Sub-billion Parameter Language Models for
  On-Device Use Cases
MobileLLM: Optimizing Sub-billion Parameter Language Models for On-Device Use Cases
Zechun Liu
Changsheng Zhao
Forrest N. Iandola
Chen Lai
Yuandong Tian
...
Ernie Chang
Yangyang Shi
Raghuraman Krishnamoorthi
Liangzhen Lai
Vikas Chandra
ALM
342
185
0
22 Feb 2024
Chain-of-Thought Unfaithfulness as Disguised Accuracy
Chain-of-Thought Unfaithfulness as Disguised Accuracy
Oliver Bentham
Nathan Stringham
Ana Marasović
LRMHILM
347
23
0
22 Feb 2024
$Se^2$: Sequential Example Selection for In-Context Learning
Se2Se^2Se2: Sequential Example Selection for In-Context Learning
Haoyu Liu
Jianfeng Liu
Shaohan Huang
Yuefeng Zhan
Hao Sun
Weiwei Deng
Furu Wei
Qi Zhang
238
6
0
21 Feb 2024
DREsS: Dataset for Rubric-based Essay Scoring on EFL Writing
DREsS: Dataset for Rubric-based Essay Scoring on EFL Writing
Haneul Yoo
Jieun Han
So-Yeon Ahn
Alice Oh
152
8
0
21 Feb 2024
DrBenchmark: A Large Language Understanding Evaluation Benchmark for
  French Biomedical Domain
DrBenchmark: A Large Language Understanding Evaluation Benchmark for French Biomedical Domain
Yanis Labrak
Adrien Bazoge
Oumaima El Khettari
Mickael Rouvier
Pacome Constant dit Beaufils
...
B. Daille
Solen Quiniou
Emmanuel Morin
P. Gourraud
Richard Dufour
LM&MA
225
7
0
20 Feb 2024
The Hidden Space of Transformer Language Adapters
The Hidden Space of Transformer Language Adapters
Jesujoba Oluwadara Alabi
Marius Mosbach
Matan Eyal
Dietrich Klakow
Mor Geva
363
15
1
20 Feb 2024
Effective and Efficient Conversation Retrieval for Dialogue State
  Tracking with Implicit Text Summaries
Effective and Efficient Conversation Retrieval for Dialogue State Tracking with Implicit Text Summaries
Seanie Lee
Jianpeng Cheng
Joris Driesen
Alexandru Coca
Anders Johannsen
RALM
365
7
0
20 Feb 2024
BioMistral: A Collection of Open-Source Pretrained Large Language Models
  for Medical Domains
BioMistral: A Collection of Open-Source Pretrained Large Language Models for Medical Domains
Yanis Labrak
Adrien Bazoge
Emmanuel Morin
P. Gourraud
Mickael Rouvier
Richard Dufour
487
367
0
15 Feb 2024
Personalized Large Language Models
Personalized Large Language Models
Stanislaw Wo'zniak
Bartlomiej Koptyra
Arkadiusz Janz
P. Kazienko
Jan Kocoñ
205
35
0
14 Feb 2024
Can LLMs Learn New Concepts Incrementally without Forgetting?
Can LLMs Learn New Concepts Incrementally without Forgetting?
Junhao Zheng
Shengjie Qiu
Qianli Ma
CLL
266
0
0
13 Feb 2024
LoTa-Bench: Benchmarking Language-oriented Task Planners for Embodied
  Agents
LoTa-Bench: Benchmarking Language-oriented Task Planners for Embodied Agents
Jae-Woo Choi
Youngwoo Yoon
Hyobin Ong
Jaehong Kim
Minsu Jang
203
41
0
13 Feb 2024
Tuning-Free Stochastic Optimization
Tuning-Free Stochastic Optimization
Ahmed Khaled
Chi Jin
237
13
0
12 Feb 2024
ZeroPP: Unleashing Exceptional Parallelism Efficiency through
  Tensor-Parallelism-Free Methodology
ZeroPP: Unleashing Exceptional Parallelism Efficiency through Tensor-Parallelism-Free Methodology
Ding Tang
Lijuan Jiang
Jiecheng Zhou
Minxi Jin
Hengjie Li
Xingcheng Zhang
Zhiling Pei
Jidong Zhai
420
3
0
06 Feb 2024
Enhancing Transformer RNNs with Multiple Temporal Perspectives
Enhancing Transformer RNNs with Multiple Temporal Perspectives
Razvan-Gabriel Dumitru
Darius Peteleaza
Mihai Surdeanu
AI4TS
237
3
0
04 Feb 2024
Frequency Explains the Inverse Correlation of Large Language Models'
  Size, Training Data Amount, and Surprisal's Fit to Reading Times
Frequency Explains the Inverse Correlation of Large Language Models' Size, Training Data Amount, and Surprisal's Fit to Reading Times
Byung-Doh Oh
Shisen Yue
William Schuler
298
33
0
03 Feb 2024
Previous
123456...111213
Next