ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2204.06745
  4. Cited By
GPT-NeoX-20B: An Open-Source Autoregressive Language Model

GPT-NeoX-20B: An Open-Source Autoregressive Language Model

14 April 2022
Sid Black
Stella Biderman
Eric Hallahan
Quentin G. Anthony
Leo Gao
Laurence Golding
Horace He
Connor Leahy
Kyle McDonell
Jason Phang
Michael Pieler
USVSN Sai Prashanth
Shivanshu Purohit
Laria Reynolds
J. Tow
Benqi Wang
Samuel Weinbach
ArXivPDFHTML

Papers citing "GPT-NeoX-20B: An Open-Source Autoregressive Language Model"

50 / 554 papers shown
Title
Is Next Token Prediction Sufficient for GPT? Exploration on Code Logic
  Comprehension
Is Next Token Prediction Sufficient for GPT? Exploration on Code Logic Comprehension
Mengnan Qi
Yufan Huang
Yongqiang Yao
Maoquan Wang
Bin Gu
Neel Sundaresan
27
2
0
13 Apr 2024
Eagle and Finch: RWKV with Matrix-Valued States and Dynamic Recurrence
Eagle and Finch: RWKV with Matrix-Valued States and Dynamic Recurrence
Bo Peng
Daniel Goldstein
Quentin G. Anthony
Alon Albalak
Eric Alcaide
...
Bingchen Zhao
Qihang Zhao
Peng Zhou
Jian Zhu
Ruijie Zhu
51
74
0
08 Apr 2024
Untangle the KNOT: Interweaving Conflicting Knowledge and Reasoning
  Skills in Large Language Models
Untangle the KNOT: Interweaving Conflicting Knowledge and Reasoning Skills in Large Language Models
Yantao Liu
Zijun Yao
Xin Lv
Yuchen Fan
S. Cao
Jifan Yu
Lei Hou
Juanzi Li
41
2
0
04 Apr 2024
Min-K%++: Improved Baseline for Detecting Pre-Training Data from Large Language Models
Min-K%++: Improved Baseline for Detecting Pre-Training Data from Large Language Models
Jingyang Zhang
Jingwei Sun
Eric C. Yeats
Ouyang Yang
Martin Kuo
Jianyi Zhang
Hao Frank Yang
Hai Li
32
41
0
03 Apr 2024
Constrained Robotic Navigation on Preferred Terrains Using LLMs and
  Speech Instruction: Exploiting the Power of Adverbs
Constrained Robotic Navigation on Preferred Terrains Using LLMs and Speech Instruction: Exploiting the Power of Adverbs
F. Lotfi
F. Faraji
Nikhil Kakodkar
Travis Manderson
D. Meger
Gregory Dudek
LM&Ro
12
0
0
02 Apr 2024
Release of Pre-Trained Models for the Japanese Language
Release of Pre-Trained Models for the Japanese Language
Kei Sawada
Tianyu Zhao
Makoto Shing
Kentaro Mitsui
Akio Kaga
Yukiya Hono
Toshiaki Wakatsuki
Koh Mitsuda
27
10
0
02 Apr 2024
Beyond One-Size-Fits-All: Multi-Domain, Multi-Task Framework for
  Embedding Model Selection
Beyond One-Size-Fits-All: Multi-Domain, Multi-Task Framework for Embedding Model Selection
Vivek Khetan
22
0
0
30 Mar 2024
The Invalsi Benchmarks: measuring Linguistic and Mathematical
  understanding of Large Language Models in Italian
The Invalsi Benchmarks: measuring Linguistic and Mathematical understanding of Large Language Models in Italian
Andrea Esuli
Giovanni Puccetti
ELM
22
0
0
27 Mar 2024
FoC: Figure out the Cryptographic Functions in Stripped Binaries with LLMs
FoC: Figure out the Cryptographic Functions in Stripped Binaries with LLMs
Guoqiang Chen
Xiuwei Shang
Shaoyin Cheng
Yanming Zhang
Weiming Zhang
Neng H. Yu
N. Yu
92
2
0
27 Mar 2024
Continual Few-shot Event Detection via Hierarchical Augmentation
  Networks
Continual Few-shot Event Detection via Hierarchical Augmentation Networks
Chenlong Zhang
Pengfei Cao
Yubo Chen
Kang Liu
Zhiqiang Zhang
Mengshu Sun
Jun Zhao
30
3
0
26 Mar 2024
Language Models for Text Classification: Is In-Context Learning Enough?
Language Models for Text Classification: Is In-Context Learning Enough?
A. Edwards
Jose Camacho-Collados
LRM
41
17
0
26 Mar 2024
Large Language Models in Biomedical and Health Informatics: A
  Bibliometric Review
Large Language Models in Biomedical and Health Informatics: A Bibliometric Review
Huizi Yu
Lizhou Fan
Lingyao Li
Jiayan Zhou
Zihui Ma
...
Sijia He
Mingyu Jin
Yongfeng Zhang
Ashvin Gandhi
Xin Ma
LM&MA
32
11
0
24 Mar 2024
Comprehensive Reassessment of Large-Scale Evaluation Outcomes in LLMs: A
  Multifaceted Statistical Approach
Comprehensive Reassessment of Large-Scale Evaluation Outcomes in LLMs: A Multifaceted Statistical Approach
Kun Sun
Rong Wang
Anders Sogaard
29
3
0
22 Mar 2024
FollowIR: Evaluating and Teaching Information Retrieval Models to Follow
  Instructions
FollowIR: Evaluating and Teaching Information Retrieval Models to Follow Instructions
Orion Weller
Benjamin Chang
Sean MacAvaney
Kyle Lo
Arman Cohan
Benjamin Van Durme
Dawn J Lawrie
Luca Soldaini
63
27
0
22 Mar 2024
ChatGPT Alternative Solutions: Large Language Models Survey
ChatGPT Alternative Solutions: Large Language Models Survey
H. Alipour
Nick Pendar
Kohinoor Roy
LM&MA
27
4
0
21 Mar 2024
Dated Data: Tracing Knowledge Cutoffs in Large Language Models
Dated Data: Tracing Knowledge Cutoffs in Large Language Models
Jeffrey Cheng
Marc Marone
Orion Weller
Dawn J Lawrie
Daniel Khashabi
Benjamin Van Durme
59
12
0
19 Mar 2024
Rectifying Demonstration Shortcut in In-Context Learning
Rectifying Demonstration Shortcut in In-Context Learning
Joonwon Jang
Sanghwan Jang
Wonbin Kweon
Minjin Jeon
Hwanjo Yu
29
1
0
14 Mar 2024
The Garden of Forking Paths: Observing Dynamic Parameters Distribution
  in Large Language Models
The Garden of Forking Paths: Observing Dynamic Parameters Distribution in Large Language Models
Carlo Nicolini
Jacopo Staiano
Bruno Lepri
Raffaele Marino
MoE
26
1
0
13 Mar 2024
Language models scale reliably with over-training and on downstream
  tasks
Language models scale reliably with over-training and on downstream tasks
S. Gadre
Georgios Smyrnis
Vaishaal Shankar
Suchin Gururangan
Mitchell Wortsman
...
Y. Carmon
Achal Dave
Reinhard Heckel
Niklas Muennighoff
Ludwig Schmidt
ALM
ELM
LRM
103
40
0
13 Mar 2024
Mastering Text, Code and Math Simultaneously via Fusing Highly
  Specialized Language Models
Mastering Text, Code and Math Simultaneously via Fusing Highly Specialized Language Models
Ning Ding
Yulin Chen
Ganqu Cui
Xingtai Lv
Weilin Zhao
Ruobing Xie
Bowen Zhou
Zhiyuan Liu
Maosong Sun
ALM
MoMe
AI4CE
38
7
0
13 Mar 2024
MEIT: Multi-Modal Electrocardiogram Instruction Tuning on Large Language
  Models for Report Generation
MEIT: Multi-Modal Electrocardiogram Instruction Tuning on Large Language Models for Report Generation
Zhongwei Wan
Che Liu
Xin Wang
Chaofan Tao
Hui Shen
Zhenwu Peng
Jie Fu
Rossella Arcucci
Huaxiu Yao
Mi Zhang
47
7
0
07 Mar 2024
Reliable, Adaptable, and Attributable Language Models with Retrieval
Reliable, Adaptable, and Attributable Language Models with Retrieval
Akari Asai
Zexuan Zhong
Danqi Chen
Pang Wei Koh
Luke Zettlemoyer
Hanna Hajishirzi
Wen-tau Yih
KELM
RALM
41
53
0
05 Mar 2024
How Well Can Transformers Emulate In-context Newton's Method?
How Well Can Transformers Emulate In-context Newton's Method?
Angeliki Giannou
Liu Yang
Tianhao Wang
Dimitris Papailiopoulos
Jason D. Lee
27
16
0
05 Mar 2024
Online Training of Large Language Models: Learn while chatting
Online Training of Large Language Models: Learn while chatting
Juhao Liang
Ziwei Wang
Zhuoheng Ma
Jianquan Li
Zhiyi Zhang
Xiangbo Wu
Benyou Wang
KELM
37
3
0
04 Mar 2024
An Improved Traditional Chinese Evaluation Suite for Foundation Model
An Improved Traditional Chinese Evaluation Suite for Foundation Model
Zhi Rui Tam
Ya-Ting Pai
Yen-Wei Lee
Jun-Da Chen
Wei-Min Chu
Sega Cheng
Hong-Han Shuai
ELM
32
11
0
04 Mar 2024
Exploring the Efficacy of Large Language Models in Summarizing Mental
  Health Counseling Sessions: A Benchmark Study
Exploring the Efficacy of Large Language Models in Summarizing Mental Health Counseling Sessions: A Benchmark Study
Prottay Kumar Adhikary
Aseem Srivastava
Shivani Kumar
Salam Michael Singh
Puneet Manuja
Jini K. Gopinath
Vijay Krishnan
Swati Kedia
K. Deb
Tanmoy Chakraborty
AI4MH
36
8
0
29 Feb 2024
On the Societal Impact of Open Foundation Models
On the Societal Impact of Open Foundation Models
Sayash Kapoor
Rishi Bommasani
Kevin Klyman
Shayne Longpre
Ashwin Ramaswami
...
Victor Storchan
Daniel Zhang
Daniel E. Ho
Percy Liang
Arvind Narayanan
26
54
0
27 Feb 2024
Language Models for Code Completion: A Practical Evaluation
Language Models for Code Completion: A Practical Evaluation
M. Izadi
J. Katzy
Tim van Dam
Marc Otten
R. Popescu
A. van Deursen
ALM
ELM
39
22
0
25 Feb 2024
Fast Adversarial Attacks on Language Models In One GPU Minute
Fast Adversarial Attacks on Language Models In One GPU Minute
Vinu Sankar Sadasivan
Shoumik Saha
Gaurang Sriramanan
Priyatham Kattakinda
Atoosa Malemir Chegini
S. Feizi
MIALM
30
32
0
23 Feb 2024
MobileLLM: Optimizing Sub-billion Parameter Language Models for
  On-Device Use Cases
MobileLLM: Optimizing Sub-billion Parameter Language Models for On-Device Use Cases
Zechun Liu
Changsheng Zhao
Forrest N. Iandola
Chen Lai
Yuandong Tian
...
Ernie Chang
Yangyang Shi
Raghuraman Krishnamoorthi
Liangzhen Lai
Vikas Chandra
ALM
38
74
0
22 Feb 2024
Chain-of-Thought Unfaithfulness as Disguised Accuracy
Chain-of-Thought Unfaithfulness as Disguised Accuracy
Oliver Bentham
Nathan Stringham
Ana Marasović
LRM
HILM
37
8
0
22 Feb 2024
$Se^2$: Sequential Example Selection for In-Context Learning
Se2Se^2Se2: Sequential Example Selection for In-Context Learning
Haoyu Liu
Jianfeng Liu
Shaohan Huang
Yuefeng Zhan
Hao Sun
Weiwei Deng
Furu Wei
Qi Zhang
25
3
0
21 Feb 2024
DREsS: Dataset for Rubric-based Essay Scoring on EFL Writing
DREsS: Dataset for Rubric-based Essay Scoring on EFL Writing
Haneul Yoo
Jieun Han
So-Yeon Ahn
Alice H. Oh
19
4
0
21 Feb 2024
DrBenchmark: A Large Language Understanding Evaluation Benchmark for
  French Biomedical Domain
DrBenchmark: A Large Language Understanding Evaluation Benchmark for French Biomedical Domain
Yanis Labrak
Adrien Bazoge
Oumaima El Khettari
Mickael Rouvier
Pacome Constant dit Beaufils
...
B. Daille
Solen Quiniou
Emmanuel Morin
P. Gourraud
Richard Dufour
LM&MA
17
6
0
20 Feb 2024
The Hidden Space of Transformer Language Adapters
The Hidden Space of Transformer Language Adapters
Jesujoba Oluwadara Alabi
Marius Mosbach
Matan Eyal
Dietrich Klakow
Mor Geva
48
7
1
20 Feb 2024
Effective and Efficient Conversation Retrieval for Dialogue State
  Tracking with Implicit Text Summaries
Effective and Efficient Conversation Retrieval for Dialogue State Tracking with Implicit Text Summaries
Seanie Lee
Jianpeng Cheng
Joris Driesen
Alexandru Coca
Anders Johannsen
RALM
28
1
0
20 Feb 2024
BioMistral: A Collection of Open-Source Pretrained Large Language Models
  for Medical Domains
BioMistral: A Collection of Open-Source Pretrained Large Language Models for Medical Domains
Yanis Labrak
Adrien Bazoge
Emmanuel Morin
P. Gourraud
Mickael Rouvier
Richard Dufour
96
190
0
15 Feb 2024
Personalized Large Language Models
Personalized Large Language Models
Stanislaw Wo'zniak
Bartlomiej Koptyra
Arkadiusz Janz
P. Kazienko
Jan Kocoñ
16
18
0
14 Feb 2024
Can LLMs Learn New Concepts Incrementally without Forgetting?
Can LLMs Learn New Concepts Incrementally without Forgetting?
Junhao Zheng
Shengjie Qiu
Qianli Ma
CLL
27
0
0
13 Feb 2024
LoTa-Bench: Benchmarking Language-oriented Task Planners for Embodied
  Agents
LoTa-Bench: Benchmarking Language-oriented Task Planners for Embodied Agents
Jae-Woo Choi
Youngwoo Yoon
Hyobin Ong
Jaehong Kim
Minsu Jang
19
12
0
13 Feb 2024
Tuning-Free Stochastic Optimization
Tuning-Free Stochastic Optimization
Ahmed Khaled
Chi Jin
30
7
0
12 Feb 2024
ZeroPP: Unleashing Exceptional Parallelism Efficiency through
  Tensor-Parallelism-Free Methodology
ZeroPP: Unleashing Exceptional Parallelism Efficiency through Tensor-Parallelism-Free Methodology
Ding Tang
Lijuan Jiang
Jiecheng Zhou
Minxi Jin
Hengjie Li
Xingcheng Zhang
Zhiling Pei
Jidong Zhai
62
3
0
06 Feb 2024
Enhancing Transformer RNNs with Multiple Temporal Perspectives
Enhancing Transformer RNNs with Multiple Temporal Perspectives
Razvan-Gabriel Dumitru
Darius Peteleaza
Mihai Surdeanu
AI4TS
8
2
0
04 Feb 2024
Frequency Explains the Inverse Correlation of Large Language Models'
  Size, Training Data Amount, and Surprisal's Fit to Reading Times
Frequency Explains the Inverse Correlation of Large Language Models' Size, Training Data Amount, and Surprisal's Fit to Reading Times
Byung-Doh Oh
Shisen Yue
William Schuler
38
14
0
03 Feb 2024
Getting the most out of your tokenizer for pre-training and domain
  adaptation
Getting the most out of your tokenizer for pre-training and domain adaptation
Gautier Dagan
Gabriele Synnaeve
Baptiste Rozière
32
20
0
01 Feb 2024
OLMo: Accelerating the Science of Language Models
OLMo: Accelerating the Science of Language Models
Dirk Groeneveld
Iz Beltagy
Pete Walsh
Akshita Bhagia
Rodney Michael Kinney
...
Jesse Dodge
Kyle Lo
Luca Soldaini
Noah A. Smith
Hanna Hajishirzi
OSLM
130
355
0
01 Feb 2024
Does DetectGPT Fully Utilize Perturbation? Bridging Selective
  Perturbation to Fine-tuned Contrastive Learning Detector would be Better
Does DetectGPT Fully Utilize Perturbation? Bridging Selective Perturbation to Fine-tuned Contrastive Learning Detector would be Better
Shengchao Liu
Xiaoming Liu
Yichen Wang
Zehua Cheng
Chengzhengxu Li
Zhaohan Zhang
Y. Lan
Chao Shen
DeLMO
30
2
0
01 Feb 2024
Probing Language Models' Gesture Understanding for Enhanced Human-AI
  Interaction
Probing Language Models' Gesture Understanding for Enhanced Human-AI Interaction
Philipp Wicke
27
2
0
31 Jan 2024
TeenyTinyLlama: open-source tiny language models trained in Brazilian
  Portuguese
TeenyTinyLlama: open-source tiny language models trained in Brazilian Portuguese
N. Corrêa
Sophia Falk
Shiza Fatimah
Aniket Sen
N. D. Oliveira
20
9
0
30 Jan 2024
NoFunEval: Funny How Code LMs Falter on Requirements Beyond Functional
  Correctness
NoFunEval: Funny How Code LMs Falter on Requirements Beyond Functional Correctness
Manav Singhal
Tushar Aggarwal
Abhijeet Awasthi
Nagarajan Natarajan
Aditya Kanade
24
12
0
29 Jan 2024
Previous
12345...101112
Next