Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2204.06745
Cited By
GPT-NeoX-20B: An Open-Source Autoregressive Language Model
14 April 2022
Sid Black
Stella Biderman
Eric Hallahan
Quentin G. Anthony
Leo Gao
Laurence Golding
Horace He
Connor Leahy
Kyle McDonell
Jason Phang
Michael Pieler
USVSN Sai Prashanth
Shivanshu Purohit
Laria Reynolds
J. Tow
Benqi Wang
Samuel Weinbach
Re-assign community
ArXiv (abs)
PDF
HTML
HuggingFace (1 upvotes)
Github (7200★)
Papers citing
"GPT-NeoX-20B: An Open-Source Autoregressive Language Model"
50 / 603 papers shown
Smarter, Better, Faster, Longer: A Modern Bidirectional Encoder for Fast, Memory Efficient, and Long Context Finetuning and Inference
Benjamin Warner
Antoine Chaffin
Benjamin Clavié
Orion Weller
Oskar Hallström
...
Tom Aarsen
Nathan Cooper
Griffin Adams
Jeremy Howard
Iacopo Poli
458
402
0
18 Dec 2024
Optimizing AI-Assisted Code Generation
Simon Torka
Sahin Albayrak
266
1
0
14 Dec 2024
Code LLMs: A Taxonomy-based Survey
BigData Congress [Services Society] (BSS), 2024
Nishat Raihan
Christian D. Newman
Marcos Zampieri
377
4
0
11 Dec 2024
LA4SR: illuminating the dark proteome with generative AI
David R. Nelson
Ashish Kumar Jaiswal
Noha Ismail
Alexandra Mystikou
Kourosh Salehi-Ashtiani
174
0
0
11 Nov 2024
Towards Low-Resource Harmful Meme Detection with LMM Agents
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2024
Jianzhao Huang
Hongzhan Lin
Ziyan Liu
Ziyang Luo
Guang Chen
Jing Ma
224
23
0
08 Nov 2024
OpenCoder: The Open Cookbook for Top-Tier Code Large Language Models
Annual Meeting of the Association for Computational Linguistics (ACL), 2024
Siming Huang
Tianhao Cheng
J.K. Liu
Jiaran Hao
L. Song
...
Ge Zhang
Zili Wang
Yuan Qi
Yinghui Xu
Wei Chu
ALM
498
84
0
07 Nov 2024
Photon: Federated LLM Pre-Training
Lorenzo Sani
Alex Iacob
Zeyu Cao
Royson Lee
Bill Marino
...
Dongqi Cai
Zexi Li
Wanru Zhao
Xinchi Qiu
Nicholas D. Lane
AI4CE
317
17
0
05 Nov 2024
Normalization Layer Per-Example Gradients are Sufficient to Predict Gradient Noise Scale in Transformers
Neural Information Processing Systems (NeurIPS), 2024
Gavia Gray
Aman Tiwari
Shane Bergsma
Joel Hestness
357
2
0
01 Nov 2024
GigaCheck: Detecting LLM-generated Content
Irina Tolstykh
Aleksandra Tsybina
Sergey Yakubson
Aleksandr Gordeev
Vladimir Dokholyan
Maksim Kuprashevich
DeLMO
306
4
0
31 Oct 2024
TokenFormer: Rethinking Transformer Scaling with Tokenized Model Parameters
International Conference on Learning Representations (ICLR), 2024
Haiyang Wang
Yue Fan
Muhammad Ferjad Naeem
Yongqin Xian
J. E. Lenssen
Liwei Wang
F. Tombari
Bernt Schiele
386
9
0
30 Oct 2024
SVIP: Towards Verifiable Inference of Open-source Large Language Models
Yifan Sun
Yuhang Li
Yue Zhang
Yuchen Jin
Huan Zhang
328
3
0
29 Oct 2024
M2rc-Eval: Massively Multilingual Repository-level Code Completion Evaluation
Qingbin Liu
Ken Deng
Congnan Liu
Zhiqiang Wang
Shukai Liu
...
Zekun Wang
Guoan Zhang
Bangyu Xiang
Yuchi Xu
Jian Xu
208
16
0
28 Oct 2024
DeTeCtive: Detecting AI-generated Text via Multi-Level Contrastive Learning
Neural Information Processing Systems (NeurIPS), 2024
Xun Guo
Shan Zhang
Yongxin He
Ting Zhang
Wanquan Feng
Haibin Huang
Chongyang Ma
DeLMO
315
48
0
28 Oct 2024
Deep Optimizer States: Towards Scalable Training of Transformer Models Using Interleaved Offloading
International Middleware Conference (Middleware), 2024
Avinash Maurya
Jie Ye
M. Rafique
Franck Cappello
Bogdan Nicolae
178
7
0
26 Oct 2024
Reinforcement Learning for Aligning Large Language Models Agents with Interactive Environments: Quantifying and Mitigating Prompt Overfitting
North American Chapter of the Association for Computational Linguistics (NAACL), 2024
Mohamed Salim Aissi
Clément Romac
Thomas Carta
Sylvain Lamprier
Pierre-Yves Oudeyer
Olivier Sigaud
Laure Soulier
Nicolas Thome
279
3
0
25 Oct 2024
Self-Explained Keywords Empower Large Language Models for Code Generation
Lishui Fan
Mouxiang Chen
Zhongxin Liu
299
2
0
21 Oct 2024
Scalable Data Ablation Approximations for Language Models through Modular Training and Merging
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2024
Clara Na
Ian H. Magnusson
A. Jha
Tom Sherborne
Emma Strubell
Jesse Dodge
Pradeep Dasigi
MoMe
165
9
0
21 Oct 2024
Hallucination Detox: Sensitivity Dropout (SenD) for Large Language Model Training
Annual Meeting of the Association for Computational Linguistics (ACL), 2024
Shahrad Mohammadzadeh
Juan D. Guerra
Marco Bonizzato
Reihaneh Rabbany
Golnoosh Farnadi
HILM
412
0
0
20 Oct 2024
Adaptive Data Optimization: Dynamic Sample Selection with Scaling Laws
International Conference on Learning Representations (ICLR), 2024
Yiding Jiang
Allan Zhou
Zhili Feng
Sadhika Malladi
J. Zico Kolter
235
35
0
15 Oct 2024
TemporalBench: Benchmarking Fine-grained Temporal Understanding for Multimodal Video Models
Mu Cai
Reuben Tan
Jianrui Zhang
Bocheng Zou
Kai Zhang
...
Yao Dou
J. Park
Jianfeng Gao
Yong Jae Lee
Jianwei Yang
271
155
0
14 Oct 2024
Ada-K Routing: Boosting the Efficiency of MoE-based LLMs
Tongtian Yue
Longteng Guo
Jie Cheng
Xuange Gao
Qingbin Liu
MoE
292
8
0
14 Oct 2024
LLM-SmartAudit: Advanced Smart Contract Vulnerability Detection
Zhiyuan Wei
Jing Sun
Zijiang Zhang
Xianhao Zhang
Meng Li
Zhe Hou
283
25
0
12 Oct 2024
Enterprise Benchmarks for Large Language Model Evaluation
Bing Zhang
Mikio Takeuchi
Ryo Kawahara
Shubhi Asthana
Md. Maruf Hossain
Guang-Jie Ren
Kate Soule
Yada Zhu
ELM
227
4
0
11 Oct 2024
PLaMo-100B: A Ground-Up Language Model Designed for Japanese Proficiency
Preferred Elements
:
Kenshin Abe
Kaizaburo Chubachi
Yasuhiro Fujita
...
Yoshihiko Ozaki
Shotaro Sano
Shuji Suzuki
Tianqi Xu
Toshihiko Yanase
252
0
0
10 Oct 2024
LecPrompt: A Prompt-based Approach for Logical Error Correction with CodeBERT
Zhenyu Xu
Victor S. Sheng
KELM
230
2
0
10 Oct 2024
Detecting Training Data of Large Language Models via Expectation Maximization
Gyuwan Kim
Yang Li
Evangelia Spiliopoulou
Jie Ma
Miguel Ballesteros
MIALM
722
9
2
10 Oct 2024
FreqMark: Frequency-Based Watermark for Sentence-Level Detection of LLM-Generated Text
Zhenyu Xu
Kun Zhang
Victor S. Sheng
WaLM
197
6
0
09 Oct 2024
Which Programming Language and What Features at Pre-training Stage Affect Downstream Logical Inference Performance?
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2024
Fumiya Uchiyama
Takeshi Kojima
Andrew Gambardella
Qi Cao
Yusuke Iwasawa
Y. Matsuo
LRM
ReLM
206
4
0
09 Oct 2024
Fine-tuning can Help Detect Pretraining Data from Large Language Models
International Conference on Learning Representations (ICLR), 2024
Han Zhang
Songxin Zhang
Bingyi Jing
Jianguo Huang
445
4
0
09 Oct 2024
Round and Round We Go! What makes Rotary Positional Encodings useful?
International Conference on Learning Representations (ICLR), 2024
Federico Barbero
Alex Vitvitskyi
Christos Perivolaropoulos
Razvan Pascanu
Petar Velickovic
458
66
0
08 Oct 2024
DEPT: Decoupled Embeddings for Pre-training Language Models
International Conference on Learning Representations (ICLR), 2024
Alex Iacob
Lorenzo Sani
Meghdad Kurmanji
William F. Shen
Xinchi Qiu
Dongqi Cai
Yan Gao
Nicholas D. Lane
VLM
1.4K
2
0
07 Oct 2024
LLM-Pilot: Characterize and Optimize Performance of your LLM Inference Services
International Conference for High Performance Computing, Networking, Storage and Analysis (SC), 2024
Małgorzata Łazuka
Andreea Anghel
Thomas Parnell
260
19
0
03 Oct 2024
Training Language Models on Synthetic Edit Sequences Improves Code Synthesis
International Conference on Learning Representations (ICLR), 2024
Ulyana Piterbarg
Lerrel Pinto
Rob Fergus
SyDa
448
7
0
03 Oct 2024
Creative and Context-Aware Translation of East Asian Idioms with GPT-4
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2024
Kenan Tang
Peiyang Song
Yao Qin
Xifeng Yan
302
6
0
01 Oct 2024
Zero-Shot Detection of LLM-Generated Text using Token Cohesiveness
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2024
Shixuan Ma
Quan Wang
228
12
0
25 Sep 2024
Pretraining Data Detection for Large Language Models: A Divergence-based Calibration Method
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2024
Weichao Zhang
Ruqing Zhang
Jiafeng Guo
Maarten de Rijke
Yixing Fan
Xueqi Cheng
450
50
0
23 Sep 2024
Expanding Expressivity in Transformer Models with MöbiusAttention
Anna-Maria Halacheva
M. Nayyeri
Steffen Staab
227
1
0
08 Sep 2024
Con-ReCall: Detecting Pre-training Data in LLMs via Contrastive Decoding
International Conference on Computational Linguistics (COLING), 2024
Cheng Wang
Yiwei Wang
Bryan Hooi
Yujun Cai
Nanyun Peng
Kai-Wei Chang
403
15
0
05 Sep 2024
The AdEMAMix Optimizer: Better, Faster, Older
International Conference on Learning Representations (ICLR), 2024
Matteo Pagliardini
Pierre Ablin
David Grangier
ODL
328
23
0
05 Sep 2024
Comparing Discrete and Continuous Space LLMs for Speech Recognition
Interspeech (Interspeech), 2024
Yaoxun Xu
Shi-Xiong Zhang
Jianwei Yu
Zhiyong Wu
Dong Yu
AuLLM
273
14
0
01 Sep 2024
A Survey of Large Language Models for European Languages
Wazir Ali
S. Pyysalo
385
6
0
27 Aug 2024
Internal and External Knowledge Interactive Refinement Framework for Knowledge-Intensive Question Answering
Haowei Du
Dongyan Zhao
KELM
182
0
0
23 Aug 2024
ONSEP: A Novel Online Neural-Symbolic Framework for Event Prediction Based on Large Language Model
Annual Meeting of the Association for Computational Linguistics (ACL), 2024
Xuanqing Yu
Wangtao Sun
Jingwei Li
Kang Liu
Chengbao Liu
Jie Tan
OffRL
AI4TS
247
8
0
14 Aug 2024
Data Mixture Inference: What do BPE Tokenizers Reveal about their Training Data?
J. Hayase
Alisa Liu
Yejin Choi
Sewoong Oh
Noah A. Smith
366
19
0
23 Jul 2024
Consent in Crisis: The Rapid Decline of the AI Data Commons
Shayne Longpre
Robert Mahari
Ariel N. Lee
Campbell Lund
Hamidah Oderinwale
...
Hanlin Li
Daphne Ippolito
Sara Hooker
Jad Kabbara
Sandy Pentland
353
66
0
20 Jul 2024
The 2024 Foundation Model Transparency Index
Rishi Bommasani
Kevin Klyman
Sayash Kapoor
Shayne Longpre
Betty Xiong
Nestor Maslej
Abigail Z. Jacobs
ELM
321
5
0
17 Jul 2024
A Survey on Symbolic Knowledge Distillation of Large Language Models
Kamal Acharya
Alvaro Velasquez
Haoze Song
SyDa
288
24
0
12 Jul 2024
AutoBencher: Towards Declarative Benchmark Construction
Xiang Lisa Li
Emmy Liu
Abigail Z. Jacobs
Tatsunori Hashimoto
Percy Liang
Tatsunori Hashimoto
191
1
0
11 Jul 2024
A Review of the Challenges with Massive Web-mined Corpora Used in Large Language Models Pre-Training
Michał Perełkiewicz
Rafał Poświata
198
8
0
10 Jul 2024
Who is better at math, Jenny or Jingzhen? Uncovering Stereotypes in Large Language Models
Zara Siddique
Liam D. Turner
Luis Espinosa-Anke
225
2
0
09 Jul 2024
Previous
1
2
3
4
5
6
...
11
12
13
Next
Page 3 of 13
Page
of 13
Go