Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2205.01068
Cited By
v1
v2
v3
v4 (latest)
OPT: Open Pre-trained Transformer Language Models
2 May 2022
Susan Zhang
Stephen Roller
Naman Goyal
Mikel Artetxe
Moya Chen
Shuohui Chen
Christopher Dewan
Mona T. Diab
Xian Li
Xi Lin
Todor Mihaylov
Myle Ott
Sam Shleifer
Kurt Shuster
Daniel Simig
Punit Singh Koura
Anjali Sridhar
Tianlu Wang
Luke Zettlemoyer
VLM
OSLM
AI4CE
Re-assign community
ArXiv (abs)
PDF
HTML
HuggingFace (2 upvotes)
Papers citing
"OPT: Open Pre-trained Transformer Language Models"
50 / 2,922 papers shown
Rethinking Memorization Measures and their Implications in Large Language Models
Bishwamittra Ghosh
Soumi Das
Qinyuan Wu
Mohammad Aflah Khan
Krishna P. Gummadi
Evimaria Terzi
Deepak Garg
PILM
243
0
0
20 Jul 2025
Exploring the Dynamic Scheduling Space of Real-Time Generative AI Applications on Emerging Heterogeneous Systems
Rachid Karami
Rajeev Patwari
Hyoukjun Kwon
Ashish Sirasao
95
1
0
19 Jul 2025
ELK: Exploring the Efficiency of Inter-core Connected AI Chips with Deep Learning Compiler Techniques
Yiqi Liu
Y. Xue
Noelle Crawford
Jilong Xue
Jian Huang
133
2
0
15 Jul 2025
Pimba: A Processing-in-Memory Acceleration for Post-Transformer Large Language Model Serving
Wonung Kim
Yubin Lee
Yoonsung Kim
Jinwoo Hwang
Seongryong Oh
...
Aziz Huseynov
Woong Gyu Park
Chang Hyun Park
Divya Mahajan
Jongse Park
634
3
0
14 Jul 2025
Heterogeneous User Modeling for LLM-based Recommendation
Honghui Bao
Wenjie Wang
Xinyu Lin
Fengbin Zhu
Teng Sun
Fuli Feng
Tat-Seng Chua
198
0
0
07 Jul 2025
Pay Attention to Small Weights
Chao Zhou
Tom Jacobs
Advait Gadhikar
R. Burkholz
240
0
0
26 Jun 2025
Multi-Amateur Contrastive Decoding for Text Generation
Jaydip Sen
S. Dasgupta
Hetvi Waghela
189
2
0
22 Jun 2025
Large Language Models as Psychological Simulators: A Methodological Guide
Zhicheng Lin
LLMAG
247
2
0
20 Jun 2025
SmartGuard: Leveraging Large Language Models for Network Attack Detection through Audit Log Analysis and Summarization
Hao Zhang
Shuo Shao
Song Li
Zhenyu Zhong
Yan Liu
Zhan Qin
K. Ren
261
1
0
20 Jun 2025
Language-driven Description Generation and Common Sense Reasoning for Video Action Recognition
Xiaodan Hu
Chuhang Zou
Suchen Wang
Jaechul Kim
Narendra Ahuja
LRM
171
0
0
20 Jun 2025
EvoLM: In Search of Lost Language Model Training Dynamics
Zhenting Qi
Fan Nie
Alexandre Alahi
James Zou
Himabindu Lakkaraju
Yilun Du
Eric P. Xing
Sham Kakade
Hanlin Zhang
319
2
0
19 Jun 2025
REIS: A High-Performance and Energy-Efficient Retrieval System with In-Storage Processing
International Symposium on Computer Architecture (ISCA), 2025
Kangqi Chen
Andreas Kosmas Kakolyris
Rakesh Nadig
Manos Frouzakis
Nika Mansouri-Ghiasi
Yu Liang
Haiyu Mao
Mohammad Sadrosadati
Mohammad Sadrosadati
Onur Mutlu
RALM
278
1
0
19 Jun 2025
Memory-Efficient Differentially Private Training with Gradient Random Projection
Alex Mulrooney
Devansh Gupta
James Flemings
Huanyu Zhang
Murali Annavaram
Meisam Razaviyayn
Xinwei Zhang
248
1
0
18 Jun 2025
All is Not Lost: LLM Recovery without Checkpoints
Nikolay Blagoev
Oğuzhan Ersoy
Lydia Yiyu Chen
226
1
0
18 Jun 2025
Mixture of Weight-shared Heterogeneous Group Attention Experts for Dynamic Token-wise KV Optimization
Guanghui Song
Dongping Liao
Yiren Zhao
Kejiang Ye
Cheng-zhong Xu
X. Gao
MoE
182
0
0
16 Jun 2025
MEraser: An Effective Fingerprint Erasure Approach for Large Language Models
Annual Meeting of the Association for Computational Linguistics (ACL), 2025
Jingxuan Zhang
Zhenhua Xu
Rui Hu
Wenpeng Xing
Xuhong Zhang
Meng Han
AAML
204
13
0
14 Jun 2025
Exploring Cultural Variations in Moral Judgments with Large Language Models
Hadi Mohammadi
Efthymia Papadopoulou
203
1
0
14 Jun 2025
Fed-HeLLo: Efficient Federated Foundation Model Fine-Tuning with Heterogeneous LoRA Allocation
IEEE Transactions on Neural Networks and Learning Systems (IEEE TNNLS), 2025
Zikai Zhang
Ping Liu
Jiahao Xu
Rui Hu
228
5
0
13 Jun 2025
NoLoCo: No-all-reduce Low Communication Training Method for Large Models
Jari Kolehmainen
Nikolay Blagoev
John Donaghy
Oğuzhan Ersoy
Christopher Nies
285
0
0
12 Jun 2025
SLICK: Selective Localization and Instance Calibration for Knowledge-Enhanced Car Damage Segmentation in Automotive Insurance
Teerapong Panboonyuen
331
34
0
12 Jun 2025
Surprisal from Larger Transformer-based Language Models Predicts fMRI Data More Poorly
Yi-Chien Lin
William Schuler
139
1
0
12 Jun 2025
SLED: A Speculative LLM Decoding Framework for Efficient Edge Serving
Xiangchen Li
Dimitrios Spatharakis
Saeid Ghafouri
Jiakun Fan
Dimitrios Nikolopoulos
Deepu John
Bo Ji
Dimitrios S. Nikolopoulos
411
6
0
11 Jun 2025
How Good LLM-Generated Password Policies Are?
Vivek Vaidya
Aditya Patwardhan
Ashish Kundu
214
1
0
10 Jun 2025
LlamaRec-LKG-RAG: A Single-Pass, Learnable Knowledge Graph-RAG Framework for LLM-Based Ranking
Vahid Azizi
Fatemeh Koochaki
AI4TS
158
0
0
09 Jun 2025
ReCogDrive: A Reinforced Cognitive Framework for End-to-End Autonomous Driving
Yongkang Li
Kaixin Xiong
Xiangyu Guo
Fang Li
Sixu Yan
...
Guang Chen
Hangjun Ye
Wenyu Liu
Xinggang Wang
Xinggang Wang
VLM
270
5
0
09 Jun 2025
FREE: Fast and Robust Vision Language Models with Early Exits
Annual Meeting of the Association for Computational Linguistics (ACL), 2025
Divya J. Bajpai
M. Hanawal
VLM
149
3
0
07 Jun 2025
Being Strong Progressively! Enhancing Knowledge Distillation of Large Language Models through a Curriculum Learning Framework
Lingyuan Liu
Mengxiang Zhang
187
0
0
06 Jun 2025
BAQ: Efficient Bit Allocation Quantization for Large Language Models
Chao Zhang
Li Wang
S. Lasaulce
Mérouane Debbah
MQ
219
0
0
06 Jun 2025
Eigenspectrum Analysis of Neural Networks without Aspect Ratio Bias
Yuanzhe Hu
Kinshuk Goel
Vlad Killiakov
Yaoqing Yang
311
3
0
06 Jun 2025
SECNEURON: Reliable and Flexible Abuse Control in Local LLMs via Hybrid Neuron Encryption
Zhiqiang Wang
Haohua Du
Junyang Wang
Haifeng Sun
Kaiwen Guo
Haikuo Yu
Chao Liu
Xiang-Yang Li
AAML
327
0
0
05 Jun 2025
Kinetics: Rethinking Test-Time Scaling Laws
Ranajoy Sadhukhan
Zhuoming Chen
Haizhong Zheng
Yang Zhou
Emma Strubell
Beidi Chen
458
6
0
05 Jun 2025
Unifying Uniform and Binary-coding Quantization for Accurate Compression of Large Language Models
Annual Meeting of the Association for Computational Linguistics (ACL), 2025
Seungcheol Park
Jeongin Bae
Beomseok Kwon
Minjun Kim
Byeongwook Kim
S. Kwon
U. Kang
Dongsoo Lee
MQ
381
0
0
04 Jun 2025
Leveraging Coordinate Momentum in SignSGD and Muon: Memory-Optimized Zero-Order
Egor Petrov
Grigoriy Evseev
Aleksey Antonov
Andrey Veprikov
Nikolay Bushkov
Nikolay Bushkov
Stanislav Moiseev
425
2
0
04 Jun 2025
Unpacking Let Alone: Human-Scale Models Generalize to a Rare Construction in Form but not Meaning
Wesley Scivetti
Tatsuya Aoyama
Ethan Wilcox
Nathan Schneider
192
4
0
04 Jun 2025
Accurate Sublayer Pruning for Large Language Models by Exploiting Latency and Tunability Information
International Joint Conference on Artificial Intelligence (IJCAI), 2025
Seungcheol Park
Sojin Lee
Jongjin Kim
Jinsik Lee
Hyunjik Jo
U. Kang
285
3
0
04 Jun 2025
Generating 6DoF Object Manipulation Trajectories from Action Description in Egocentric Vision
Computer Vision and Pattern Recognition (CVPR), 2025
Tomoya Yoshida
Shuhei Kurita
Taichi Nishimura
Shinsuke Mori
292
2
0
04 Jun 2025
AhaKV: Adaptive Holistic Attention-Driven KV Cache Eviction for Efficient Inference of Large Language Models
Yifeng Gu
Zicong Jiang
Jianxiu Jin
K. Guo
Ziyang Zhang
Xiangmin Xu
251
0
0
04 Jun 2025
Struct2D: A Perception-Guided Framework for Spatial Reasoning in MLLMs
Fangrui Zhu
Hanhui Wang
Yiming Xie
Jing Gu
Tianye Ding
Jianwei Yang
Huaizu Jiang
3DV
LRM
464
0
0
04 Jun 2025
Beyond Text Compression: Evaluating Tokenizers Across Scales
Annual Meeting of the Association for Computational Linguistics (ACL), 2025
Jonas F. Lotz
António V. Lopes
Stephan Peitz
Hendra Setiawan
Leonardo Emili
278
3
0
03 Jun 2025
ProcrustesGPT: Compressing LLMs with Structured Matrices and Orthogonal Transformations
Annual Meeting of the Association for Computational Linguistics (ACL), 2025
Ekaterina Grishina
Mikhail Gorbunov
Maxim Rakhuba
176
0
0
03 Jun 2025
Fighting Fire with Fire (F3): A Training-free and Efficient Visual Adversarial Example Purification Method in LVLMs
Yudong Zhang
Ruobing Xie
Yiqing Huang
Jiansheng Chen
Xingwu Sun
Zhanhui Kang
Di Wang
Yu Wang
AAML
342
1
0
01 Jun 2025
Improving Dialogue State Tracking through Combinatorial Search for In-Context Examples
Annual Meeting of the Association for Computational Linguistics (ACL), 2025
Haesung Pyun
Yoonah Park
Yohan Jo
203
0
0
31 May 2025
LittleBit: Ultra Low-Bit Quantization via Latent Factorization
Banseok Lee
Dongkyu Kim
Youngcheon You
Youngmin Kim
MQ
236
4
0
30 May 2025
Curse of High Dimensionality Issue in Transformer for Long-context Modeling
Shuhai Zhang
Zeng You
Yaofo Chen
Z. Wen
Qianyue Wang
Zhijie Qiu
Yuanqing Li
Mingkui Tan
402
1
0
28 May 2025
Enhancing Long-Chain Reasoning Distillation through Error-Aware Self-Reflection
Z. Wu
Xinze Li
Zhenghao Liu
Shi Yu
Zhiyuan Liu
Minghe Yu
Cheng Yang
Yu Gu
Ge Yu
Maosong Sun
LRM
292
0
0
28 May 2025
ACE: Exploring Activation Cosine Similarity and Variance for Accurate and Calibration-Efficient LLM Pruning
Zhendong Mi
Zhenglun Kong
Geng Yuan
Shaoyi Huang
244
2
0
28 May 2025
Beyond path selection: Better LLMs for Scientific Information Extraction with MimicSFT and Relevance and Rule-induced(R
2
^2
2
)GRPO
Ran Li
Hanmo Liu
Yuchen Liu
Chen Jing
Yu Qiu
Lei Chen
LRM
267
0
0
28 May 2025
Evaluation of LLMs in Speech is Often Flawed: Test Set Contamination in Large Language Models for Speech Recognition
Yuan Tseng
Titouan Parcollet
Rogier van Dalen
Shucong Zhang
Sourav Bhattacharya
299
2
0
28 May 2025
RISE: Reasoning Enhancement via Iterative Self-Exploration in Multi-hop Question Answering
Annual Meeting of the Association for Computational Linguistics (ACL), 2025
Bolei He
Xinran He
Mengke Chen
Xianwei Xue
Ying Zhu
Zhenhua Ling
ReLM
LRM
254
1
0
28 May 2025
Highly Efficient and Effective LLMs with Multi-Boolean Architectures
Ba-Hien Tran
Van Minh Nguyen
MQ
378
0
0
28 May 2025
Previous
1
2
3
4
5
6
...
57
58
59
Next
Page 5 of 59
Page
of 59
Go