ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2205.01068
  4. Cited By
OPT: Open Pre-trained Transformer Language Models
v1v2v3v4 (latest)

OPT: Open Pre-trained Transformer Language Models

2 May 2022
Susan Zhang
Stephen Roller
Naman Goyal
Mikel Artetxe
Moya Chen
Shuohui Chen
Christopher Dewan
Mona T. Diab
Xian Li
Xi Lin
Todor Mihaylov
Myle Ott
Sam Shleifer
Kurt Shuster
Daniel Simig
Punit Singh Koura
Anjali Sridhar
Tianlu Wang
Luke Zettlemoyer
    VLMOSLMAI4CE
ArXiv (abs)PDFHTMLHuggingFace (2 upvotes)

Papers citing "OPT: Open Pre-trained Transformer Language Models"

50 / 2,922 papers shown
SoK: Are Watermarks in LLMs Ready for Deployment?
SoK: Are Watermarks in LLMs Ready for Deployment?
Kieu Dang
Phung Lai
Nhathai Phan
Yelong Shen
Ruoming Jin
Abdallah Khreishah
My T. Thai
170
1
0
24 Dec 2025
KVNAND: Efficient On-Device Large Language Model Inference Using DRAM-Free In-Flash Computing
KVNAND: Efficient On-Device Large Language Model Inference Using DRAM-Free In-Flash Computing
Lishuo Deng
Shaojie Xu
Jinwu Chen
Changwei Yan
Jiajie Wang
Zhe Jiang
Weiwei Shan
82
0
0
03 Dec 2025
TokenPowerBench: Benchmarking the Power Consumption of LLM Inference
TokenPowerBench: Benchmarking the Power Consumption of LLM Inference
Chenxu Niu
Wei Zhang
Jie Li
Yongjian Zhao
Tongyang Wang
Xi-Zhao Wang
Yong Chen
57
1
0
02 Dec 2025
Fairy2i: Training Complex LLMs from Real LLMs with All Parameters in $\{\pm 1, \pm i\}$
Fairy2i: Training Complex LLMs from Real LLMs with All Parameters in {±1,±i}\{\pm 1, \pm i\}{±1,±i}
Feiyu Wang
Xinyu Tan
Bokai Huang
Yihao Zhang
Guoan Wang
Peizhuang Cong
Tong Yang
MQ
476
0
0
02 Dec 2025
Context-Enriched Contrastive Loss: Enhancing Presentation of Inherent Sample Connections in Contrastive Learning Framework
Context-Enriched Contrastive Loss: Enhancing Presentation of Inherent Sample Connections in Contrastive Learning FrameworkIEEE transactions on multimedia (TMM), 2025
Haojin Deng
Yimin Yang
69
0
0
01 Dec 2025
Tangram: Accelerating Serverless LLM Loading through GPU Memory Reuse and Affinity
Wenbin Zhu
Zhaoyan Shen
Z. Shao
Hongjun Dai
Feng Chen
23
0
0
01 Dec 2025
Comparative Analysis of 47 Context-Based Question Answer Models Across 8 Diverse Datasets
Comparative Analysis of 47 Context-Based Question Answer Models Across 8 Diverse Datasets
Muhammad Muneeb
David B. Ascher
Ahsan Baidar Bakht
94
0
0
29 Nov 2025
Serving Heterogeneous LoRA Adapters in Distributed LLM Inference Systems
Serving Heterogeneous LoRA Adapters in Distributed LLM Inference Systems
Shashwat Jaiswal
Shrikara Arun
Anjaly Parayil
Ankur Mallick
Spyros Mastorakis
...
Chloi Alverti
Renée St. Amant
Chetan Bansal
Victor Rühle
Josep Torrellas
111
0
0
28 Nov 2025
Experts are all you need: A Composable Framework for Large Language Model Inference
Experts are all you need: A Composable Framework for Large Language Model Inference
S. Sridharan
Sourjya Roy
A. Raghunathan
Kaushik Roy
MoE
173
0
0
28 Nov 2025
Towards Audio Token Compression in Large Audio Language Models
Towards Audio Token Compression in Large Audio Language Models
Saurabhchand Bhati
Samuel Thomas
Hilde Kuehne
Rogerio Feris
James R. Glass
AuLLM
310
0
0
26 Nov 2025
CDLM: Consistency Diffusion Language Models For Faster Sampling
CDLM: Consistency Diffusion Language Models For Faster Sampling
Minseo Kim
Chenfeng Xu
Coleman Hooper
Harman Singh
Ben Athiwaratkun
Ce Zhang
Kurt Keutzer
Amir Gholami
200
0
0
24 Nov 2025
FastForward Pruning: Efficient LLM Pruning via Single-Step Reinforcement Learning
FastForward Pruning: Efficient LLM Pruning via Single-Step Reinforcement Learning
Xin Yuan
S. Li
Jiateng Wei
Chengrui Zhu
Yanming Wu
Qingpeng Li
Jiajun Lv
Xiaoke Lan
Jun Chen
Yong-Jin Liu
OffRL
377
0
0
24 Nov 2025
Nemotron-Flash: Towards Latency-Optimal Hybrid Small Language Models
Nemotron-Flash: Towards Latency-Optimal Hybrid Small Language Models
Y. Fu
Xin Dong
Shizhe Diao
Matthijs Van Keirsbilck
Hanrong Ye
...
Maksim Khadkevich
A. Keller
Jan Kautz
Y. Lin
Pavlo Molchanov
164
0
0
24 Nov 2025
Adaptive Layer-Wise Transformations for Post-Training Quantization of Large Language Models
Adaptive Layer-Wise Transformations for Post-Training Quantization of Large Language Models
Cuong Pham
Hoang Anh Dung
Cuong C. Nguyen
Trung Le
G. Carneiro
Jianfei Cai
Thanh-Toan Do
MQ
146
0
0
21 Nov 2025
Layer-Wise High-Impact Parameter Ratio Optimization in Post-Training Quantization for Large Language Models
Layer-Wise High-Impact Parameter Ratio Optimization in Post-Training Quantization for Large Language Models
Cuong Pham
Hoang Anh Dung
Cuong C. Nguyen
Trung Le
G. Carneiro
Thanh-Toan Do
MQ
136
0
0
21 Nov 2025
R2Q: Towards Robust 2-Bit Large Language Models via Residual Refinement Quantization
R2Q: Towards Robust 2-Bit Large Language Models via Residual Refinement Quantization
Jiayi Chen
Jieqi Shi
Jing Huo
Chen Wu
MQ
176
0
0
21 Nov 2025
Robot Confirmation Generation and Action Planning Using Long-context Q-Former Integrated with Multimodal LLM
Robot Confirmation Generation and Action Planning Using Long-context Q-Former Integrated with Multimodal LLM
Chiori Hori
Yoshiki Masuyama
Siddarth Jain
Radu Corcodel
Devesh K. Jha
Diego Romeres
Jonathan Le Roux
101
0
0
21 Nov 2025
An Image Is Worth Ten Thousand Words: Verbose-Text Induction Attacks on VLMs
Zhi Luo
Zenghui Yuan
Wenqi Wei
Daizong Liu
P. Zhou
VLM
207
0
0
20 Nov 2025
10Cache: Heterogeneous Resource-Aware Tensor Caching and Migration for LLM Training
10Cache: Heterogeneous Resource-Aware Tensor Caching and Migration for LLM Training
Sabiha Afroz
Redwan Ibne Seraj Khan
Hadeel Albahar
Jingoo Han
A. R. Butt
157
0
0
18 Nov 2025
GPS: General Per-Sample Prompter
GPS: General Per-Sample Prompter
Pawel Batorski
Paul Swoboda
64
1
0
18 Nov 2025
Neo: Real-Time On-Device 3D Gaussian Splatting with Reuse-and-Update Sorting Acceleration
Neo: Real-Time On-Device 3D Gaussian Splatting with Reuse-and-Update Sorting Acceleration
Changhun Oh
Seongryong Oh
Jinwoo Hwang
Yoonsung Kim
Hardik Sharma
Jongse Park
3DGS
211
0
0
17 Nov 2025
Souper-Model: How Simple Arithmetic Unlocks State-of-the-Art LLM Performance
Souper-Model: How Simple Arithmetic Unlocks State-of-the-Art LLM Performance
Shalini Maiti
Amar Budhiraja
Bhavul Gauri
Gaurav Chaurasia
Anton Protopopov
...
Michael Slater
Despoina Magka
Tatiana Shavrina
Roberta Raileanu
Yoram Bachrach
MoMe
168
1
0
17 Nov 2025
MACKO: Sparse Matrix-Vector Multiplication for Low Sparsity
MACKO: Sparse Matrix-Vector Multiplication for Low Sparsity
Vladimír Macko
Vladimír Boža
136
0
0
17 Nov 2025
BitSnap: Checkpoint Sparsification and Quantization in LLM Training
BitSnap: Checkpoint Sparsification and Quantization in LLM Training
Yanxin Peng
Qingping Li
Baodong Wu
Shigang Li
Guohao Dai
Shengen Yan
Yu Wang
MQ
326
0
0
15 Nov 2025
Don't Think of the White Bear: Ironic Negation in Transformer Models Under Cognitive Load
Don't Think of the White Bear: Ironic Negation in Transformer Models Under Cognitive Load
Logan Mann
Nayan Saxena
Sarah Tandon
Chenhao Sun
Savar Toteja
Kevin Zhu
92
0
0
15 Nov 2025
OAD-Promoter: Enhancing Zero-shot VQA using Large Language Models with Object Attribute Description
OAD-Promoter: Enhancing Zero-shot VQA using Large Language Models with Object Attribute Description
Quanxing Xu
Ling Zhou
Feifei Zhang
Jinyu Tian
Rubing Huang
VLM
261
0
0
15 Nov 2025
Dynamic Temperature Scheduler for Knowledge Distillation
Dynamic Temperature Scheduler for Knowledge Distillation
Sibgat Ul Islam
Jawad Ibn Ahad
Fuad Rahman
M. R. Amin
Nabeel Mohammed
Shafin Rahman
102
0
0
14 Nov 2025
iSeal: Encrypted Fingerprinting for Reliable LLM Ownership Verification
iSeal: Encrypted Fingerprinting for Reliable LLM Ownership Verification
Zixun Xiong
Gaoyi Wu
Qingyang Yu
Mingyu Derek Ma
Lingfeng Yao
Miao Pan
Xiaojiang Du
Hao Wang
160
0
0
12 Nov 2025
LLM-GROP: Visually Grounded Robot Task and Motion Planning with Large Language Models
LLM-GROP: Visually Grounded Robot Task and Motion Planning with Large Language ModelsThe international journal of robotics research (IJRR), 2025
Xiaohan Zhang
Yan Ding
Yohei Hayamizu
Zainab Altaweel
Yifeng Zhu
Yuke Zhu
Peter Stone
Chris Paxton
Shiqi Zhang
LM&Ro
245
1
0
11 Nov 2025
ProcGen3D: Learning Neural Procedural Graph Representations for Image-to-3D Reconstruction
ProcGen3D: Learning Neural Procedural Graph Representations for Image-to-3D Reconstruction
Xinyi Zhang
Daoyi Gao
Naiqi Li
Angela Dai
204
0
0
10 Nov 2025
Teaching Pretrained Language Models to Think Deeper with Retrofitted Recurrence
Teaching Pretrained Language Models to Think Deeper with Retrofitted Recurrence
Sean McLeish
Ang Li
John Kirchenbauer
Dayal Singh Kalra
Brian Bartoldson
B. Kailkhura
Avi Schwarzschild
Jonas Geiping
Tom Goldstein
Micah Goldblum
279
2
0
10 Nov 2025
Rethinking Parameter Sharing as Graph Coloring for Structured Compression
Rethinking Parameter Sharing as Graph Coloring for Structured Compression
Boyang Zhang
Daning Cheng
Yunquan Zhang
184
0
0
10 Nov 2025
Private-RAG: Answering Multiple Queries with LLMs while Keeping Your Data Private
Private-RAG: Answering Multiple Queries with LLMs while Keeping Your Data Private
Ruihan Wu
Erchi Wang
Zhiyuan Zhang
Yu-Xiang Wang
SILM
266
0
0
10 Nov 2025
Ghost in the Transformer: Detecting Model Reuse with Invariant Spectral Signatures
Ghost in the Transformer: Detecting Model Reuse with Invariant Spectral Signatures
Suqing Wang
Ziyang Ma
Xinyi Li
Zuchao Li
160
0
0
09 Nov 2025
Chain-of-Thought as a Lens: Evaluating Structured Reasoning Alignment between Human Preferences and Large Language Models
Chain-of-Thought as a Lens: Evaluating Structured Reasoning Alignment between Human Preferences and Large Language Models
Boxuan Wang
Z. Li
Xinmiao Huang
Xiaowei Huang
Yi Dong
LRM
113
1
0
09 Nov 2025
HatePrototypes: Interpretable and Transferable Representations for Implicit and Explicit Hate Speech Detection
HatePrototypes: Interpretable and Transferable Representations for Implicit and Explicit Hate Speech Detection
Irina Proskurina
Marc-Antoine Carpentier
Julien Velcin
VLM
129
0
0
09 Nov 2025
DRAGON: Guard LLM Unlearning in Context via Negative Detection and Reasoning
DRAGON: Guard LLM Unlearning in Context via Negative Detection and ReasoningConference on Empirical Methods in Natural Language Processing (EMNLP), 2025
Y. Wang
Chris Yuhao Liu
Quan Liu
Jinglong Pang
Wei Wei
Yujia Bao
Yang Liu
MU
355
2
0
08 Nov 2025
The Future of Fully Homomorphic Encryption System: from a Storage I/O Perspective
The Future of Fully Homomorphic Encryption System: from a Storage I/O Perspective
Lei Chen
Erci Xu
Yiming Sun
Shengyu Fan
Xianglong Deng
...
Guang Fan
Liang Kong
Yilan Zhu
Shoumeng Yan
Mingzhe Zhang
84
1
0
07 Nov 2025
DartQuant: Efficient Rotational Distribution Calibration for LLM Quantization
DartQuant: Efficient Rotational Distribution Calibration for LLM Quantization
Yuantian Shao
Yuanteng Chen
Peisong Wang
Jianlin Yu
Jing Lin
Yiwu Yao
Zhihui Wei
Jian Cheng
MQ
365
1
0
06 Nov 2025
From Prompts to Power: Measuring the Energy Footprint of LLM Inference
From Prompts to Power: Measuring the Energy Footprint of LLM Inference
Francisco Caravaca
Ángel Cuevas
R. Cuevas
116
0
0
05 Nov 2025
FP8-Flow-MoE: A Casting-Free FP8 Recipe without Double Quantization Error
FP8-Flow-MoE: A Casting-Free FP8 Recipe without Double Quantization Error
Fengjuan Wang
Zhiyi Su
Xingzhu Hu
Cheng Wang
Mou Sun
MQ
127
0
0
04 Nov 2025
ConMeZO: Adaptive Descent-Direction Sampling for Gradient-Free Finetuning of Large Language Models
ConMeZO: Adaptive Descent-Direction Sampling for Gradient-Free Finetuning of Large Language Models
Lejs Deen Behric
Liang Zhang
Bingcong Li
K. K. Thekumparampil
141
0
0
04 Nov 2025
Analyzing the Power of Chain of Thought through Memorization Capabilities
Analyzing the Power of Chain of Thought through Memorization Capabilities
Lijia Yu
Xiao-Shan Gao
Lijun Zhang
LRMELM
216
0
0
03 Nov 2025
A CPU-Centric Perspective on Agentic AI
A CPU-Centric Perspective on Agentic AI
Ritik Raj
Hong Wang
Tushar Krishna
313
0
0
01 Nov 2025
Encoder-Decoder or Decoder-Only? Revisiting Encoder-Decoder Large Language Model
Encoder-Decoder or Decoder-Only? Revisiting Encoder-Decoder Large Language Model
Biao Zhang
Yong Cheng
Siamak Shakeri
Xinyi Wang
Min Ma
Orhan Firat
148
1
0
30 Oct 2025
MMEdge: Accelerating On-device Multimodal Inference via Pipelined Sensing and Encoding
MMEdge: Accelerating On-device Multimodal Inference via Pipelined Sensing and Encoding
Runxi Huang
Mingxuan Yu
Mingyu Tsoi
Xiaomin Ouyang
281
0
0
29 Oct 2025
Layer of Truth: Probing Belief Shifts under Continual Pre-Training Poisoning
Layer of Truth: Probing Belief Shifts under Continual Pre-Training Poisoning
S. Churina
Niranjan Chebrolu
Kokil Jaidka
KELMHILMCLL
369
0
0
29 Oct 2025
MISA: Memory-Efficient LLMs Optimization with Module-wise Importance Sampling
MISA: Memory-Efficient LLMs Optimization with Module-wise Importance Sampling
Yuxi Liu
Renjia Deng
Yutong He
Xue Wang
Tao Yao
Kun Yuan
148
0
0
28 Oct 2025
Language Model Behavioral Phases are Consistent Across Architecture, Training Data, and Scale
Language Model Behavioral Phases are Consistent Across Architecture, Training Data, and Scale
J. Michaelov
Roger P. Levy
Benjamin Bergen
AI4TS
133
0
0
28 Oct 2025
DualCap: Enhancing Lightweight Image Captioning via Dual Retrieval with Similar Scenes Visual Prompts
DualCap: Enhancing Lightweight Image Captioning via Dual Retrieval with Similar Scenes Visual Prompts
Binbin Li
Guimiao Yang
Zisen Qi
Haiping Wang
Yu Ding
VLM
337
0
0
28 Oct 2025
1234...575859
Next
Page 1 of 59
Pageof 59