Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2205.01068
Cited By
OPT: Open Pre-trained Transformer Language Models
2 May 2022
Susan Zhang
Stephen Roller
Naman Goyal
Mikel Artetxe
Moya Chen
Shuohui Chen
Christopher Dewan
Mona T. Diab
Xian Li
Xi Victoria Lin
Todor Mihaylov
Myle Ott
Sam Shleifer
Kurt Shuster
Daniel Simig
Punit Singh Koura
Anjali Sridhar
Tianlu Wang
Luke Zettlemoyer
VLM
OSLM
AI4CE
Re-assign community
ArXiv
PDF
HTML
Papers citing
"OPT: Open Pre-trained Transformer Language Models"
50 / 647 papers shown
Title
Rectifying Demonstration Shortcut in In-Context Learning
Joonwon Jang
Sanghwan Jang
Wonbin Kweon
Minjin Jeon
Hwanjo Yu
29
1
0
14 Mar 2024
UniCode: Learning a Unified Codebook for Multimodal Large Language Models
Sipeng Zheng
Bohan Zhou
Yicheng Feng
Ye Wang
Zongqing Lu
VLM
MLLM
38
7
0
14 Mar 2024
DialogGen: Multi-modal Interactive Dialogue System for Multi-turn Text-to-Image Generation
Minbin Huang
Yanxin Long
Xinchi Deng
Ruihang Chu
Jiangfeng Xiong
Xiaodan Liang
Hong Cheng
Qinglin Lu
Wei Liu
MLLM
EGVM
65
8
0
13 Mar 2024
Beyond Text: Frozen Large Language Models in Visual Signal Comprehension
Lei Zhu
Fangyun Wei
Yanye Lu
MLLM
VLM
44
17
0
12 Mar 2024
SVD-LLM: Truncation-aware Singular Value Decomposition for Large Language Model Compression
Xin Wang
Yu Zheng
Zhongwei Wan
Mi Zhang
MQ
55
43
0
12 Mar 2024
ConspEmoLLM: Conspiracy Theory Detection Using an Emotion-Based Large Language Model
Zhiwei Liu
Boyang Liu
Paul Thompson
Kailai Yang
Sophia Ananiadou
32
3
0
11 Mar 2024
SaulLM-7B: A pioneering Large Language Model for Law
Pierre Colombo
T. Pires
Malik Boudiaf
Dominic Culver
Rui Melo
...
Andre F. T. Martins
Fabrizio Esposito
Vera Lúcia Raposo
Sofia Morgado
Michael Desa
ELM
AILaw
39
63
0
06 Mar 2024
MeaCap: Memory-Augmented Zero-shot Image Captioning
Zequn Zeng
Yan Xie
Hao Zhang
Chiyu Chen
Zhengjue Wang
Boli Chen
VLM
25
14
0
06 Mar 2024
WaterMax: breaking the LLM watermark detectability-robustness-quality trade-off
Eva Giboulot
Furon Teddy
WaLM
37
12
0
06 Mar 2024
HeteGen: Heterogeneous Parallel Inference for Large Language Models on Resource-Constrained Devices
Xuanlei Zhao
Bin Jia
Hao Zhou
Ziming Liu
Shenggan Cheng
Yang You
19
4
0
02 Mar 2024
FlexLLM: A System for Co-Serving Large Language Model Inference and Parameter-Efficient Finetuning
Xupeng Miao
Gabriele Oliaro
Xinhao Cheng
Vineeth Kada
Ruohan Gao
...
April Yang
Yingcheng Wang
Mengdi Wu
Colin Unger
Zhihao Jia
MoE
94
9
0
29 Feb 2024
Polos: Multimodal Metric Learning from Human Feedback for Image Captioning
Yuiga Wada
Kanta Kaneda
Daichi Saito
Komei Sugiura
34
24
0
28 Feb 2024
On the Challenges and Opportunities in Generative AI
Laura Manduchi
Kushagra Pandey
Robert Bamler
Ryan Cotterell
Sina Daubener
...
F. Wenzel
Frank Wood
Stephan Mandt
Vincent Fortuin
Vincent Fortuin
56
17
0
28 Feb 2024
Compass: A Decentralized Scheduler for Latency-Sensitive ML Workflows
Yuting Yang
Andrea Merlina
Weijia Song
Tiancheng Yuan
Ken Birman
Roman Vitenberg
41
0
0
27 Feb 2024
Agent-Pro: Learning to Evolve via Policy-Level Reflection and Optimization
Wenqi Zhang
Ke Tang
Hai Wu
Mengna Wang
Yongliang Shen
Guiyang Hou
Zeqi Tan
Peng Li
Y. Zhuang
Weiming Lu
LLMAG
36
36
0
27 Feb 2024
Multi-Bit Distortion-Free Watermarking for Large Language Models
Massieh Kordi Boroujeny
Ya Jiang
Kai Zeng
Brian L. Mark
WaLM
VLM
40
4
0
26 Feb 2024
An LLM-Enhanced Adversarial Editing System for Lexical Simplification
Keren Tan
Kangyang Luo
Yunshi Lan
Zheng Yuan
Jinlong Shu
AAML
24
5
0
22 Feb 2024
COPR: Continual Human Preference Learning via Optimal Policy Regularization
Han Zhang
Lin Gui
Yu Lei
Yuanzhao Zhai
Yehong Zhang
...
Hui Wang
Yue Yu
Kam-Fai Wong
Bin Liang
Ruifeng Xu
CLL
34
4
0
22 Feb 2024
Analysing The Impact of Sequence Composition on Language Model Pre-Training
Yu Zhao
Yuanbin Qu
Konrad Staniszewski
Szymon Tworkowski
Wei Liu
Piotr Milo's
Yuxiang Wu
Pasquale Minervini
34
14
0
21 Feb 2024
WKVQuant: Quantizing Weight and Key/Value Cache for Large Language Models Gains More
Yuxuan Yue
Zhihang Yuan
Haojie Duanmu
Sifan Zhou
Jianlong Wu
Liqiang Nie
MQ
32
42
0
19 Feb 2024
Machine-Generated Text Localization
Zhongping Zhang
Wenda Qin
Bryan A. Plummer
DeLMO
34
5
0
19 Feb 2024
Why Lift so Heavy? Slimming Large Language Models by Cutting Off the Layers
Shuzhou Yuan
Ercong Nie
Bolei Ma
Michael Farber
34
3
0
18 Feb 2024
Rethinking Human-like Translation Strategy: Integrating Drift-Diffusion Model with Large Language Models for Machine Translation
Hongbin Na
Zimu Wang
M. Maimaiti
Tong Chen
Wei Wang
Tao Shen
Ling Chen
LRM
20
5
0
16 Feb 2024
Smaller Language Models are capable of selecting Instruction-Tuning Training Data for Larger Language Models
Dheeraj Mekala
Alex Nguyen
Jingbo Shang
ALM
20
18
0
16 Feb 2024
Quantized Embedding Vectors for Controllable Diffusion Language Models
Cheng Kang
Xinye Chen
Yong Hu
Daniel Novak
23
0
0
15 Feb 2024
Fiddler: CPU-GPU Orchestration for Fast Inference of Mixture-of-Experts Models
Keisuke Kamahori
Tian Tang
Yile Gu
Kan Zhu
Baris Kasikci
63
20
0
10 Feb 2024
Large Language Models: A Survey
Shervin Minaee
Tomáš Mikolov
Narjes Nikzad
M. Asgari-Chenaghlu
R. Socher
Xavier Amatriain
Jianfeng Gao
ALM
LM&MA
ELM
120
364
0
09 Feb 2024
RepQuant: Towards Accurate Post-Training Quantization of Large Transformer Models via Scale Reparameterization
Zhikai Li
Xuewen Liu
Jing Zhang
Qingyi Gu
MQ
37
7
0
08 Feb 2024
Pretrained Generative Language Models as General Learning Frameworks for Sequence-Based Tasks
Ben Fauber
21
2
0
08 Feb 2024
SPHINX-X: Scaling Data and Parameters for a Family of Multi-modal Large Language Models
Chris Liu
Renrui Zhang
Longtian Qiu
Siyuan Huang
Weifeng Lin
...
Hao Shao
Pan Lu
Hongsheng Li
Yu Qiao
Peng Gao
MLLM
128
107
0
08 Feb 2024
Do Moral Judgment and Reasoning Capability of LLMs Change with Language? A Study using the Multilingual Defining Issues Test
Aditi Khandelwal
Utkarsh Agarwal
Kumar Tanmay
Monojit Choudhury
ELM
LRM
27
6
0
03 Feb 2024
Decoding Speculative Decoding
Minghao Yan
Saurabh Agarwal
Shivaram Venkataraman
LRM
27
5
0
02 Feb 2024
LLM-based NLG Evaluation: Current Status and Challenges
Mingqi Gao
Xinyu Hu
Jie Ruan
Xiao Pu
Xiaojun Wan
ELM
LM&MA
55
29
0
02 Feb 2024
CroissantLLM: A Truly Bilingual French-English Language Model
Manuel Faysse
Patrick Fernandes
Nuno M. Guerreiro
António Loison
Duarte M. Alves
...
François Yvon
André F.T. Martins
Gautier Viaud
C´eline Hudelot
Pierre Colombo
43
32
0
01 Feb 2024
Defining and Extracting generalizable interaction primitives from DNNs
Lu Chen
Siyu Lou
Benhao Huang
Quanshi Zhang
26
9
0
29 Jan 2024
VIALM: A Survey and Benchmark of Visually Impaired Assistance with Large Models
Yi Zhao
Yilin Zhang
Rong Xiang
Jing Li
Hillming Li
31
16
0
29 Jan 2024
Evaluating Gender Bias in Large Language Models via Chain-of-Thought Prompting
Masahiro Kaneko
Danushka Bollegala
Naoaki Okazaki
Timothy Baldwin
LRM
29
27
0
28 Jan 2024
To Burst or Not to Burst: Generating and Quantifying Improbable Text
Kuleen Sasse
Samuel Barham
Efsun Sarioglu Kayi
Edward W. Staley
DeLMO
21
1
0
27 Jan 2024
Large Language Model Adaptation for Financial Sentiment Analysis
Pau Rodriguez Inserte
Mariam Nakhlé
Raheel Qader
Gaëtan Caillaut
Jingshu Liu
17
13
0
26 Jan 2024
Looking Right is Sometimes Right: Investigating the Capabilities of Decoder-only LLMs for Sequence Labeling
David Dukić
Jan Šnajder
24
12
0
25 Jan 2024
Automated Root Causing of Cloud Incidents using In-Context Learning with GPT-4
Xuchao Zhang
Supriyo Ghosh
Chetan Bansal
Rujia Wang
Ming-Jie Ma
Yu Kang
Saravan Rajmohan
38
23
0
24 Jan 2024
SMUTF: Schema Matching Using Generative Tags and Hybrid Features
Yu Zhang
Mei Di
Haozheng Luo
Chenwei Xu
Richard Tzong-Han Tsai
57
7
0
22 Jan 2024
Medusa: Simple LLM Inference Acceleration Framework with Multiple Decoding Heads
Tianle Cai
Yuhong Li
Zhengyang Geng
Hongwu Peng
Jason D. Lee
De-huai Chen
Tri Dao
44
247
0
19 Jan 2024
DoraemonGPT: Toward Understanding Dynamic Scenes with Large Language Models (Exemplified as A Video Agent)
Zongxin Yang
Guikun Chen
Xiaodi Li
Wenguan Wang
Yi Yang
LM&Ro
LLMAG
60
35
0
16 Jan 2024
Only Send What You Need: Learning to Communicate Efficiently in Federated Multilingual Machine Translation
Yun-Wei Chu
Dong-Jun Han
Christopher G. Brinton
24
4
0
15 Jan 2024
Private Fine-tuning of Large Language Models with Zeroth-order Optimization
Xinyu Tang
Ashwinee Panda
Milad Nasr
Saeed Mahloujifar
Prateek Mittal
44
18
0
09 Jan 2024
FFSplit: Split Feed-Forward Network For Optimizing Accuracy-Efficiency Trade-off in Language Model Inference
Zirui Liu
Qingquan Song
Q. Xiao
Sathiya Keerthi Selvaraj
Rahul Mazumder
Aman Gupta
Xia Hu
32
4
0
08 Jan 2024
MERBench: A Unified Evaluation Benchmark for Multimodal Emotion Recognition
Zheng Lian
Licai Sun
Yong Ren
Hao Gu
Haiyang Sun
Lan Chen
Bin Liu
Jianhua Tao
15
12
0
07 Jan 2024
GOAT-Bench: Safety Insights to Large Multimodal Models through Meme-Based Social Abuse
Hongzhan Lin
Ziyang Luo
Bo Wang
Ruichao Yang
Jing Ma
37
24
0
03 Jan 2024
GenH2R: Learning Generalizable Human-to-Robot Handover via Scalable Simulation, Demonstration, and Imitation
Zifan Wang
Junyu Chen
Ziqing Chen
Pengwei Xie
Rui Chen
Li Yi
29
9
0
01 Jan 2024
Previous
1
2
3
...
5
6
7
...
11
12
13
Next