Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2205.01068
Cited By
OPT: Open Pre-trained Transformer Language Models
2 May 2022
Susan Zhang
Stephen Roller
Naman Goyal
Mikel Artetxe
Moya Chen
Shuohui Chen
Christopher Dewan
Mona T. Diab
Xian Li
Xi Victoria Lin
Todor Mihaylov
Myle Ott
Sam Shleifer
Kurt Shuster
Daniel Simig
Punit Singh Koura
Anjali Sridhar
Tianlu Wang
Luke Zettlemoyer
VLM
OSLM
AI4CE
Re-assign community
ArXiv
PDF
HTML
Papers citing
"OPT: Open Pre-trained Transformer Language Models"
50 / 587 papers shown
Title
Controllable Text Generation via Probability Density Estimation in the Latent Space
Yuxuan Gu
Xiaocheng Feng
Sicheng Ma
Lingyuan Zhang
Heng Gong
Weihong Zhong
Bing Qin
19
18
0
16 Dec 2022
Improving Chess Commentaries by Combining Language Models with Symbolic Reasoning Engines
Andrew Lee
David Wu
Emily Dinan
M. Lewis
LRM
25
7
0
15 Dec 2022
Attributed Question Answering: Evaluation and Modeling for Attributed Large Language Models
Bernd Bohnet
Vinh Q. Tran
Pat Verga
Roee Aharoni
D. Andor
...
Michael Collins
Dipanjan Das
Donald Metzler
Slav Petrov
Kellie Webster
41
59
0
15 Dec 2022
Prompting Is Programming: A Query Language for Large Language Models
Luca Beurer-Kellner
Marc Fischer
Martin Vechev
LRM
28
94
0
12 Dec 2022
Demystifying Prompts in Language Models via Perplexity Estimation
Hila Gonen
Srini Iyer
Terra Blevins
Noah A. Smith
Luke Zettlemoyer
LRM
25
195
0
08 Dec 2022
The problem with AI consciousness: A neurogenetic case against synthetic sentience
Yoshija Walter
L. Zbinden
11
1
0
07 Dec 2022
I2MVFormer: Large Language Model Generated Multi-View Document Supervision for Zero-Shot Image Classification
Muhammad Ferjad Naeem
Muhammad Gul Zain Ali Khan
Yongqin Xian
Muhammad Zeshan Afzal
D. Stricker
Luc Van Gool
F. Tombari
VLM
22
51
0
05 Dec 2022
Momentum Decoding: Open-ended Text Generation As Graph Exploration
Tian Lan
Yixuan Su
Shuhang Liu
Heyan Huang
Xian-Ling Mao
34
5
0
05 Dec 2022
Nonparametric Masked Language Modeling
Sewon Min
Weijia Shi
M. Lewis
Xilun Chen
Wen-tau Yih
Hannaneh Hajishirzi
Luke Zettlemoyer
RALM
40
48
0
02 Dec 2022
Self-Destructing Models: Increasing the Costs of Harmful Dual Uses of Foundation Models
Peter Henderson
E. Mitchell
Christopher D. Manning
Dan Jurafsky
Chelsea Finn
16
47
0
27 Nov 2022
Retrieval-Augmented Multimodal Language Modeling
Michihiro Yasunaga
Armen Aghajanyan
Weijia Shi
Rich James
J. Leskovec
Percy Liang
M. Lewis
Luke Zettlemoyer
Wen-tau Yih
RALM
11
95
0
22 Nov 2022
SmoothQuant: Accurate and Efficient Post-Training Quantization for Large Language Models
Guangxuan Xiao
Ji Lin
Mickael Seznec
Hao Wu
Julien Demouth
Song Han
MQ
59
728
0
18 Nov 2022
Ignore Previous Prompt: Attack Techniques For Language Models
Fábio Perez
Ian Ribeiro
SILM
20
396
0
17 Nov 2022
GAMMT: Generative Ambiguity Modeling Using Multiple Transformers
Xingcheng Xu
17
0
0
16 Nov 2022
On the Compositional Generalization Gap of In-Context Learning
Arian Hosseini
Ankit Vani
Dzmitry Bahdanau
Alessandro Sordoni
Aaron C. Courville
19
24
0
15 Nov 2022
Mind Your Bias: A Critical Review of Bias Detection Methods for Contextual Language Models
Silke Husse
Andreas Spitz
14
6
0
15 Nov 2022
Measuring Reliability of Large Language Models through Semantic Consistency
Harsh Raj
Domenic Rosati
S. Majumdar
HILM
22
30
0
10 Nov 2022
Collateral facilitation in humans and language models
J. Michaelov
Benjamin Bergen
9
11
0
09 Nov 2022
BLOOM: A 176B-Parameter Open-Access Multilingual Language Model
BigScience Workshop
:
Teven Le Scao
Angela Fan
Christopher Akiki
...
Zhongli Xie
Zifan Ye
M. Bras
Younes Belkada
Thomas Wolf
VLM
89
2,301
0
09 Nov 2022
Active Example Selection for In-Context Learning
Yiming Zhang
Shi Feng
Chenhao Tan
SILM
LRM
30
186
0
08 Nov 2022
LMentry: A Language Model Benchmark of Elementary Language Tasks
Avia Efrat
Or Honovich
Omer Levy
27
19
0
03 Nov 2022
Large Language Models Are Human-Level Prompt Engineers
Yongchao Zhou
Andrei Ioan Muresanu
Ziwen Han
Keiran Paster
Silviu Pitis
Harris Chan
Jimmy Ba
ALM
LLMAG
14
826
0
03 Nov 2022
SSD-LM: Semi-autoregressive Simplex-based Diffusion Language Model for Text Generation and Modular Control
Xiaochuang Han
Sachin Kumar
Yulia Tsvetkov
30
79
0
31 Oct 2022
GPTQ: Accurate Post-Training Quantization for Generative Pre-trained Transformers
Elias Frantar
Saleh Ashkboos
Torsten Hoefler
Dan Alistarh
MQ
14
882
0
31 Oct 2022
A Solvable Model of Neural Scaling Laws
A. Maloney
Daniel A. Roberts
J. Sully
29
51
0
30 Oct 2022
Class Based Thresholding in Early Exit Semantic Segmentation Networks
Alperen Görmez
Erdem Koyuncu
23
5
0
27 Oct 2022
Personalized Dialogue Generation with Persona-Adaptive Attention
Qiushi Huang
Yu Zhang
Tom Ko
Xubo Liu
Boyong Wu
Wenwu Wang
Lilian H. Y. Tang
26
19
0
27 Oct 2022
Weakly Supervised Data Augmentation Through Prompting for Dialogue Understanding
Maximillian Chen
Alexandros Papangelis
Chenyang Tao
Andrew Rosenbaum
Seokhwan Kim
Yang Liu
Zhou Yu
Dilek Z. Hakkani-Tür
39
32
0
25 Oct 2022
Towards Better Few-Shot and Finetuning Performance with Forgetful Causal Language Models
Hao Liu
Xinyang Geng
Lisa Lee
Igor Mordatch
Sergey Levine
Sharan Narang
Pieter Abbeel
KELM
CLL
33
2
0
24 Oct 2022
Neural Theory-of-Mind? On the Limits of Social Intelligence in Large LMs
Maarten Sap
Ronan Le Bras
Daniel Fried
Yejin Choi
19
205
0
24 Oct 2022
The Curious Case of Absolute Position Embeddings
Koustuv Sinha
Amirhossein Kazemnejad
Siva Reddy
J. Pineau
Dieuwke Hupkes
Adina Williams
77
15
0
23 Oct 2022
lo-fi: distributed fine-tuning without communication
Mitchell Wortsman
Suchin Gururangan
Shen Li
Ali Farhadi
Ludwig Schmidt
Michael G. Rabbat
Ari S. Morcos
19
24
0
19 Oct 2022
Attribution and Obfuscation of Neural Text Authorship: A Data Mining Perspective
Adaku Uchendu
Thai Le
Dongwon Lee
DeLMO
19
40
0
19 Oct 2022
Machine Generated Text: A Comprehensive Survey of Threat Models and Detection Methods
Evan Crothers
Nathalie Japkowicz
H. Viktor
DeLMO
25
107
0
13 Oct 2022
AlphaTuning: Quantization-Aware Parameter-Efficient Adaptation of Large-Scale Pre-Trained Language Models
S. Kwon
Jeonghoon Kim
Jeongin Bae
Kang Min Yoo
Jin-Hwa Kim
Baeseong Park
Byeongwook Kim
Jung-Woo Ha
Nako Sung
Dongsoo Lee
MQ
21
30
0
08 Oct 2022
Few-Shot Anaphora Resolution in Scientific Protocols via Mixtures of In-Context Experts
Nghia T. Le
Fan Bai
Alan Ritter
29
12
0
07 Oct 2022
Guess the Instruction! Flipped Learning Makes Language Models Stronger Zero-Shot Learners
Seonghyeon Ye
Doyoung Kim
Joel Jang
Joongbo Shin
Minjoon Seo
FedML
VLM
UQCV
LRM
11
25
0
06 Oct 2022
Large Language Models are Pretty Good Zero-Shot Video Game Bug Detectors
Mohammad Reza Taesiri
Finlay Macklon
Yihe Wang
Hengshuo Shen
C. Bezemer
ELM
LLMAG
MLLM
29
13
0
05 Oct 2022
Ask Me Anything: A simple strategy for prompting language models
Simran Arora
A. Narayan
Mayee F. Chen
Laurel J. Orr
Neel Guha
Kush S. Bhatia
Ines Chami
Frederic Sala
Christopher Ré
ReLM
LRM
206
206
0
05 Oct 2022
Robot Task Planning and Situation Handling in Open Worlds
Yan Ding
Xiaohan Zhang
S. Amiri
Nieqing Cao
Hao Yang
Chad Esselink
Shiqi Zhang
LM&Ro
22
19
0
04 Oct 2022
AudioGen: Textually Guided Audio Generation
Felix Kreuk
Gabriel Synnaeve
Adam Polyak
Uriel Singer
Alexandre Défossez
Jade Copet
Devi Parikh
Yaniv Taigman
Yossi Adi
DiffM
17
289
0
30 Sep 2022
Bidirectional Language Models Are Also Few-shot Learners
Ajay Patel
Bryan Li
Mohammad Sadegh Rasooli
Noah Constant
Colin Raffel
Chris Callison-Burch
LRM
62
45
0
29 Sep 2022
Deep Generative Multimedia Children's Literature
Matthew Lyle Olson
11
0
0
27 Sep 2022
Learning to Drop Out: An Adversarial Approach to Training Sequence VAEs
Ðorðe Miladinovic
Kumar Shridhar
Kushal Kumar Jain
Max B. Paulus
J. M. Buhmann
Mrinmaya Sachan
Carl Allen
DRL
21
5
0
26 Sep 2022
Variational Open-Domain Question Answering
Valentin Liévin
Andreas Geert Motzfeldt
Ida Riis Jensen
Ole Winther
OOD
BDL
26
8
0
23 Sep 2022
Generate rather than Retrieve: Large Language Models are Strong Context Generators
W. Yu
Dan Iter
Shuohang Wang
Yichong Xu
Mingxuan Ju
Soumya Sanyal
Chenguang Zhu
Michael Zeng
Meng-Long Jiang
RALM
AIMat
221
321
0
21 Sep 2022
Extremely Simple Activation Shaping for Out-of-Distribution Detection
Andrija Djurisic
Nebojsa Bozanic
Arjun Ashok
Rosanne Liu
OODD
158
150
0
20 Sep 2022
FP8 Formats for Deep Learning
Paulius Micikevicius
Dusan Stosic
N. Burgess
Marius Cornea
Pradeep Dubey
...
Naveen Mellempudi
S. Oberman
M. Shoeybi
Michael Siu
Hao Wu
BDL
VLM
MQ
67
121
0
12 Sep 2022
Analyzing Transformers in Embedding Space
Guy Dar
Mor Geva
Ankit Gupta
Jonathan Berant
19
83
0
06 Sep 2022
Diffusion Models: A Comprehensive Survey of Methods and Applications
Ling Yang
Zhilong Zhang
Yingxia Shao
Shenda Hong
Runsheng Xu
Yue Zhao
Wentao Zhang
Bin Cui
Ming-Hsuan Yang
DiffM
MedIm
224
1,300
0
02 Sep 2022
Previous
1
2
3
...
10
11
12
Next