Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2205.01068
Cited By
v1
v2
v3
v4 (latest)
OPT: Open Pre-trained Transformer Language Models
2 May 2022
Susan Zhang
Stephen Roller
Naman Goyal
Mikel Artetxe
Moya Chen
Shuohui Chen
Christopher Dewan
Mona T. Diab
Xian Li
Xi Lin
Todor Mihaylov
Myle Ott
Sam Shleifer
Kurt Shuster
Daniel Simig
Punit Singh Koura
Anjali Sridhar
Tianlu Wang
Luke Zettlemoyer
VLM
OSLM
AI4CE
Re-assign community
ArXiv (abs)
PDF
HTML
HuggingFace (2 upvotes)
Papers citing
"OPT: Open Pre-trained Transformer Language Models"
50 / 2,924 papers shown
Toolformer: Language Models Can Teach Themselves to Use Tools
Neural Information Processing Systems (NeurIPS), 2023
Timo Schick
Jane Dwivedi-Yu
Roberto Dessì
Roberta Raileanu
Maria Lomeli
Luke Zettlemoyer
Nicola Cancedda
Thomas Scialom
SyDa
RALM
472
2,744
0
09 Feb 2023
GPTScore: Evaluate as You Desire
North American Chapter of the Association for Computational Linguistics (NAACL), 2023
Jinlan Fu
See-Kiong Ng
Zhengbao Jiang
Pengfei Liu
LM&MA
ALM
ELM
394
407
0
08 Feb 2023
Revisiting Offline Compression: Going Beyond Factorization-based Methods for Transformer Language Models
Findings (Findings), 2023
Mohammadreza Banaei
Klaudia Bałazy
Artur Kasymov
R. Lebret
Jacek Tabor
Karl Aberer
OffRL
154
1
0
08 Feb 2023
ChatGPT versus Traditional Question Answering for Knowledge Graphs: Current Status and Future Directions Towards Knowledge Graph Chatbots
Reham Omar
Omij Mangukiya
Panos Kalnis
Essam Mansour
AI4MH
147
91
0
08 Feb 2023
Augmenting Zero-Shot Dense Retrievers with Plug-in Mixture-of-Memories
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Suyu Ge
Chenyan Xiong
Corby Rosset
Arnold Overwijk
Jiawei Han
Paul N. Bennett
VLM
157
11
0
07 Feb 2023
Hard Prompts Made Easy: Gradient-Based Discrete Optimization for Prompt Tuning and Discovery
Neural Information Processing Systems (NeurIPS), 2023
Yuxin Wen
Neel Jain
John Kirchenbauer
Micah Goldblum
Jonas Geiping
Tom Goldstein
VLM
DiffM
336
363
1
07 Feb 2023
PLACES: Prompting Language Models for Social Conversation Synthesis
Findings (Findings), 2023
Maximillian Chen
Alexandros Papangelis
Chenyang Tao
Seokhwan Kim
Andrew Rosenbaum
Yang Liu
Zhou Yu
Dilek Z. Hakkani-Tür
284
97
0
07 Feb 2023
Chain of Hindsight Aligns Language Models with Feedback
International Conference on Learning Representations (ICLR), 2023
Hao Liu
Carmelo Sferrazza
Pieter Abbeel
ALM
810
149
0
06 Feb 2023
The Gradient of Generative AI Release: Methods and Considerations
Conference on Fairness, Accountability and Transparency (FAccT), 2023
Irene Solaiman
197
125
0
05 Feb 2023
FineDeb: A Debiasing Framework for Language Models
Akash Saravanan
Dhruv Mullick
Habibur Rahman
Nidhi Hegde
FedML
AI4CE
176
7
0
05 Feb 2023
Quantized Distributed Training of Large Models with Convergence Guarantees
International Conference on Machine Learning (ICML), 2023
I. Markov
Adrian Vladu
Qi Guo
Dan Alistarh
MQ
269
16
0
05 Feb 2023
The Science of Detecting LLM-Generated Texts
Communications of the ACM (CACM), 2023
Ruixiang Tang
Yu-Neng Chuang
Helen Zhou
DeLMO
395
236
0
04 Feb 2023
Describe, Explain, Plan and Select: Interactive Planning with Large Language Models Enables Open-World Multi-Task Agents
Zihao Wang
Shaofei Cai
Guanzhou Chen
Hoang Trung-Dung
Xiaojian Ma
Yitao Liang
LM&Ro
LLMAG
512
435
0
03 Feb 2023
Language Quantized AutoEncoders: Towards Unsupervised Text-Image Alignment
Neural Information Processing Systems (NeurIPS), 2023
Hao Liu
Wilson Yan
Pieter Abbeel
254
34
0
02 Feb 2023
Using In-Context Learning to Improve Dialogue Safety
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Nicholas Meade
Spandana Gella
Devamanyu Hazarika
Prakhar Gupta
Di Jin
Siva Reddy
Yang Liu
Dilek Z. Hakkani-Tür
268
50
0
02 Feb 2023
Synthetic Prompting: Generating Chain-of-Thought Demonstrations for Large Language Models
International Conference on Machine Learning (ICML), 2023
Zhihong Shao
Yeyun Gong
Yelong Shen
Shiyu Huang
Nan Duan
Weizhu Chen
ReLM
LRM
220
89
0
01 Feb 2023
Analyzing Feed-Forward Blocks in Transformers through the Lens of Attention Maps
International Conference on Learning Representations (ICLR), 2023
Goro Kobayashi
Tatsuki Kuribayashi
Sho Yokoi
Kentaro Inui
487
26
0
01 Feb 2023
In-Context Retrieval-Augmented Language Models
Transactions of the Association for Computational Linguistics (TACL), 2023
Ori Ram
Yoav Levine
Itay Dalmedigos
Dor Muhlgay
Amnon Shashua
Kevin Leyton-Brown
Y. Shoham
KELM
RALM
LRM
572
858
0
31 Jan 2023
Benchmarking Large Language Models for News Summarization
Transactions of the Association for Computational Linguistics (TACL), 2023
Tianyi Zhang
Faisal Ladhak
Esin Durmus
Abigail Z. Jacobs
Kathleen McKeown
Tatsunori B. Hashimoto
ELM
327
676
0
31 Jan 2023
Grounding Language Models to Images for Multimodal Inputs and Outputs
International Conference on Machine Learning (ICML), 2023
Jing Yu Koh
Ruslan Salakhutdinov
Daniel Fried
MLLM
448
151
0
31 Jan 2023
The Flan Collection: Designing Data and Methods for Effective Instruction Tuning
International Conference on Machine Learning (ICML), 2023
Shayne Longpre
Le Hou
Tu Vu
Albert Webson
Hyung Won Chung
...
Denny Zhou
Quoc V. Le
Barret Zoph
Jason W. Wei
Adam Roberts
ALM
444
853
0
31 Jan 2023
Direct Preference-based Policy Optimization without Reward Modeling
Neural Information Processing Systems (NeurIPS), 2023
Gaon An
Junhyeok Lee
Xingdong Zuo
Norio Kosaka
KyungHyun Kim
Hyun Oh Song
OffRL
260
40
0
30 Jan 2023
REPLUG: Retrieval-Augmented Black-Box Language Models
North American Chapter of the Association for Computational Linguistics (NAACL), 2023
Weijia Shi
Sewon Min
Michihiro Yasunaga
Minjoon Seo
Rich James
M. Lewis
Luke Zettlemoyer
Anuj Kumar
RALM
VLM
KELM
729
866
0
30 Jan 2023
BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image Encoders and Large Language Models
International Conference on Machine Learning (ICML), 2023
Junnan Li
Dongxu Li
Silvio Savarese
Steven C. H. Hoi
VLM
MLLM
1.3K
6,781
0
30 Jan 2023
Large Language Models for Biomedical Knowledge Graph Construction: Information extraction from EMR notes
Workshop on Biomedical Natural Language Processing (BioNLP), 2023
Vahan Arsenyan
Spartak Bughdaryan
Fadi Shaya
Kent Small
Davit Shahnazaryan
211
28
0
29 Jan 2023
Understanding the Effectiveness of Very Large Language Models on Dialog Evaluation
Jessica Huynh
Cathy Jiao
Prakhar Gupta
Shikib Mehri
Payal Bajaj
Vishrav Chaudhary
M. Eskénazi
ELM
LM&MA
223
18
0
27 Jan 2023
Large Language Models Are Latent Variable Models: Explaining and Finding Good Demonstrations for In-Context Learning
Neural Information Processing Systems (NeurIPS), 2023
Xinyi Wang
Wanrong Zhu
Michael Stephen Saxon
Mark Steyvers
William Yang Wang
BDL
539
163
0
27 Jan 2023
Call for Papers -- The BabyLM Challenge: Sample-efficient pretraining on a developmentally plausible corpus
Alex Warstadt
Leshem Choshen
Aaron Mueller
Adina Williams
Ethan Gotlieb Wilcox
Chengxu Zhuang
221
69
0
27 Jan 2023
DetectGPT: Zero-Shot Machine-Generated Text Detection using Probability Curvature
International Conference on Machine Learning (ICML), 2023
E. Mitchell
Yoonho Lee
Alexander Khazatsky
Christopher D. Manning
Chelsea Finn
671
856
0
26 Jan 2023
Affective Faces for Goal-Driven Dyadic Communication
Scott Geng
Revant Teotia
Purva Tendulkar
Sachit Menon
Carl Vondrick
VGen
130
31
0
26 Jan 2023
PIT: Optimization of Dynamic Sparse Deep Learning Models via Permutation Invariant Transformation
Symposium on Operating Systems Principles (SOSP), 2023
Ningxin Zheng
Huiqiang Jiang
Quan Zhang
Zhenhua Han
Yuqing Yang
...
Fan Yang
Chengruidong Zhang
Lili Qiu
Mao Yang
Lidong Zhou
204
36
0
26 Jan 2023
Explainable AI does not provide the explanations end-users are asking for
Savio Rozario
G. Cevora
XAI
189
3
0
25 Jan 2023
Efficient Language Model Training through Cross-Lingual and Progressive Transfer Learning
Malte Ostendorff
Georg Rehm
CLIP
VLM
CLL
310
35
0
23 Jan 2023
Dr.Spider: A Diagnostic Evaluation Benchmark towards Text-to-SQL Robustness
International Conference on Learning Representations (ICLR), 2023
Shuaichen Chang
Jun Wang
Mingwen Dong
Lin Pan
Henghui Zhu
...
William Yang Wang
Zhiguo Wang
Vittorio Castelli
Patrick Ng
Bing Xiang
OOD
303
54
0
21 Jan 2023
Prompting Large Language Model for Machine Translation: A Case Study
International Conference on Machine Learning (ICML), 2023
Biao Zhang
Barry Haddow
Alexandra Birch
LRM
435
376
0
17 Jan 2023
RILS: Masked Visual Reconstruction in Language Semantic Space
Computer Vision and Pattern Recognition (CVPR), 2023
Shusheng Yang
Yixiao Ge
Kun Yi
Dian Li
Ying Shan
Xiaohu Qie
Xinggang Wang
CLIP
194
14
0
17 Jan 2023
TikTalk: A Video-Based Dialogue Dataset for Multi-Modal Chitchat in Real World
ACM Multimedia (ACM MM), 2023
Hongpeng Lin
Ludan Ruan
Wenke Xia
Peiyu Liu
Jing Wen
...
Di Hu
Ruihua Song
Wayne Xin Zhao
Qin Jin
Zhiwu Lu
VGen
209
13
0
14 Jan 2023
Leveraging Large Language Models to Power Chatbots for Collecting User Self-Reported Data
Jing Wei
Sungdong Kim
Hyunhoon Jung
Young-Ho Kim
300
127
0
14 Jan 2023
See, Think, Confirm: Interactive Prompting Between Vision and Language Models for Knowledge-based Visual Reasoning
Zhenfang Chen
Qinhong Zhou
Songlin Yang
Yining Hong
Hao Zhang
Chuang Gan
LRM
VLM
284
53
0
12 Jan 2023
The Role of Interactive Visualization in Explaining (Large) NLP Models: from Data to Inference
R. Brath
Daniel A. Keim
Johannes Knittel
Shimei Pan
Pia Sommerauer
Hendrik Strobelt
153
15
0
11 Jan 2023
Recommending Root-Cause and Mitigation Steps for Cloud Incidents using Large Language Models
International Conference on Software Engineering (ICSE), 2023
Toufique Ahmed
Supriyo Ghosh
Chetan Bansal
Thomas Zimmermann
Xuchao Zhang
Saravan Rajmohan
AI4CE
177
82
0
10 Jan 2023
Scaling Laws for Generative Mixed-Modal Language Models
International Conference on Machine Learning (ICML), 2023
Armen Aghajanyan
L. Yu
Alexis Conneau
Wei-Ning Hsu
Karen Hambardzumyan
Susan Zhang
Stephen Roller
Naman Goyal
Omer Levy
Luke Zettlemoyer
MoE
VLM
314
137
0
10 Jan 2023
Does compressing activations help model parallel training?
Conference on Machine Learning and Systems (MLSys), 2023
S. Bian
Dacheng Li
Hongyi Wang
Eric P. Xing
Shivaram Venkataraman
230
12
0
06 Jan 2023
UniHD at TSAR-2022 Shared Task: Is Compute All We Need for Lexical Simplification?
Dennis Aumiller
Michael Gertz
200
25
0
04 Jan 2023
SparseGPT: Massive Language Models Can Be Accurately Pruned in One-Shot
International Conference on Machine Learning (ICML), 2023
Elias Frantar
Dan Alistarh
VLM
607
1,054
0
02 Jan 2023
Rethinking with Retrieval: Faithful Large Language Model Inference
Hangfeng He
Hongming Zhang
Dan Roth
KELM
LRM
488
205
0
31 Dec 2022
Targeted Phishing Campaigns using Large Scale Language Models
Rabimba Karanjai
250
45
0
30 Dec 2022
Hungry Hungry Hippos: Towards Language Modeling with State Space Models
International Conference on Learning Representations (ICLR), 2022
Daniel Y. Fu
Tri Dao
Khaled Kamal Saab
A. Thomas
Atri Rudra
Christopher Ré
440
556
0
28 Dec 2022
Large Language Models Encode Clinical Knowledge
Nature (Nature), 2022
K. Singhal
Shekoofeh Azizi
T. Tu
S. S. Mahdavi
Jason W. Wei
...
A. Rajkomar
Joelle Barral
Christopher Semturs
Alan Karthikesalingam
Vivek Natarajan
LM&MA
ELM
AI4MH
608
3,513
0
26 Dec 2022
Do DALL-E and Flamingo Understand Each Other?
IEEE International Conference on Computer Vision (ICCV), 2022
Hang Li
Jindong Gu
Rajat Koner
Sahand Sharifzadeh
Volker Tresp
MLLM
226
14
0
23 Dec 2022
Previous
1
2
3
...
53
54
55
...
57
58
59
Next
Page 54 of 59
Page
of 59
Go