ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2205.01068
  4. Cited By
OPT: Open Pre-trained Transformer Language Models
v1v2v3v4 (latest)

OPT: Open Pre-trained Transformer Language Models

2 May 2022
Susan Zhang
Stephen Roller
Naman Goyal
Mikel Artetxe
Moya Chen
Shuohui Chen
Christopher Dewan
Mona T. Diab
Xian Li
Xi Lin
Todor Mihaylov
Myle Ott
Sam Shleifer
Kurt Shuster
Daniel Simig
Punit Singh Koura
Anjali Sridhar
Tianlu Wang
Luke Zettlemoyer
    VLMOSLMAI4CE
ArXiv (abs)PDFHTMLHuggingFace (2 upvotes)

Papers citing "OPT: Open Pre-trained Transformer Language Models"

50 / 2,924 papers shown
Toolformer: Language Models Can Teach Themselves to Use Tools
Toolformer: Language Models Can Teach Themselves to Use ToolsNeural Information Processing Systems (NeurIPS), 2023
Timo Schick
Jane Dwivedi-Yu
Roberto Dessì
Roberta Raileanu
Maria Lomeli
Luke Zettlemoyer
Nicola Cancedda
Thomas Scialom
SyDaRALM
472
2,744
0
09 Feb 2023
GPTScore: Evaluate as You Desire
GPTScore: Evaluate as You DesireNorth American Chapter of the Association for Computational Linguistics (NAACL), 2023
Jinlan Fu
See-Kiong Ng
Zhengbao Jiang
Pengfei Liu
LM&MAALMELM
394
407
0
08 Feb 2023
Revisiting Offline Compression: Going Beyond Factorization-based Methods
  for Transformer Language Models
Revisiting Offline Compression: Going Beyond Factorization-based Methods for Transformer Language ModelsFindings (Findings), 2023
Mohammadreza Banaei
Klaudia Bałazy
Artur Kasymov
R. Lebret
Jacek Tabor
Karl Aberer
OffRL
154
1
0
08 Feb 2023
ChatGPT versus Traditional Question Answering for Knowledge Graphs:
  Current Status and Future Directions Towards Knowledge Graph Chatbots
ChatGPT versus Traditional Question Answering for Knowledge Graphs: Current Status and Future Directions Towards Knowledge Graph Chatbots
Reham Omar
Omij Mangukiya
Panos Kalnis
Essam Mansour
AI4MH
147
91
0
08 Feb 2023
Augmenting Zero-Shot Dense Retrievers with Plug-in Mixture-of-Memories
Augmenting Zero-Shot Dense Retrievers with Plug-in Mixture-of-MemoriesConference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Suyu Ge
Chenyan Xiong
Corby Rosset
Arnold Overwijk
Jiawei Han
Paul N. Bennett
VLM
157
11
0
07 Feb 2023
Hard Prompts Made Easy: Gradient-Based Discrete Optimization for Prompt
  Tuning and Discovery
Hard Prompts Made Easy: Gradient-Based Discrete Optimization for Prompt Tuning and DiscoveryNeural Information Processing Systems (NeurIPS), 2023
Yuxin Wen
Neel Jain
John Kirchenbauer
Micah Goldblum
Jonas Geiping
Tom Goldstein
VLMDiffM
336
363
1
07 Feb 2023
PLACES: Prompting Language Models for Social Conversation Synthesis
PLACES: Prompting Language Models for Social Conversation SynthesisFindings (Findings), 2023
Maximillian Chen
Alexandros Papangelis
Chenyang Tao
Seokhwan Kim
Andrew Rosenbaum
Yang Liu
Zhou Yu
Dilek Z. Hakkani-Tür
284
97
0
07 Feb 2023
Chain of Hindsight Aligns Language Models with Feedback
Chain of Hindsight Aligns Language Models with FeedbackInternational Conference on Learning Representations (ICLR), 2023
Hao Liu
Carmelo Sferrazza
Pieter Abbeel
ALM
810
149
0
06 Feb 2023
The Gradient of Generative AI Release: Methods and Considerations
The Gradient of Generative AI Release: Methods and ConsiderationsConference on Fairness, Accountability and Transparency (FAccT), 2023
Irene Solaiman
197
125
0
05 Feb 2023
FineDeb: A Debiasing Framework for Language Models
FineDeb: A Debiasing Framework for Language Models
Akash Saravanan
Dhruv Mullick
Habibur Rahman
Nidhi Hegde
FedMLAI4CE
176
7
0
05 Feb 2023
Quantized Distributed Training of Large Models with Convergence
  Guarantees
Quantized Distributed Training of Large Models with Convergence GuaranteesInternational Conference on Machine Learning (ICML), 2023
I. Markov
Adrian Vladu
Qi Guo
Dan Alistarh
MQ
269
16
0
05 Feb 2023
The Science of Detecting LLM-Generated Texts
The Science of Detecting LLM-Generated TextsCommunications of the ACM (CACM), 2023
Ruixiang Tang
Yu-Neng Chuang
Helen Zhou
DeLMO
395
236
0
04 Feb 2023
Describe, Explain, Plan and Select: Interactive Planning with Large
  Language Models Enables Open-World Multi-Task Agents
Describe, Explain, Plan and Select: Interactive Planning with Large Language Models Enables Open-World Multi-Task Agents
Zihao Wang
Shaofei Cai
Guanzhou Chen
Hoang Trung-Dung
Xiaojian Ma
Yitao Liang
LM&RoLLMAG
512
435
0
03 Feb 2023
Language Quantized AutoEncoders: Towards Unsupervised Text-Image
  Alignment
Language Quantized AutoEncoders: Towards Unsupervised Text-Image AlignmentNeural Information Processing Systems (NeurIPS), 2023
Hao Liu
Wilson Yan
Pieter Abbeel
254
34
0
02 Feb 2023
Using In-Context Learning to Improve Dialogue Safety
Using In-Context Learning to Improve Dialogue SafetyConference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Nicholas Meade
Spandana Gella
Devamanyu Hazarika
Prakhar Gupta
Di Jin
Siva Reddy
Yang Liu
Dilek Z. Hakkani-Tür
268
50
0
02 Feb 2023
Synthetic Prompting: Generating Chain-of-Thought Demonstrations for
  Large Language Models
Synthetic Prompting: Generating Chain-of-Thought Demonstrations for Large Language ModelsInternational Conference on Machine Learning (ICML), 2023
Zhihong Shao
Yeyun Gong
Yelong Shen
Shiyu Huang
Nan Duan
Weizhu Chen
ReLMLRM
220
89
0
01 Feb 2023
Analyzing Feed-Forward Blocks in Transformers through the Lens of
  Attention Maps
Analyzing Feed-Forward Blocks in Transformers through the Lens of Attention MapsInternational Conference on Learning Representations (ICLR), 2023
Goro Kobayashi
Tatsuki Kuribayashi
Sho Yokoi
Kentaro Inui
487
26
0
01 Feb 2023
In-Context Retrieval-Augmented Language Models
In-Context Retrieval-Augmented Language ModelsTransactions of the Association for Computational Linguistics (TACL), 2023
Ori Ram
Yoav Levine
Itay Dalmedigos
Dor Muhlgay
Amnon Shashua
Kevin Leyton-Brown
Y. Shoham
KELMRALMLRM
572
858
0
31 Jan 2023
Benchmarking Large Language Models for News Summarization
Benchmarking Large Language Models for News SummarizationTransactions of the Association for Computational Linguistics (TACL), 2023
Tianyi Zhang
Faisal Ladhak
Esin Durmus
Abigail Z. Jacobs
Kathleen McKeown
Tatsunori B. Hashimoto
ELM
327
676
0
31 Jan 2023
Grounding Language Models to Images for Multimodal Inputs and Outputs
Grounding Language Models to Images for Multimodal Inputs and OutputsInternational Conference on Machine Learning (ICML), 2023
Jing Yu Koh
Ruslan Salakhutdinov
Daniel Fried
MLLM
448
151
0
31 Jan 2023
The Flan Collection: Designing Data and Methods for Effective
  Instruction Tuning
The Flan Collection: Designing Data and Methods for Effective Instruction TuningInternational Conference on Machine Learning (ICML), 2023
Shayne Longpre
Le Hou
Tu Vu
Albert Webson
Hyung Won Chung
...
Denny Zhou
Quoc V. Le
Barret Zoph
Jason W. Wei
Adam Roberts
ALM
444
853
0
31 Jan 2023
Direct Preference-based Policy Optimization without Reward Modeling
Direct Preference-based Policy Optimization without Reward ModelingNeural Information Processing Systems (NeurIPS), 2023
Gaon An
Junhyeok Lee
Xingdong Zuo
Norio Kosaka
KyungHyun Kim
Hyun Oh Song
OffRL
260
40
0
30 Jan 2023
REPLUG: Retrieval-Augmented Black-Box Language Models
REPLUG: Retrieval-Augmented Black-Box Language ModelsNorth American Chapter of the Association for Computational Linguistics (NAACL), 2023
Weijia Shi
Sewon Min
Michihiro Yasunaga
Minjoon Seo
Rich James
M. Lewis
Luke Zettlemoyer
Anuj Kumar
RALMVLMKELM
729
866
0
30 Jan 2023
BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image
  Encoders and Large Language Models
BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image Encoders and Large Language ModelsInternational Conference on Machine Learning (ICML), 2023
Junnan Li
Dongxu Li
Silvio Savarese
Steven C. H. Hoi
VLMMLLM
1.3K
6,781
0
30 Jan 2023
Large Language Models for Biomedical Knowledge Graph Construction:
  Information extraction from EMR notes
Large Language Models for Biomedical Knowledge Graph Construction: Information extraction from EMR notesWorkshop on Biomedical Natural Language Processing (BioNLP), 2023
Vahan Arsenyan
Spartak Bughdaryan
Fadi Shaya
Kent Small
Davit Shahnazaryan
211
28
0
29 Jan 2023
Understanding the Effectiveness of Very Large Language Models on Dialog
  Evaluation
Understanding the Effectiveness of Very Large Language Models on Dialog Evaluation
Jessica Huynh
Cathy Jiao
Prakhar Gupta
Shikib Mehri
Payal Bajaj
Vishrav Chaudhary
M. Eskénazi
ELMLM&MA
223
18
0
27 Jan 2023
Large Language Models Are Latent Variable Models: Explaining and Finding
  Good Demonstrations for In-Context Learning
Large Language Models Are Latent Variable Models: Explaining and Finding Good Demonstrations for In-Context LearningNeural Information Processing Systems (NeurIPS), 2023
Xinyi Wang
Wanrong Zhu
Michael Stephen Saxon
Mark Steyvers
William Yang Wang
BDL
539
163
0
27 Jan 2023
Call for Papers -- The BabyLM Challenge: Sample-efficient pretraining on
  a developmentally plausible corpus
Call for Papers -- The BabyLM Challenge: Sample-efficient pretraining on a developmentally plausible corpus
Alex Warstadt
Leshem Choshen
Aaron Mueller
Adina Williams
Ethan Gotlieb Wilcox
Chengxu Zhuang
221
69
0
27 Jan 2023
DetectGPT: Zero-Shot Machine-Generated Text Detection using Probability
  Curvature
DetectGPT: Zero-Shot Machine-Generated Text Detection using Probability CurvatureInternational Conference on Machine Learning (ICML), 2023
E. Mitchell
Yoonho Lee
Alexander Khazatsky
Christopher D. Manning
Chelsea Finn
671
856
0
26 Jan 2023
Affective Faces for Goal-Driven Dyadic Communication
Affective Faces for Goal-Driven Dyadic Communication
Scott Geng
Revant Teotia
Purva Tendulkar
Sachit Menon
Carl Vondrick
VGen
130
31
0
26 Jan 2023
PIT: Optimization of Dynamic Sparse Deep Learning Models via Permutation
  Invariant Transformation
PIT: Optimization of Dynamic Sparse Deep Learning Models via Permutation Invariant TransformationSymposium on Operating Systems Principles (SOSP), 2023
Ningxin Zheng
Huiqiang Jiang
Quan Zhang
Zhenhua Han
Yuqing Yang
...
Fan Yang
Chengruidong Zhang
Lili Qiu
Mao Yang
Lidong Zhou
204
36
0
26 Jan 2023
Explainable AI does not provide the explanations end-users are asking
  for
Explainable AI does not provide the explanations end-users are asking for
Savio Rozario
G. Cevora
XAI
189
3
0
25 Jan 2023
Efficient Language Model Training through Cross-Lingual and Progressive
  Transfer Learning
Efficient Language Model Training through Cross-Lingual and Progressive Transfer Learning
Malte Ostendorff
Georg Rehm
CLIPVLMCLL
310
35
0
23 Jan 2023
Dr.Spider: A Diagnostic Evaluation Benchmark towards Text-to-SQL
  Robustness
Dr.Spider: A Diagnostic Evaluation Benchmark towards Text-to-SQL RobustnessInternational Conference on Learning Representations (ICLR), 2023
Shuaichen Chang
Jun Wang
Mingwen Dong
Lin Pan
Henghui Zhu
...
William Yang Wang
Zhiguo Wang
Vittorio Castelli
Patrick Ng
Bing Xiang
OOD
303
54
0
21 Jan 2023
Prompting Large Language Model for Machine Translation: A Case Study
Prompting Large Language Model for Machine Translation: A Case StudyInternational Conference on Machine Learning (ICML), 2023
Biao Zhang
Barry Haddow
Alexandra Birch
LRM
435
376
0
17 Jan 2023
RILS: Masked Visual Reconstruction in Language Semantic Space
RILS: Masked Visual Reconstruction in Language Semantic SpaceComputer Vision and Pattern Recognition (CVPR), 2023
Shusheng Yang
Yixiao Ge
Kun Yi
Dian Li
Ying Shan
Xiaohu Qie
Xinggang Wang
CLIP
194
14
0
17 Jan 2023
TikTalk: A Video-Based Dialogue Dataset for Multi-Modal Chitchat in Real
  World
TikTalk: A Video-Based Dialogue Dataset for Multi-Modal Chitchat in Real WorldACM Multimedia (ACM MM), 2023
Hongpeng Lin
Ludan Ruan
Wenke Xia
Peiyu Liu
Jing Wen
...
Di Hu
Ruihua Song
Wayne Xin Zhao
Qin Jin
Zhiwu Lu
VGen
209
13
0
14 Jan 2023
Leveraging Large Language Models to Power Chatbots for Collecting User
  Self-Reported Data
Leveraging Large Language Models to Power Chatbots for Collecting User Self-Reported Data
Jing Wei
Sungdong Kim
Hyunhoon Jung
Young-Ho Kim
300
127
0
14 Jan 2023
See, Think, Confirm: Interactive Prompting Between Vision and Language
  Models for Knowledge-based Visual Reasoning
See, Think, Confirm: Interactive Prompting Between Vision and Language Models for Knowledge-based Visual Reasoning
Zhenfang Chen
Qinhong Zhou
Songlin Yang
Yining Hong
Hao Zhang
Chuang Gan
LRMVLM
284
53
0
12 Jan 2023
The Role of Interactive Visualization in Explaining (Large) NLP Models:
  from Data to Inference
The Role of Interactive Visualization in Explaining (Large) NLP Models: from Data to Inference
R. Brath
Daniel A. Keim
Johannes Knittel
Shimei Pan
Pia Sommerauer
Hendrik Strobelt
153
15
0
11 Jan 2023
Recommending Root-Cause and Mitigation Steps for Cloud Incidents using
  Large Language Models
Recommending Root-Cause and Mitigation Steps for Cloud Incidents using Large Language ModelsInternational Conference on Software Engineering (ICSE), 2023
Toufique Ahmed
Supriyo Ghosh
Chetan Bansal
Thomas Zimmermann
Xuchao Zhang
Saravan Rajmohan
AI4CE
177
82
0
10 Jan 2023
Scaling Laws for Generative Mixed-Modal Language Models
Scaling Laws for Generative Mixed-Modal Language ModelsInternational Conference on Machine Learning (ICML), 2023
Armen Aghajanyan
L. Yu
Alexis Conneau
Wei-Ning Hsu
Karen Hambardzumyan
Susan Zhang
Stephen Roller
Naman Goyal
Omer Levy
Luke Zettlemoyer
MoEVLM
314
137
0
10 Jan 2023
Does compressing activations help model parallel training?
Does compressing activations help model parallel training?Conference on Machine Learning and Systems (MLSys), 2023
S. Bian
Dacheng Li
Hongyi Wang
Eric P. Xing
Shivaram Venkataraman
230
12
0
06 Jan 2023
UniHD at TSAR-2022 Shared Task: Is Compute All We Need for Lexical
  Simplification?
UniHD at TSAR-2022 Shared Task: Is Compute All We Need for Lexical Simplification?
Dennis Aumiller
Michael Gertz
200
25
0
04 Jan 2023
SparseGPT: Massive Language Models Can Be Accurately Pruned in One-Shot
SparseGPT: Massive Language Models Can Be Accurately Pruned in One-ShotInternational Conference on Machine Learning (ICML), 2023
Elias Frantar
Dan Alistarh
VLM
607
1,054
0
02 Jan 2023
Rethinking with Retrieval: Faithful Large Language Model Inference
Rethinking with Retrieval: Faithful Large Language Model Inference
Hangfeng He
Hongming Zhang
Dan Roth
KELMLRM
488
205
0
31 Dec 2022
Targeted Phishing Campaigns using Large Scale Language Models
Targeted Phishing Campaigns using Large Scale Language Models
Rabimba Karanjai
250
45
0
30 Dec 2022
Hungry Hungry Hippos: Towards Language Modeling with State Space Models
Hungry Hungry Hippos: Towards Language Modeling with State Space ModelsInternational Conference on Learning Representations (ICLR), 2022
Daniel Y. Fu
Tri Dao
Khaled Kamal Saab
A. Thomas
Atri Rudra
Christopher Ré
440
556
0
28 Dec 2022
Large Language Models Encode Clinical Knowledge
Large Language Models Encode Clinical KnowledgeNature (Nature), 2022
K. Singhal
Shekoofeh Azizi
T. Tu
S. S. Mahdavi
Jason W. Wei
...
A. Rajkomar
Joelle Barral
Christopher Semturs
Alan Karthikesalingam
Vivek Natarajan
LM&MAELMAI4MH
608
3,513
0
26 Dec 2022
Do DALL-E and Flamingo Understand Each Other?
Do DALL-E and Flamingo Understand Each Other?IEEE International Conference on Computer Vision (ICCV), 2022
Hang Li
Jindong Gu
Rajat Koner
Sahand Sharifzadeh
Volker Tresp
MLLM
226
14
0
23 Dec 2022
Previous
123...535455...575859
Next
Page 54 of 59
Pageof 59