ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2205.01068
  4. Cited By
OPT: Open Pre-trained Transformer Language Models
v1v2v3v4 (latest)

OPT: Open Pre-trained Transformer Language Models

2 May 2022
Susan Zhang
Stephen Roller
Naman Goyal
Mikel Artetxe
Moya Chen
Shuohui Chen
Christopher Dewan
Mona T. Diab
Xian Li
Xi Lin
Todor Mihaylov
Myle Ott
Sam Shleifer
Kurt Shuster
Daniel Simig
Punit Singh Koura
Anjali Sridhar
Tianlu Wang
Luke Zettlemoyer
    VLMOSLMAI4CE
ArXiv (abs)PDFHTMLHuggingFace (2 upvotes)

Papers citing "OPT: Open Pre-trained Transformer Language Models"

24 / 2,924 papers shown
Language Models with Image Descriptors are Strong Few-Shot
  Video-Language Learners
Language Models with Image Descriptors are Strong Few-Shot Video-Language LearnersNeural Information Processing Systems (NeurIPS), 2022
Zhenhailong Wang
Pengfei Yu
Ruochen Xu
Luowei Zhou
Jie Lei
...
Chenguang Zhu
Derek Hoiem
Shih-Fu Chang
Joey Tianyi Zhou
Heng Ji
MLLMVLM
542
163
0
22 May 2022
A Study on Transformer Configuration and Training Objective
A Study on Transformer Configuration and Training ObjectiveInternational Conference on Machine Learning (ICML), 2022
Fuzhao Xue
Jianghai Chen
Aixin Sun
Xiaozhe Ren
Zangwei Zheng
Xiaoxin He
Yongming Chen
Xin Jiang
Yang You
208
10
0
21 May 2022
Visually-Augmented Language Modeling
Visually-Augmented Language ModelingInternational Conference on Learning Representations (ICLR), 2022
Weizhi Wang
Li Dong
Hao Cheng
Haoyu Song
Xiaodong Liu
Xifeng Yan
Jianfeng Gao
Furu Wei
VLM
232
22
0
20 May 2022
Clinical Prompt Learning with Frozen Language Models
Clinical Prompt Learning with Frozen Language Models
Niall Taylor
Yi Zhang
Dan W Joyce
A. Nevado-Holgado
Andrey Kormilitzin
VLMLM&MA
141
37
0
11 May 2022
The Unreliability of Explanations in Few-shot Prompting for Textual
  Reasoning
The Unreliability of Explanations in Few-shot Prompting for Textual ReasoningNeural Information Processing Systems (NeurIPS), 2022
Xi Ye
Greg Durrett
ReLMLRM
319
229
0
06 May 2022
MiCS: Near-linear Scaling for Training Gigantic Model on Public Cloud
MiCS: Near-linear Scaling for Training Gigantic Model on Public CloudProceedings of the VLDB Endowment (PVLDB), 2022
Zhen Zhang
Shuai Zheng
Yida Wang
Justin Chiu
George Karypis
Trishul Chilimbi
Mu Li
Xin Jin
477
47
0
30 Apr 2022
mGPT: Few-Shot Learners Go Multilingual
mGPT: Few-Shot Learners Go MultilingualTransactions of the Association for Computational Linguistics (TACL), 2022
Oleh Shliazhko
Alena Fenogenova
Maria Tikhonova
Vladislav Mikhailov
Anastasia Kozlova
Tatiana Shavrina
364
192
0
15 Apr 2022
REx: Data-Free Residual Quantization Error Expansion
REx: Data-Free Residual Quantization Error ExpansionNeural Information Processing Systems (NeurIPS), 2022
Edouard Yvinec
Arnaud Dapgony
Matthieu Cord
Kévin Bailly
MQ
345
9
0
28 Mar 2022
In-Context Learning for Few-Shot Dialogue State Tracking
In-Context Learning for Few-Shot Dialogue State TrackingConference on Empirical Methods in Natural Language Processing (EMNLP), 2022
Yushi Hu
Chia-Hsuan Lee
Tianbao Xie
Tao Yu
Noah A. Smith
Mari Ostendorf
BDL
343
70
0
16 Mar 2022
GrIPS: Gradient-free, Edit-based Instruction Search for Prompting Large
  Language Models
GrIPS: Gradient-free, Edit-based Instruction Search for Prompting Large Language ModelsConference of the European Chapter of the Association for Computational Linguistics (EACL), 2022
Archiki Prasad
Peter Hase
Xiang Zhou
Joey Tianyi Zhou
249
148
0
14 Mar 2022
Internet-augmented language models through few-shot prompting for
  open-domain question answering
Internet-augmented language models through few-shot prompting for open-domain question answering
Angeliki Lazaridou
E. Gribovskaya
Wojciech Stokowiec
N. Grigorev
KELMLRM
244
159
0
10 Mar 2022
LiteTransformerSearch: Training-free Neural Architecture Search for
  Efficient Language Models
LiteTransformerSearch: Training-free Neural Architecture Search for Efficient Language ModelsNeural Information Processing Systems (NeurIPS), 2022
Mojan Javaheripi
Gustavo de Rosa
Subhabrata Mukherjee
S. Shah
Tomasz Religa
C. C. T. Mendes
Sébastien Bubeck
F. Koushanfar
Debadeepta Dey
240
23
0
04 Mar 2022
ZeroGen: Efficient Zero-shot Learning via Dataset Generation
ZeroGen: Efficient Zero-shot Learning via Dataset GenerationConference on Empirical Methods in Natural Language Processing (EMNLP), 2022
Jiacheng Ye
Jiahui Gao
Qintong Li
Hang Xu
Jiangtao Feng
Zhiyong Wu
Tao Yu
Lingpeng Kong
SyDa
344
275
0
16 Feb 2022
Quantifying Memorization Across Neural Language Models
Quantifying Memorization Across Neural Language ModelsInternational Conference on Learning Representations (ICLR), 2022
Nicholas Carlini
Daphne Ippolito
Matthew Jagielski
Katherine Lee
Florian Tramèr
Chiyuan Zhang
PILM
506
778
0
15 Feb 2022
Fooling MOSS Detection with Pretrained Language Models
Fooling MOSS Detection with Pretrained Language ModelsInternational Conference on Information and Knowledge Management (CIKM), 2022
Stella Biderman
Edward Raff
DeLMO
172
40
0
19 Jan 2022
Counterfactual Memorization in Neural Language Models
Counterfactual Memorization in Neural Language ModelsNeural Information Processing Systems (NeurIPS), 2021
Chiyuan Zhang
Daphne Ippolito
Katherine Lee
Matthew Jagielski
Florian Tramèr
Nicholas Carlini
318
169
0
24 Dec 2021
Generating More Pertinent Captions by Leveraging Semantics and Style on
  Multi-Source Datasets
Generating More Pertinent Captions by Leveraging Semantics and Style on Multi-Source Datasets
Marcella Cornia
Lorenzo Baraldi
G. Fiameni
Rita Cucchiara
321
14
0
24 Nov 2021
How much do language models copy from their training data? Evaluating
  linguistic novelty in text generation using RAVEN
How much do language models copy from their training data? Evaluating linguistic novelty in text generation using RAVEN
R. Thomas McCoy
P. Smolensky
Tal Linzen
Jianfeng Gao
Asli Celikyilmaz
SyDa
233
161
0
18 Nov 2021
Understanding Jargon: Combining Extraction and Generation for Definition
  Modeling
Understanding Jargon: Combining Extraction and Generation for Definition ModelingConference on Empirical Methods in Natural Language Processing (EMNLP), 2021
Jie Huang
Hanyin Shao
Kevin Chen-Chuan Chang
Jinjun Xiong
Wen-mei W. Hwu
183
20
0
14 Nov 2021
Colossal-AI: A Unified Deep Learning System For Large-Scale Parallel
  Training
Colossal-AI: A Unified Deep Learning System For Large-Scale Parallel TrainingInternational Conference on Parallel Processing (ICPP), 2021
Yongbin Li
Hongxin Liu
Zhengda Bian
Boxiang Wang
Haichen Huang
Fan Cui
Chuan-Qing Wang
Yang You
GNN
297
190
0
28 Oct 2021
Can Machines Learn Morality? The Delphi Experiment
Can Machines Learn Morality? The Delphi Experiment
Liwei Jiang
Jena D. Hwang
Chandra Bhagavatula
Ronan Le Bras
Jenny T Liang
...
Yulia Tsvetkov
Oren Etzioni
Maarten Sap
Regina A. Rini
Yejin Choi
FaML
355
153
0
14 Oct 2021
Creativity and Machine Learning: A Survey
Creativity and Machine Learning: A SurveyACM Computing Surveys (CSUR), 2021
Giorgio Franceschelli
Mirco Musolesi
VLMAI4CE
554
56
0
06 Apr 2021
Graphmax for Text Generation
Graphmax for Text GenerationJournal of Artificial Intelligence Research (JAIR), 2021
Bin Liu
Guosheng Yin
218
3
0
01 Jan 2021
NarrativeTime: Dense Temporal Annotation on a Timeline
NarrativeTime: Dense Temporal Annotation on a TimelineInternational Conference on Language Resources and Evaluation (LREC), 2019
Anna Rogers
Marzena Karpinska
Ankita Gupta
Vladislav Lialin
Gregory Smelkov
Anna Rumshisky
180
6
0
29 Aug 2019
Previous
123...575859
Page 59 of 59
Pageof 59