ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2205.01068
  4. Cited By
OPT: Open Pre-trained Transformer Language Models
v1v2v3v4 (latest)

OPT: Open Pre-trained Transformer Language Models

2 May 2022
Susan Zhang
Stephen Roller
Naman Goyal
Mikel Artetxe
Moya Chen
Shuohui Chen
Christopher Dewan
Mona T. Diab
Xian Li
Xi Lin
Todor Mihaylov
Myle Ott
Sam Shleifer
Kurt Shuster
Daniel Simig
Punit Singh Koura
Anjali Sridhar
Tianlu Wang
Luke Zettlemoyer
    VLMOSLMAI4CE
ArXiv (abs)PDFHTMLHuggingFace (2 upvotes)

Papers citing "OPT: Open Pre-trained Transformer Language Models"

50 / 2,924 papers shown
Why Does Surprisal From Larger Transformer-Based Language Models Provide
  a Poorer Fit to Human Reading Times?
Why Does Surprisal From Larger Transformer-Based Language Models Provide a Poorer Fit to Human Reading Times?Transactions of the Association for Computational Linguistics (TACL), 2022
Byung-Doh Oh
William Schuler
173
151
0
23 Dec 2022
OPT-IML: Scaling Language Model Instruction Meta Learning through the
  Lens of Generalization
OPT-IML: Scaling Language Model Instruction Meta Learning through the Lens of Generalization
Srinivasan Iyer
Xi Lin
Ramakanth Pasunuru
Todor Mihaylov
Daniel Simig
...
Jeff Wang
Christopher Dewan
Asli Celikyilmaz
Luke Zettlemoyer
Veselin Stoyanov
ALM
487
303
0
22 Dec 2022
From Images to Textual Prompts: Zero-shot VQA with Frozen Large Language
  Models
From Images to Textual Prompts: Zero-shot VQA with Frozen Large Language ModelsComputer Vision and Pattern Recognition (CVPR), 2022
Jiaxian Guo
Junnan Li
Dongxu Li
A. M. H. Tiong
Boyang Albert Li
Dacheng Tao
Steven C. H. Hoi
VLMMLLM
385
163
0
21 Dec 2022
SERENGETI: Massively Multilingual Language Models for Africa
SERENGETI: Massively Multilingual Language Models for AfricaAnnual Meeting of the Association for Computational Linguistics (ACL), 2022
Ife Adebara
AbdelRahim Elmadany
Muhammad Abdul-Mageed
Alcides Alcoba Inciarte
288
43
0
21 Dec 2022
JASMINE: Arabic GPT Models for Few-Shot Learning
JASMINE: Arabic GPT Models for Few-Shot LearningConference on Empirical Methods in Natural Language Processing (EMNLP), 2022
El Moatez Billah Nagoudi
Muhammad Abdul-Mageed
AbdelRahim Elmadany
Alcides Alcoba Inciarte
Md. Tawkat Islam Khondaker
204
13
0
21 Dec 2022
DialGuide: Aligning Dialogue Model Behavior with Developer Guidelines
DialGuide: Aligning Dialogue Model Behavior with Developer GuidelinesConference on Empirical Methods in Natural Language Processing (EMNLP), 2022
Prakhar Gupta
Yang Liu
Di Jin
Behnam Hedayatnia
Spandana Gella
Sijia Liu
P. Lange
Julia Hirschberg
Dilek Z. Hakkani-Tür
233
6
0
20 Dec 2022
T-Projection: High Quality Annotation Projection for Sequence Labeling
  Tasks
T-Projection: High Quality Annotation Projection for Sequence Labeling TasksConference on Empirical Methods in Natural Language Processing (EMNLP), 2022
Iker García-Ferrero
Rodrigo Agerri
German Rigau
253
16
0
20 Dec 2022
When Not to Trust Language Models: Investigating Effectiveness of
  Parametric and Non-Parametric Memories
When Not to Trust Language Models: Investigating Effectiveness of Parametric and Non-Parametric MemoriesAnnual Meeting of the Association for Computational Linguistics (ACL), 2022
Alex Troy Mallen
Akari Asai
Victor Zhong
Rajarshi Das
Daniel Khashabi
Hannaneh Hajishirzi
RALMHILMKELM
407
898
0
20 Dec 2022
SODA: Million-scale Dialogue Distillation with Social Commonsense
  Contextualization
SODA: Million-scale Dialogue Distillation with Social Commonsense ContextualizationConference on Empirical Methods in Natural Language Processing (EMNLP), 2022
Hyunwoo J. Kim
Jack Hessel
Liwei Jiang
Peter West
Ximing Lu
...
Ronan Le Bras
Malihe Alikhani
Gunhee Kim
Maarten Sap
Yejin Choi
HILM
463
193
0
20 Dec 2022
Go-tuning: Improving Zero-shot Learning Abilities of Smaller Language
  Models
Go-tuning: Improving Zero-shot Learning Abilities of Smaller Language Models
Jingjing Xu
Qingxiu Dong
Hongyi Liu
Lei Li
ALMLRM
115
2
0
20 Dec 2022
Is GPT-3 a Good Data Annotator?
Is GPT-3 a Good Data Annotator?Annual Meeting of the Association for Computational Linguistics (ACL), 2022
Bosheng Ding
Chengwei Qin
Linlin Liu
Yew Ken Chia
Shafiq Joty
Boyang Albert Li
Lidong Bing
345
312
0
20 Dec 2022
Geographic and Geopolitical Biases of Language Models
Geographic and Geopolitical Biases of Language Models
Fahim Faisal
Antonios Anastasopoulos
222
31
0
20 Dec 2022
Towards Reasoning in Large Language Models: A Survey
Towards Reasoning in Large Language Models: A SurveyAnnual Meeting of the Association for Computational Linguistics (ACL), 2022
Jie Huang
Kevin Chen-Chuan Chang
LM&MAELMLRM
1.1K
814
0
20 Dec 2022
Data Curation Alone Can Stabilize In-context Learning
Data Curation Alone Can Stabilize In-context LearningAnnual Meeting of the Association for Computational Linguistics (ACL), 2022
Ting-Yun Chang
Robin Jia
165
62
0
20 Dec 2022
HINT: Hypernetwork Instruction Tuning for Efficient Zero- & Few-Shot
  Generalisation
HINT: Hypernetwork Instruction Tuning for Efficient Zero- & Few-Shot GeneralisationAnnual Meeting of the Association for Computational Linguistics (ACL), 2022
Michal Guerquin
Akshita Bhagia
Yizhong Wang
Hannaneh Hajishirzi
Matthew E. Peters
310
24
0
20 Dec 2022
On the Blind Spots of Model-Based Evaluation Metrics for Text Generation
On the Blind Spots of Model-Based Evaluation Metrics for Text GenerationAnnual Meeting of the Association for Computational Linguistics (ACL), 2022
Tianxing He
Jingyu Zhang
Tianle Wang
Sachin Kumar
Dong Wang
James R. Glass
Yulia Tsvetkov
391
59
0
20 Dec 2022
Inducing Character-level Structure in Subword-based Language Models with
  Type-level Interchange Intervention Training
Inducing Character-level Structure in Subword-based Language Models with Type-level Interchange Intervention TrainingAnnual Meeting of the Association for Computational Linguistics (ACL), 2022
Jing-ling Huang
Zhengxuan Wu
Kyle Mahowald
Christopher Potts
233
15
0
19 Dec 2022
Training Trajectories of Language Models Across Scales
Training Trajectories of Language Models Across ScalesAnnual Meeting of the Association for Computational Linguistics (ACL), 2022
Mengzhou Xia
Mikel Artetxe
Chunting Zhou
Xi Lin
Ramakanth Pasunuru
Danqi Chen
Luke Zettlemoyer
Ves Stoyanov
AIFinLRM
269
71
0
19 Dec 2022
The case for 4-bit precision: k-bit Inference Scaling Laws
The case for 4-bit precision: k-bit Inference Scaling LawsInternational Conference on Machine Learning (ICML), 2022
Tim Dettmers
Luke Zettlemoyer
MQ
392
292
0
19 Dec 2022
Explanation Regeneration via Information Bottleneck
Explanation Regeneration via Information BottleneckAnnual Meeting of the Association for Computational Linguistics (ACL), 2022
Qintong Li
Zhiyong Wu
Lingpeng Kong
Wei Bi
263
4
0
19 Dec 2022
BLOOM+1: Adding Language Support to BLOOM for Zero-Shot Prompting
BLOOM+1: Adding Language Support to BLOOM for Zero-Shot PromptingAnnual Meeting of the Association for Computational Linguistics (ACL), 2022
Zheng-Xin Yong
Hailey Schoelkopf
Niklas Muennighoff
Alham Fikri Aji
David Ifeoluwa Adelani
...
Genta Indra Winata
Stella Biderman
Edward Raff
Dragomir R. Radev
Vassilina Nikoulina
CLLVLMAI4CELRM
386
106
0
19 Dec 2022
I2D2: Inductive Knowledge Distillation with NeuroLogic and
  Self-Imitation
I2D2: Inductive Knowledge Distillation with NeuroLogic and Self-ImitationAnnual Meeting of the Association for Computational Linguistics (ACL), 2022
Chandra Bhagavatula
Jena D. Hwang
Doug Downey
Ronan Le Bras
Ximing Lu
Lianhui Qin
Keisuke Sakaguchi
Swabha Swayamdipta
Peter West
Yejin Choi
212
38
0
19 Dec 2022
Rethinking the Role of Scale for In-Context Learning: An
  Interpretability-based Case Study at 66 Billion Scale
Rethinking the Role of Scale for In-Context Learning: An Interpretability-based Case Study at 66 Billion ScaleAnnual Meeting of the Association for Computational Linguistics (ACL), 2022
Hritik Bansal
Karthik Gopalakrishnan
Saket Dingliwal
S. Bodapati
Katrin Kirchhoff
Dan Roth
LRM
247
64
0
18 Dec 2022
Low-Resource Authorship Style Transfer: Can Non-Famous Authors Be
  Imitated?
Low-Resource Authorship Style Transfer: Can Non-Famous Authors Be Imitated?
Ajay Patel
Nicholas Andrews
Chris Callison-Burch
210
10
0
18 Dec 2022
Language model acceptability judgements are not always robust to context
Language model acceptability judgements are not always robust to contextAnnual Meeting of the Association for Computational Linguistics (ACL), 2022
Koustuv Sinha
Jon Gauthier
Aaron Mueller
Kanishka Misra
Keren Fuentes
R. Levy
Adina Williams
210
20
0
18 Dec 2022
Graph Learning and Its Advancements on Large Language Models: A Holistic
  Survey
Graph Learning and Its Advancements on Large Language Models: A Holistic Survey
Shaopeng Wei
Yu Zhao
Xingyan Chen
Qing Li
Fuzhen Zhuang
Ji Liu
Fuji Ren
Gang Kou
AI4CE
422
6
0
17 Dec 2022
Rarely a problem? Language models exhibit inverse scaling in their
  predictions following few-type quantifiers
Rarely a problem? Language models exhibit inverse scaling in their predictions following few-type quantifiersAnnual Meeting of the Association for Computational Linguistics (ACL), 2022
J. Michaelov
Benjamin Bergen
198
18
0
16 Dec 2022
Self-Prompting Large Language Models for Zero-Shot Open-Domain QA
Self-Prompting Large Language Models for Zero-Shot Open-Domain QANorth American Chapter of the Association for Computational Linguistics (NAACL), 2022
Junlong Li
Jinyuan Wang
Zhuosheng Zhang
Hai Zhao
LRM
223
55
0
16 Dec 2022
MURMUR: Modular Multi-Step Reasoning for Semi-Structured Data-to-Text
  Generation
MURMUR: Modular Multi-Step Reasoning for Semi-Structured Data-to-Text GenerationAnnual Meeting of the Association for Computational Linguistics (ACL), 2022
Swarnadeep Saha
Xinyan Velocity Yu
Joey Tianyi Zhou
Ramakanth Pasunuru
Asli Celikyilmaz
ReLMLRM
227
13
0
16 Dec 2022
Lessons learned from the evaluation of Spanish Language Models
Lessons learned from the evaluation of Spanish Language Models
Rodrigo Agerri
Eneko Agirre
ELM
265
16
0
16 Dec 2022
Controllable Text Generation via Probability Density Estimation in the
  Latent Space
Controllable Text Generation via Probability Density Estimation in the Latent SpaceAnnual Meeting of the Association for Computational Linguistics (ACL), 2022
Yuxuan Gu
Xiaocheng Feng
Sicheng Ma
Lingyuan Zhang
Heng Gong
Weihong Zhong
Bing Qin
230
28
0
16 Dec 2022
ALERT: Adapting Language Models to Reasoning Tasks
ALERT: Adapting Language Models to Reasoning TasksAnnual Meeting of the Association for Computational Linguistics (ACL), 2022
Ping Yu
Tianlu Wang
O. Yu. Golovneva
Badr AlKhamissi
Siddharth Verma
Zhijing Jin
Gargi Ghosh
Mona T. Diab
Asli Celikyilmaz
ReLMLRM
269
20
0
16 Dec 2022
Improving Chess Commentaries by Combining Language Models with Symbolic
  Reasoning Engines
Improving Chess Commentaries by Combining Language Models with Symbolic Reasoning Engines
Andrew Lee
David Wu
Emily Dinan
M. Lewis
LRM
165
9
0
15 Dec 2022
On Second Thought, Let's Not Think Step by Step! Bias and Toxicity in
  Zero-Shot Reasoning
On Second Thought, Let's Not Think Step by Step! Bias and Toxicity in Zero-Shot ReasoningAnnual Meeting of the Association for Computational Linguistics (ACL), 2022
Omar Shaikh
Hongxin Zhang
William B. Held
Michael S. Bernstein
Diyi Yang
ReLMLRM
488
241
0
15 Dec 2022
Attributed Question Answering: Evaluation and Modeling for Attributed
  Large Language Models
Attributed Question Answering: Evaluation and Modeling for Attributed Large Language Models
Bernd Bohnet
Vinh Q. Tran
Pat Verga
Roee Aharoni
D. Andor
...
Michael Collins
Dipanjan Das
Donald Metzler
Slav Petrov
Kellie Webster
320
81
0
15 Dec 2022
Prompting Is Programming: A Query Language for Large Language Models
Prompting Is Programming: A Query Language for Large Language Models
Luca Beurer-Kellner
Marc Fischer
Martin Vechev
LRM
388
143
0
12 Dec 2022
Elixir: Train a Large Language Model on a Small GPU Cluster
Elixir: Train a Large Language Model on a Small GPU Cluster
Haichen Huang
Jiarui Fang
Hongxin Liu
Shenggui Li
Yang You
VLM
250
10
0
10 Dec 2022
Structured information extraction from complex scientific text with
  fine-tuned large language models
Structured information extraction from complex scientific text with fine-tuned large language models
Alex Dunn
John Dagdelen
Nicholas Walker
Sanghoon Lee
Andrew S. Rosen
Gerbrand Ceder
Kristin A. Persson
Anubhav Jain
248
108
0
10 Dec 2022
DC-MBR: Distributional Cooling for Minimum Bayesian Risk Decoding
DC-MBR: Distributional Cooling for Minimum Bayesian Risk DecodingInternational Conference on Language Resources and Evaluation (LREC), 2022
Jianhao Yan
Jin Xu
Fandong Meng
Jie Zhou
Yue Zhang
348
4
0
08 Dec 2022
Demystifying Prompts in Language Models via Perplexity Estimation
Demystifying Prompts in Language Models via Perplexity EstimationConference on Empirical Methods in Natural Language Processing (EMNLP), 2022
Hila Gonen
Srini Iyer
Terra Blevins
Noah A. Smith
Luke Zettlemoyer
LRM
408
278
0
08 Dec 2022
The problem with AI consciousness: A neurogenetic case against synthetic
  sentience
The problem with AI consciousness: A neurogenetic case against synthetic sentience
Yoshija Walter
L. Zbinden
90
1
0
07 Dec 2022
I2MVFormer: Large Language Model Generated Multi-View Document
  Supervision for Zero-Shot Image Classification
I2MVFormer: Large Language Model Generated Multi-View Document Supervision for Zero-Shot Image ClassificationComputer Vision and Pattern Recognition (CVPR), 2022
Muhammad Ferjad Naeem
Muhammad Gul Zain Ali Khan
Yongqin Xian
Muhammad Zeshan Afzal
D. Stricker
Luc Van Gool
F. Tombari
VLM
206
84
0
05 Dec 2022
Momentum Decoding: Open-ended Text Generation As Graph Exploration
Momentum Decoding: Open-ended Text Generation As Graph Exploration
Tian Lan
Yixuan Su
Shuhang Liu
Heyan Huang
Xian-Ling Mao
140
5
0
05 Dec 2022
Understanding How Model Size Affects Few-shot Instruction Prompting
Understanding How Model Size Affects Few-shot Instruction Prompting
Ayrton San Joaquin
Ardy Haroen
71
0
0
04 Dec 2022
Nonparametric Masked Language Modeling
Nonparametric Masked Language ModelingAnnual Meeting of the Association for Computational Linguistics (ACL), 2022
Sewon Min
Weijia Shi
M. Lewis
Xilun Chen
Anuj Kumar
Hannaneh Hajishirzi
Luke Zettlemoyer
RALM
339
55
0
02 Dec 2022
Extensible Prompts for Language Models on Zero-shot Language Style
  Customization
Extensible Prompts for Language Models on Zero-shot Language Style CustomizationNeural Information Processing Systems (NeurIPS), 2022
Tao Ge
Jing Hu
Li Dong
Shaoguang Mao
Yanqiu Xia
Xun Wang
Si-Qing Chen
Furu Wei
VLM
203
8
0
01 Dec 2022
What learning algorithm is in-context learning? Investigations with
  linear models
What learning algorithm is in-context learning? Investigations with linear modelsInternational Conference on Learning Representations (ICLR), 2022
Ekin Akyürek
Dale Schuurmans
Jacob Andreas
Tengyu Ma
Denny Zhou
543
620
0
28 Nov 2022
Self-Destructing Models: Increasing the Costs of Harmful Dual Uses of
  Foundation Models
Self-Destructing Models: Increasing the Costs of Harmful Dual Uses of Foundation ModelsAAAI/ACM Conference on AI, Ethics, and Society (AIES), 2022
Peter Henderson
E. Mitchell
Christopher D. Manning
Dan Jurafsky
Chelsea Finn
241
62
0
27 Nov 2022
Complementary Explanations for Effective In-Context Learning
Complementary Explanations for Effective In-Context LearningAnnual Meeting of the Association for Computational Linguistics (ACL), 2022
Xi Ye
Srini Iyer
Asli Celikyilmaz
Ves Stoyanov
Greg Durrett
Ramakanth Pasunuru
ReLMLRM
268
114
0
25 Nov 2022
Undesirable Biases in NLP: Addressing Challenges of Measurement
Undesirable Biases in NLP: Addressing Challenges of Measurement
Oskar van der Wal
Dominik Bachmann
Alina Leidinger
L. Maanen
Willem H. Zuidema
K. Schulz
475
8
0
24 Nov 2022
Previous
123...545556575859
Next
Page 55 of 59
Pageof 59