v1v2v3v4 (latest)

OPT: Open Pre-trained Transformer Language Models

2 May 2022

Xian Li

Luke Zettlemoyer

ArXiv (abs)PDF HTML HuggingFace (2 upvotes)

Papers citing "OPT: Open Pre-trained Transformer Language Models"

50 / 2,924 papers shown

Why Does Surprisal From Larger Transformer-Based Language Models Provide a Poorer Fit to Human Reading Times?Transactions of the Association for Computational Linguistics (TACL), 2022

Byung-Doh Oh

William Schuler

173

151

23 Dec 2022

OPT-IML: Scaling Language Model Instruction Meta Learning through the Lens of Generalization

...

Luke Zettlemoyer

487

303

22 Dec 2022

From Images to Textual Prompts: Zero-shot VQA with Frozen Large Language ModelsComputer Vision and Pattern Recognition (CVPR), 2022

385

163

21 Dec 2022

SERENGETI: Massively Multilingual Language Models for AfricaAnnual Meeting of the Association for Computational Linguistics (ACL), 2022

Ife Adebara

AbdelRahim Elmadany

Muhammad Abdul-Mageed

Alcides Alcoba Inciarte

288

21 Dec 2022

JASMINE: Arabic GPT Models for Few-Shot LearningConference on Empirical Methods in Natural Language Processing (EMNLP), 2022

El Moatez Billah Nagoudi

Muhammad Abdul-Mageed

AbdelRahim Elmadany

Alcides Alcoba Inciarte

Md. Tawkat Islam Khondaker

204

21 Dec 2022

DialGuide: Aligning Dialogue Model Behavior with Developer GuidelinesConference on Empirical Methods in Natural Language Processing (EMNLP), 2022

Yang Liu

233

20 Dec 2022

T-Projection: High Quality Annotation Projection for Sequence Labeling TasksConference on Empirical Methods in Natural Language Processing (EMNLP), 2022

Iker García-Ferrero

Rodrigo Agerri

German Rigau

253

20 Dec 2022

When Not to Trust Language Models: Investigating Effectiveness of Parametric and Non-Parametric MemoriesAnnual Meeting of the Association for Computational Linguistics (ACL), 2022

Daniel Khashabi

407

898

20 Dec 2022

SODA: Million-scale Dialogue Distillation with Social Commonsense ContextualizationConference on Empirical Methods in Natural Language Processing (EMNLP), 2022

...

Yejin Choi

463

193

20 Dec 2022

Go-tuning: Improving Zero-shot Learning Abilities of Smaller Language Models

Jingjing Xu

Qingxiu Dong

Hongyi Liu

Lei Li

ALM LRM

115

20 Dec 2022

Is GPT-3 a Good Data Annotator?Annual Meeting of the Association for Computational Linguistics (ACL), 2022

345

312

20 Dec 2022

Geographic and Geopolitical Biases of Language Models

Fahim Faisal

Antonios Anastasopoulos

222

20 Dec 2022

Towards Reasoning in Large Language Models: A SurveyAnnual Meeting of the Association for Computational Linguistics (ACL), 2022

Jie Huang

Kevin Chen-Chuan Chang

LM&MA ELM LRM

1.1K

814

20 Dec 2022

Data Curation Alone Can Stabilize In-context LearningAnnual Meeting of the Association for Computational Linguistics (ACL), 2022

Ting-Yun Chang

Robin Jia

165

20 Dec 2022

HINT: Hypernetwork Instruction Tuning for Efficient Zero- & Few-Shot GeneralisationAnnual Meeting of the Association for Computational Linguistics (ACL), 2022

Akshita Bhagia

310

20 Dec 2022

On the Blind Spots of Model-Based Evaluation Metrics for Text GenerationAnnual Meeting of the Association for Computational Linguistics (ACL), 2022

Tianxing He

Jingyu Zhang

Tianle Wang

391

20 Dec 2022

Inducing Character-level Structure in Subword-based Language Models with Type-level Interchange Intervention TrainingAnnual Meeting of the Association for Computational Linguistics (ACL), 2022

233

19 Dec 2022

Training Trajectories of Language Models Across ScalesAnnual Meeting of the Association for Computational Linguistics (ACL), 2022

Luke Zettlemoyer

269

19 Dec 2022

The case for 4-bit precision: k-bit Inference Scaling LawsInternational Conference on Machine Learning (ICML), 2022

Tim Dettmers

Luke Zettlemoyer

392

292

19 Dec 2022

Explanation Regeneration via Information BottleneckAnnual Meeting of the Association for Computational Linguistics (ACL), 2022

Qintong Li

Zhiyong Wu

Lingpeng Kong

Wei Bi

263

19 Dec 2022

BLOOM+1: Adding Language Support to BLOOM for Zero-Shot PromptingAnnual Meeting of the Association for Computational Linguistics (ACL), 2022

Zheng-Xin Yong

Hailey Schoelkopf

Niklas Muennighoff

Alham Fikri Aji

David Ifeoluwa Adelani

...

386

106

19 Dec 2022

I2D2: Inductive Knowledge Distillation with NeuroLogic and Self-ImitationAnnual Meeting of the Association for Computational Linguistics (ACL), 2022

Keisuke Sakaguchi

Yejin Choi

212

19 Dec 2022

Rethinking the Role of Scale for In-Context Learning: An Interpretability-based Case Study at 66 Billion ScaleAnnual Meeting of the Association for Computational Linguistics (ACL), 2022

Hritik Bansal

Karthik Gopalakrishnan

247

18 Dec 2022

Low-Resource Authorship Style Transfer: Can Non-Famous Authors Be Imitated?

Ajay Patel

Nicholas Andrews

Chris Callison-Burch

210

18 Dec 2022

Language model acceptability judgements are not always robust to contextAnnual Meeting of the Association for Computational Linguistics (ACL), 2022

210

18 Dec 2022

Graph Learning and Its Advancements on Large Language Models: A Holistic Survey

422

17 Dec 2022

Rarely a problem? Language models exhibit inverse scaling in their predictions following few-type quantifiersAnnual Meeting of the Association for Computational Linguistics (ACL), 2022

J. Michaelov

Benjamin Bergen

198

16 Dec 2022

Self-Prompting Large Language Models for Zero-Shot Open-Domain QANorth American Chapter of the Association for Computational Linguistics (NAACL), 2022

223

16 Dec 2022

MURMUR: Modular Multi-Step Reasoning for Semi-Structured Data-to-Text GenerationAnnual Meeting of the Association for Computational Linguistics (ACL), 2022

227

16 Dec 2022

Lessons learned from the evaluation of Spanish Language Models

Rodrigo Agerri

Eneko Agirre

ELM

265

16 Dec 2022

Controllable Text Generation via Probability Density Estimation in the Latent SpaceAnnual Meeting of the Association for Computational Linguistics (ACL), 2022

230

16 Dec 2022

ALERT: Adapting Language Models to Reasoning TasksAnnual Meeting of the Association for Computational Linguistics (ACL), 2022

269

16 Dec 2022

Improving Chess Commentaries by Combining Language Models with Symbolic Reasoning Engines

165

15 Dec 2022

On Second Thought, Let's Not Think Step by Step! Bias and Toxicity in Zero-Shot ReasoningAnnual Meeting of the Association for Computational Linguistics (ACL), 2022

Diyi Yang

488

241

15 Dec 2022

Attributed Question Answering: Evaluation and Modeling for Attributed Large Language Models

...

320

15 Dec 2022

Prompting Is Programming: A Query Language for Large Language Models

Luca Beurer-Kellner

Marc Fischer

Martin Vechev

LRM

388

143

12 Dec 2022

Elixir: Train a Large Language Model on a Small GPU Cluster

Yang You

250

10 Dec 2022

Structured information extraction from complex scientific text with fine-tuned large language models

248

108

10 Dec 2022

DC-MBR: Distributional Cooling for Minimum Bayesian Risk DecodingInternational Conference on Language Resources and Evaluation (LREC), 2022

Jianhao Yan

Jin Xu

Fandong Meng

Jie Zhou

Yue Zhang

348

08 Dec 2022

Demystifying Prompts in Language Models via Perplexity EstimationConference on Empirical Methods in Natural Language Processing (EMNLP), 2022

Luke Zettlemoyer

408

278

08 Dec 2022

The problem with AI consciousness: A neurogenetic case against synthetic sentience

Yoshija Walter

L. Zbinden

07 Dec 2022

I2MVFormer: Large Language Model Generated Multi-View Document Supervision for Zero-Shot Image ClassificationComputer Vision and Pattern Recognition (CVPR), 2022

Muhammad Ferjad Naeem

Muhammad Gul Zain Ali Khan

Yongqin Xian

Muhammad Zeshan Afzal

D. Stricker

Luc Van Gool

F. Tombari

VLM

206

05 Dec 2022

Momentum Decoding: Open-ended Text Generation As Graph Exploration

140

05 Dec 2022

Understanding How Model Size Affects Few-shot Instruction Prompting

Ayrton San Joaquin

Ardy Haroen

04 Dec 2022

Nonparametric Masked Language ModelingAnnual Meeting of the Association for Computational Linguistics (ACL), 2022

Weijia Shi

Luke Zettlemoyer

339

02 Dec 2022

Extensible Prompts for Language Models on Zero-shot Language Style CustomizationNeural Information Processing Systems (NeurIPS), 2022

203

01 Dec 2022

What learning algorithm is in-context learning? Investigations with linear modelsInternational Conference on Learning Representations (ICLR), 2022

543

620

28 Nov 2022

Self-Destructing Models: Increasing the Costs of Harmful Dual Uses of Foundation ModelsAAAI/ACM Conference on AI, Ethics, and Society (AIES), 2022

Peter Henderson

E. Mitchell

Christopher D. Manning

Dan Jurafsky

Chelsea Finn

241

27 Nov 2022

Complementary Explanations for Effective In-Context LearningAnnual Meeting of the Association for Computational Linguistics (ACL), 2022

268

114

25 Nov 2022

Undesirable Biases in NLP: Addressing Challenges of Measurement

475

24 Nov 2022