v1v2v3v4 (latest)

OPT: Open Pre-trained Transformer Language Models

2 May 2022

Xian Li

Luke Zettlemoyer

ArXiv (abs)PDF HTML HuggingFace (2 upvotes)

Papers citing "OPT: Open Pre-trained Transformer Language Models"

50 / 2,924 papers shown

Augmentation with Projection: Towards an Effective and Efficient Data Augmentation Paradigm for DistillationInternational Conference on Learning Representations (ICLR), 2022

Heng Ji

262

21 Oct 2022

SMaLL-100: Introducing Shallow Multilingual Machine Translation Model for Low-Resource LanguagesConference on Empirical Methods in Natural Language Processing (EMNLP), 2022

Alireza Mohammadshahi

246

20 Oct 2022

lo-fi: distributed fine-tuning without communication

349

19 Oct 2022

Attribution and Obfuscation of Neural Text Authorship: A Data Mining PerspectiveSIGKDD Explorations (SIGKDD Explor.), 2022

316

19 Oct 2022

Prompting GPT-3 To Be ReliableInternational Conference on Learning Representations (ICLR), 2022

Jordan L. Boyd-Graber

Lijuan Wang

KELM LRM

414

343

17 Oct 2022

Language Generation Models Can Cause Harm: So What Can We Do About It? An Actionable SurveyConference of the European Chapter of the Association for Computational Linguistics (EACL), 2022

Sachin Kumar

Vidhisha Balachandran

Lucille Njoo

Antonios Anastasopoulos

Yulia Tsvetkov

ELM

452

106

14 Oct 2022

Enabling Classifiers to Make Judgements Explicitly Aligned with Human Values

Andrea Madotto

178

14 Oct 2022

Machine Generated Text: A Comprehensive Survey of Threat Models and Detection MethodsIEEE Access (IEEE Access), 2022

Evan Crothers

Nathalie Japkowicz

H. Viktor

DeLMO

386

158

13 Oct 2022

Bootstrapping Multilingual Semantic Parsers using Large Language ModelsConference of the European Chapter of the Association for Computational Linguistics (EACL), 2022

274

13 Oct 2022

Visual Classification via Description from Large Language ModelsInternational Conference on Learning Representations (ICLR), 2022

Sachit Menon

Carl Vondrick

VLM

385

374

13 Oct 2022

MAPL: Parameter-Efficient Adaptation of Unimodal Pre-Trained Models for Vision-Language Few-Shot PromptingConference of the European Chapter of the Association for Computational Linguistics (EACL), 2022

264

13 Oct 2022

On Divergence Measures for Bayesian PseudocoresetsNeural Information Processing Systems (NeurIPS), 2022

188

12 Oct 2022

Generating Executable Action Plans with Environmentally-Aware Language ModelsIEEE/RJS International Conference on Intelligent RObots and Systems (IROS), 2022

Maitrey Gramopadhye

D. Szafir

LM&Ro LLMAG

325

10 Oct 2022

Controllable Dialogue Simulation with In-Context LearningConference on Empirical Methods in Natural Language Processing (EMNLP), 2022

475

09 Oct 2022

AlphaTuning: Quantization-Aware Parameter-Efficient Adaptation of Large-Scale Pre-Trained Language ModelsConference on Empirical Methods in Natural Language Processing (EMNLP), 2022

283

08 Oct 2022

Large Language Models can Implement Policy IterationNeural Information Processing Systems (NeurIPS), 2022

387

07 Oct 2022

LLMEffiChecker: Understanding and Testing Efficiency Degradation of Large Language ModelsACM Transactions on Software Engineering and Methodology (TOSEM), 2022

262

07 Oct 2022

Few-Shot Anaphora Resolution in Scientific Protocols via Mixtures of In-Context ExpertsConference on Empirical Methods in Natural Language Processing (EMNLP), 2022

Nghia T. Le

Fan Bai

Alan Ritter

380

07 Oct 2022

State-of-the-art generalisation research in NLP: A taxonomy and reviewNature Machine Intelligence (Nat. Mach. Intell.), 2022

Verna Dankers

...

631

132

06 Oct 2022

Guess the Instruction! Flipped Learning Makes Language Models Stronger Zero-Shot LearnersInternational Conference on Learning Representations (ICLR), 2022

451

06 Oct 2022

A Distributional Lens for Multi-Aspect Controllable Text GenerationConference on Empirical Methods in Natural Language Processing (EMNLP), 2022

344

06 Oct 2022

Large Language Models are Pretty Good Zero-Shot Video Game Bug Detectors

Mohammad Reza Taesiri

Yihe Wang

178

05 Oct 2022

Ask Me Anything: A simple strategy for prompting language modelsInternational Conference on Learning Representations (ICLR), 2022

650

256

05 Oct 2022

GLM-130B: An Open Bilingual Pre-trained ModelInternational Conference on Learning Representations (ICLR), 2022

Xiao Liu

...

Yuxiao Dong

805

1,221

05 Oct 2022

Explaining Patterns in Data with Language Models via Interpretable Autoprompting

189

04 Oct 2022

Text Characterization Toolkit

Daniel Simig

Tianlu Wang

Verna Dankers

Peter Henderson

Khuyagbaatar Batsuren

Dieuwke Hupkes

Mona T. Diab

171

04 Oct 2022

Knowledge Unlearning for Mitigating Privacy Risks in Language ModelsAnnual Meeting of the Association for Computational Linguistics (ACL), 2022

507

365

04 Oct 2022

Recitation-Augmented Language ModelsInternational Conference on Learning Representations (ICLR), 2022

868

04 Oct 2022

Robot Task Planning and Situation Handling in Open Worlds

166

04 Oct 2022

FRMT: A Benchmark for Few-Shot Region-Aware Machine TranslationTransactions of the Association for Computational Linguistics (TACL), 2022

325

01 Oct 2022

Zemi: Learning Zero-Shot Semi-Parametric Language Models from Multiple TasksAnnual Meeting of the Association for Computational Linguistics (ACL), 2022

Dian Yu

Heng Ji

271

01 Oct 2022

AudioGen: Textually Guided Audio GenerationInternational Conference on Learning Representations (ICLR), 2022

Devi Parikh

Yossi Adi

433

394

30 Sep 2022

SmallCap: Lightweight Image Captioning Prompted with Retrieval AugmentationComputer Vision and Pattern Recognition (CVPR), 2022

R. Ramos

Bruno Martins

Desmond Elliott

Yova Kementchedjhieva

VLM

206

121

30 Sep 2022

Unpacking Large Language Models with Conceptual Consistency

217

29 Sep 2022

Bidirectional Language Models Are Also Few-shot LearnersInternational Conference on Learning Representations (ICLR), 2022

Ajay Patel

Bryan Li

Mohammad Sadegh Rasooli

229

29 Sep 2022

EditEval: An Instruction-Based Benchmark for Text ImprovementsConference on Computational Natural Language Learning (CoNLL), 2022

Patrick Lewis

199

27 Sep 2022

Deep Generative Multimedia Children's Literature

Matthew Lyle Olson

27 Sep 2022

Can Large Language Models Truly Understand Prompts? A Case Study with Negated Prompts

255

26 Sep 2022

Learning to Drop Out: An Adversarial Approach to Training Sequence VAEsNeural Information Processing Systems (NeurIPS), 2022

324

26 Sep 2022

Moral Mimicry: Large Language Models Produce Moral Rationalizations Tailored to Political IdentityAnnual Meeting of the Association for Computational Linguistics (ACL), 2022

Gabriel Simmons

371

24 Sep 2022

Variational Open-Domain Question AnsweringInternational Conference on Machine Learning (ICML), 2022

Valentin Liévin

Andreas Geert Motzfeldt

Ida Riis Jensen

Ole Winther

OOD BDL

210

23 Sep 2022

A Case Report On The "A.I. Locked-In Problem": social concerns with modern NLP

Yoshija Walter

LLMAG

129

22 Sep 2022

WeLM: A Well-Read Pre-trained Language Model for Chinese

Xiao Zhou

269

21 Sep 2022

Generate rather than Retrieve: Large Language Models are Strong Context GeneratorsInternational Conference on Learning Representations (ICLR), 2022

1.2K

398

21 Sep 2022

Extremely Simple Activation Shaping for Out-of-Distribution DetectionInternational Conference on Learning Representations (ICLR), 2022

Andrija Djurisic

445

206

20 Sep 2022

Training Recipe for N:M Structured Sparsity with Decaying Pruning Mask

329

15 Sep 2022

FP8 Formats for Deep Learning

Paulius Micikevicius

...

802

202

12 Sep 2022

Open-Domain Dialog Evaluation using Follow-Ups LikelihoodInternational Conference on Computational Linguistics (COLING), 2022

206

12 Sep 2022

Chain of Explanation: New Prompting Method to Generate Higher Quality Natural Language Explanation for Implicit Hate SpeechThe Web Conference (WWW), 2022

210

11 Sep 2022

Analyzing Transformers in Embedding SpaceAnnual Meeting of the Association for Computational Linguistics (ACL), 2022

332

124

06 Sep 2022