v1v2v3 (latest)

Measuring Massive Multitask Language Understanding

International Conference on Learning Representations (ICLR), 2020

7 September 2020

ArXiv (abs)PDF HTML HuggingFace (3 upvotes)

Papers citing "Measuring Massive Multitask Language Understanding"

50 / 4,502 papers shown

nanoLM: an Affordable LLM Pre-training Benchmark via Accurate Loss Prediction across Scales

Xuezhi Fang

...

Kang Liu

240

14 Apr 2023

Learning Personalized Decision Support PoliciesAAAI Conference on Artificial Intelligence (AAAI), 2023

Umang Bhatt

Valerie Chen

Katherine M. Collins

Parameswaran Kamalaruban

576

13 Apr 2023

AGIEval: A Human-Centric Benchmark for Evaluating Foundation Models

385

765

13 Apr 2023

Can Large Language Models Transform Computational Social Science?International Conference on Computational Logic (ICCL), 2023

Jiaao Chen

Diyi Yang

518

453

12 Apr 2023

Boosted Prompt Ensembles for Large Language Models

Silviu Pitis

Michael Ruogu Zhang

Andrew Wang

Jimmy Ba

LRM LLMAG

237

12 Apr 2023

LLMMaps -- A Visual Metaphor for Stratified Evaluation of Large Language Models

Patrik Puchert

Poonam Poonam

Christian van Onzenoodt

Timo Ropinski

187

02 Apr 2023

BloombergGPT: A Large Language Model for Finance

715

1,195

30 Mar 2023

Whose Opinions Do Language Models Reflect?International Conference on Machine Learning (ICML), 2023

Esin Durmus

Tatsunori Hashimoto

383

688

30 Mar 2023

Natural Language Reasoning, A SurveyACM Computing Surveys (ACM Comput. Surv.), 2023

Hongbo Zhang

359

101

26 Mar 2023

$k$NN Prompting: Beyond-Context Learning with Calibration-Free Nearest
Neighbor Inference

k

NN Prompting: Beyond-Context Learning with Calibration-Free Nearest Neighbor InferenceInternational Conference on Learning Representations (ICLR), 2023

Benfeng Xu

318

24 Mar 2023

Sparse-IFT: Sparse Iso-FLOP Transformations for Maximizing Training EfficiencyInternational Conference on Machine Learning (ICML), 2023

467

21 Mar 2023

Language Model Behavior: A Comprehensive SurveyInternational Conference on Computational Logic (ICCL), 2023

Tyler A. Chang

Benjamin Bergen

VLM LRM LM&MA

410

148

20 Mar 2023

eP-ALM: Efficient Perceptual Augmentation of Language ModelsIEEE International Conference on Computer Vision (ICCV), 2023

432

20 Mar 2023

Capabilities of GPT-4 on Medical Challenge Problems

504

1,120

20 Mar 2023

Large Language Model Instruction Following: A Survey of Progresses and ChallengesComputational Linguistics (CL), 2023

877

18 Mar 2023

Can Generative Pre-trained Transformers (GPT) Pass Assessments in Higher Education Programming Courses?

190

121

16 Mar 2023

ART: Automatic multi-step reasoning and tool-use for large language models

Luke Zettlemoyer

327

198

16 Mar 2023

The Learnability of In-Context LearningNeural Information Processing Systems (NeurIPS), 2023

Noam Wies

Yoav Levine

Amnon Shashua

348

169

14 Mar 2023

Generating multiple-choice questions for medical question answering with distractors and cue-maskingInternational Conference on Language Resources and Evaluation (LREC), 2023

Damien Sileo

Kanimozhi Uma

Marie-Francine Moens

238

13 Mar 2023

LLaMA: Open and Efficient Foundation Language Models

...

16.6K

18,610

27 Feb 2023

Testing AI on language comprehension tasks reveals insensitivity to underlying meaningScientific Reports (Sci Rep), 2023

Elliot Murphy

453

23 Feb 2023

Complex QA and language models hybrid architectures, Survey

769

17 Feb 2023

Augmented Language Models: a Survey

Grégoire Mialon

Roberto Dessì

Maria Lomeli

Christoforos Nalmpantis

...

317

509

15 Feb 2023

STREET: A Multi-Task Structured Reasoning and Explanation BenchmarkInternational Conference on Learning Representations (ICLR), 2023

...

George Karypis

180

13 Feb 2023

Can GPT-3 Perform Statutory Reasoning?International Conference on Artificial Intelligence and Law (ICAIL), 2023

344

125

13 Feb 2023

Mathematical Capabilities of ChatGPTNeural Information Processing Systems (NeurIPS), 2023

546

542

31 Jan 2023

The Flan Collection: Designing Data and Methods for Effective Instruction TuningInternational Conference on Machine Learning (ICML), 2023

...

464

875

31 Jan 2023

LEXTREME: A Multi-Lingual and Multi-Task Benchmark for the Legal DomainConference on Empirical Methods in Natural Language Processing (EMNLP), 2023

389

30 Jan 2023

REPLUG: Retrieval-Augmented Black-Box Language ModelsNorth American Chapter of the Association for Computational Linguistics (NAACL), 2023

Weijia Shi

Luke Zettlemoyer

747

886

30 Jan 2023

ThoughtSource: A central hub for large language model reasoning dataScientific Data (Sci Data), 2023

Simon Ott

Konstantin Hebenstreit

Valentin Liévin

C. Hother

M. Moradi

Maximilian Mayrhauser

576

27 Jan 2023

MAUD: An Expert-Annotated Legal NLP Dataset for Merger Agreement UnderstandingConference on Empirical Methods in Natural Language Processing (EMNLP), 2023

227

02 Jan 2023

Inconsistencies in Masked Language Models

Tom Young

Yunan Chen

Yang You

311

30 Dec 2022

Large Language Models Encode Clinical KnowledgeNature (Nature), 2022

...

Alan Karthikesalingam

Vivek Natarajan

LM&MA ELM AI4MH

632

3,659

26 Dec 2022

Quality at the Tail of Machine Learning Inference

198

25 Dec 2022

OPT-IML: Scaling Language Model Instruction Meta Learning through the Lens of Generalization

...

Luke Zettlemoyer

508

306

22 Dec 2022

ORCA: A Challenging Benchmark for Arabic Language UnderstandingAnnual Meeting of the Association for Computational Linguistics (ACL), 2022

AbdelRahim Elmadany

El Moatez Billah Nagoudi

Muhammad Abdul-Mageed

ELM

308

21 Dec 2022

A Survey of Deep Learning for Mathematical ReasoningAnnual Meeting of the Association for Computational Linguistics (ACL), 2022

Wenhao Yu

318

188

20 Dec 2022

Evaluating Human-Language Model Interaction

Esin Durmus

...

359

122

19 Dec 2022

ALERT: Adapting Language Models to Reasoning TasksAnnual Meeting of the Association for Computational Linguistics (ACL), 2022

286

16 Dec 2022

On Second Thought, Let's Not Think Step by Step! Bias and Toxicity in Zero-Shot ReasoningAnnual Meeting of the Association for Computational Linguistics (ACL), 2022

Diyi Yang

499

249

15 Dec 2022

Automaton-Based Representations of Task Knowledge from Generative Language Models

Yunhao Yang

Jean-Raphael Gaglione

Cyrus Neary

Ufuk Topcu

478

04 Dec 2022

SmoothQuant: Accurate and Efficient Post-Training Quantization for Large Language ModelsInternational Conference on Machine Learning (ICML), 2022

Song Han

874

1,339

18 Nov 2022

Galactica: A Large Language Model for Science

425

960

16 Nov 2022

Calibrated Interpretation: Confidence Estimation in Semantic ParsingTransactions of the Association for Computational Linguistics (TACL), 2022

Elias Stengel-Eskin

Benjamin Van Durme

UQLM

461

14 Nov 2022

Measuring Progress on Scalable Oversight for Large Language Models

...

353

182

04 Nov 2022

LMentry: A Language Model Benchmark of Elementary Language TasksAnnual Meeting of the Association for Computational Linguistics (ACL), 2022

Avia Efrat

Or Honovich

Omer Levy

264

03 Nov 2022

RQUGE: Reference-Free Metric for Evaluating Question Generation by Answering the QuestionAnnual Meeting of the Association for Computational Linguistics (ACL), 2022

Alireza Mohammadshahi

Angela Fan

275

02 Nov 2022

Knowledge-in-Context: Towards Knowledgeable Semi-Parametric Language ModelsInternational Conference on Learning Representations (ICLR), 2022

Wenlin Yao

Dian Yu

600

28 Oct 2022

Leveraging Large Language Models for Multiple Choice Question AnsweringInternational Conference on Learning Representations (ICLR), 2022

439

251

22 Oct 2022

Scaling Instruction-Finetuned Language ModelsJournal of machine learning research (JMLR), 2022

...

1.7K

3,929

20 Oct 2022