v1v2v3 (latest)

Measuring Massive Multitask Language Understanding

International Conference on Learning Representations (ICLR), 2020

7 September 2020

ArXiv (abs)PDF HTML HuggingFace (3 upvotes)

Papers citing "Measuring Massive Multitask Language Understanding"

50 / 4,481 papers shown

INSTRUCTEVAL: Towards Holistic Evaluation of Instruction-Tuned Large Language Models

256

07 Jun 2023

How Far Can Camels Go? Exploring the State of Instruction Tuning on Open ResourcesNeural Information Processing Systems (NeurIPS), 2023

...

352

469

07 Jun 2023

PromptRobust: Towards Evaluating the Robustness of Large Language Models on Adversarial Prompts

Hao Chen

...

Yue Zhang

430

211

07 Jun 2023

Benchmarking Foundation Models with Language-Model-as-an-ExaminerNeural Information Processing Systems (NeurIPS), 2023

Yuze He

...

Yijia Xiao

Haozhe Lyu

Jiayin Zhang

Juanzi Li

Lei Hou

ALM ELM

269

199

07 Jun 2023

The Emergence of Essential Sparsity in Large Pre-trained Models: The Weights that MatterNeural Information Processing Systems (NeurIPS), 2023

276

06 Jun 2023

Applying Standards to Advance Upstream & Downstream Ethics in Large Language Models

Jose Berengueres

Marybeth Sandell

181

06 Jun 2023

Inference-Time Intervention: Eliciting Truthful Answers from a Language ModelNeural Information Processing Systems (NeurIPS), 2023

746

833

06 Jun 2023

Benchmarking Large Language Models on CMExam -- A Comprehensive Chinese Medical Exam DatasetNeural Information Processing Systems (NeurIPS), 2023

...

Helin Wang

Lei Zhu

493

115

05 Jun 2023

MultiLegalPile: A 689GB Multilingual Legal CorpusAnnual Meeting of the Association for Computational Linguistics (ACL), 2023

422

03 Jun 2023

Reimagining Retrieval Augmented Language Models for Answering QueriesAnnual Meeting of the Association for Computational Linguistics (ACL), 2023

305

01 Jun 2023

A Systematic Study and Comprehensive Evaluation of ChatGPT on Benchmark DatasetsAnnual Meeting of the Association for Computational Linguistics (ACL), 2023

Md Tahmid Rahman Laskar

M Saiful Bari

Mizanur Rahman

Md Amran Hossen Bhuiyan

Shafiq Joty

J. Huang

LM&MA ELM ALM

500

212

29 May 2023

LLM-QAT: Data-Free Quantization Aware Training for Large Language ModelsAnnual Meeting of the Association for Computational Linguistics (ACL), 2023

Yashar Mehdad

Raghuraman Krishnamoorthi

Vikas Chandra

263

294

29 May 2023

Conformal Prediction with Large Language Models for Multi-Choice Question Answering

442

101

28 May 2023

What can Large Language Models do in chemistry? A comprehensive benchmark on eight tasksNeural Information Processing Systems (NeurIPS), 2023

518

210

27 May 2023

Augmentation-Adapted Retriever Improves Generalization of Language Models as Generic Plug-InAnnual Meeting of the Association for Computational Linguistics (ACL), 2023

Zhiyuan Liu

289

27 May 2023

Beyond Positive Scaling: How Negation Impacts Scaling Trends of Language ModelsAnnual Meeting of the Association for Computational Linguistics (ACL), 2023

265

27 May 2023

Chain-of-Thought Hub: A Continuous Effort to Measure Large Language Models' Reasoning Performance

212

125

26 May 2023

Training Socially Aligned Language Models on Simulated Social InteractionsInternational Conference on Learning Representations (ICLR), 2023

Ruibo Liu

Diyi Yang

285

26 May 2023

Self-contradictory Hallucinations of Large Language Models: Evaluation, Detection and MitigationInternational Conference on Learning Representations (ICLR), 2023

Niels Mündler

Jingxuan He

Slobodan Jenko

Martin Vechev

HILM

308

156

25 May 2023

The False Promise of Imitating Proprietary LLMs

Pieter Abbeel

331

250

25 May 2023

On Degrees of Freedom in Defining and Testing Natural Language UnderstandingAnnual Meeting of the Association for Computational Linguistics (ACL), 2023

Saku Sugawara

S. Tsugita

ELM

323

24 May 2023

C-STS: Conditional Semantic Textual SimilarityConference on Empirical Methods in Natural Language Processing (EMNLP), 2023

182

24 May 2023

Have LLMs Advanced Enough? A Challenging Problem Solving Benchmark For Large Language ModelsConference on Empirical Methods in Natural Language Processing (EMNLP), 2023

412

24 May 2023

The Art of SOCRATIC QUESTIONING: Recursive Thinking with Large Language ModelsConference on Empirical Methods in Natural Language Processing (EMNLP), 2023

152

24 May 2023

How Predictable Are Large Language Model Capabilities? A Case Study on BIG-benchConference on Empirical Methods in Natural Language Processing (EMNLP), 2023

Qinyuan Ye

Harvey Yiyun Fu

Xiang Ren

Robin Jia

ELM

270

24 May 2023

In-Context Impersonation Reveals Large Language Models' Strengths and BiasesNeural Information Processing Systems (NeurIPS), 2023

305

181

24 May 2023

Estimating Large Language Model Capabilities without Labeled Test DataConference on Empirical Methods in Natural Language Processing (EMNLP), 2023

Harvey Yiyun Fu

Qinyuan Ye

Albert Xu

Xiang Ren

Robin Jia

268

24 May 2023

Mixture-of-Experts Meets Instruction Tuning:A Winning Combination for Large Language ModelsInternational Conference on Learning Representations (ICLR), 2023

...

442

24 May 2023

Emergent inabilities? Inverse scaling over the course of pretrainingConference on Empirical Methods in Natural Language Processing (EMNLP), 2023

J. Michaelov

Benjamin Bergen

LRM ReLM

183

24 May 2023

Increasing Probability Mass on Answer Choices Does Not Always Improve AccuracyConference on Empirical Methods in Natural Language Processing (EMNLP), 2023

Oyvind Tafjord

191

24 May 2023

Sources of Hallucination by Large Language Models on Inference TasksConference on Empirical Methods in Natural Language Processing (EMNLP), 2023

Nick McKenna

Tianyi Li

Liang Cheng

Mohammad Javad Hosseini

Mark Johnson

Mark Steedman

LRM HILM

293

243

23 May 2023

RetICL: Sequential Retrieval of In-Context Examples with Reinforcement Learning

Alexander Scarlatos

Andrew Lan

OffRL LRM

260

23 May 2023

Improving Factuality and Reasoning in Language Models through Multiagent DebateInternational Conference on Machine Learning (ICML), 2023

Yilun Du

Shuang Li

Antonio Torralba

J. Tenenbaum

Igor Mordatch

LLMAG LRM

351

1,182

23 May 2023

QLoRA: Efficient Finetuning of Quantized LLMsNeural Information Processing Systems (NeurIPS), 2023

Tim Dettmers

Artidoro Pagnoni

Ari Holtzman

Luke Zettlemoyer

ALM

616

3,684

23 May 2023

Query Rewriting for Retrieval-Augmented Large Language Models

236

192

23 May 2023

Enhancing Chat Language Models by Scaling High-quality Instructional ConversationsConference on Empirical Methods in Natural Language Processing (EMNLP), 2023

Zhiyuan Liu

Maosong Sun

Bowen Zhou

ALM

365

747

23 May 2023

Skill-Based Few-Shot Selection for In-Context LearningConference on Empirical Methods in Natural Language Processing (EMNLP), 2023

398

23 May 2023

Memory-Efficient Fine-Tuning of Compressed Large Language Models via sub-4-bit Integer QuantizationNeural Information Processing Systems (NeurIPS), 2023

374

131

23 May 2023

Can Large Language Models Capture Dissenting Human Voices?Conference on Empirical Methods in Natural Language Processing (EMNLP), 2023

319

23 May 2023

Aligning Large Language Models through Synthetic FeedbackConference on Empirical Methods in Natural Language Processing (EMNLP), 2023

273

23 May 2023

Exploring Self-supervised Logic-enhanced Training for Large Language ModelsNorth American Chapter of the Association for Computational Linguistics (NAACL), 2023

Nancy F. Chen

245

23 May 2023

Knowledge of Knowledge: Exploring Known-Unknowns Uncertainty with Large Language ModelsAnnual Meeting of the Association for Computational Linguistics (ACL), 2023

398

23 May 2023

Polyglot or Not? Measuring Multilingual Encyclopedic Knowledge in Foundation ModelsConference on Empirical Methods in Natural Language Processing (EMNLP), 2023

283

23 May 2023

CLASS: A Design Framework for building Intelligent Tutoring Systems based on Learning Science principlesConference on Empirical Methods in Natural Language Processing (EMNLP), 2023

Shashank Sonkar

Lucy Liu

D. B. Mallick

Richard G. Baraniuk

309

22 May 2023

Chain-of-Knowledge: Grounding Large Language Models via Dynamic Knowledge Adapting over Heterogeneous SourcesInternational Conference on Learning Representations (ICLR), 2023

468

144

22 May 2023

Should We Attend More or Less? Modulating Attention for Fairness

264

22 May 2023

RWKV: Reinventing RNNs for the Transformer EraConference on Empirical Methods in Natural Language Processing (EMNLP), 2023

...

Rui-Jie Zhu

579

856

22 May 2023

Iterative Forward Tuning Boosts In-Context Learning in Language ModelsAnnual Meeting of the Association for Computational Linguistics (ACL), 2023

Jiaxi Yang

Binyuan Hui

Min Yang

Bailin Wang

Bowen Li

Binhua Li

Fei Huang

Yongbin Li

267

22 May 2023

ExplainCPE: A Free-text Explanation Benchmark of Chinese Pharmacist ExaminationConference on Empirical Methods in Natural Language Processing (EMNLP), 2023

Baotian Hu

178

22 May 2023

Meta-in-context learning in large language modelsNeural Information Processing Systems (NeurIPS), 2023

429

22 May 2023