Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2009.03300
Cited By
v1
v2
v3 (latest)
Measuring Massive Multitask Language Understanding
International Conference on Learning Representations (ICLR), 2020
7 September 2020
Dan Hendrycks
Collin Burns
Steven Basart
Andy Zou
Mantas Mazeika
Basel Alomair
Jacob Steinhardt
ELM
RALM
Re-assign community
ArXiv (abs)
PDF
HTML
HuggingFace (3 upvotes)
Papers citing
"Measuring Massive Multitask Language Understanding"
50 / 4,502 papers shown
nanoLM: an Affordable LLM Pre-training Benchmark via Accurate Loss Prediction across Scales
Yiqun Yao
Siqi Fan
Xiusheng Huang
Xuezhi Fang
Xiang Li
...
Peng Han
Shuo Shang
Kang Liu
Aixin Sun
Yequan Wang
240
8
0
14 Apr 2023
Learning Personalized Decision Support Policies
AAAI Conference on Artificial Intelligence (AAAI), 2023
Umang Bhatt
Valerie Chen
Katherine M. Collins
Parameswaran Kamalaruban
Emma Kallina
Adrian Weller
Ameet Talwalkar
OffRL
576
12
0
13 Apr 2023
AGIEval: A Human-Centric Benchmark for Evaluating Foundation Models
Wanjun Zhong
Ruixiang Cui
Yiduo Guo
Yaobo Liang
Shuai Lu
Yanlin Wang
Amin Saied
Weizhu Chen
Nan Duan
ALM
ELM
385
765
0
13 Apr 2023
Can Large Language Models Transform Computational Social Science?
International Conference on Computational Logic (ICCL), 2023
Caleb Ziems
William B. Held
Omar Shaikh
Jiaao Chen
Zhehao Zhang
Diyi Yang
LLMAG
518
453
0
12 Apr 2023
Boosted Prompt Ensembles for Large Language Models
Silviu Pitis
Michael Ruogu Zhang
Andrew Wang
Jimmy Ba
LRM
LLMAG
237
55
0
12 Apr 2023
LLMMaps -- A Visual Metaphor for Stratified Evaluation of Large Language Models
Patrik Puchert
Poonam Poonam
Christian van Onzenoodt
Timo Ropinski
187
12
0
02 Apr 2023
BloombergGPT: A Large Language Model for Finance
Shijie Wu
Ozan Irsoy
Steven Lu
Vadim Dabravolski
Mark Dredze
Sebastian Gehrmann
P. Kambadur
David S. Rosenberg
Gideon Mann
AIFin
715
1,195
0
30 Mar 2023
Whose Opinions Do Language Models Reflect?
International Conference on Machine Learning (ICML), 2023
Shibani Santurkar
Esin Durmus
Faisal Ladhak
Cinoo Lee
Abigail Z. Jacobs
Tatsunori Hashimoto
383
688
0
30 Mar 2023
Natural Language Reasoning, A Survey
ACM Computing Surveys (ACM Comput. Surv.), 2023
Fei Yu
Hongbo Zhang
Prayag Tiwari
Benyou Wang
ReLM
LRM
359
101
0
26 Mar 2023
k
k
k
NN Prompting: Beyond-Context Learning with Calibration-Free Nearest Neighbor Inference
International Conference on Learning Representations (ICLR), 2023
Benfeng Xu
Quan Wang
Zhendong Mao
Yajuan Lyu
Qiaoqiao She
Yongdong Zhang
318
65
0
24 Mar 2023
Sparse-IFT: Sparse Iso-FLOP Transformations for Maximizing Training Efficiency
International Conference on Machine Learning (ICML), 2023
Vithursan Thangarasa
Shreyas Saxena
Abhay Gupta
Sean Lie
467
8
0
21 Mar 2023
Language Model Behavior: A Comprehensive Survey
International Conference on Computational Logic (ICCL), 2023
Tyler A. Chang
Benjamin Bergen
VLM
LRM
LM&MA
410
148
0
20 Mar 2023
eP-ALM: Efficient Perceptual Augmentation of Language Models
IEEE International Conference on Computer Vision (ICCV), 2023
Mustafa Shukor
Corentin Dancette
Matthieu Cord
MLLM
VLM
432
34
0
20 Mar 2023
Capabilities of GPT-4 on Medical Challenge Problems
Harsha Nori
Nicholas King
S. McKinney
Dean Carignan
Eric Horvitz
LM&MA
ELM
AI4MH
504
1,120
0
20 Mar 2023
Large Language Model Instruction Following: A Survey of Progresses and Challenges
Computational Linguistics (CL), 2023
Renze Lou
Kai Zhang
Wenpeng Yin
ALM
LRM
877
42
0
18 Mar 2023
Can Generative Pre-trained Transformers (GPT) Pass Assessments in Higher Education Programming Courses?
Jaromír Šavelka
Arav Agarwal
Chris Bogart
Yifan Song
M. Sakr
ELM
190
121
0
16 Mar 2023
ART: Automatic multi-step reasoning and tool-use for large language models
Bhargavi Paranjape
Scott M. Lundberg
Sameer Singh
Hannaneh Hajishirzi
Luke Zettlemoyer
Marco Tulio Ribeiro
KELM
ReLM
LRM
327
198
0
16 Mar 2023
The Learnability of In-Context Learning
Neural Information Processing Systems (NeurIPS), 2023
Noam Wies
Yoav Levine
Amnon Shashua
348
169
0
14 Mar 2023
Generating multiple-choice questions for medical question answering with distractors and cue-masking
International Conference on Language Resources and Evaluation (LREC), 2023
Damien Sileo
Kanimozhi Uma
Marie-Francine Moens
238
5
0
13 Mar 2023
LLaMA: Open and Efficient Foundation Language Models
Hugo Touvron
Thibaut Lavril
Gautier Izacard
Xavier Martinet
Marie-Anne Lachaux
...
Faisal Azhar
Aurelien Rodriguez
Armand Joulin
Edouard Grave
Guillaume Lample
ALM
PILM
16.6K
18,610
0
27 Feb 2023
Testing AI on language comprehension tasks reveals insensitivity to underlying meaning
Scientific Reports (Sci Rep), 2023
Vittoria Dentella
Fritz Guenther
Elliot Murphy
G. Marcus
Evelina Leivada
ELM
453
61
0
23 Feb 2023
Complex QA and language models hybrid architectures, Survey
Xavier Daull
P. Bellot
Emmanuel Bruno
Vincent Martin
Elisabeth Murisasco
ELM
769
17
0
17 Feb 2023
Augmented Language Models: a Survey
Grégoire Mialon
Roberto Dessì
Maria Lomeli
Christoforos Nalmpantis
Ramakanth Pasunuru
...
Jane Dwivedi-Yu
Asli Celikyilmaz
Edouard Grave
Yann LeCun
Thomas Scialom
LRM
KELM
317
509
0
15 Feb 2023
STREET: A Multi-Task Structured Reasoning and Explanation Benchmark
International Conference on Learning Representations (ICLR), 2023
D. Ribeiro
Shen Wang
Xiaofei Ma
He Zhu
Rui Dong
...
William Yang Wang
Zhiheng Huang
George Karypis
Bing Xiang
Dan Roth
LRM
ReLM
180
26
0
13 Feb 2023
Can GPT-3 Perform Statutory Reasoning?
International Conference on Artificial Intelligence and Law (ICAIL), 2023
Andrew Blair-Stanek
Nils Holzenberger
Benjamin Van Durme
ELM
LRM
344
125
0
13 Feb 2023
Mathematical Capabilities of ChatGPT
Neural Information Processing Systems (NeurIPS), 2023
Simon Frieder
Luca Pinchetti
Alexis Chevalier
Ryan-Rhys Griffiths
Tommaso Salvatori
Thomas Lukasiewicz
P. Petersen
Julius Berner
ELM
AI4MH
546
542
0
31 Jan 2023
The Flan Collection: Designing Data and Methods for Effective Instruction Tuning
International Conference on Machine Learning (ICML), 2023
Shayne Longpre
Le Hou
Tu Vu
Albert Webson
Hyung Won Chung
...
Denny Zhou
Quoc V. Le
Barret Zoph
Jason W. Wei
Adam Roberts
ALM
464
875
0
31 Jan 2023
LEXTREME: A Multi-Lingual and Multi-Task Benchmark for the Legal Domain
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Joel Niklaus
Veton Matoshi
Pooja Rani
Andrea Galassi
Matthias Sturmer
Ilias Chalkidis
ELM
AILaw
389
82
0
30 Jan 2023
REPLUG: Retrieval-Augmented Black-Box Language Models
North American Chapter of the Association for Computational Linguistics (NAACL), 2023
Weijia Shi
Sewon Min
Michihiro Yasunaga
Minjoon Seo
Rich James
M. Lewis
Luke Zettlemoyer
Anuj Kumar
RALM
VLM
KELM
747
886
0
30 Jan 2023
ThoughtSource: A central hub for large language model reasoning data
Scientific Data (Sci Data), 2023
Simon Ott
Konstantin Hebenstreit
Valentin Liévin
C. Hother
M. Moradi
Maximilian Mayrhauser
Robert Praas
Ole Winther
Matthias Samwald
ReLM
LRM
576
62
0
27 Jan 2023
MAUD: An Expert-Annotated Legal NLP Dataset for Merger Agreement Understanding
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Steven H. Wang
Antoine Scardigli
Leonard Tang
Wei Chen
D.M. Levkin
Anya Chen
Spencer Ball
Thomas Woodside
Oliver Zhang
Dan Hendrycks
AILaw
ELM
227
42
0
02 Jan 2023
Inconsistencies in Masked Language Models
Tom Young
Yunan Chen
Yang You
311
2
0
30 Dec 2022
Large Language Models Encode Clinical Knowledge
Nature (Nature), 2022
K. Singhal
Shekoofeh Azizi
T. Tu
S. S. Mahdavi
Jason W. Wei
...
A. Rajkomar
Joelle Barral
Christopher Semturs
Alan Karthikesalingam
Vivek Natarajan
LM&MA
ELM
AI4MH
632
3,659
0
26 Dec 2022
Quality at the Tail of Machine Learning Inference
Zhengxin Yang
Wanling Gao
Chunjie Luo
Lei Wang
Fei Tang
Xu Wen
Jianfeng Zhan
198
1
0
25 Dec 2022
OPT-IML: Scaling Language Model Instruction Meta Learning through the Lens of Generalization
Srinivasan Iyer
Xi Lin
Ramakanth Pasunuru
Todor Mihaylov
Daniel Simig
...
Jeff Wang
Christopher Dewan
Asli Celikyilmaz
Luke Zettlemoyer
Veselin Stoyanov
ALM
508
306
0
22 Dec 2022
ORCA: A Challenging Benchmark for Arabic Language Understanding
Annual Meeting of the Association for Computational Linguistics (ACL), 2022
AbdelRahim Elmadany
El Moatez Billah Nagoudi
Muhammad Abdul-Mageed
ELM
308
61
0
21 Dec 2022
A Survey of Deep Learning for Mathematical Reasoning
Annual Meeting of the Association for Computational Linguistics (ACL), 2022
Pan Lu
Liang Qiu
Wenhao Yu
Sean Welleck
Kai-Wei Chang
ReLM
LRM
318
188
0
20 Dec 2022
Evaluating Human-Language Model Interaction
Mina Lee
Megha Srivastava
Amelia Hardy
John Thickstun
Esin Durmus
...
Hancheng Cao
Tony Lee
Rishi Bommasani
Michael S. Bernstein
Abigail Z. Jacobs
LM&MA
ALM
359
122
0
19 Dec 2022
ALERT: Adapting Language Models to Reasoning Tasks
Annual Meeting of the Association for Computational Linguistics (ACL), 2022
Ping Yu
Tianlu Wang
O. Yu. Golovneva
Badr AlKhamissi
Siddharth Verma
Zhijing Jin
Gargi Ghosh
Mona T. Diab
Asli Celikyilmaz
ReLM
LRM
286
20
0
16 Dec 2022
On Second Thought, Let's Not Think Step by Step! Bias and Toxicity in Zero-Shot Reasoning
Annual Meeting of the Association for Computational Linguistics (ACL), 2022
Omar Shaikh
Hongxin Zhang
William B. Held
Michael S. Bernstein
Diyi Yang
ReLM
LRM
499
249
0
15 Dec 2022
Automaton-Based Representations of Task Knowledge from Generative Language Models
Yunhao Yang
Jean-Raphael Gaglione
Cyrus Neary
Ufuk Topcu
478
14
0
04 Dec 2022
SmoothQuant: Accurate and Efficient Post-Training Quantization for Large Language Models
International Conference on Machine Learning (ICML), 2022
Guangxuan Xiao
Ji Lin
Mickael Seznec
Hao Wu
Julien Demouth
Song Han
MQ
874
1,339
0
18 Nov 2022
Galactica: A Large Language Model for Science
Ross Taylor
Marcin Kardas
Guillem Cucurull
Thomas Scialom
Anthony Hartshorn
Elvis Saravia
Andrew Poulton
Viktor Kerkez
Robert Stojnic
ELM
ReLM
425
960
0
16 Nov 2022
Calibrated Interpretation: Confidence Estimation in Semantic Parsing
Transactions of the Association for Computational Linguistics (TACL), 2022
Elias Stengel-Eskin
Benjamin Van Durme
UQLM
461
36
0
14 Nov 2022
Measuring Progress on Scalable Oversight for Large Language Models
Sam Bowman
Jeeyoon Hyun
Ethan Perez
Edwin Chen
Craig Pettit
...
Tristan Hume
Yuntao Bai
Zac Hatfield-Dodds
Benjamin Mann
Jared Kaplan
ALM
ELM
353
182
0
04 Nov 2022
LMentry: A Language Model Benchmark of Elementary Language Tasks
Annual Meeting of the Association for Computational Linguistics (ACL), 2022
Avia Efrat
Or Honovich
Omer Levy
264
31
0
03 Nov 2022
RQUGE: Reference-Free Metric for Evaluating Question Generation by Answering the Question
Annual Meeting of the Association for Computational Linguistics (ACL), 2022
Alireza Mohammadshahi
Thomas Scialom
Majid Yazdani
Pouya Yanki
Angela Fan
James Henderson
Marzieh Saeidi
275
24
0
02 Nov 2022
Knowledge-in-Context: Towards Knowledgeable Semi-Parametric Language Models
International Conference on Learning Representations (ICLR), 2022
Xiaoman Pan
Wenlin Yao
Hongming Zhang
Dian Yu
Dong Yu
Jianshu Chen
KELM
600
28
0
28 Oct 2022
Leveraging Large Language Models for Multiple Choice Question Answering
International Conference on Learning Representations (ICLR), 2022
Joshua Robinson
Christopher Rytting
David Wingate
ELM
439
251
0
22 Oct 2022
Scaling Instruction-Finetuned Language Models
Journal of machine learning research (JMLR), 2022
Hyung Won Chung
Le Hou
Shayne Longpre
Barret Zoph
Yi Tay
...
Jacob Devlin
Adam Roberts
Denny Zhou
Quoc V. Le
Jason W. Wei
ReLM
LRM
1.7K
3,929
0
20 Oct 2022
Previous
1
2
3
...
88
89
90
91
Next
Page 89 of 91
Page
of 91
Go