Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2009.03300
Cited By
v1
v2
v3 (latest)
Measuring Massive Multitask Language Understanding
International Conference on Learning Representations (ICLR), 2020
7 September 2020
Dan Hendrycks
Collin Burns
Steven Basart
Andy Zou
Mantas Mazeika
Basel Alomair
Jacob Steinhardt
ELM
RALM
Re-assign community
ArXiv (abs)
PDF
HTML
HuggingFace (3 upvotes)
Papers citing
"Measuring Massive Multitask Language Understanding"
50 / 4,478 papers shown
Title
CDT: A Comprehensive Capability Framework for Large Language Models Across Cognition, Domain, and Task
Haosi Mo
Xinyu Ma
Xuebo Liu
Yang Li
Yu Li
Jie Liu
Min Zhang
ELM
114
0
0
29 Sep 2025
RADAR: Reasoning-Ability and Difficulty-Aware Routing for Reasoning LLMs
Nigel Fernandez
Branislav Kveton
Ryan Rossi
Andrew Lan
Zichao Wang
LRM
202
0
0
29 Sep 2025
LLM DNA: Tracing Model Evolution via Functional Representations
Zhaomin Wu
Haodong Zhao
Ziyang Wang
Jizhou Guo
Qian Wang
Bingsheng He
108
1
0
29 Sep 2025
VISOR++: Universal Visual Inputs based Steering for Large Vision Language Models
Ravikumar Balakrishnan
Mansi Phute
LLMSV
159
1
0
29 Sep 2025
Flash-Searcher: Fast and Effective Web Agents via DAG-Based Parallel Execution
Tianrui Qin
Qianben Chen
S. Wang
He Xing
King Zhu
...
G. Zhang
Jiaheng Liu
Yuchen Eleanor Jiang
Xitong Gao
Wangchunshu Zhou
LLMAG
LRM
159
4
0
29 Sep 2025
Vision Function Layer in Multimodal LLMs
Cheng Shi
Yizhou Yu
Sibei Yang
116
1
0
29 Sep 2025
Fingerprinting LLMs via Prompt Injection
Yuepeng Hu
Zhengyuan Jiang
Mengyuan Li
Osama Ahmed
Zhicong Huang
Cheng Hong
Neil Zhenqiang Gong
178
0
0
29 Sep 2025
Generalized Correctness Models: Learning Calibrated and Model-Agnostic Correctness Predictors from Historical Patterns
Hanqi Xiao
Vaidehi Patil
Hyunji Lee
Elias Stengel-Eskin
Mohit Bansal
164
1
0
29 Sep 2025
Query Circuits: Explaining How Language Models Answer User Prompts
Tung-Yu Wu
Fazl Barez
ReLM
LRM
137
0
0
29 Sep 2025
DiffuGuard: How Intrinsic Safety is Lost and Found in Diffusion Large Language Models
Zherui Li
Zheng Nie
Zhenhong Zhou
Yufei Guo
Yue Liu
Y. Zhang
Yu Cheng
Qingsong Wen
Kun Wang
Jiaheng Zhang
AAML
127
0
0
29 Sep 2025
Beyond Repetition: Text Simplification and Curriculum Learning for Data-Constrained Pretraining
M. R
Dan John Velasco
89
0
0
29 Sep 2025
Mechanisms of Matter: Language Inferential Benchmark on Physicochemical Hypothesis in Materials Synthesis
Yingming Pu
Tao Lin
Hongyu Chen
145
0
0
29 Sep 2025
Intra-request branch orchestration for efficient LLM reasoning
Weifan Jiang
Rana Shahout
Yilun Du
Michael Mitzenmacher
Minlan Yu
LRM
108
0
0
29 Sep 2025
LLaDA-MoE: A Sparse MoE Diffusion Language Model
Fengqi Zhu
Zebin You
Yipeng Xing
Zenan Huang
Lin Liu
...
Junbo Zhao
Da Zheng
Chongxuan Li
Jianguo Li
J. Wen
MoE
208
10
0
29 Sep 2025
Reasoning or Retrieval? A Study of Answer Attribution on Large Reasoning Models
Yuhui Wang
Changjiang Li
Guangke Chen
Jiacheng Liang
Ting Wang
ReLM
KELM
LRM
105
1
0
29 Sep 2025
SecInfer: Preventing Prompt Injection via Inference-time Scaling
Yupei Liu
Yanting Wang
Yuqi Jia
Jinyuan Jia
Neil Zhenqiang Gong
LRM
SILM
AAML
425
3
0
29 Sep 2025
A Hierarchical Error Framework for Reliable Automated Coding in Communication Research: Applications to Health and Political Communication
Zhilong Zhao
Yindi Liu
AILaw
177
1
0
29 Sep 2025
Expanding Computation Spaces of LLMs at Inference Time
Yoonna Jang
Kisu Yang
Isabelle Augenstein
LLMAG
ReLM
LRM
68
0
0
29 Sep 2025
Anchored Supervised Fine-Tuning
He Zhu
Junyou Su
Peng Lai
Ren Ma
Wenjia Zhang
L. Yang
Guanhua Chen
OffRL
168
0
0
28 Sep 2025
Beyond Benchmarks: Understanding Mixture-of-Experts Models through Internal Mechanisms
Jiahao Ying
Mingbao Lin
Qianru Sun
Yixin Cao
MoE
48
0
0
28 Sep 2025
Dynamic Orthogonal Continual Fine-tuning for Mitigating Catastrophic Forgettings
Zhixin Zhang
Zeming Wei
Meng Sun
CLL
132
0
0
28 Sep 2025
Toward Preference-aligned Large Language Models via Residual-based Model Steering
Lucio La Cava
Andrea Tagarelli
LLMSV
148
0
0
28 Sep 2025
The Impossibility of Inverse Permutation Learning in Transformer Models
Rohan Alur
Chris Hays
Manish Raghavan
Devavrat Shah
175
0
0
28 Sep 2025
ChunkLLM: A Lightweight Pluggable Framework for Accelerating LLMs Inference
Haojie Ouyang
Jianwei Lv
Lei Ren
Chen Wei
Xiaojie Wang
Fangxiang Feng
VLM
156
0
0
28 Sep 2025
Bridging the Knowledge-Prediction Gap in LLMs on Multiple-Choice Questions
Yoonah Park
Haesung Pyun
Yohan Jo
KELM
327
0
0
28 Sep 2025
Sequential Diffusion Language Models
Yangzhou Liu
Yue Cao
Hao-Wen Li
Gen Luo
Z. Chen
...
Yuqiang Li
Tong Lu
Yu Qiao
Jifeng Dai
Wenhai Wang
96
5
0
28 Sep 2025
Beyond Magic Words: Sharpness-Aware Prompt Evolving for Robust Large Language Models with TARE
Guancheng Wan
Lucheng Fu
Haoxin Liu
Yiqiao Jin
Hui Yi Leong
...
Yunpu Ma
Xiangru Tang
B. A. Prakash
Yizhou Sun
Wei Wang
KELM
135
0
0
28 Sep 2025
Do LLMs Understand Romanian Driving Laws? A Study on Multimodal and Fine-Tuned Question Answering
Eduard Barbu
Adrian Marius Dumitran
44
0
0
28 Sep 2025
Singleton-Optimized Conformal Prediction
Tao Wang
Yan Sun
Edgar Dobriban
108
0
0
28 Sep 2025
Explore-Execute Chain: Towards an Efficient Structured Reasoning Paradigm
Kaisen Yang
Lixuan He
Rushi Shah
Kaicheng Yang
Qinwei Ma
Dianbo Liu
Alex Lamb
OffRL
LRM
154
0
0
28 Sep 2025
Knowledge Homophily in Large Language Models
Utkarsh Sahu
Zhisheng Qi
M. Halappanavar
Nedim Lipka
Ryan Rossi
Franck Dernoncourt
Yu Zhang
Yao Ma
Yu Wang
81
0
0
28 Sep 2025
Test-Time Policy Adaptation for Enhanced Multi-Turn Interactions with LLMs
Chenxing Wei
Hong Wang
Ying He
Fei Richard Yu
Yao Shu
96
1
0
27 Sep 2025
Dual-Space Smoothness for Robust and Balanced LLM Unlearning
Han Yan
Zheyuan Liu
Meng Jiang
MU
AAML
108
0
0
27 Sep 2025
Artificial Phantasia: Evidence for Propositional Reasoning-Based Mental Imagery in Large Language Models
Morgan McCarty
Jorge Morales
LRM
97
0
0
27 Sep 2025
SPEC-RL: Accelerating On-Policy Reinforcement Learning via Speculative Rollouts
Bingshuai Liu
Ante Wang
Zijun Min
Liang Yao
Haibo Zhang
Yang Liu
Anxiang Zeng
Jinsong Su
OffRL
LRM
87
5
0
27 Sep 2025
Bridging the Gap Between Promise and Performance for Microscaling FP4 Quantization
Vage Egiazarian
Roberto L. Castro
Denis Kuznedelev
Andrei Panferov
Eldar Kurtic
...
Alexandre Marques
Mark Kurtz
Saleh Ashkboos
Torsten Hoefler
Dan Alistarh
MQ
216
1
0
27 Sep 2025
DOoM: Difficult Olympiads of Math
Ilya Kuleshov
Ilin Pavel
Nikolay Kompanets
Ksenia Sycheva
Aleksandr Nikolich
AIMat
250
0
0
27 Sep 2025
Multiplayer Nash Preference Optimization
Fang Wu
X. Y. Huang
Weihao Xuan
Zhiwei Zhang
Yijia Xiao
...
Xiaomin Li
Bing Hu
Peng Xia
Jure Leskovec
Yejin Choi
124
1
0
27 Sep 2025
Model Consistency as a Cheap yet Predictive Proxy for LLM Elo Scores
Ashwin Ramaswamy
Nestor Demeure
Ermal Rrapaj
ALM
ELM
104
0
0
27 Sep 2025
A2D: Any-Order, Any-Step Safety Alignment for Diffusion Language Models
Wonje Jeung
Sangyeon Yoon
Yoonjun Cho
Dongjae Jeon
Sangwoo Shin
Hyesoo Hong
Albert No
DiffM
137
0
0
27 Sep 2025
Mapping Overlaps in Benchmarks through Perplexity in the Wild
Siyang Wu
Honglin Bao
Sida Li
Ari Holtzman
James A. Evans
287
0
0
27 Sep 2025
Scaling LLM Test-Time Compute with Mobile NPU on Smartphones
Zixu Hao
Jianyu Wei
Tuowei Wang
Minxing Huang
Huiqiang Jiang
Shiqi Jiang
Ting Cao
Ju Ren
246
1
0
27 Sep 2025
Train Once, Answer All: Many Pretraining Experiments for the Cost of One
Sebastian Bordt
Martin Pawelczyk
CLL
168
1
0
27 Sep 2025
Memory-Efficient Fine-Tuning via Low-Rank Activation Compression
Jiang-Xin Shi
Wen-Da Wei
Jin-Fei Qi
Xuanyu Chen
Tong Wei
Yu-Feng Li
120
0
0
27 Sep 2025
Quant-dLLM: Post-Training Extreme Low-Bit Quantization for Diffusion Large Language Models
Tianao Zhang
Zhiteng Li
Xianglong Yan
Haotong Qin
Yong Guo
Yulun Zhang
MQ
113
0
0
27 Sep 2025
Front-Loading Reasoning: The Synergy between Pretraining and Post-Training Data
Syeda Nahida Akter
Shrimai Prabhumoye
Eric Nyberg
M. Patwary
Mohammad Shoeybi
Yejin Choi
Bryan Catanzaro
AIFin
LRM
AI4CE
116
4
0
26 Sep 2025
Erase or Hide? Suppressing Spurious Unlearning Neurons for Robust Unlearning
Nakyeong Yang
Dong-Kyum Kim
Jea Kwon
Minsung Kim
Kyomin Jung
M. Cha
MU
KELM
104
0
0
26 Sep 2025
What Matters More For In-Context Learning under Matched Compute Budgets: Pretraining on Natural Text or Incorporating Targeted Synthetic Examples?
Mohammed Sabry
Anya Belz
91
0
0
26 Sep 2025
AI Brown and AI Koditex: LLM-Generated Corpora Comparable to Traditional Corpora of English and Czech Texts
Jiří Milička
Anna Marklová
Václav Cvrček
DeLMO
168
0
0
26 Sep 2025
SBFA: Single Sneaky Bit Flip Attack to Break Large Language Models
Jingkai Guo
C. Chakrabarti
Deliang Fan
AAML
44
3
0
26 Sep 2025
Previous
1
2
3
...
10
11
12
...
88
89
90
Next