v1v2v3 (latest)

Measuring Massive Multitask Language Understanding

International Conference on Learning Representations (ICLR), 2020

7 September 2020

ArXiv (abs)PDF HTML HuggingFace (3 upvotes)

Papers citing "Measuring Massive Multitask Language Understanding"

50 / 4,486 papers shown

MixtureVitae: Open Web-Scale Pretraining Dataset With High Quality Instruction and Reasoning Data Built from Permissive-First Text Sources

...

Aleksandra Krasnodębska

264

29 Sep 2025

MAS$^2$: Self-Generative, Self-Configuring, Self-Rectifying Multi-Agent Systems

MAS

^2

: Self-Generative, Self-Configuring, Self-Rectifying Multi-Agent Systems

132

29 Sep 2025

Query Circuits: Explaining How Language Models Answer User Prompts

Tung-Yu Wu

Fazl Barez

ReLM LRM

163

29 Sep 2025

LLM DNA: Tracing Model Evolution via Functional Representations

137

29 Sep 2025

Mechanisms of Matter: Language Inferential Benchmark on Physicochemical Hypothesis in Materials Synthesis

Yingming Pu

Tao Lin

Hongyu Chen

157

29 Sep 2025

Beyond Repetition: Text Simplification and Curriculum Learning for Data-Constrained Pretraining

M. R

Dan John Velasco

121

29 Sep 2025

DiffuGuard: How Intrinsic Safety is Lost and Found in Diffusion Large Language Models

154

29 Sep 2025

RADAR: Reasoning-Ability and Difficulty-Aware Routing for Reasoning LLMs

221

29 Sep 2025

SeaPO: Strategic Error Amplification for Robust Preference Optimization of Large Language Models

148

29 Sep 2025

Vision Function Layer in Multimodal LLMs

Cheng Shi

Yizhou Yu

Sibei Yang

138

29 Sep 2025

MobileLLM-R1: Exploring the Limits of Sub-Billion Language Model Reasoners with Open Training Recipes

...

Raghuraman Krishnamoorthi

Yangyang Shi

Vikas Chandra

ReLM LRM

222

29 Sep 2025

A Hierarchical Error Framework for Reliable Automated Coding in Communication Research: Applications to Health and Political Communication

Zhilong Zhao

Yindi Liu

AILaw

209

29 Sep 2025

LLaDA-MoE: A Sparse MoE Diffusion Language Model

...

263

29 Sep 2025

Reasoning or Retrieval? A Study of Answer Attribution on Large Reasoning Models

152

29 Sep 2025

Alternatives To Next Token Prediction In Text Generation - A Survey

Charlie Wyatt

Aditya Joshi

Flora D. Salim

121

29 Sep 2025

SemShareKV: Efficient KVCache Sharing for Semantically Similar Prompts via Token-Level LSH Matching

Xinye Zhao

Spyridon Mastorakis

150

29 Sep 2025

UniAPL: A Unified Adversarial Preference Learning Framework for Instruct-Following

128

29 Sep 2025

Intra-request branch orchestration for efficient LLM reasoning

124

29 Sep 2025

Flash-Searcher: Fast and Effective Web Agents via DAG-Based Parallel Execution

...

173

29 Sep 2025

Anchored Supervised Fine-Tuning

199

28 Sep 2025

The Impossibility of Inverse Permutation Learning in Transformer Models

199

28 Sep 2025

Singleton-Optimized Conformal Prediction

Tao Wang

Yan Sun

Edgar Dobriban

144

28 Sep 2025

Toward Preference-aligned Large Language Models via Residual-based Model Steering

Lucio La Cava

Andrea Tagarelli

LLMSV

163

28 Sep 2025

Dynamic Orthogonal Continual Fine-tuning for Mitigating Catastrophic Forgettings

152

28 Sep 2025

Do LLMs Understand Romanian Driving Laws? A Study on Multimodal and Fine-Tuned Question Answering

Eduard Barbu

Adrian Marius Dumitran

28 Sep 2025

Beyond Magic Words: Sharpness-Aware Prompt Evolving for Robust Large Language Models with TARE

...

169

28 Sep 2025

ChunkLLM: A Lightweight Pluggable Framework for Accelerating LLMs Inference

175

28 Sep 2025

Knowledge Homophily in Large Language Models

125

28 Sep 2025

Sequential Diffusion Language Models

...

122

28 Sep 2025

Beyond Benchmarks: Understanding Mixture-of-Experts Models through Internal Mechanisms

28 Sep 2025

Bridging the Knowledge-Prediction Gap in LLMs on Multiple-Choice Questions

375

28 Sep 2025

Explore-Execute Chain: Towards an Efficient Structured Reasoning Paradigm

174

28 Sep 2025

Test-Time Policy Adaptation for Enhanced Multi-Turn Interactions with LLMs

111

27 Sep 2025

Train Once, Answer All: Many Pretraining Experiments for the Cost of One

Sebastian Bordt

Martin Pawelczyk

CLL

186

27 Sep 2025

Artificial Phantasia: Evidence for Propositional Reasoning-Based Mental Imagery in Large Language Models

Morgan McCarty

Jorge Morales

LRM

105

27 Sep 2025

SPEC-RL: Accelerating On-Policy Reinforcement Learning with Speculative Rollouts

Anxiang Zeng

Jinsong Su

OffRL LRM

206

27 Sep 2025

DOoM: Difficult Olympiads of Math

258

27 Sep 2025

A2D: Any-Order, Any-Step Safety Alignment for Diffusion Language Models

154

27 Sep 2025

Scaling LLM Test-Time Compute with Mobile NPU on Smartphones

273

27 Sep 2025

Bridging the Gap Between Promise and Performance for Microscaling FP4 Quantization

...

243

27 Sep 2025

Multiplayer Nash Preference Optimization

...

145

27 Sep 2025

Mapping Overlaps in Benchmarks through Perplexity in the Wild

307

27 Sep 2025

Dual-Space Smoothness for Robust and Balanced LLM Unlearning

121

27 Sep 2025

Quant-dLLM: Post-Training Extreme Low-Bit Quantization for Diffusion Large Language Models

143

27 Sep 2025

SysMoBench: Evaluating AI on Formally Modeling Complex Real-World Systems

146

27 Sep 2025

Memory-Efficient Fine-Tuning via Low-Rank Activation Compression

132

27 Sep 2025

Model Consistency as a Cheap yet Predictive Proxy for LLM Elo Scores

118

27 Sep 2025

Front-Loading Reasoning: The Synergy between Pretraining and Post-Training Data

120

26 Sep 2025

What Matters More For In-Context Learning under Matched Compute Budgets: Pretraining on Natural Text or Incorporating Targeted Synthetic Examples?

Mohammed Sabry

Anya Belz

104

26 Sep 2025

Fine-tuning Done Right in Model Editing

196

26 Sep 2025