Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2009.03300
Cited By
Measuring Massive Multitask Language Understanding
7 September 2020
Dan Hendrycks
Collin Burns
Steven Basart
Andy Zou
Mantas Mazeika
D. Song
Jacob Steinhardt
ELM
RALM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Measuring Massive Multitask Language Understanding"
50 / 769 papers shown
Title
Quantifying the Capability Boundary of DeepSeek Models: An Application-Driven Performance Analysis
Kaikai Zhao
Zhaoxiang Liu
Xuejiao Lei
Ning Wang
Zhenhong Long
...
Minjie Hua
Kai Wang
W. Liu
Kai Wang
Shiguo Lian
ELM
LRM
52
1
0
16 Feb 2025
Unlocking the Power of Function Vectors for Characterizing and Mitigating Catastrophic Forgetting in Continual Instruction Tuning
Gangwei Jiang
Caigao Jiang
Zhaoyi Li
Siqiao Xue
Jun-ping Zhou
Linqi Song
Defu Lian
Yin Wei
CLL
MU
58
0
0
16 Feb 2025
Leveraging Uncertainty Estimation for Efficient LLM Routing
Tuo Zhang
Asal Mehradfar
Dimitrios Dimitriadis
Salman Avestimehr
51
1
0
16 Feb 2025
Smoothing Out Hallucinations: Mitigating LLM Hallucination with Smoothed Knowledge Distillation
Hieu Nguyen
Zihao He
Shoumik Atul Gandre
Ujjwal Pasupulety
Sharanya Kumari Shivakumar
Kristina Lerman
HILM
56
1
0
16 Feb 2025
Large Language Diffusion Models
Shen Nie
Fengqi Zhu
Zebin You
Xiaolu Zhang
Jingyang Ou
Jun Hu
Jun Zhou
Yankai Lin
Ji-Rong Wen
Chongxuan Li
110
14
0
14 Feb 2025
Mind the Gap! Choice Independence in Using Multilingual LLMs for Persuasive Co-Writing Tasks in Different Languages
Shreyan Biswas
Alexander Erlei
U. Gadiraju
105
4
0
13 Feb 2025
BenchMAX: A Comprehensive Multilingual Evaluation Suite for Large Language Models
Xu Huang
Wenhao Zhu
Hanxu Hu
Conghui He
Lei Li
Shujian Huang
Fei Yuan
ELM
51
3
0
11 Feb 2025
Speculate, then Collaborate: Fusing Knowledge of Language Models during Decoding
Z. Wang
Muneeza Azmart
Ang Li
R. Horesh
Mikhail Yurochkin
112
1
0
11 Feb 2025
EfficientLLM: Scalable Pruning-Aware Pretraining for Architecture-Agnostic Edge Language Models
Xingrun Xing
Zheng Liu
Shitao Xiao
Boyan Gao
Yiming Liang
Wanpeng Zhang
Haokun Lin
Guoqi Li
Jiajun Zhang
LRM
61
1
0
10 Feb 2025
Unbiased Evaluation of Large Language Models from a Causal Perspective
Meilin Chen
Jian Tian
Liang Ma
Di Xie
Weijie Chen
Jiang Zhu
ALM
ELM
52
0
0
10 Feb 2025
Reinforced Lifelong Editing for Language Models
Zherui Li
Houcheng Jiang
Hao Chen
Baolong Bi
Z. Zhou
Fei Sun
Junfeng Fang
X. Wang
KELM
51
5
0
09 Feb 2025
Evaluating Vision-Language Models for Emotion Recognition
Sree Bhattacharyya
James Z. Wang
VLM
57
0
0
08 Feb 2025
SEER: Self-Explainability Enhancement of Large Language Models' Representations
Guanxu Chen
Dongrui Liu
Tao Luo
Jing Shao
LRM
MILM
65
1
0
07 Feb 2025
The Order Effect: Investigating Prompt Sensitivity to Input Order in LLMs
Bryan Guan
Tanya Roosta
Peyman Passban
Mehdi Rezagholizadeh
97
0
0
06 Feb 2025
Decoder-Only LLMs are Better Controllers for Diffusion Models
Ziyi Dong
Yao Xiao
Pengxu Wei
Liang Lin
DiffM
86
0
0
06 Feb 2025
\Éclair -- Extracting Content and Layout with Integrated Reading Order for Documents
Ilia Karmanov
A. Deshmukh
Lukas Voegtle
Philipp Fischer
Kateryna Chumachenko
...
Jarno Seppänen
Jupinder Parmar
Joseph Jennings
Andrew Tao
Karan Sapra
73
0
0
06 Feb 2025
Position: Multimodal Large Language Models Can Significantly Advance Scientific Reasoning
Yibo Yan
Shen Wang
Jiahao Huo
Jingheng Ye
Zhendong Chu
Xuming Hu
Philip S. Yu
Carla P. Gomes
B. Selman
Qingsong Wen
LRM
121
9
0
05 Feb 2025
Fairness through Difference Awareness: Measuring Desired Group Discrimination in LLMs
Angelina Wang
Michelle Phan
Daniel E. Ho
Sanmi Koyejo
49
2
0
04 Feb 2025
Robust Federated Finetuning of LLMs via Alternating Optimization of LoRA
Shuangyi Chen
Yuanxin Guo
Yue Ju
Harik Dalal
Ashish Khisti
48
1
0
03 Feb 2025
Model Tampering Attacks Enable More Rigorous Evaluations of LLM Capabilities
Zora Che
Stephen Casper
Robert Kirk
Anirudh Satheesh
Stewart Slocum
...
Zikui Cai
Bilal Chughtai
Y. Gal
Furong Huang
Dylan Hadfield-Menell
MU
AAML
ELM
85
3
0
03 Feb 2025
Advanced Weakly-Supervised Formula Exploration for Neuro-Symbolic Mathematical Reasoning
Yuxuan Wu
Hideki Nakayama
NAI
53
0
0
02 Feb 2025
Rethinking Mixture-of-Agents: Is Mixing Different Large Language Models Beneficial?
Wenzhe Li
Yong Lin
Mengzhou Xia
Chi Jin
MoE
91
2
0
02 Feb 2025
UGPhysics: A Comprehensive Benchmark for Undergraduate Physics Reasoning with Large Language Models
Xin Xu
Qiyun Xu
Tong Xiao
Tianhao Chen
Yuchen Yan
Jiaxin Zhang
Shizhe Diao
Can Yang
Yang Wang
ELM
LRM
AI4CE
102
2
0
01 Feb 2025
Memory-Efficient Fine-Tuning of Transformers via Token Selection
Antoine Simoulin
Namyong Park
Xiaoyi Liu
Grey Yang
110
0
0
31 Jan 2025
DFPE: A Diverse Fingerprint Ensemble for Enhancing LLM Performance
Seffi Cohen
Niv Goldshlager
Nurit Cohen-Inger
Bracha Shapira
L. Rokach
43
0
0
29 Jan 2025
SLoPe: Double-Pruned Sparse Plus Lazy Low-Rank Adapter Pretraining of LLMs
Mohammad Mozaffari
Amir Yazdanbakhsh
Zhao Zhang
M. Dehnavi
78
5
0
28 Jan 2025
Mix-of-Granularity: Optimize the Chunking Granularity for Retrieval-Augmented Generation
Zijie Zhong
Hanwen Liu
Xiaoya Cui
Xiaofan Zhang
Zengchang Qin
82
6
0
28 Jan 2025
Baichuan-Omni-1.5 Technical Report
Yadong Li
J. Liu
Tao Zhang
Tao Zhang
S. Chen
...
Jianhua Xu
Haoze Sun
Mingan Lin
Zenan Zhou
Weipeng Chen
AuLLM
72
10
0
28 Jan 2025
GUIDE: A Global Unified Inference Engine for Deploying Large Language Models in Heterogeneous Environments
Yanyu Chen
Ganhong Huang
103
0
0
28 Jan 2025
SEAL: Scaling to Emphasize Attention for Long-Context Retrieval
Changhun Lee
Jun-gyu Jin
Younghyun Cho
Eunhyeok Park
LRM
56
0
0
28 Jan 2025
StringLLM: Understanding the String Processing Capability of Large Language Models
Xilong Wang
Hao Fu
Jindong Wang
Neil Zhenqiang Gong
49
0
0
28 Jan 2025
SimRAG: Self-Improving Retrieval-Augmented Generation for Adapting Large Language Models to Specialized Domains
Ran Xu
Hui Liu
Sreyashi Nag
Zhenwei Dai
Yaochen Xie
...
Chen Luo
Yang Li
Joyce C. Ho
Carl Yang
Qi He
RALM
68
8
0
28 Jan 2025
Unleashing the Potential of Large Language Models as Prompt Optimizers: Analogical Analysis with Gradient-based Model Optimizers
Xinyu Tang
Xiaolei Wang
Wayne Xin Zhao
Siyuan Lu
Yaliang Li
Ji-Rong Wen
LRM
56
13
0
28 Jan 2025
A Survey of Large Language Models for Healthcare: from Data, Technology, and Applications to Accountability and Ethics
Kai He
Rui Mao
Qika Lin
Yucheng Ruan
Xiang Lan
Mengling Feng
Erik Cambria
LM&MA
AILaw
93
153
0
28 Jan 2025
BLoB: Bayesian Low-Rank Adaptation by Backpropagation for Large Language Models
Yibin Wang
H. Shi
Ligong Han
Dimitris N. Metaxas
Hao Wang
BDL
UQLM
110
6
0
28 Jan 2025
AdaCoT: Rethinking Cross-Lingual Factual Reasoning through Adaptive Chain-of-Thought
Xin Huang
Tarun K. Vangani
Zhengyuan Liu
Bowei Zou
A. Aw
LRM
AI4CE
55
2
0
27 Jan 2025
Option-ID Based Elimination For Multiple Choice Questions
Zhenhao Zhu
Bulou Liu
Qingyao Ai
Y. Liu
54
0
0
25 Jan 2025
You Only Prune Once: Designing Calibration-Free Model Compression With Policy Learning
Ayan Sengupta
Siddhant Chaudhary
Tanmoy Chakraborty
44
3
0
25 Jan 2025
The Karp Dataset
Mason DiCicco
Eamon Worden
Conner Olsen
Nikhil Gangaram
Daniel Reichman
Neil T. Heffernan
ReLM
LRM
56
0
0
24 Jan 2025
HumorReject: Decoupling LLM Safety from Refusal Prefix via A Little Humor
Zihui Wu
Haichang Gao
Jiacheng Luo
Zhaoxiang Liu
35
0
0
23 Jan 2025
Kimi k1.5: Scaling Reinforcement Learning with LLMs
Kimi Team
Angang Du
Bofei Gao
Bowei Xing
Changjiu Jiang
...
Zhilin Yang
Zhiqi Huang
Zihao Huang
Ziyao Xu
Z. Yang
VLM
ALM
OffRL
AI4TS
LRM
106
136
0
22 Jan 2025
MMVU: Measuring Expert-Level Multi-Discipline Video Understanding
Yilun Zhao
Lujing Xie
Haowei Zhang
Guo Gan
Yitao Long
...
Xiangru Tang
Zhenwen Liang
Y. Liu
Chen Zhao
Arman Cohan
53
5
0
21 Jan 2025
A Collection of Question Answering Datasets for Norwegian
Vladislav Mikhailov
Petter Mæhlum
Victoria Ovedie Chruickshank Langø
Erik Velldal
Lilja Øvrelid
RALM
41
4
0
19 Jan 2025
Knowledge Retrieval Based on Generative AI
Te-Lun Yang
Jyi-Shane Liu
Yuen-Hsien Tseng
Jyh-Shing Roger Jang
3DV
RALM
53
1
0
17 Jan 2025
Aligning Instruction Tuning with Pre-training
Yiming Liang
Tianyu Zheng
Xinrun Du
Ge Zhang
J. Liu
...
Zhaoxiang Zhang
Wenhao Huang
Jiajun Zhang
Xiang Yue
Jiajun Zhang
86
1
0
16 Jan 2025
Optimization Strategies for Enhancing Resource Efficiency in Transformers & Large Language Models
Tom Wallace
Naser Ezzati-Jivan
Beatrice Ombuki-Berman
MQ
33
1
0
16 Jan 2025
HyGen: Efficient LLM Serving via Elastic Online-Offline Request Co-location
Ting Sun
Penghan Wang
Fan Lai
137
1
0
15 Jan 2025
Tensor Product Attention Is All You Need
Yifan Zhang
Yifeng Liu
Huizhuo Yuan
Zhen Qin
Yang Yuan
Q. Gu
Andrew Chi-Chih Yao
77
9
0
11 Jan 2025
PalmBench: A Comprehensive Benchmark of Compressed Large Language Models on Mobile Platforms
Yilong Li
Jingyu Liu
Hao Zhang
M Badri Narayanan
Utkarsh Sharma
Shuai Zhang
Pan Hu
Yijing Zeng
Jayaram Raghuram
Suman Banerjee
MQ
39
2
0
10 Jan 2025
PRMBench: A Fine-grained and Challenging Benchmark for Process-Level Reward Models
Mingyang Song
Zhaochen Su
Xiaoye Qu
Jiawei Zhou
Yu-Xi Cheng
LRM
53
29
0
06 Jan 2025
Previous
1
2
3
4
5
6
...
14
15
16
Next