Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
1905.07830
Cited By
HellaSwag: Can a Machine Really Finish Your Sentence?
Annual Meeting of the Association for Computational Linguistics (ACL), 2019
19 May 2019
Rowan Zellers
Ari Holtzman
Yonatan Bisk
Ali Farhadi
Yejin Choi
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"HellaSwag: Can a Machine Really Finish Your Sentence?"
50 / 2,251 papers shown
SBVR: Summation of BitVector Representation for Efficient LLM Quantization
Wonjun Bang
Jongseok Park
Hongseung Yu
Kyungmin Bin
Kyunghan Lee
MQ
132
0
0
17 Sep 2025
DSFT: Inspiring Diffusion Large Language Models to Comprehend Mathematical and Logical Patterns
Ranfei Chen
Ming Chen
DiffM
AI4CE
77
0
0
17 Sep 2025
ZERA: Zero-init Instruction Evolving Refinement Agent - From Zero Instructions to Structured Prompts via Principle-based Optimization
Seungyoun Yi
Minsoo Khang
Sungrae Park
LLMAG
84
0
0
17 Sep 2025
NIRVANA: Structured pruning reimagined for large language models compression
Mengting Ai
Tianxin Wei
Sirui Chen
Jingrui He
VLM
1.6K
1
0
17 Sep 2025
Instance-level Randomization: Toward More Stable LLM Evaluations
Yiyang Li
Y. Wu
Ying Luo
Liangtai Sun
Zishu Qin
Lin Qiu
Xuezhi Cao
Xunliang Cai
149
0
0
16 Sep 2025
CBP-Tuning: Efficient Local Customization for Black-box Large Language Models
Jiaxuan Zhao
Naibin Gu
Yuchen Feng
Xiyu Liu
Peng Fu
Zheng Lin
Weiping Wang
96
0
0
15 Sep 2025
AMQ: Enabling AutoML for Mixed-precision Weight-Only Quantization of Large Language Models
Sangjun Lee
Seung-taek Woo
Jungyu Jin
Changhun Lee
Eunhyeok Park
MQ
109
2
0
15 Sep 2025
From Parameters to Performance: A Data-Driven Study on LLM Structure and Development
Suqing Wang
Zuchao Li
Luohe Shi
Bo Du
Hai Zhao
Yun Li
Qianren Wang
121
0
0
14 Sep 2025
Fluid Language Model Benchmarking
Valentin Hofmann
David Heineman
Ian H. Magnusson
Kyle Lo
Jesse Dodge
Maarten Sap
Pang Wei Koh
Chun Wang
Hannaneh Hajishirzi
Noah A. Smith
125
7
0
14 Sep 2025
Optimal Brain Restoration for Joint Quantization and Sparsification of LLMs
Hang Guo
Yawei Li
Luca Benini
MQ
207
0
0
14 Sep 2025
AQUA: Attention via QUery mAgnitudes for Memory and Compute Efficient Inference in LLMs
S. Shah
Saurav Prakash
Balaraman Ravindran
65
0
0
14 Sep 2025
Dropping Experts, Recombining Neurons: Retraining-Free Pruning for Sparse Mixture-of-Experts LLMs
Yixiao Zhou
Ziyu Zhao
Dongzhou Cheng
Zhiliang Wu
Jie Gui
Yi-feng Yang
Fei Wu
Yu Cheng
Hehe Fan
MoMe
MoE
147
3
0
12 Sep 2025
Towards Understanding Visual Grounding in Visual Language Models
Georgios Pantazopoulos
Eda B. Özyiğit
ObjD
300
3
0
12 Sep 2025
ButterflyQuant: Ultra-low-bit LLM Quantization through Learnable Orthogonal Butterfly Transforms
Bingxin Xu
Zhen Dong
Oussama Elachqar
Yuzhang Shang
MQ
188
1
0
11 Sep 2025
LLM-JEPA: Large Language Models Meet Joint Embedding Predictive Architectures
Hai Huang
Yann LeCun
Randall Balestriero
187
4
0
11 Sep 2025
Benchmarking Energy Efficiency of Large Language Models Using vLLM
K. Pronk
Q. Zhao
84
0
0
10 Sep 2025
Open-sci-ref-0.01: open and reproducible reference baselines for language model and dataset comparison
Marianna Nezhurina
Jörg Franke
Taishi Nakamura
Timur Carstensen
Niccolò Ajroldi
Ville Komulainen
David Salinas
J. Jitsev
165
2
0
10 Sep 2025
ForTIFAI: Fending Off Recursive Training Induced Failure for AI Model Collapse
Soheil Zibakhsh Shabgahi
Pedram Aghazadeh
Azalia Mirhoseini
F. Koushanfar
263
0
0
10 Sep 2025
Mitigating Attention Localization in Small Scale: Self-Attention Refinement via One-step Belief Propagation
Nakyung Lee
Yeongoon Kim
Minhae Oh
Suhwan Kim
Jin Woo Koo
Hyewon Jo
Jungwoo Lee
134
1
0
09 Sep 2025
Causal Attention with Lookahead Keys
Zhuoqing Song
Peng Sun
Huizhuo Yuan
Quanquan Gu
CML
188
0
0
09 Sep 2025
LoaQ: Layer-wise Output Approximation Quantization
Li Lin
Xiaojun Wan
MQ
84
1
0
08 Sep 2025
IPR: Intelligent Prompt Routing with User-Controlled Quality-Cost Trade-offs
Aosong Feng
Zhichao Xu
Xian Wu
Kang Zhou
Sheng Guan
...
Soumya Smruti Mishra
Yifei Teng
Darren Yow-Bang Wang
Haibo Ding
Lin Lee Cheong
238
0
0
08 Sep 2025
COMPACT: Common-token Optimized Model Pruning Across Channels and Tokens
Eugene Kwek
Wenpeng Yin
VLM
248
0
0
08 Sep 2025
AntiDote: Bi-level Adversarial Training for Tamper-Resistant LLMs
Debdeep Sanyal
Manodeep Ray
Murari Mandal
AAML
188
0
0
06 Sep 2025
Llama-GENBA-10B: A Trilingual Large Language Model for German, English and Bavarian
Michael Hoffmann
Jophin John
Stefan Schweter
Gokul Ramakrishnan
Hoi-Fong Mak
Alice Zhang
Dmitry Gaynullin
Nicolay J. Hammer
CLL
160
1
0
06 Sep 2025
Hyperbolic Large Language Models
Sarang Patil
Zeyong Zhang
Yiran Huang
Tengfei Ma
Mengjia Xu
AI4CE
210
0
0
06 Sep 2025
Delta Activations: A Representation for Finetuned Large Language Models
Zhiqiu Xu
Amish Sethi
Mayur Naik
Ser-Nam Lim
146
0
0
04 Sep 2025
Set Block Decoding is a Language Model Inference Accelerator
Itai Gat
Heli Ben-Hamu
Marton Havasi
Daniel Haziza
Jeremy Reizenstein
Gabriel Synnaeve
David Lopez-Paz
Brian Karrer
Y. Lipman
142
6
0
04 Sep 2025
Drivel-ology: Challenging LLMs with Interpreting Nonsense with Depth
Yang Wang
Chenghao Xiao
Chia-Yi Hsiao
Zi Yan Chang
Chi-Li Chen
Tyler Loakman
Chenghua Lin
235
1
0
04 Sep 2025
On Robustness and Reliability of Benchmark-Based Evaluation of LLMs
Riccardo Lunardi
V. D. Mea
Stefano Mizzaro
Kevin Roitero
164
5
0
04 Sep 2025
SelfAug: Mitigating Catastrophic Forgetting in Retrieval-Augmented Generation via Distribution Self-Alignment
Yuqing Huang
Rongyang Zhang
Qimeng Wang
Chengqiang Lu
Yan Gao
...
Xuyang Zhi
Guiquan Liu
Xin Li
Hao Wang
Tong Xu
CLL
167
2
0
04 Sep 2025
RL's Razor: Why Online Reinforcement Learning Forgets Less
Idan Shenfeld
Jyothish Pari
Pulkit Agrawal
CLL
183
41
0
04 Sep 2025
Adaptive Preference Optimization with Uncertainty-aware Utility Anchor
Xiaobo Wang
Zixia Jia
Jiaqi Li
Qi Liu
Zilong Zheng
104
0
0
03 Sep 2025
LExI: Layer-Adaptive Active Experts for Efficient MoE Model Inference
Krishna Teja Chitty-Venkata
Sandeep Madireddy
M. Emani
V. Vishwanath
MoE
159
1
0
02 Sep 2025
Efficient Training-Free Online Routing for High-Volume Multi-LLM Serving
Fangzhou Wu
Sandeep Silwal
229
0
0
02 Sep 2025
Implicit Reasoning in Large Language Models: A Comprehensive Survey
Jindong Li
Yali Fu
Li Fan
Jiahong Liu
Yao Shu
Chengwei Qin
Menglin Yang
Irwin King
Rex Ying
OffRL
LRM
AI4CE
212
11
0
02 Sep 2025
Causal Consistency Regularization: Training Verifiably Sensitive Reasoning in Large Language Models
Ibne Farabi Shihab
Sanjeda Akter
Anuj Sharma
ReLM
LRM
154
0
0
01 Sep 2025
Dream-Coder 7B: An Open Diffusion Language Model for Code
Zhihui Xie
Jiacheng Ye
Lin Zheng
Lei Li
Jingwei Dong
...
Xueliang Zhao
Shansan Gong
Xin Jiang
Zhenguo Li
Lingpeng Kong
DiffM
127
18
0
01 Sep 2025
LiquidGEMM: Hardware-Efficient W4A8 GEMM Kernel for High-Performance LLM Serving
Huanqi Hu
Bowen Xiao
Shixuan Sun
Jianian Yin
Zhexi Zhang
...
Chengquan Jiang
Weiqi Xu
Xiaoying Jia
Xin Liu
Minyi Guo
MQ
VLM
106
4
0
01 Sep 2025
GradES: Significantly Faster Training in Transformers with Gradient-Based Early Stopping
Qifu Wen
Xi Zeng
Zihan Zhou
Shuaijun Liu
M. Hosseinzadeh
Ningxin Su
Reza Rawassizadeh
255
0
0
01 Sep 2025
Flaw or Artifact? Rethinking Prompt Sensitivity in Evaluating LLMs
Andong Hua
Kenan Tang
Chenhe Gu
Jindong Gu
Eric Wong
Yao Qin
LRM
107
2
0
01 Sep 2025
DTRNet: Dynamic Token Routing Network to Reduce Quadratic Costs in Transformers
Aman Sharma
Saeed Najafi
Parsa Farinneya
Benyamin Jamialahmadi
Marzieh S. Tahaei
Yuhe Fan
Mehdi Rezagholizadeh
Boxing Chen
A. Jafari
85
1
0
31 Aug 2025
Router Upcycling: Leveraging Mixture-of-Routers in Mixture-of-Experts Upcycling
Junfeng Ran
Guangxiang Zhao
Yuhan Wu
Dawei Zhu
Longyun Wu
Yikai Zhao
Tong Yang
Lin Sun
Xiangzheng Zhang
Sujian Li
MoE
MoMe
84
0
0
31 Aug 2025
Middo: Model-Informed Dynamic Data Optimization for Enhanced LLM Fine-Tuning via Closed-Loop Learning
Zinan Tang
Xin Gao
Qizhi Pei
Zhuoshi Pan
Mengzhang Cai
Jiang Wu
Conghui He
Lijun Wu
SyDa
305
2
0
29 Aug 2025
Standard vs. Modular Sampling: Best Practices for Reliable LLM Unlearning
Praveen Bushipaka
Lucia Passaro
Tommaso Cucinotta
MU
108
0
0
29 Aug 2025
PDTrim: Targeted Pruning for Prefill-Decode Disaggregation in Inference
Hao Zhang
Mengsi Lyu
Zhuo Chen
Xingrun Xing
Yulong Ao
Yonghua Lin
467
1
0
29 Aug 2025
Turning the Spell Around: Lightweight Alignment Amplification via Rank-One Safety Injection
Harethah Shairah
Hasan Hammoud
G. Turkiyyah
Bernard Ghanem
LLMSV
140
1
0
28 Aug 2025
Provable Benefits of In-Tool Learning for Large Language Models
Sam Houliston
Ambroise Odonnat
Charles Arnal
Vivien A. Cabannes
RALM
152
1
0
28 Aug 2025
InSQuAD: In-Context Learning for Efficient Retrieval via Submodular Mutual Information to Enforce Quality and Diversity
Souradeep Nanda
Anay Majee
Rishabh K. Iyer
123
0
0
28 Aug 2025
UI-Bench: A Benchmark for Evaluating Design Capabilities of AI Text-to-App Tools
Sam Jung
Agustin Garcinuno
Spencer Mateega
ELM
208
0
0
28 Aug 2025
Previous
1
2
3
...
6
7
8
...
44
45
46
Next