ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1905.07830
  4. Cited By
HellaSwag: Can a Machine Really Finish Your Sentence?

HellaSwag: Can a Machine Really Finish Your Sentence?

Annual Meeting of the Association for Computational Linguistics (ACL), 2019
19 May 2019
Rowan Zellers
Ari Holtzman
Yonatan Bisk
Ali Farhadi
Yejin Choi
ArXiv (abs)PDFHTML

Papers citing "HellaSwag: Can a Machine Really Finish Your Sentence?"

50 / 2,253 papers shown
Tensorized Clustered LoRA Merging for Multi-Task Interference
Tensorized Clustered LoRA Merging for Multi-Task Interference
Zhan Su
Fengran Mo
G. Liang
Jinghan Zhang
Bingbing Wen
Prayag Tiwari
Jian-Yun Nie
MoMe
178
0
0
06 Aug 2025
Share Your Attention: Transformer Weight Sharing via Matrix-based Dictionary Learning
Share Your Attention: Transformer Weight Sharing via Matrix-based Dictionary Learning
Magauiya Zhussip
Dmitriy Shopkhoev
Ammar Ali
Stamatios Lefkimmiatis
107
2
0
06 Aug 2025
Exploring Layer-wise Information Effectiveness for Post-Training Quantization in Small Language Models
Exploring Layer-wise Information Effectiveness for Post-Training Quantization in Small Language Models
He Xiao
Qingyao Yang
Dirui Xie
Wendong Xu
Wenyong Zhou
Haobo Liu
Zhengwu Liu
Ngai Wong
Zhengwu Liu
Ngai Wong
MQ
114
0
0
05 Aug 2025
FPEdit: Robust LLM Fingerprinting through Localized Parameter Editing
FPEdit: Robust LLM Fingerprinting through Localized Parameter Editing
Shida Wang
Chaohu Liu
Yubo Wang
Linli Xu
KELM
268
1
0
04 Aug 2025
Parameter-Efficient Routed Fine-Tuning: Mixture-of-Experts Demands Mixture of Adaptation Modules
Parameter-Efficient Routed Fine-Tuning: Mixture-of-Experts Demands Mixture of Adaptation Modules
Yilun Liu
Yunpu Ma
Yuetian Lu
Shuo Chen
Zifeng Ding
Volker Tresp
MoE
120
0
0
04 Aug 2025
Kron-LoRA: Hybrid Kronecker-LoRA Adapters for Scalable, Sustainable Fine-tuning
Kron-LoRA: Hybrid Kronecker-LoRA Adapters for Scalable, Sustainable Fine-tuning
Yixin Shen
143
1
0
04 Aug 2025
Trainable Dynamic Mask Sparse Attention
Trainable Dynamic Mask Sparse Attention
Jingze Shi
Yifan Wu
Yiran Peng
Yiran Peng
Liangdong Wang
Guang Liu
Yuyu Luo
351
3
0
04 Aug 2025
Beyond Manually Designed Pruning Policies with Second-Level Performance Prediction: A Pruning Framework for LLMs
Beyond Manually Designed Pruning Policies with Second-Level Performance Prediction: A Pruning Framework for LLMs
Zuxin Ma
Yunhe Cui
Yongbin Qin
141
0
0
04 Aug 2025
FlashCommunication V2: Bit Splitting and Spike Reserving for Any Bit Communication
FlashCommunication V2: Bit Splitting and Spike Reserving for Any Bit Communication
Qingyuan Li
Bo Zhang
Hui Kang
T. Xu
YuLei Qian
Yuchen Xie
Lin Ma
138
0
0
04 Aug 2025
CAMERA: Multi-Matrix Joint Compression for MoE Models via Micro-Expert Redundancy Analysis
CAMERA: Multi-Matrix Joint Compression for MoE Models via Micro-Expert Redundancy Analysis
Yuzhuang Xu
Xu Han
Yuanchi Zhang
Yixuan Wang
Yijun Liu
Shiyu Ji
Qingfu Zhu
Wanxiang Che
MoEMQ
409
1
0
04 Aug 2025
EAC-MoE: Expert-Selection Aware Compressor for Mixture-of-Experts Large Language Models
EAC-MoE: Expert-Selection Aware Compressor for Mixture-of-Experts Large Language ModelsAnnual Meeting of the Association for Computational Linguistics (ACL), 2025
Yuanteng Chen
Yuantian Shao
Peisong Wang
Jian Cheng
MoE
161
2
0
03 Aug 2025
Revisiting Replay and Gradient Alignment for Continual Pre-Training of Large Language Models
Revisiting Replay and Gradient Alignment for Continual Pre-Training of Large Language Models
Istabrak Abbes
G. Subbaraj
Matthew D Riemer
Nizar Islah
Benjamin Thérien
Tsuguchika Tabaru
Hiroaki Kingetsu
Sarath Chandar
Irina Rish
CLL
115
2
0
03 Aug 2025
LinkQA: Synthesizing Diverse QA from Multiple Seeds Strongly Linked by Knowledge Points
LinkQA: Synthesizing Diverse QA from Multiple Seeds Strongly Linked by Knowledge Points
Xuemiao Zhang
Can Ren
Chengying Tu
Rongxiang Weng
Hongfei Yan
Jingang Wang
Xunliang Cai
208
2
0
02 Aug 2025
Large-Scale Diverse Synthesis for Mid-Training
Large-Scale Diverse Synthesis for Mid-Training
Xuemiao Zhang
Chengying Tu
Can Ren
Rongxiang Weng
Hongfei Yan
Jingang Wang
Xunliang Cai
SyDa
151
3
0
02 Aug 2025
Oedipus and the Sphinx: Benchmarking and Improving Visual Language Models for Complex Graphic Reasoning
Oedipus and the Sphinx: Benchmarking and Improving Visual Language Models for Complex Graphic Reasoning
Jianyi Zhang
Xu Ji
Ziyin Zhou
Yuchen Zhou
Shubo Shi
Haoyu Wu
Zhen Li
Shizhao Liu
ReLMCoGeLRMVLM
153
1
0
01 Aug 2025
Unveiling Super Experts in Mixture-of-Experts Large Language Models
Unveiling Super Experts in Mixture-of-Experts Large Language Models
Zunhai Su
Qingyuan Li
Hao Zhang
Weihao Ye
Qibo Xue
YuLei Qian
Yuchen Xie
Ngai Wong
Kehong Yuan
MoE
277
2
0
31 Jul 2025
KLLM: Fast LLM Inference with K-Means Quantization
KLLM: Fast LLM Inference with K-Means Quantization
Xueying Wu
Baijun Zhou
Zhihui Gao
Yuzhe Fu
Qilin Zheng
Yintao He
Hai Helen Li
MQ
254
0
0
30 Jul 2025
ISO-Bench: Benchmarking Multimodal Causal Reasoning in Visual-Language Models through Procedural Plans
ISO-Bench: Benchmarking Multimodal Causal Reasoning in Visual-Language Models through Procedural Plans
Ananya Sadana
Yash Kumar Lal
Jiawei Zhou
CMLVLM
159
0
0
30 Jul 2025
League of LLMs: A Benchmark-Free Paradigm for Mutual Evaluation of Large Language Models
League of LLMs: A Benchmark-Free Paradigm for Mutual Evaluation of Large Language Models
Q. Guo
Wei Xie
Xiaofang Cai
Enze Wang
Shuoyoucheng Ma
Kai Chen
Xiaofeng Wang
Baosheng Wang
Xiaofeng Wang
Baosheng Wang
ELM
191
0
0
30 Jul 2025
Strategic Deflection: Defending LLMs from Logit Manipulation
Strategic Deflection: Defending LLMs from Logit Manipulation
Yassine Rachidy
Jihad Rbaiti
Youssef Hmamouche
Faissal Sehbaoui
Amal El Fallah Seghrouchni
AAMLLLMSV
155
1
0
29 Jul 2025
Kimi K2: Open Agentic Intelligence
Kimi K2: Open Agentic Intelligence
Kimi Team
Yifan Bai
Yiping Bao
Guanduo Chen
Jiahao Chen
...
Qifeng Teng
Chensi Wang
Dinglu Wang
Feng Wang
Haiming Wang
MoEVLMLRM
179
81
0
28 Jul 2025
Intent Aware Context Retrieval for Multi-Turn Agricultural Question Answering
Intent Aware Context Retrieval for Multi-Turn Agricultural Question Answering
Abhay Vijayvargia
Ajay Nagpal
Kundeshwar Pundalik
Atharva Savarkar
Smita Gautam
Pankaj Singh
Rohit Saluja
Ganesh Ramakrishnan
56
0
0
28 Jul 2025
MaPPO: Maximum a Posteriori Preference Optimization with Prior Knowledge
MaPPO: Maximum a Posteriori Preference Optimization with Prior Knowledge
Guangchen Lan
Sipeng Zhang
Tianle Wang
Yuwei Zhang
Daoan Zhang
Xinpeng Wei
Xiaoman Pan
Hongming Zhang
Dong-Jun Han
Christopher G. Brinton
290
2
0
27 Jul 2025
IQ Test for LLMs: An Evaluation Framework for Uncovering Core Skills in LLMs
IQ Test for LLMs: An Evaluation Framework for Uncovering Core Skills in LLMs
Aviya Maimon
Amir D. N. Cohen
Gal Vishne
Shauli Ravfogel
Reut Tsarfaty
137
0
0
27 Jul 2025
DeltaLLM: A Training-Free Framework Exploiting Temporal Sparsity for Efficient Edge LLM Inference
DeltaLLM: A Training-Free Framework Exploiting Temporal Sparsity for Efficient Edge LLM Inference
Jiawen Qi
Chang Gao
Zhaochun Ren
Qinyu Chen
162
2
0
25 Jul 2025
Technical Report of TeleChat2, TeleChat2.5 and T1
Technical Report of TeleChat2, TeleChat2.5 and T1
Zihan Wang
Xinzhang Liu
Yitong Yao
Chao Wang
Yu Zhao
...
Bingkai Yang
Shuangyong Song
Yongxiang Li
Zhongjiang He
Xuelong Li
AI4TSLRM
422
6
0
24 Jul 2025
Squeeze10-LLM: Squeezing LLMs' Weights by 10 Times via a Staged Mixed-Precision Quantization Method
Squeeze10-LLM: Squeezing LLMs' Weights by 10 Times via a Staged Mixed-Precision Quantization Method
Qingcheng Zhu
Yangyang Ren
L. Yang
Mingbao Lin
Yanjing Li
...
Haodong Zhu
Yuguang Yang
Juan Zhang
Runqi Wang
Baochang Zhang
MQ
161
0
0
24 Jul 2025
Innovator: Scientific Continued Pretraining with Fine-grained MoE Upcycling
Innovator: Scientific Continued Pretraining with Fine-grained MoE Upcycling
Ning Liao
Xiaoxing Wang
Peng Liu
Weiyang Guo
Feng Hong
...
Junchi Yan
Zhiyu Li
Feiyu Xiong
Yanfeng Wang
Linfeng Zhang
CLL
243
1
0
24 Jul 2025
A Comprehensive Evaluation on Quantization Techniques for Large Language Models
A Comprehensive Evaluation on Quantization Techniques for Large Language Models
Yutong Liu
Cairong Zhao
Guosheng Hu
MQ
215
0
0
23 Jul 2025
Towards Greater Leverage: Scaling Laws for Efficient Mixture-of-Experts Language Models
Towards Greater Leverage: Scaling Laws for Efficient Mixture-of-Experts Language Models
Changxin Tian
Kunlong Chen
Jia-Ling Liu
Ziqi Liu
Zhiqiang Zhang
Jun Zhou
MoE
385
12
0
23 Jul 2025
MultiNRC: A Challenging and Native Multilingual Reasoning Evaluation Benchmark for LLMs
MultiNRC: A Challenging and Native Multilingual Reasoning Evaluation Benchmark for LLMs
Alexander R. Fabbri
Diego Mares
Jorge Flores
Meher Mankikar
Ernesto Hernandez
Dean Lee
Bing Liu
Chen Xing
LRM
326
2
0
23 Jul 2025
WSM: Decay-Free Learning Rate Schedule via Checkpoint Merging for LLM Pre-training
WSM: Decay-Free Learning Rate Schedule via Checkpoint Merging for LLM Pre-training
Changxin Tian
Jiapeng Wang
Qian Zhao
Kunlong Chen
Jia-Ling Liu
Ziqi Liu
Jiaxin Mao
Wayne Xin Zhao
Zhiqiang Zhang
Jun Zhou
MoMeCLL
251
6
0
23 Jul 2025
LLM Data Selection and Utilization via Dynamic Bi-level Optimization
LLM Data Selection and Utilization via Dynamic Bi-level Optimization
Yang Yu
Kai Han
Hang Zhou
Yehui Tang
Kaiqi Huang
Yunhe Wang
Dacheng Tao
239
1
0
22 Jul 2025
Diffusion Beats Autoregressive in Data-Constrained Settings
Diffusion Beats Autoregressive in Data-Constrained Settings
Mihir Prabhudesai
Menging Wu
Amir Zadeh
Katerina Fragkiadaki
Deepak Pathak
DiffM
331
22
0
21 Jul 2025
Is Large Language Model Performance on Reasoning Tasks Impacted by Different Ways Questions Are Asked?
Is Large Language Model Performance on Reasoning Tasks Impacted by Different Ways Questions Are Asked?Annual Meeting of the Association for Computational Linguistics (ACL), 2025
Seok Hwan Song
Mohna Chakraborty
Qi Li
Wallapak Tavanapong
ELMLRM
184
1
0
21 Jul 2025
Data Mixing Agent: Learning to Re-weight Domains for Continual Pre-training
Data Mixing Agent: Learning to Re-weight Domains for Continual Pre-training
Kailai Yang
Xiao Liu
Lei Ji
Hao Li
Yeyun Gong
Peng Cheng
M. Yang
CLL
171
2
0
21 Jul 2025
StackTrans: From Large Language Model to Large Pushdown Automata Model
StackTrans: From Large Language Model to Large Pushdown Automata Model
Kechi Zhang
Ge Li
Jia Li
Huangzhao Zhang
Yihong Dong
Jia Li
Jingjing Xu
Zhi Jin
171
0
0
21 Jul 2025
Sparse Autoencoder-guided Supervised Finetuning to Mitigate Unexpected Code-Switching in LLMs
Sparse Autoencoder-guided Supervised Finetuning to Mitigate Unexpected Code-Switching in LLMs
Boyi Deng
Yu Wan
Baosong Yang
Fei Huang
Wenjie Wang
Fuli Feng
168
0
0
20 Jul 2025
LoRA meets Riemannion: Muon Optimizer for Parametrization-independent Low-Rank Adapters
LoRA meets Riemannion: Muon Optimizer for Parametrization-independent Low-Rank Adapters
Vladimir Bogachev
Vladimir Aletov
Alexander Molozhavenko
Denis Bobkov
Vera Soboleva
Aibek Alanov
Maxim Rakhuba
147
1
0
16 Jul 2025
First-Order Error Matters: Accurate Compensation for Quantized Large Language Models
First-Order Error Matters: Accurate Compensation for Quantized Large Language Models
Xingyu Zheng
Haotong Qin
Yuye Li
Haoran Chu
Jiakai Wang
Jinyang Guo
Michele Magno
Xianglong Liu
MQ
285
0
0
15 Jul 2025
Composing Linear Layers from Irreducibles
Composing Linear Layers from Irreducibles
Travis Pence
Daisuke Yamada
Vikas Singh
207
0
0
15 Jul 2025
FusionFactory: Fusing LLM Capabilities with Multi-LLM Log Data
FusionFactory: Fusing LLM Capabilities with Multi-LLM Log Data
Tao Feng
Haozhen Zhang
Zijie Lei
Pengrui Han
M. Patwary
Mohammad Shoeybi
Bryan Catanzaro
Jiaxuan You
MoMe
203
0
0
14 Jul 2025
PRM-Free Security Alignment of Large Models via Red Teaming and Adversarial Training
PRM-Free Security Alignment of Large Models via Red Teaming and Adversarial Training
Pengfei Du
AAML
151
2
0
14 Jul 2025
Pimba: A Processing-in-Memory Acceleration for Post-Transformer Large Language Model Serving
Pimba: A Processing-in-Memory Acceleration for Post-Transformer Large Language Model Serving
Wonung Kim
Yubin Lee
Yoonsung Kim
Jinwoo Hwang
Seongryong Oh
...
Aziz Huseynov
Woong Gyu Park
Chang Hyun Park
Divya Mahajan
Jongse Park
627
3
0
14 Jul 2025
DATE-LM: Benchmarking Data Attribution Evaluation for Large Language Models
DATE-LM: Benchmarking Data Attribution Evaluation for Large Language Models
Cathy Jiao
Yijun Pan
Emily Xiao
Daisy Sheng
Niket Jain
H. C. Zhao
Ishita Dasgupta
Jiaqi W. Ma
Chenyan Xiong
215
0
0
12 Jul 2025
Advancing Large Language Models for Tibetan with Curated Data and Continual Pre-Training
Advancing Large Language Models for Tibetan with Curated Data and Continual Pre-Training
Leiyu Pan
Bojian Xiong
Lei Yang
Renren Jin
Shaowei Zhang
...
Tianyu Dong
Zhuowen Han
Zhuo Chen
Yuqi Ren
Deyi Xiong
CLL
373
3
0
12 Jul 2025
Lizard: An Efficient Linearization Framework for Large Language Models
Lizard: An Efficient Linearization Framework for Large Language Models
Chien Van Nguyen
Ruiyi Zhang
Hanieh Deilamsalehy
Puneet Mathur
Viet Dac Lai
...
Ryan Rossi
Trung H. Bui
N. Vlassis
Franck Dernoncourt
T. Nguyen
KELM
247
2
0
11 Jul 2025
AbbIE: Autoregressive Block-Based Iterative Encoder for Efficient Sequence Modeling
AbbIE: Autoregressive Block-Based Iterative Encoder for Efficient Sequence Modeling
Preslav Aleksandrov
Meghdad Kurmanji
Fernando Garcia Redondo
David O'Shea
William F. Shen
Alex Iacob
Lorenzo Sani
Xinchi Qiu
Nicola Cancedda
Nicholas D. Lane
186
4
0
11 Jul 2025
BlockFFN: Towards End-Side Acceleration-Friendly Mixture-of-Experts with Chunk-Level Activation Sparsity
BlockFFN: Towards End-Side Acceleration-Friendly Mixture-of-Experts with Chunk-Level Activation Sparsity
Chenyang Song
Weilin Zhao
Xu Han
Chaojun Xiao
Yingfa Chen
Yuxuan Li
Zhiyuan Liu
Maosong Sun
MoE
260
0
0
11 Jul 2025
Pre-Training LLMs on a budget: A comparison of three optimizers
Pre-Training LLMs on a budget: A comparison of three optimizers
Joel Schlotthauer
Christian Kroos
Chris Hinze
Viktor Hangya
Luzian Hahn
Fabian Küch
197
0
0
11 Jul 2025
Previous
123...8910...444546
Next