ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1905.07830
  4. Cited By
HellaSwag: Can a Machine Really Finish Your Sentence?

HellaSwag: Can a Machine Really Finish Your Sentence?

Annual Meeting of the Association for Computational Linguistics (ACL), 2019
19 May 2019
Rowan Zellers
Ari Holtzman
Yonatan Bisk
Ali Farhadi
Yejin Choi
ArXiv (abs)PDFHTML

Papers citing "HellaSwag: Can a Machine Really Finish Your Sentence?"

50 / 2,243 papers shown
Title
Train Once, Answer All: Many Pretraining Experiments for the Cost of One
Train Once, Answer All: Many Pretraining Experiments for the Cost of One
Sebastian Bordt
Martin Pawelczyk
CLL
168
1
0
27 Sep 2025
Multiplayer Nash Preference Optimization
Multiplayer Nash Preference Optimization
Fang Wu
X. Y. Huang
Weihao Xuan
Zhiwei Zhang
Yijia Xiao
...
Xiaomin Li
Bing Hu
Peng Xia
Jure Leskovec
Yejin Choi
120
1
0
27 Sep 2025
Erase or Hide? Suppressing Spurious Unlearning Neurons for Robust Unlearning
Erase or Hide? Suppressing Spurious Unlearning Neurons for Robust Unlearning
Nakyeong Yang
Dong-Kyum Kim
Jea Kwon
Minsung Kim
Kyomin Jung
M. Cha
MUKELM
104
0
0
26 Sep 2025
COSPADI: Compressing LLMs via Calibration-Guided Sparse Dictionary Learning
COSPADI: Compressing LLMs via Calibration-Guided Sparse Dictionary Learning
Dmitriy Shopkhoev
Denis Makhov
Magauiya Zhussip
Ammar Ali
Stamatios Lefkimmiatis
181
0
0
26 Sep 2025
Front-Loading Reasoning: The Synergy between Pretraining and Post-Training Data
Front-Loading Reasoning: The Synergy between Pretraining and Post-Training Data
Syeda Nahida Akter
Shrimai Prabhumoye
Eric Nyberg
M. Patwary
Mohammad Shoeybi
Yejin Choi
Bryan Catanzaro
AIFinLRMAI4CE
116
4
0
26 Sep 2025
Rethinking RoPE Scaling in Quantized LLM: Theory, Outlier, and Channel-Band Analysis with Weight Rescaling
Rethinking RoPE Scaling in Quantized LLM: Theory, Outlier, and Channel-Band Analysis with Weight Rescaling
Ye Qiao
Haocheng Xu
Xiaofan Zhang
Sitao Huang
MQ
88
0
0
26 Sep 2025
Lightweight error mitigation strategies for post-training N:M activation sparsity in LLMs
Lightweight error mitigation strategies for post-training N:M activation sparsity in LLMs
Shirin Alanova
Kristina Kazistova
Ekaterina Galaeva
Alina Kostromina
Vladimir Smirnov
Redko Dmitry
Alexey Dontsov
Maxim Zhelnin
Evgeny Burnaev
Egor Shvetsov
132
0
0
26 Sep 2025
Stochastic activations
Stochastic activations
Maria Lomeli
Matthijs Douze
Gergely Szilvasy
Loic Cabannes
Jade Copet
Sainbayar Sukhbaatar
Jason Weston
Gabriel Synnaeve
Pierre-Emmanuel Mazaré
Hervé Jégou
LLMSV
212
0
0
26 Sep 2025
HEAPr: Hessian-based Efficient Atomic Expert Pruning in Output Space
HEAPr: Hessian-based Efficient Atomic Expert Pruning in Output Space
Ke Li
Zheng Yang
Zhongbin Zhou
Feng Xue
Zhonglin Jiang
Wenxiao Wang
MoE
113
0
0
26 Sep 2025
What Matters More For In-Context Learning under Matched Compute Budgets: Pretraining on Natural Text or Incorporating Targeted Synthetic Examples?
What Matters More For In-Context Learning under Matched Compute Budgets: Pretraining on Natural Text or Incorporating Targeted Synthetic Examples?
Mohammed Sabry
Anya Belz
83
0
0
26 Sep 2025
Elastic MoE: Unlocking the Inference-Time Scalability of Mixture-of-Experts
Elastic MoE: Unlocking the Inference-Time Scalability of Mixture-of-Experts
Naibin Gu
Zhenyu Zhang
Yuchen Feng
Yilong Chen
Peng Fu
...
Shuohuan Wang
Yu Sun
Hua Wu
Weiping Wang
Haifeng Wang
MoE
85
0
0
26 Sep 2025
IIET: Efficient Numerical Transformer via Implicit Iterative Euler Method
IIET: Efficient Numerical Transformer via Implicit Iterative Euler Method
Xinyu Liu
Bei Li
Jiahao Liu
Junhao Ruan
Kechen Jiao
Hongyin Tang
Jingang Wang
Xiao Tong
Jingbo Zhu
170
0
0
26 Sep 2025
Predicting LLM Reasoning Performance with Small Proxy Model
Predicting LLM Reasoning Performance with Small Proxy Model
Woosung Koh
Juyoung Suk
Sungjun Han
Se-Young Yun
Jay Shin
LRMAI4CE
218
0
0
25 Sep 2025
Blockwise Hadamard high-Rank Adaptation for Parameter-Efficient LLM Fine-Tuning
Blockwise Hadamard high-Rank Adaptation for Parameter-Efficient LLM Fine-Tuning
Feng Yu
Jia Hu
Geyong Min
152
0
0
25 Sep 2025
SFT Doesn't Always Hurt General Capabilities: Revisiting Domain-Specific Fine-Tuning in LLMs
SFT Doesn't Always Hurt General Capabilities: Revisiting Domain-Specific Fine-Tuning in LLMs
J. Lin
Zhongruo Wang
Kun Qian
Tian Wang
Arvind Srinivasan
...
Weiqi Zhang
Sujay Sanghavi
C. L. P. Chen
Hyokun Yun
Lihong Li
CLL
318
1
0
25 Sep 2025
Expanding Reasoning Potential in Foundation Model by Learning Diverse Chains of Thought Patterns
Expanding Reasoning Potential in Foundation Model by Learning Diverse Chains of Thought Patterns
Xuemiao Zhang
Can Ren
Chengying Tu
Rongxiang Weng
Shuo Wang
Hongfei Yan
Jingang Wang
Xunliang Cai
LRMAI4CE
177
1
0
25 Sep 2025
CLUE: Conflict-guided Localization for LLM Unlearning Framework
CLUE: Conflict-guided Localization for LLM Unlearning Framework
Hang Chen
Jiaying Zhu
Xinyu Yang
Wenya Wang
MU
132
0
0
25 Sep 2025
Integrated Framework for LLM Evaluation with Answer Generation
Integrated Framework for LLM Evaluation with Answer Generation
Sujeong Lee
Hayoung Lee
Seongsoo Heo
Wonik Choi
157
0
0
24 Sep 2025
Detoxifying Large Language Models via Autoregressive Reward Guided Representation Editing
Detoxifying Large Language Models via Autoregressive Reward Guided Representation Editing
Yisong Xiao
Aishan Liu
Siyuan Liang
Zonghao Ying
Xianglong Liu
Dacheng Tao
KELM
141
2
0
24 Sep 2025
Q-Palette: Fractional-Bit Quantizers Toward Optimal Bit Allocation for Efficient LLM Deployment
Q-Palette: Fractional-Bit Quantizers Toward Optimal Bit Allocation for Efficient LLM Deployment
Deokjae Lee
Hyun Oh Song
MQ
165
0
0
24 Sep 2025
Enhancing Linear Attention with Residual Learning
Enhancing Linear Attention with Residual Learning
Xunhao Lai
Jialiang Kang
Jianqiao Lu
Tong Lin
Pengyu Zhao
KELMCLL
99
0
0
24 Sep 2025
RSAVQ: Riemannian Sensitivity-Aware Vector Quantization for Large Language Models
RSAVQ: Riemannian Sensitivity-Aware Vector Quantization for Large Language Models
Zukang Xu
Yan Chen
Qiang Wu
Dawei Yang
MQ
218
0
0
24 Sep 2025
HyperAdapt: Simple High-Rank Adaptation
HyperAdapt: Simple High-Rank Adaptation
Abel Gurung
Joseph Campbell
147
0
0
23 Sep 2025
ExPe: Exact Positional Encodings for Generative Transformer Models with Extrapolating Capabilities
ExPe: Exact Positional Encodings for Generative Transformer Models with Extrapolating Capabilities
Aleksis Datseris
Sylvia Vassileva
Ivan Koychev
Svetla Boytcheva
56
0
0
23 Sep 2025
Soft Tokens, Hard Truths
Soft Tokens, Hard Truths
Natasha Butt
Ariel Kwiatkowski
Ismail Labiad
Julia Kempe
Yann Ollivier
OffRLCLLLRM
139
1
0
23 Sep 2025
Benchmark Profiling: Mechanistic Diagnosis of LLM Benchmarks
Benchmark Profiling: Mechanistic Diagnosis of LLM Benchmarks
Dongjun Kim
Gyuho Shim
YongChan Chun
Minhyuk Kim
Chanjun Park
Heuiseok Lim
132
1
0
23 Sep 2025
Prior-based Noisy Text Data Filtering: Fast and Strong Alternative For Perplexity
Prior-based Noisy Text Data Filtering: Fast and Strong Alternative For Perplexity
Yeongbin Seo
Gayoung Kim
Jaehyung Kim
Jinyoung Yeo
127
0
0
23 Sep 2025
Diversity Boosts AI-Generated Text Detection
Diversity Boosts AI-Generated Text Detection
Advik Raj Basani
Pin-Yu Chen
DeLMO
275
3
0
23 Sep 2025
Training-free Truthfulness Detection via Value Vectors in LLMs
Training-free Truthfulness Detection via Value Vectors in LLMs
Runheng Liu
Heyan Huang
Xingchen Xiao
Zhijing Wu
88
0
0
22 Sep 2025
QWHA: Quantization-Aware Walsh-Hadamard Adaptation for Parameter-Efficient Fine-Tuning on Large Language Models
QWHA: Quantization-Aware Walsh-Hadamard Adaptation for Parameter-Efficient Fine-Tuning on Large Language Models
Hyesung Jeon
Seojune Lee
Beomseok Kang
Yulhwa Kim
Jae-Joon Kim
MQ
259
0
0
22 Sep 2025
SEQR: Secure and Efficient QR-based LoRA Routing
SEQR: Secure and Efficient QR-based LoRA Routing
William Fleshman
Benjamin Van Durme
156
0
0
22 Sep 2025
Dynamic Expert Specialization: Towards Catastrophic Forgetting-Free Multi-Domain MoE Adaptation
Dynamic Expert Specialization: Towards Catastrophic Forgetting-Free Multi-Domain MoE Adaptation
Junzhuo Li
Bo Wang
Xiuze Zhou
Xuming Hu
MoMeCLLMoE
160
0
0
21 Sep 2025
PTQTP: Post-Training Quantization to Trit-Planes for Large Language Models
PTQTP: Post-Training Quantization to Trit-Planes for Large Language Models
He Xiao
Runming Yang
Qingyao Yang
Wendong Xu
Zheng Li
Yupeng Su
Zhengwu Liu
Hongxia Yang
Ngai Wong
MQ
100
1
0
21 Sep 2025
MoEs Are Stronger than You Think: Hyper-Parallel Inference Scaling with RoE
MoEs Are Stronger than You Think: Hyper-Parallel Inference Scaling with RoE
Soheil Zibakhsh
Mohammad Samragh
K. Nishu
Lauren Hannah
Arnav Kundu
Minsik Cho
MoEBDLLRM
226
0
0
21 Sep 2025
EG-MLA: Embedding-Gated Multi-head Latent Attention for Scalable and Efficient LLMs
EG-MLA: Embedding-Gated Multi-head Latent Attention for Scalable and Efficient LLMs
Zhengge Cai
Haowen Hou
49
0
0
20 Sep 2025
Rethinking the Role of Text Complexity in Language Model Pretraining
Rethinking the Role of Text Complexity in Language Model Pretraining
Dan John Velasco
M. R
186
1
0
20 Sep 2025
DiEP: Adaptive Mixture-of-Experts Compression through Differentiable Expert Pruning
DiEP: Adaptive Mixture-of-Experts Compression through Differentiable Expert Pruning
Sikai Bai
Haoxi Li
Jie Zhang
Zicong Hong
Song Guo
MoE
106
1
0
19 Sep 2025
Evaluating the Effectiveness and Scalability of LLM-Based Data Augmentation for Retrieval
Evaluating the Effectiveness and Scalability of LLM-Based Data Augmentation for Retrieval
Pranjal A. Chitale
Bishal Santra
Yashoteja Prabhu
Amit Sharma
93
1
0
19 Sep 2025
UniGist: Towards General and Hardware-aligned Sequence-level Long Context Compression
UniGist: Towards General and Hardware-aligned Sequence-level Long Context Compression
Chenlong Deng
Zhisong Zhang
Kelong Mao
Shuaiyi Li
Tianqing Fang
H. Zhang
Haitao Mi
Dong Yu
Zhicheng Dou
142
0
0
19 Sep 2025
Distribution-Aligned Decoding for Efficient LLM Task Adaptation
Distribution-Aligned Decoding for Efficient LLM Task Adaptation
Senkang Hu
Xudong Han
Jinqi Jiang
Yihang Tao
Zihan Fang
Yong Dai
Sam Kwong
Yuguang Fang
221
1
0
19 Sep 2025
Pico: A Modular Framework for Hypothesis-Driven Small Language Model Research
Pico: A Modular Framework for Hypothesis-Driven Small Language Model Research
Richard Diehl Martinez
David Demitri Africa
Yuval Weiss
Suchir Salhan
Ryan Daniels
P. Buttery
112
1
0
19 Sep 2025
SABER: Uncovering Vulnerabilities in Safety Alignment via Cross-Layer Residual Connection
SABER: Uncovering Vulnerabilities in Safety Alignment via Cross-Layer Residual Connection
Maithili Joshi
Palash Nandi
Tanmoy Chakraborty
AAMLLLMSV
96
0
0
19 Sep 2025
Exploring Polyglot Harmony: On Multilingual Data Allocation for Large Language Models Pretraining
Exploring Polyglot Harmony: On Multilingual Data Allocation for Large Language Models Pretraining
Ping Guo
Y. Ren
Binbin Liu
Fengze Liu
Haobin Lin
Yifan Zhang
Bingni Zhang
Taifeng Wang
Yin Zheng
132
0
0
19 Sep 2025
KITE: Kernelized and Information Theoretic Exemplars for In-Context Learning
KITE: Kernelized and Information Theoretic Exemplars for In-Context Learning
Vaibhav Singh
Soumya Suvra Ghosal
Kapu Nirmal Joshua
Soumyabrata Pal
Sayak Ray Chowdhury
104
0
0
19 Sep 2025
Fair-GPTQ: Bias-Aware Quantization for Large Language Models
Fair-GPTQ: Bias-Aware Quantization for Large Language Models
Irina Proskurina
Guillaume Metzler
Julien Velcin
MQ
109
0
0
18 Sep 2025
NIRVANA: Structured pruning reimagined for large language models compression
NIRVANA: Structured pruning reimagined for large language models compression
Mengting Ai
Tianxin Wei
Sirui Chen
Jingrui He
VLM
1.6K
1
0
17 Sep 2025
SBVR: Summation of BitVector Representation for Efficient LLM Quantization
SBVR: Summation of BitVector Representation for Efficient LLM Quantization
Wonjun Bang
Jongseok Park
Hongseung Yu
Kyungmin Bin
Kyunghan Lee
MQ
108
0
0
17 Sep 2025
ZERA: Zero-init Instruction Evolving Refinement Agent - From Zero Instructions to Structured Prompts via Principle-based Optimization
ZERA: Zero-init Instruction Evolving Refinement Agent - From Zero Instructions to Structured Prompts via Principle-based Optimization
Seungyoun Yi
Minsoo Khang
Sungrae Park
LLMAG
68
0
0
17 Sep 2025
DSFT: Inspiring Diffusion Large Language Models to Comprehend Mathematical and Logical Patterns
DSFT: Inspiring Diffusion Large Language Models to Comprehend Mathematical and Logical Patterns
Ranfei Chen
Ming Chen
DiffMAI4CE
73
0
0
17 Sep 2025
Instance-level Randomization: Toward More Stable LLM Evaluations
Instance-level Randomization: Toward More Stable LLM Evaluations
Yiyang Li
Y. Wu
Ying Luo
Liangtai Sun
Zishu Qin
Lin Qiu
Xuezhi Cao
Xunliang Cai
116
0
0
16 Sep 2025
Previous
123...567...434445
Next