ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1905.07830
  4. Cited By
HellaSwag: Can a Machine Really Finish Your Sentence?

HellaSwag: Can a Machine Really Finish Your Sentence?

Annual Meeting of the Association for Computational Linguistics (ACL), 2019
19 May 2019
Rowan Zellers
Ari Holtzman
Yonatan Bisk
Ali Farhadi
Yejin Choi
ArXiv (abs)PDFHTML

Papers citing "HellaSwag: Can a Machine Really Finish Your Sentence?"

50 / 2,252 papers shown
RAGCap-Bench: Benchmarking Capabilities of LLMs in Agentic Retrieval Augmented Generation Systems
RAGCap-Bench: Benchmarking Capabilities of LLMs in Agentic Retrieval Augmented Generation Systems
Jingru Lin
Chen Zhang
Stephen Y. Liu
Haizhou Li
RALM
116
0
0
15 Oct 2025
CARVQ: Corrective Adaptor with Group Residual Vector Quantization for LLM Embedding Compression
CARVQ: Corrective Adaptor with Group Residual Vector Quantization for LLM Embedding Compression
Dayin Gou
Sanghyun Byun
Nilesh Malpeddi
Gabrielle De Micheli
Prathamesh Vaste
Jacob Song
Woo Seong Chung
MQ
108
0
0
14 Oct 2025
OPLoRA: Orthogonal Projection LoRA Prevents Catastrophic Forgetting during Parameter-Efficient Fine-Tuning
OPLoRA: Orthogonal Projection LoRA Prevents Catastrophic Forgetting during Parameter-Efficient Fine-Tuning
Yifeng Xiong
Xiaohui Xie
CLL
476
2
0
14 Oct 2025
Deconstructing Attention: Investigating Design Principles for Effective Language Modeling
Deconstructing Attention: Investigating Design Principles for Effective Language Modeling
Huiyin Xue
Nafise Sadat Moosavi
Nikolaos Aletras
121
0
0
13 Oct 2025
Neural Weight Compression for Language Models
Neural Weight Compression for Language Models
Jegwang Ryu
Minkyu Kim
Seungjun Shin
Hee Min Choi
Dokwan Oh
Jaeho Lee
133
0
0
13 Oct 2025
APLOT: Robust Reward Modeling via Adaptive Preference Learning with Optimal Transport
APLOT: Robust Reward Modeling via Adaptive Preference Learning with Optimal Transport
Z. Li
Yuege Feng
Dandan Guo
Jinpeng Hu
Anningzhe Gao
Xiang Wan
120
1
0
13 Oct 2025
Balancing Synthetic Data and Replay for Enhancing Task-Specific Capabilities
Balancing Synthetic Data and Replay for Enhancing Task-Specific Capabilities
Urs Spiegelhalter
Jorg K. H. Franke
Frank Hutter
CLLKELM
136
0
0
13 Oct 2025
Preconditioned Norms: A Unified Framework for Steepest Descent, Quasi-Newton and Adaptive Methods
Preconditioned Norms: A Unified Framework for Steepest Descent, Quasi-Newton and Adaptive Methods
Andrey Veprikov
Arman Bolatov
Samuel Horváth
Aleksandr Beznosikov
Martin Takáč
Slavomír Hanzely
ODL
310
0
0
12 Oct 2025
Preserving LLM Capabilities through Calibration Data Curation: From Analysis to Optimization
Preserving LLM Capabilities through Calibration Data Curation: From Analysis to Optimization
Bowei He
Lihao Yin
Huiling Zhen
Shuqi Liu
Han Wu
Xiaokun Zhang
Mingxuan Yuan
Chen Ma
104
0
0
12 Oct 2025
RePro: Training Language Models to Faithfully Recycle the Web for Pretraining
RePro: Training Language Models to Faithfully Recycle the Web for Pretraining
Zichun Yu
Chenyan Xiong
OnRL
228
0
0
12 Oct 2025
Rethinking LLM Evaluation: Can We Evaluate LLMs with 200x Less Data?
Rethinking LLM Evaluation: Can We Evaluate LLMs with 200x Less Data?
Shaobo Wang
C. Wang
Wenjie Fu
Yue Min
Mingquan Feng
...
Kexin Yang
Xingzhang Ren
Fei Huang
Dayiheng Liu
Linfeng Zhang
146
0
0
12 Oct 2025
AnyBCQ: Hardware Efficient Flexible Binary-Coded Quantization for Multi-Precision LLMs
AnyBCQ: Hardware Efficient Flexible Binary-Coded Quantization for Multi-Precision LLMs
Gunho Park
Jeongin Bae
Beomseok Kwon
Byeongwook Kim
S. Kwon
Dongsoo Lee
MQ
165
1
0
12 Oct 2025
Long Exposure: Accelerating Parameter-Efficient Fine-Tuning for LLMs under Shadowy Sparsity
Long Exposure: Accelerating Parameter-Efficient Fine-Tuning for LLMs under Shadowy SparsityInternational Conference for High Performance Computing, Networking, Storage and Analysis (SC), 2024
Tuowei Wang
Kun Li
Zixu Hao
Donglin Bai
Ju Ren
Yaoxue Zhang
Ting Cao
M. Yang
152
4
0
12 Oct 2025
BabyBabelLM: A Multilingual Benchmark of Developmentally Plausible Training Data
BabyBabelLM: A Multilingual Benchmark of Developmentally Plausible Training Data
Jaap Jumelet
Abdellah Fourtassi
Akari Haga
Bastian Bunzeck
Bhargav Shandilya
...
Yurii Paniv
Ziyin Zhang
Arianna Bisazza
Alex Warstadt
Leshem Choshen
116
1
0
11 Oct 2025
CTR-LoRA: Curvature-Aware and Trust-Region Guided Low-Rank Adaptation for Large Language Models
CTR-LoRA: Curvature-Aware and Trust-Region Guided Low-Rank Adaptation for Large Language Models
Zhuxuanzi Wang
Mingqiao Mo
Xi Xiao
Chen Liu
Chenrui Ma
Yunbei Zhang
Xiao Wang
Smita Krishnaswamy
Tianyang Wang
127
0
0
11 Oct 2025
PermLLM: Learnable Channel Permutation for N:M Sparse Large Language Models
PermLLM: Learnable Channel Permutation for N:M Sparse Large Language Models
Lancheng Zou
Shuo Yin
Zehua Pei
Tsung-Yi Ho
Farzan Farnia
Bei Yu
84
0
0
11 Oct 2025
NarraBench: A Comprehensive Framework for Narrative Benchmarking
NarraBench: A Comprehensive Framework for Narrative Benchmarking
Sil Hamilton
Matthew Wilkens
Andrew Piper
190
0
0
10 Oct 2025
ProxRouter: Proximity-Weighted LLM Query Routing for Improved Robustness to Outliers
ProxRouter: Proximity-Weighted LLM Query Routing for Improved Robustness to Outliers
Shivam Patel
Neharika Jali
Ankur Mallick
Gauri Joshi
128
0
0
10 Oct 2025
Entropy Meets Importance: A Unified Head Importance-Entropy Score for Stable and Efficient Transformer Pruning
Entropy Meets Importance: A Unified Head Importance-Entropy Score for Stable and Efficient Transformer Pruning
Minsik Choi
Hyegang Son
Changhoon Kim
Young Geun Kim
AAML
116
0
0
10 Oct 2025
Hierarchical Scheduling for Multi-Vector Image Retrieval
Hierarchical Scheduling for Multi-Vector Image Retrieval
Maoliang Li
K. Li
Yaoyang Liu
Jiayu Chen
Zihao Zheng
Yinjun Wu
Xiang Chen
112
0
0
10 Oct 2025
FLRC: Fine-grained Low-Rank Compressor for Efficient LLM Inference
FLRC: Fine-grained Low-Rank Compressor for Efficient LLM Inference
Yu-Chen Lu
Chong-Yan Chen
Chi-Chih Chang
Yu-Fang Hu
Kai-Chiang Wu
80
1
0
10 Oct 2025
SliceFine: The Universal Winning-Slice Hypothesis for Pretrained Networks
SliceFine: The Universal Winning-Slice Hypothesis for Pretrained Networks
Md. Kowsher
Ali O. Polat
Ehsan Mohammady Ardehaly
Mehrdad Salehi
Zia Ghiasi
Prasanth Murali
Chen Chen
180
1
0
09 Oct 2025
DISCO: Diversifying Sample Condensation for Efficient Model Evaluation
DISCO: Diversifying Sample Condensation for Efficient Model Evaluation
Alexander Rubinstein
Benjamin Raible
Martin Gubri
Seong Joon Oh
ELM
375
0
1
09 Oct 2025
RCPU: Rotation-Constrained Error Compensation for Structured Pruning of a Large Language Model
RCPU: Rotation-Constrained Error Compensation for Structured Pruning of a Large Language Model
Shuichiro Haruta
Kazunori Matsumoto
Zhi Li
Yanan Wang
Mori Kurokawa
126
0
0
09 Oct 2025
Fewer Weights, More Problems: A Practical Attack on LLM Pruning
Fewer Weights, More Problems: A Practical Attack on LLM Pruning
Kazuki Egashira
Robin Staab
Thibaud Gloaguen
Mark Vero
Martin Vechev
AAML
191
1
0
09 Oct 2025
Weak Form Learning for Mean-Field Partial Differential Equations: an Application to Insect Movement
Weak Form Learning for Mean-Field Partial Differential Equations: an Application to Insect Movement
Seth Minor
Bret D. Elderd
Benjamin Van Allen
David M. Bortz
Vanja M. Dukic
119
0
0
09 Oct 2025
AILoRA: Function-Aware Asymmetric Initialization for Low-Rank Adaptation of Large Language Models
AILoRA: Function-Aware Asymmetric Initialization for Low-Rank Adaptation of Large Language Models
Xiaoshuang Ji
Zhendong Zhao
Xiaoyan Gu
Xiaojun Chen
Xin Zhao
Zeyao Liu
123
0
0
09 Oct 2025
Contrastive Weak-to-strong Generalization
Contrastive Weak-to-strong Generalization
Houcheng Jiang
Junfeng Fang
Jiaxin Wu
T. Zhang
Chen Gao
Yong Li
X. Wang
Xiangnan He
Yang Deng
132
0
0
09 Oct 2025
Recycling Pretrained Checkpoints: Orthogonal Growth of Mixture-of-Experts for Efficient Large Language Model Pre-Training
Recycling Pretrained Checkpoints: Orthogonal Growth of Mixture-of-Experts for Efficient Large Language Model Pre-Training
Ruizhe Wang
Yucheng Ding
Xiao Liu
Yaoxiang Wang
Peng Cheng
Baining Guo
Zhengjun Zha
Yeyun Gong
136
0
0
09 Oct 2025
Rényi Sharpness: A Novel Sharpness that Strongly Correlates with Generalization
Rényi Sharpness: A Novel Sharpness that Strongly Correlates with Generalization
Qiaozhe Zhang
Jun Sun
Ruijie Zhang
Yingzhuang Liu
180
0
0
09 Oct 2025
POME: Post Optimization Model Edit via Muon-style Projection
POME: Post Optimization Model Edit via Muon-style Projection
Yong Liu
Di Fu
Yang Luo
Zirui Zhu
Minhao Cheng
Cho-Jui Hsieh
Yang You
96
0
0
08 Oct 2025
Next Semantic Scale Prediction via Hierarchical Diffusion Language Models
Next Semantic Scale Prediction via Hierarchical Diffusion Language Models
Cai Zhou
Chenyu Wang
Dinghuai Zhang
Shangyuan Tong
Yifei Wang
Stephen Bates
Tommi Jaakkola
140
0
0
08 Oct 2025
Encode, Think, Decode: Scaling test-time reasoning with recursive latent thoughts
Encode, Think, Decode: Scaling test-time reasoning with recursive latent thoughts
Yeskendir Koishekenov
Aldo Lipani
Nicola Cancedda
LRM
150
1
0
08 Oct 2025
Auto-Stega: An Agent-Driven System for Lifelong Strategy Evolution in LLM-Based Text Steganography
Auto-Stega: An Agent-Driven System for Lifelong Strategy Evolution in LLM-Based Text Steganography
Jiuan Zhou
Yu Cheng
Yuan Xie
Z. Yin
106
3
0
08 Oct 2025
Learning to Route LLMs from Bandit Feedback: One Policy, Many Trade-offs
Learning to Route LLMs from Bandit Feedback: One Policy, Many Trade-offs
Wang Wei
Tiankai Yang
Hongjie Chen
Yue Zhao
Franck Dernoncourt
Ryan Rossi
Hoda Eldardiry
OffRL
88
0
0
08 Oct 2025
JAI-1: A Thai-Centric Large Language Model
JAI-1: A Thai-Centric Large Language Model
Attapol T. Rutherford
Jullajak Karnjanaekarin
Narongkorn Panitsrisit
Pontakorn Trakuekul
Sumana Sumanakul
Natchanon Pollertlam
75
0
0
08 Oct 2025
Native Hybrid Attention for Efficient Sequence Modeling
Native Hybrid Attention for Efficient Sequence Modeling
Jusen Du
Jiaxi Hu
Tao Zhang
Weigao Sun
Yu Cheng
197
3
0
08 Oct 2025
PIKA: Expert-Level Synthetic Datasets for Post-Training Alignment from Scratch
PIKA: Expert-Level Synthetic Datasets for Post-Training Alignment from Scratch
Shangjian Yin
Shining Liang
Wenbiao Ding
Yuli Qian
Zhouxing Shi
Hongzhi Li
Yutao Xie
ALM
118
0
0
08 Oct 2025
Latent Representation Learning in Heavy-Ion Collisions with MaskPoint Transformer
Latent Representation Learning in Heavy-Ion Collisions with MaskPoint Transformer
Jing-Zong Zhang
Shuang Guo
Li-Lin Zhu
Lingxiao Wang
Guo-Liang Ma
140
10
0
08 Oct 2025
Poisoning Attacks on LLMs Require a Near-constant Number of Poison Samples
Poisoning Attacks on LLMs Require a Near-constant Number of Poison Samples
Alexandra Souly
Javier Rando
Ed Chapman
Xander Davies
Shae McFadden
...
Erik Jones
Chris Hicks
Nicholas Carlini
Y. Gal
Robert Kirk
AAMLSILM
268
8
0
08 Oct 2025
ParsTranslit: Truly Versatile Tajik-Farsi Transliteration
ParsTranslit: Truly Versatile Tajik-Farsi Transliteration
Rayyan Merchant
Kevin Tang
85
0
0
08 Oct 2025
Adaptive Stain Normalization for Cross-Domain Medical Histology
Adaptive Stain Normalization for Cross-Domain Medical HistologyInternational Conference on Medical Image Computing and Computer-Assisted Intervention (MICCAI), 2025
Tianyue Xu
Yanlin Wu
Abhai K. Tripathi
Matthew M. Ippolito
Benjamin D. Haeffele
OODMedIm
124
0
0
08 Oct 2025
Grouped Differential Attention
Grouped Differential Attention
Junghwan Lim
S. W. Lee
Dongseok Kim
Wai Ting Cheung
Beomgyu Kim
Taehwan Kim
Haesol Lee
Junhyeok Lee
Dongpin Oh
Eunhwan Park
101
1
0
08 Oct 2025
lm-Meter: Unveiling Runtime Inference Latency for On-Device Language Models
lm-Meter: Unveiling Runtime Inference Latency for On-Device Language Models
Haoxin Wang
Xiaolong Tu
Hongyu Ke
Huirong Chai
Dawei Chen
Kyungtae Han
107
1
0
07 Oct 2025
BLISS: A Lightweight Bilevel Influence Scoring Method for Data Selection in Language Model Pretraining
BLISS: A Lightweight Bilevel Influence Scoring Method for Data Selection in Language Model Pretraining
Jie Hao
Rui Yu
W. Zhang
Huixia Wang
Jie Xu
Mingrui Liu
256
0
0
07 Oct 2025
Activation-Informed Pareto-Guided Low-Rank Compression for Efficient LLM/VLM
Activation-Informed Pareto-Guided Low-Rank Compression for Efficient LLM/VLM
Ryan Solgi
Parsa Madinei
Jiayi Tian
Rupak Vignesh Swaminathan
Jing Liu
Nathan Susanj
Zheng Zhang
90
1
0
07 Oct 2025
Fairness in Token Delegation: Mitigating Voting Power Concentration in DAOs
Fairness in Token Delegation: Mitigating Voting Power Concentration in DAOs
Johnnatan Messias
Ayae Ide
105
0
0
07 Oct 2025
Latent Speech-Text Transformer
Latent Speech-Text Transformer
Yen-Ju Lu
Yashesh Gaur
Wei Zhou
Benjamin Muller
Jesus Villalba
...
Luke Zettlemoyer
Gargi Ghosh
Mike Lewis
Srinivasan Iyer
Duc Le
VLM
124
0
0
07 Oct 2025
Diversity Is All You Need for Contrastive Learning: Spectral Bounds on Gradient Magnitudes
Diversity Is All You Need for Contrastive Learning: Spectral Bounds on Gradient Magnitudes
Peter Ochieng
84
1
0
07 Oct 2025
ARMOR: High-Performance Semi-Structured Pruning via Adaptive Matrix Factorization
ARMOR: High-Performance Semi-Structured Pruning via Adaptive Matrix Factorization
Lawrence Liu
Alexander Liu
Mengdi Wang
T. Zhao
Lin F. Yang
120
0
0
07 Oct 2025
Previous
12345...444546
Next