Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
1905.07830
Cited By
HellaSwag: Can a Machine Really Finish Your Sentence?
Annual Meeting of the Association for Computational Linguistics (ACL), 2019
19 May 2019
Rowan Zellers
Ari Holtzman
Yonatan Bisk
Ali Farhadi
Yejin Choi
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"HellaSwag: Can a Machine Really Finish Your Sentence?"
50 / 2,253 papers shown
TwinBreak: Jailbreaking LLM Security Alignments based on Twin Prompts
T. Krauß
Hamid Dashtbani
Alexandra Dmitrienko
152
6
0
09 Jun 2025
ReCogDrive: A Reinforced Cognitive Framework for End-to-End Autonomous Driving
Yongkang Li
Kaixin Xiong
Xiangyu Guo
Fang Li
Sixu Yan
...
Guang Chen
Hangjun Ye
Wenyu Liu
Xinggang Wang
Xinggang Wang
VLM
270
5
0
09 Jun 2025
Private Memorization Editing: Turning Memorization into a Defense to Strengthen Data Privacy in Large Language Models
Annual Meeting of the Association for Computational Linguistics (ACL), 2025
Elena Sofia Ruzzetti
Giancarlo A. Xompero
Davide Venditti
Fabio Massimo Zanzotto
KELM
PILM
291
5
0
09 Jun 2025
Learning Distribution-Wise Control in Representation Space for Language Models
Chunyuan Deng
Ruidi Chang
Hanjie Chen
268
2
0
07 Jun 2025
Not quite Sherlock Holmes: Language model predictions do not reliably differentiate impossible from improbable events
Annual Meeting of the Association for Computational Linguistics (ACL), 2025
J. Michaelov
Reeka Estacio
Zhien Zhang
Benjamin Bergen
ReLM
LRM
205
1
0
07 Jun 2025
Adapt Once, Thrive with Updates: Transferable Parameter-Efficient Fine-Tuning on Evolving Base Models
Annual Meeting of the Association for Computational Linguistics (ACL), 2025
Naibin Gu
Peng Fu
Xiyu Liu
Ke Ma
Zheng Lin
Weiping Wang
188
3
0
07 Jun 2025
dots.llm1 Technical Report
Bi Huo
Bin Tu
Cheng Qin
Da Zheng
Debing Zhang
...
Yuqiu Ji
Ze Wen
Zhenhai Liu
Zichao Li
Zilong Liao
MoE
198
3
0
06 Jun 2025
Eigenspectrum Analysis of Neural Networks without Aspect Ratio Bias
Yuanzhe Hu
Kinshuk Goel
Vlad Killiakov
Yaoqing Yang
310
3
0
06 Jun 2025
Text-to-LoRA: Instant Transformer Adaption
Rujikorn Charakorn
Edoardo Cetin
Yujin Tang
Robert Tjarko Lange
AI4CE
275
7
0
06 Jun 2025
Come Together, But Not Right Now: A Progressive Strategy to Boost Low-Rank Adaptation
Zhan Zhuang
Xiequn Wang
Wei Li
Yulong Zhang
Qiushi Huang
...
Yanbin Wei
Yuhe Nie
Kede Ma
Yu Zhang
Ying Wei
283
0
0
06 Jun 2025
DynamicMind: A Tri-Mode Thinking System for Large Language Models
Wei Li
Yanbin Wei
Qiushi Huang
Jiangyue Yan
Yang Chen
James T. Kwok
Yu Zhang
LLMAG
LRM
175
3
0
06 Jun 2025
Selecting Demonstrations for Many-Shot In-Context Learning via Gradient Matching
Annual Meeting of the Association for Computational Linguistics (ACL), 2025
J. Zhang
Bei Li
Jun Bai
Rumei Li
Yanmeng Wang
Chenghua Lin
Wenge Rong
296
3
0
05 Jun 2025
FPTQuant: Function-Preserving Transforms for LLM Quantization
Boris van Breugel
Yelysei Bondarenko
Paul N. Whatmough
Markus Nagel
MQ
268
3
0
05 Jun 2025
MesaNet: Sequence Modeling by Locally Optimal Test-Time Training
J. Oswald
Nino Scherrer
Seijin Kobayashi
Luca Versari
Songlin Yang
...
Guillaume Lajoie
Charlotte Frenkel
Razvan Pascanu
Blaise Agüera y Arcas
João Sacramento
312
14
0
05 Jun 2025
Quantifying Cross-Modality Memorization in Vision-Language Models
Yuxin Wen
Yangsibo Huang
Tom Goldstein
Ravi Kumar
Badih Ghazi
Chiyuan Zhang
331
2
0
05 Jun 2025
Inference-Time Hyper-Scaling with KV Cache Compression
Adrian Łańcucki
Konrad Staniszewski
Piotr Nawrot
Edoardo Ponti
277
13
0
05 Jun 2025
Recycling the Web: A Method to Enhance Pre-training Data Quality and Quantity for Language Models
Thao Nguyen
Yang Li
O. Yu. Golovneva
Luke Zettlemoyer
Sewoong Oh
Ludwig Schmidt
Xian Li
OnRL
425
11
0
05 Jun 2025
MANBench: Is Your Multimodal Model Smarter than Human?
Annual Meeting of the Association for Computational Linguistics (ACL), 2025
Han Zhou
Qitong Xu
Yiheng Dong
Xin Yang
224
0
0
04 Jun 2025
RadialRouter: Structured Representation for Efficient and Robust Large Language Models Routing
Ruihan Jin
Pengpeng Shao
Zhengqi Wen
Jinyang Wu
Mingkuan Feng
Shuai Zhang
Jianhua Tao
275
3
0
04 Jun 2025
SkipGPT: Dynamic Layer Pruning Reinvented with Token Awareness and Module Decoupling
Anhao Zhao
Fanghua Ye
Yingqi Fan
Junlong Tong
Zhiwei Fei
Hui Su
Xiaoyu Shen
249
3
0
04 Jun 2025
A Statistical Physics of Language Model Reasoning
Jack David Carson
Amir Reisizadeh
LRM
AI4CE
185
1
0
04 Jun 2025
TokAlign: Efficient Vocabulary Adaptation via Token Alignment
Annual Meeting of the Association for Computational Linguistics (ACL), 2025
Chong Li
Jiajun Zhang
Chengqing Zong
VLM
207
2
0
04 Jun 2025
Accurate Sublayer Pruning for Large Language Models by Exploiting Latency and Tunability Information
International Joint Conference on Artificial Intelligence (IJCAI), 2025
Seungcheol Park
Sojin Lee
Jongjin Kim
Jinsik Lee
Hyunjik Jo
U. Kang
276
3
0
04 Jun 2025
Backbone Augmented Training for Adaptations
Jae Wan Park
Junhyeok Kim
Youngjun Jun
Hyunah Ko
Seong Jae Hwang
206
0
0
04 Jun 2025
Adaptive Task Vectors for Large Language Models
Joonseong Kang
Soojeong Lee
Subeen Park
Sumin Park
Taero Kim
Jihee Kim
Ryunyi Lee
Kyungwoo Song
262
0
0
03 Jun 2025
PoLAR: Polar-Decomposed Low-Rank Adapter Representation
Kai Lion
Liang Zhang
Bingcong Li
Niao He
256
3
0
03 Jun 2025
ProcrustesGPT: Compressing LLMs with Structured Matrices and Orthogonal Transformations
Annual Meeting of the Association for Computational Linguistics (ACL), 2025
Ekaterina Grishina
Mikhail Gorbunov
Maxim Rakhuba
172
0
0
03 Jun 2025
EvaLearn: Quantifying the Learning Capability and Efficiency of LLMs via Sequential Problem Solving
Jiajun Sun
Ming Zhang
Chenhao Huang
Jiayi Chen
F. Chen
...
Wei Chengzhi
Lin Yan
Qi Zhang
Qi Zhang
Xuanjing Huang
ELM
302
3
0
03 Jun 2025
Beyond Text Compression: Evaluating Tokenizers Across Scales
Annual Meeting of the Association for Computational Linguistics (ACL), 2025
Jonas F. Lotz
António V. Lopes
Stephan Peitz
Hendra Setiawan
Leonardo Emili
278
3
0
03 Jun 2025
Scaling Fine-Grained MoE Beyond 50B Parameters: Empirical Evaluation and Practical Insights
Jakub Krajewski
Marcin Chochowski
Daniel Korzekwa
MoE
ALM
207
0
0
03 Jun 2025
Cataloguing Hugging Face Models to Software Engineering Activities: Automation and Findings
Alexandra González
Xavier Franch
David Lo
Luís Cruz
VLM
284
2
0
03 Jun 2025
StochasTok: Improving Fine-Grained Subword Understanding in LLMs
Anya Sims
Thom Foster
Klara Kaleb
Tuan-Duy H. Nguyen
Joseph Lee
Jakob N. Foerster
Yee Whye Teh
Cong Lu
345
4
0
02 Jun 2025
Exploring the Potential of LLMs as Personalized Assistants: Dataset, Evaluation, and Analysis
Annual Meeting of the Association for Computational Linguistics (ACL), 2025
J. Mok
Ik-hwan Kim
Sangkwon Park
Sungroh Yoon
227
3
0
02 Jun 2025
Multilingual Definition Modeling
Annual Meeting of the Association for Computational Linguistics (ACL), 2025
Edison Marrese-Taylor
Erica K. Shimomoto
Alfredo Solano
Enrique Reid
216
0
0
02 Jun 2025
T-SHIRT: Token-Selective Hierarchical Data Selection for Instruction Tuning
Yanjun Fu
Faisal Hamman
Sanghamitra Dutta
ALM
343
6
0
02 Jun 2025
TAH-QUANT: Effective Activation Quantization in Pipeline Parallelism over Slow Network
Guangxin He
Yuan Cao
Yutong He
Tianyi Bai
Kun Yuan
Binhang Yuan
MQ
211
0
0
02 Jun 2025
Taming LLMs by Scaling Learning Rates with Gradient Grouping
Siyuan Li
Juanxi Tian
Zedong Wang
Xin Jin
Zicheng Liu
Wentao Zhang
Dan Xu
230
0
0
01 Jun 2025
Mamba Drafters for Speculative Decoding
Daewon Choi
Seunghyuk Oh
Saket Dingliwal
Jihoon Tack
Kyuyoung Kim
...
Insu Han
Jinwoo Shin
Aram Galstyan
Shubham Katiyar
S. Bodapati
290
0
0
01 Jun 2025
Data Swarms: Optimizable Generation of Synthetic Evaluation Data
Shangbin Feng
Yike Wang
Weijia Shi
Yulia Tsvetkov
358
0
0
31 May 2025
Blending Complementary Memory Systems in Hybrid Quadratic-Linear Transformers
Kazuki Irie
Morris Yau
Samuel J. Gershman
221
6
0
31 May 2025
Recipes for Pre-training LLMs with MXFP8
Asit K. Mishra
Dusan Stosic
Simon Layton
Paulius Micikevicius
MQ
229
5
0
30 May 2025
Stepsize anything: A unified learning rate schedule for budgeted-iteration training
Anda Tang
Yiming Dong
Yutao Zeng
zhou Xun
Zhouchen Lin
631
1
0
30 May 2025
HELM: Hyperbolic Large Language Models via Mixture-of-Curvature Experts
Neil He
Rishabh Anand
Hiren Madhu
Ali Maatouk
Smita Krishnaswamy
Leandros Tassiulas
Menglin Yang
Rex Ying
229
7
0
30 May 2025
Reflect, Retry, Reward: Self-Improving LLMs via Reinforcement Learning
Shelly Bensal
Umar Jamil
Christopher Bryant
M. Russak
Kiran Kamble
Dmytro Mozolevskyi
Muayad Ali
Waseem Alshikh
LLMAG
ReLM
LRM
197
14
0
30 May 2025
ReCalKV: Low-Rank KV Cache Compression via Head Reordering and Offline Calibration
Xianglong Yan
Zhiteng Li
Tianao Zhang
Linghe Kong
Yulun Zhang
Yulun Zhang
Yunbo Wang
457
4
0
30 May 2025
LittleBit: Ultra Low-Bit Quantization via Latent Factorization
Banseok Lee
Dongkyu Kim
Youngcheon You
Youngmin Kim
MQ
229
4
0
30 May 2025
SUMO: Subspace-Aware Moment-Orthogonalization for Accelerating Memory-Efficient LLM Training
Yehonathan Refael
Guy Smorodinsky
Tom Tirer
Ofir Lindenbaum
180
5
0
30 May 2025
Chameleon: A Flexible Data-mixing Framework for Language Model Pretraining and Finetuning
Wanyun Xie
F. Tonin
Volkan Cevher
169
2
0
30 May 2025
LoLA: Low-Rank Linear Attention With Sparse Caching
Luke McDermott
Robert W. Heath Jr.
Rahul Parhi
RALM
338
4
0
29 May 2025
DenoiseRotator: Enhance Pruning Robustness for LLMs via Importance Concentration
Tianteng Gu
Bei Liu
Bo Xiao
Ke Zeng
Jiacheng Liu
Y. Qian
205
1
0
29 May 2025
Previous
1
2
3
...
10
11
12
...
44
45
46
Next