Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
1803.05457
Cited By
Think you have Solved Question Answering? Try ARC, the AI2 Reasoning Challenge
14 March 2018
Peter Clark
Isaac Cowhey
Oren Etzioni
Tushar Khot
Ashish Sabharwal
Carissa Schoenick
Oyvind Tafjord
ELM
RALM
LRM
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Think you have Solved Question Answering? Try ARC, the AI2 Reasoning Challenge"
50 / 1,882 papers shown
Title
ProxRouter: Proximity-Weighted LLM Query Routing for Improved Robustness to Outliers
Shivam Patel
Neharika Jali
Ankur Mallick
Gauri Joshi
96
0
0
10 Oct 2025
Don't Throw Away Your Pretrained Model
Shangbin Feng
Wenhao Yu
Yike Wang
Hongming Zhang
Yulia Tsvetkov
Dong Yu
MoMe
162
0
0
10 Oct 2025
Entropy Meets Importance: A Unified Head Importance-Entropy Score for Stable and Efficient Transformer Pruning
Minsik Choi
Hyegang Son
Changhoon Kim
Young Geun Kim
AAML
84
0
0
10 Oct 2025
FLRC: Fine-grained Low-Rank Compressor for Efficient LLM Inference
Yu-Chen Lu
Chong-Yan Chen
Chi-Chih Chang
Yu-Fang Hu
Kai-Chiang Wu
64
1
0
10 Oct 2025
HINT: Helping Ineffective Rollouts Navigate Towards Effectiveness
X. Wang
Jinyi Han
Zishang Jiang
Tingyun Li
Jiaqing Liang
Sihang Jiang
Zhaoqian Dai
Shuguang Ma
Fei Yu
Yanghua Xiao
LRM
92
0
0
10 Oct 2025
Hierarchical Scheduling for Multi-Vector Image Retrieval
Maoliang Li
K. Li
Yaoyang Liu
Jiayu Chen
Zihao Zheng
Yinjun Wu
Xiang Chen
100
0
0
10 Oct 2025
Automatic Text Box Placement for Supporting Typographic Design
Jun Muraoka
Daichi Haraguchi
Naoto Inoue
Wataru Shimoda
Kota Yamaguchi
Seiichi Uchida
78
0
0
09 Oct 2025
Energy-Driven Steering: Reducing False Refusals in Large Language Models
Eric Hanchen Jiang
Weixuan Ou
Run Liu
Shengyuan Pang
Guancheng Wan
...
Wei Dong
Kai-Wei Chang
Xiaofeng Wang
Ying Nian Wu
Xinfeng Li
LLMSV
216
0
0
09 Oct 2025
Recycling Pretrained Checkpoints: Orthogonal Growth of Mixture-of-Experts for Efficient Large Language Model Pre-Training
Ruizhe Wang
Yucheng Ding
Xiao Liu
Yaoxiang Wang
Peng Cheng
Baining Guo
Zhengjun Zha
Yeyun Gong
128
0
0
09 Oct 2025
Weak Form Learning for Mean-Field Partial Differential Equations: an Application to Insect Movement
Seth Minor
Bret D. Elderd
Benjamin Van Allen
David M. Bortz
Vanja M. Dukic
108
0
0
09 Oct 2025
AILoRA: Function-Aware Asymmetric Initialization for Low-Rank Adaptation of Large Language Models
Xiaoshuang Ji
Zhendong Zhao
Xiaoyan Gu
Xiaojun Chen
Xin Zhao
Zeyao Liu
92
0
0
09 Oct 2025
Entropy Regularizing Activation: Boosting Continuous Control, Large Language Models, and Image Classification with Activation as Entropy Constraints
Zilin Kang
Chonghua Liao
Tingqiang Xu
Huazhe Xu
180
1
0
09 Oct 2025
Rényi Sharpness: A Novel Sharpness that Strongly Correlates with Generalization
Qiaozhe Zhang
Jun Sun
Ruijie Zhang
Yingzhuang Liu
156
0
0
09 Oct 2025
Fewer Weights, More Problems: A Practical Attack on LLM Pruning
Kazuki Egashira
Robin Staab
Thibaud Gloaguen
Mark Vero
Martin Vechev
AAML
167
0
0
09 Oct 2025
FedBook: A Unified Federated Graph Foundation Codebook with Intra-domain and Inter-domain Knowledge Modeling
Zhengyu Wu
Yinlin Zhu
Xunkai Li
Ziang Qiu
Rong-Hua Li
Guoren Wang
Chenghu Zhou
FedML
89
0
0
09 Oct 2025
Contrastive Weak-to-strong Generalization
Houcheng Jiang
Junfeng Fang
Jiaxin Wu
T. Zhang
Chen Gao
Yong Li
X. Wang
Xiangnan He
Yang Deng
120
0
0
09 Oct 2025
SliceFine: The Universal Winning-Slice Hypothesis for Pretrained Networks
Md. Kowsher
Ali O. Polat
Ehsan Mohammady Ardehaly
Mehrdad Salehi
Zia Ghiasi
Prasanth Murali
Chen Chen
162
1
0
09 Oct 2025
RCPU: Rotation-Constrained Error Compensation for Structured Pruning of a Large Language Model
Shuichiro Haruta
Kazunori Matsumoto
Zhi Li
Yanan Wang
Mori Kurokawa
94
0
0
09 Oct 2025
LLMs Show Surface-Form Brittleness Under Paraphrase Stress Tests
Juan Miguel Navarro Carranza
24
0
0
08 Oct 2025
Mid-Training of Large Language Models: A Survey
Kaixiang Mo
Yuxin Shi
Weiwei Weng
Zhiqiang Zhou
Shuman Liu
Haibo Zhang
Anxiang Zeng
LRM
119
0
0
08 Oct 2025
Grouped Differential Attention
Junghwan Lim
S. W. Lee
Dongseok Kim
Wai Ting Cheung
Beomgyu Kim
Taehwan Kim
Haesol Lee
Junhyeok Lee
Dongpin Oh
Eunhwan Park
69
1
0
08 Oct 2025
Validation of Various Normalization Methods for Brain Tumor Segmentation: Can Federated Learning Overcome This Heterogeneity?
Jan Fiszer
Dominika Ciupek
Maciej Malawski
FedML
180
1
0
08 Oct 2025
JAI-1: A Thai-Centric Large Language Model
Attapol T. Rutherford
Jullajak Karnjanaekarin
Narongkorn Panitsrisit
Pontakorn Trakuekul
Sumana Sumanakul
Natchanon Pollertlam
52
0
0
08 Oct 2025
Can Speech LLMs Think while Listening?
Yi-Jen Shih
Desh Raj
Chunyang Wu
Wei Zhou
SK Bong
Yashesh Gaur
Jay Mahadeokar
Ozlem Kalinli
M. Seltzer
LRM
123
1
0
08 Oct 2025
Next Semantic Scale Prediction via Hierarchical Diffusion Language Models
Cai Zhou
Chenyu Wang
Dinghuai Zhang
Shangyuan Tong
Yifei Wang
Stephen Bates
Tommi Jaakkola
116
0
0
08 Oct 2025
PIKA: Expert-Level Synthetic Datasets for Post-Training Alignment from Scratch
Shangjian Yin
Shining Liang
Wenbiao Ding
Yuli Qian
Zhouxing Shi
Hongzhi Li
Yutao Xie
ALM
110
0
0
08 Oct 2025
POME: Post Optimization Model Edit via Muon-style Projection
Yong Liu
Di Fu
Yang Luo
Zirui Zhu
Minhao Cheng
Cho-Jui Hsieh
Yang You
80
0
0
08 Oct 2025
Adaptive Stain Normalization for Cross-Domain Medical Histology
International Conference on Medical Image Computing and Computer-Assisted Intervention (MICCAI), 2025
Tianyue Xu
Yanlin Wu
Abhai K. Tripathi
Matthew M. Ippolito
Benjamin D. Haeffele
OOD
MedIm
112
0
0
08 Oct 2025
Learning to Route LLMs from Bandit Feedback: One Policy, Many Trade-offs
Wang Wei
Tiankai Yang
Hongjie Chen
Yue Zhao
Franck Dernoncourt
Ryan Rossi
Hoda Eldardiry
OffRL
60
0
0
08 Oct 2025
Auto-Stega: An Agent-Driven System for Lifelong Strategy Evolution in LLM-Based Text Steganography
Jiuan Zhou
Yu Cheng
Yuan Xie
Z. Yin
98
2
0
08 Oct 2025
Encode, Think, Decode: Scaling test-time reasoning with recursive latent thoughts
Yeskendir Koishekenov
Aldo Lipani
Nicola Cancedda
LRM
106
2
0
08 Oct 2025
Poisoning Attacks on LLMs Require a Near-constant Number of Poison Samples
Alexandra Souly
Javier Rando
Ed Chapman
Xander Davies
Shae McFadden
...
Erik Jones
Chris Hicks
Nicholas Carlini
Y. Gal
Robert Kirk
AAML
SILM
224
8
0
08 Oct 2025
Latent Representation Learning in Heavy-Ion Collisions with MaskPoint Transformer
Jing-Zong Zhang
Shuang Guo
Li-Lin Zhu
Lingxiao Wang
Guo-Liang Ma
116
10
0
08 Oct 2025
Native Hybrid Attention for Efficient Sequence Modeling
Jusen Du
Jiaxi Hu
Tao Zhang
Weigao Sun
Yu Cheng
164
2
0
08 Oct 2025
BLISS: A Lightweight Bilevel Influence Scoring Method for Data Selection in Language Model Pretraining
Jie Hao
Rui Yu
W. Zhang
Huixia Wang
Jie Xu
Mingrui Liu
232
0
0
07 Oct 2025
Training Dynamics Impact Post-Training Quantization Robustness
Albert Catalan-Tatjer
Niccolò Ajroldi
Jonas Geiping
MQ
137
0
0
07 Oct 2025
Activation-Informed Pareto-Guided Low-Rank Compression for Efficient LLM/VLM
Ryan Solgi
Parsa Madinei
Jiayi Tian
Rupak Vignesh Swaminathan
Jing Liu
Nathan Susanj
Zheng Zhang
62
1
0
07 Oct 2025
Fairness in Token Delegation: Mitigating Voting Power Concentration in DAOs
Johnnatan Messias
Ayae Ide
78
0
0
07 Oct 2025
Provably Mitigating Corruption, Overoptimization, and Verbosity Simultaneously in Offline and Online RLHF/DPO Alignment
Ziyi Chen
Junyi Li
Qi He
Heng-Chiao Huang
136
0
0
07 Oct 2025
ARMOR: High-Performance Semi-Structured Pruning via Adaptive Matrix Factorization
Lawrence Liu
Alexander Liu
Mengdi Wang
T. Zhao
Lin F. Yang
108
0
0
07 Oct 2025
Attention Sinks and Compression Valleys in LLMs are Two Sides of the Same Coin
Enrique Queipo-de-Llano
Alvaro Arroyo
Federico Barbero
Xiaowen Dong
Michael M. Bronstein
Yann LeCun
Ravid Shwartz-Ziv
102
1
0
07 Oct 2025
GraphGhost: Tracing Structures Behind Large Language Models
Xinnan Dai
Kai Guo
Chung-Hsiang Lo
Shenglai Zeng
Jiayuan Ding
Dongsheng Luo
Subhabrata Mukherjee
J. Tang
78
0
0
07 Oct 2025
lm-Meter: Unveiling Runtime Inference Latency for On-Device Language Models
Haoxin Wang
Xiaolong Tu
Hongyu Ke
Huirong Chai
Dawei Chen
Kyungtae Han
99
1
0
07 Oct 2025
AMAQ: Adaptive Mixed-bit Activation Quantization for Collaborative Parameter Efficient Fine-tuning
Yurun Song
Zhuoyi Yang
Ian G. Harris
Sangeetha Abdu Jyothi
MQ
137
0
0
07 Oct 2025
Diversity Is All You Need for Contrastive Learning: Spectral Bounds on Gradient Magnitudes
Peter Ochieng
72
1
0
07 Oct 2025
The End of Transformers? On Challenging Attention and the Rise of Sub-Quadratic Architectures
Alexander Fichtl
Jeremias Bohn
Josefin Kelber
Edoardo Mosca
Georg Groh
96
0
0
06 Oct 2025
SpikingMamba: Towards Energy-Efficient Large Language Models via Knowledge Distillation from Mamba
Y. Huang
Jianxiong Tang
Chao Wang
Ziyi Wang
Jianguo Zhang
Zhichao Lu
Bojun Cheng
Luziwei Leng
Mamba
148
0
0
06 Oct 2025
Test-Time Scaling in Diffusion LLMs via Hidden Semi-Autoregressive Experts
Jihoon Lee
Hoyeon Moon
Kevin Zhai
Arun Kumar Chithanar
Anit Kumar Sahu
S. Kar
Chul Lee
Souradip Chakraborty
Amrit Singh Bedi
DiffM
180
0
0
06 Oct 2025
Boomerang Distillation Enables Zero-Shot Model Size Interpolation
Sara Kangaslahti
Nihal V. Nayak
Jonathan Geuter
Marco Fumero
Francesco Locatello
David Alvarez-Melis
136
0
0
06 Oct 2025
Recover-LoRA: Data-Free Accuracy Recovery of Degraded Language Models via Low-Rank Adaptation
Devleena Das
Rajeev Patwari
Ashish Sirasao
89
0
0
06 Oct 2025
Previous
1
2
3
4
5
6
...
36
37
38
Next