Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
1803.05457
Cited By
Think you have Solved Question Answering? Try ARC, the AI2 Reasoning Challenge
14 March 2018
Peter Clark
Isaac Cowhey
Oren Etzioni
Tushar Khot
Ashish Sabharwal
Carissa Schoenick
Oyvind Tafjord
ELM
RALM
LRM
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Think you have Solved Question Answering? Try ARC, the AI2 Reasoning Challenge"
50 / 1,907 papers shown
ARA: Adaptive Rank Allocation for Efficient Large Language Model SVD Compression
Lin Xv
Jingsheng Gao
Xian Gao
Ting Liu
Yuzhuo Fu
120
0
0
22 Oct 2025
Beyond Uniform SVD:Dual-Level Optimization across Columns and Modules for LLM Compression
Lin Xv
Jingsheng Gao
Xian Gao
Ting Li
72
0
0
22 Oct 2025
Latent Space Factorization in LoRA
Shashi Kumar
Yacouba Kaloga
John Mitros
P. Motlícek
Ina Kodrasi
108
0
0
22 Oct 2025
NeuroAda: Activating Each Neuron's Potential for Parameter-Efficient Fine-Tuning
Zhi Zhang
Yixian Shen
Congfeng Cao
Ekaterina Shutova
161
0
0
21 Oct 2025
ssToken: Self-modulated and Semantic-aware Token Selection for LLM Fine-tuning
Xiaohan Qin
Xiaoxing Wang
Ning Liao
Cancheng Zhang
Xiangdong Zhang
Mingquan Feng
Jingzhi Wang
Junchi Yan
138
0
0
21 Oct 2025
Pay Attention to the Triggers: Constructing Backdoors That Survive Distillation
Giovanni De Muri
Mark Vero
Robin Staab
Martin Vechev
155
0
0
21 Oct 2025
Scaling Laws Meet Model Architecture: Toward Inference-Efficient LLMs
S. Bian
Tao Yu
Shivaram Venkataraman
Youngsuk Park
118
0
0
21 Oct 2025
Learning from the Best, Differently: A Diversity-Driven Rethinking on Data Selection
Hongyi He
Xiao Liu
Zhenghao Lin
Mingni Tang
Y. Cheng
Jintao Wang
W. Li
Peng Cheng
Yeyun Gong
OODD
185
0
0
21 Oct 2025
Binary Quadratic Quantization: Beyond First-Order Quantization for Real-Valued Matrix Compression
Kyo Kuroki
Yasuyuki Okoshi
Thiem Van Chu
Kazushi Kawamura
Masato Motomura
MQ
212
0
0
21 Oct 2025
ScaleNet: Scaling up Pretrained Neural Networks with Incremental Parameters
Zhiwei Hao
Jianyuan Guo
Li Shen
Kai Han
Yehui Tang
Han Hu
Yunhe Wang
228
0
0
21 Oct 2025
ReXMoE: Reusing Experts with Minimal Overhead in Mixture-of-Experts
Zheyue Tan
Ruoyao Xiao
Tao Yuan
Dong Zhou
Weilin Liu
...
Haiyang Xu
Boxun Li
Guohao Dai
Bo Zhao
Yu Wang
MoE
192
0
0
20 Oct 2025
Empowering Real-World: A Survey on the Technology, Practice, and Evaluation of LLM-driven Industry Agents
Yihong Tang
Kehai Chen
Liang Yue
Jinxin Fan
Caishen Zhou
...
Kaiyang Guo
Xingshan Zeng
Wenjing Cun
L. Shang
Min Zhang
LLMAG
158
0
0
20 Oct 2025
EduAdapt: A Question Answer Benchmark Dataset for Evaluating Grade-Level Adaptability in LLMs
Numaan Naeem
Abdellah El Mekki
Muhammad Abdul-Mageed
AI4Ed
ELM
238
0
0
20 Oct 2025
Mapping Post-Training Forgetting in Language Models at Scale
Jackson Harmon
Andreas Hochlehnert
Matthias Bethge
Ameya Prabhu
CLL
KELM
153
0
0
20 Oct 2025
Unbiased Gradient Low-Rank Projection
Rui Pan
Yang Luo
Yuxing Liu
Yang You
Tong Zhang
148
0
0
20 Oct 2025
DynaKV: Enabling Accurate and Efficient Long-Sequence LLM Decoding on Smartphones
Tuowei Wang
Minxing Huang
Fengzu Li
Ligeng Chen
Jinrui Zhang
Ju Ren
186
1
0
20 Oct 2025
The Free Transformer
François Fleuret
64
0
0
20 Oct 2025
Learning from Generalization Patterns: An Evaluation-Driven Approach to Enhanced Data Augmentation for Fine-Tuning Small Language Models
Huan Song
Deeksha Razdan
Yiyue Qian
Arijit Ghosh Chowdhury
Parth Patwa
Aman Chadha
Shinan Zhang
Sharlina Keshava
Hannah R Marlowe
122
1
0
20 Oct 2025
From Local to Global: Revisiting Structured Pruning Paradigms for Large Language Models
Ziyan Wang
Enmao Diao
Qi Le
Pu Wang
Minwoo Lee
Shu-ping Yeh
Evgeny Stupachenko
Hao Feng
Li Yang
128
1
0
20 Oct 2025
A Comprehensive Survey on Reinforcement Learning-based Agentic Search: Foundations, Roles, Optimizations, Evaluations, and Applications
Minhua Lin
Zongyu Wu
Zhichao Xu
Hui Liu
Xianfeng Tang
Qi He
Charu C. Aggarwal
Hui Liu
Xiang Zhang
Suhang Wang
AI4TS
LRM
554
1
0
19 Oct 2025
DistilLock: Safeguarding LLMs from Unauthorized Knowledge Distillation on the Edge
Asmita Mohanty
Gezheng Kang
Lei Gao
M. Annavaram
126
0
0
19 Oct 2025
Vocab Diet: Reshaping the Vocabulary of LLMs with Vector Arithmetic
Yuval Reif
Guy Kaplan
Roy Schwartz
KELM
161
0
0
19 Oct 2025
Unleashing Diverse Thinking Modes in LLMs through Multi-Agent Collaboration
Zhixuan He
Yue Feng
LLMAG
AI4CE
LRM
72
0
0
18 Oct 2025
Expert Merging in Sparse Mixture of Experts with Nash Bargaining
Dung V. Nguyen
Anh T. Nguyen
Minh H. Nguyen
Luc Q. Nguyen
Shiqi Jiang
Ethan Fetaya
Linh Duy Tran
Gal Chechik
T. Nguyen
MoMe
188
1
0
17 Oct 2025
Layer as Puzzle Pieces: Compressing Large Language Models through Layer Concatenation
Fei Wang
Li Shen
Liang Ding
Chao Xue
Ye Liu
Changxing Ding
148
0
0
17 Oct 2025
When to Ensemble: Identifying Token-Level Points for Stable and Fast LLM Ensembling
Heecheol Yun
Kwangmin Ki
J. H. Lee
Eunho Yang
147
0
0
17 Oct 2025
From Characters to Tokens: Dynamic Grouping with Hierarchical BPE
Rares Dolga
Lucas Maystre
Tudor Berariu
David Barber
104
0
0
17 Oct 2025
ReasonIF: Large Reasoning Models Fail to Follow Instructions During Reasoning
Yongchan Kwon
Shang Zhu
Federico Bianchi
Kaitlyn Zhou
James Y. Zou
LRM
144
1
0
17 Oct 2025
KITE: A Benchmark for Evaluating Korean Instruction-Following Abilities in Large Language Models
Dongjun Kim
Chanhee Park
Chanjun Park
Heuiseok Lim
ALM
ELM
142
0
0
17 Oct 2025
Planner and Executor: Collaboration between Discrete Diffusion And Autoregressive Models in Reasoning
Lina Berrayana
Ahmed Heakl
Muhammad Abdullah Sohail
Thomas Hofmann
Salman Khan
Wei Chen
172
0
0
17 Oct 2025
Predicting Task Performance with Context-aware Scaling Laws
Kyle Montgomery
David Park
Jianhong Tu
Michael Bendersky
Beliz Gunel
Dawn Song
Chenguang Wang
LRM
128
1
0
16 Oct 2025
Beyond Multi-Token Prediction: Pretraining LLMs with Future Summaries
Divyat Mahajan
Sachin Goyal
Badr Youbi Idrissi
Mohammad Pezeshki
Alexia Jolicoeur-Martineau
David Lopez-Paz
Kartik Ahuja
AI4TS
LRM
122
0
0
16 Oct 2025
MergeMoE: Efficient Compression of MoE Models via Expert Output Merging
Ruijie Miao
Yilun Yao
Zihan Wang
Z. Wang
Bairen Yi
LingJun Liu
Yikai Zhao
Tong Yang
MoMe
164
1
0
16 Oct 2025
RLSR: Reinforcement Learning with Supervised Reward Outperforms SFT in Instruction Following
Zhichao Wang
Andy Wong
Ruslan Belkin
ALM
LRM
111
0
0
16 Oct 2025
Kelle: Co-design KV Caching and eDRAM for Efficient LLM Serving in Edge Computing
Tianhua Xia
Sai Qian Zhang
88
1
0
16 Oct 2025
Tahakom LLM Guidelines and Recipes: From Pre-training Data to an Arabic LLM
Areej AlOtaibi
Lina Alyahya
Raghad Alshabanah
Shahad Alfawzan
Shuruq Alarefei
...
Waad Alahmed
Omar Talabay
Jalal Alowibdi
Salem Alelyani
Adel Bibi
193
0
0
15 Oct 2025
LLMs Can Get "Brain Rot"!
Shuo Xing
Junyuan Hong
Yifan Wang
Runjin Chen
Zhenyu Zhang
A. Grama
Zhengzhong Tu
Z. Wang
154
0
0
15 Oct 2025
Selective Adversarial Attacks on LLM Benchmarks
Ivan Dubrovsky
Anastasia Orlova
Illarion Iov
Nina Gubina
Irena Gureeva
Alexey Zaytsev
AAML
112
0
0
15 Oct 2025
End-to-End Multi-Modal Diffusion Mamba
Chunhao Lu
Qiang Lu
Meichen Dong
Jake Luo
130
3
0
15 Oct 2025
REAP the Experts: Why Pruning Prevails for One-Shot MoE compression
Mike Lasby
Ivan Lazarevich
Nish Sinnadurai
Sean Lie
Yani Andrew Ioannou
Vithursan Thangarasa
120
1
0
15 Oct 2025
Sparse Subnetwork Enhancement for Underrepresented Languages in Large Language Models
Daniil Gurgurov
Josef van Genabith
Simon Ostermann
MoE
198
0
0
15 Oct 2025
Closing the Gap Between Text and Speech Understanding in LLMs
Santiago Cuervo
Skyler Seto
Maureen de Seyssel
Richard He Bai
Zijin Gu
Tatiana Likhomanenko
Navdeep Jaitly
Zakaria Aldeneh
160
2
0
15 Oct 2025
Dr.LLM: Dynamic Layer Routing in LLMs
Ahmed Heakl
Martin Gubri
Salman Khan
Sangdoo Yun
Seong Joon Oh
ReLM
335
1
1
14 Oct 2025
OPLoRA: Orthogonal Projection LoRA Prevents Catastrophic Forgetting during Parameter-Efficient Fine-Tuning
Yifeng Xiong
Xiaohui Xie
CLL
476
2
0
14 Oct 2025
CARVQ: Corrective Adaptor with Group Residual Vector Quantization for LLM Embedding Compression
Dayin Gou
Sanghyun Byun
Nilesh Malpeddi
Gabrielle De Micheli
Prathamesh Vaste
Jacob Song
Woo Seong Chung
MQ
108
0
0
14 Oct 2025
Balancing Synthetic Data and Replay for Enhancing Task-Specific Capabilities
Urs Spiegelhalter
Jorg K. H. Franke
Frank Hutter
CLL
KELM
136
0
0
13 Oct 2025
Neural Weight Compression for Language Models
Jegwang Ryu
Minkyu Kim
Seungjun Shin
Hee Min Choi
Dokwan Oh
Jaeho Lee
132
0
0
13 Oct 2025
Direct Multi-Token Decoding
Xuan Luo
Weizhi Wang
Xifeng Yan
OffRL
96
0
0
13 Oct 2025
Deconstructing Attention: Investigating Design Principles for Effective Language Modeling
Huiyin Xue
Nafise Sadat Moosavi
Nikolaos Aletras
120
0
0
13 Oct 2025
ShishuLM: Lightweight Language Model with Hybrid Decoder-MLP Architecture and Paired Weight Sharing
Shivanshu Kumar
Gopalakrishnan Srinivasan
80
0
0
13 Oct 2025
Previous
1
2
3
4
5
...
37
38
39
Next