Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1905.10044
Cited By
BoolQ: Exploring the Surprising Difficulty of Natural Yes/No Questions
24 May 2019
Christopher Clark
Kenton Lee
Ming-Wei Chang
Tom Kwiatkowski
Michael Collins
Kristina Toutanova
Re-assign community
ArXiv
PDF
HTML
Papers citing
"BoolQ: Exploring the Surprising Difficulty of Natural Yes/No Questions"
50 / 1,036 papers shown
Title
STADE: Standard Deviation as a Pruning Metric
Diego Coello de Portugal Mecke
Haya Alyoussef
Ilia Koloiarov
Maximilian Stubbemann
Lars Schmidt-Thieme
29
0
0
28 Mar 2025
UGen: Unified Autoregressive Multimodal Model with Progressive Vocabulary Learning
Hongxuan Tang
Hao Liu
Xinyan Xiao
45
1
0
27 Mar 2025
Cross-Tokenizer Distillation via Approximate Likelihood Matching
Benjamin Minixhofer
Ivan Vulić
E. Ponti
134
0
0
25 Mar 2025
Gemma 3 Technical Report
Gemma Team
Aishwarya B Kamath
Johan Ferret
Shreya Pathak
Nino Vieillard
...
Harshal Tushar Lehri
Hussein Hazimeh
Ian Ballantyne
Idan Szpektor
Ivan Nardini
VLM
85
30
0
25 Mar 2025
Maximum Redundancy Pruning: A Principle-Driven Layerwise Sparsity Allocation for LLMs
Chang Gao
Kang Zhao
J. Chen
Liping Jing
42
0
0
24 Mar 2025
ZeroLM: Data-Free Transformer Architecture Search for Language Models
Zhen-Song Chen
Hong-Wei Ding
Xian-Jia Wang
Witold Pedrycz
53
0
0
24 Mar 2025
Dynamic Task Vector Grouping for Efficient Multi-Task Prompt Tuning
Pieyi Zhang
Richong Zhang
Zhijie Nie
VLM
60
0
0
23 Mar 2025
LLM Braces: Straightening Out LLM Predictions with Relevant Sub-Updates
Ying Shen
Lifu Huang
47
1
0
20 Mar 2025
Mixture of Lookup Experts
Shibo Jie
Yehui Tang
Kai Han
Y. Li
Duyu Tang
Zhi-Hong Deng
Yunhe Wang
MoE
49
0
0
20 Mar 2025
ClusComp: A Simple Paradigm for Model Compression and Efficient Finetuning
Baohao Liao
Christian Herold
Seyyed Hadi Hashemi
Stefan Vasilev
Shahram Khadivi
Christof Monz
MQ
44
0
0
17 Mar 2025
SuperBPE: Space Travel for Language Models
Alisa Liu
J. Hayase
Valentin Hofmann
Sewoong Oh
Noah A. Smith
Yejin Choi
43
3
0
17 Mar 2025
Triad: Empowering LMM-based Anomaly Detection with Vision Expert-guided Visual Tokenizer and Manufacturing Process
Yuanze Li
Shihao Yuan
Haolin Wang
Qizhang Li
Ming-Yu Liu
Chen Xu
Guangming Shi
Wangmeng Zuo
56
0
0
17 Mar 2025
Pensez: Less Data, Better Reasoning -- Rethinking French LLM
Huy Hoang Ha
ReLM
LRM
66
1
0
17 Mar 2025
ZO2: Scalable Zeroth-Order Fine-Tuning for Extremely Large Language Models with Limited GPU Memory
Liangyu Wang
Jie Ren
Hang Xu
Junxiao Wang
Huanyi Xie
David E. Keyes
Di Wang
58
0
0
16 Mar 2025
Key, Value, Compress: A Systematic Exploration of KV Cache Compression Techniques
Neusha Javidnia
B. Rouhani
F. Koushanfar
123
0
0
14 Mar 2025
Towards Extreme Pruning of LLMs with Plug-and-Play Mixed Sparsity
Chi Xu
Gefei Zhang
Yantong Zhu
Luca Benini
Guosheng Hu
Yawei Li
Zhihong Zhang
29
0
0
14 Mar 2025
BiasEdit: Debiasing Stereotyped Language Models via Model Editing
Xin Xu
Wei Xu
N. Zhang
Julian McAuley
KELM
39
0
0
11 Mar 2025
Should VLMs be Pre-trained with Image Data?
Sedrick Scott Keh
Jean-Pierre Mercat
S. Gadre
Kushal Arora
Igor Vasiljevic
...
Shuran Song
Russ Tedrake
Thomas Kollar
Ludwig Schmidt
Achal Dave
VLM
49
0
0
10 Mar 2025
Datasets, Documents, and Repetitions: The Practicalities of Unequal Data Quality
Alex Fang
Hadi Pouransari
Matt Jordan
Alexander Toshev
Vaishaal Shankar
Ludwig Schmidt
Tom Gunter
74
0
0
10 Mar 2025
SEAP: Training-free Sparse Expert Activation Pruning Unlock the Brainpower of Large Language Models
Xun Liang
Hanyu Wang
Huayi Lai
Simin Niu
Shichao Song
Jiawei Yang
Jihao Zhao
Feiyu Xiong
Bo Tang
Z. Li
VLM
45
0
0
10 Mar 2025
IteRABRe: Iterative Recovery-Aided Block Reduction
Haryo Akbarianto Wibowo
Haiyue Song
Hideki Tanaka
Masao Utiyama
Alham Fikri Aji
Raj Dabre
57
0
0
08 Mar 2025
Mitigating Memorization in LLMs using Activation Steering
Manan Suri
Nishit Anand
Amisha Bhaskar
LLMSV
50
2
0
08 Mar 2025
Sample-aware Adaptive Structured Pruning for Large Language Models
Jun Kong
Xinge Ma
Jin Wang
Xuejie Zhang
45
0
0
08 Mar 2025
Capacity-Aware Inference: Mitigating the Straggler Effect in Mixture of Experts
Shwai He
Weilin Cai
Jiayi Huang
Ang Li
MoE
34
1
0
07 Mar 2025
Generalized Interpolating Discrete Diffusion
Dimitri von Rutte
J. Fluri
Yuhui Ding
Antonio Orvieto
Bernhard Scholkopf
Thomas Hofmann
DiffM
62
0
0
06 Mar 2025
Balcony: A Lightweight Approach to Dynamic Inference of Generative Language Models
Benyamin Jamialahmadi
Parsa Kavehzadeh
Mehdi Rezagholizadeh
Parsa Farinneya
Hossein Rajabzadeh
A. Jafari
Boxing Chen
Marzieh S. Tahaei
42
0
0
06 Mar 2025
HybridNorm: Towards Stable and Efficient Transformer Training via Hybrid Normalization
Zhijian Zhuo
Yutao Zeng
Ya Wang
Sijun Zhang
Jian Yang
Xiaoqing Li
Xun Zhou
Jinwen Ma
46
0
0
06 Mar 2025
Phi-4-Mini Technical Report: Compact yet Powerful Multimodal Language Models via Mixture-of-LoRAs
Abdelrahman Abouelenin
Atabak Ashfaq
Adam Atkinson
Hany Awadalla
Nguyen Bach
...
Ishmam Zabir
Yunan Zhang
Li Zhang
Y. Zhang
Xiren Zhou
MoE
SyDa
68
23
0
03 Mar 2025
Llama-3.1-Sherkala-8B-Chat: An Open Large Language Model for Kazakh
Fajri Koto
Rituraj Joshi
Nurdaulet Mukhituly
Y. Wang
Zhuohan Xie
...
Avraham Sheinin
Natalia Vassilieva
Neha Sengupta
Larry Murray
Preslav Nakov
ALM
KELM
41
0
0
03 Mar 2025
Revisiting Large Language Model Pruning using Neuron Semantic Attribution
Yizhuo Ding
Xinwei Sun
Yanwei Fu
Guosheng Hu
61
0
0
03 Mar 2025
SePer: Measure Retrieval Utility Through The Lens Of Semantic Perplexity Reduction
Lu Dai
Yijie Xu
Jinhui Ye
Hao Liu
Hui Xiong
3DV
RALM
78
2
0
03 Mar 2025
KurTail : Kurtosis-based LLM Quantization
Mohammad Sadegh Akhondzadeh
Aleksandar Bojchevski
E. Eleftheriou
M. Dazzi
MQ
38
0
0
03 Mar 2025
FANformer: Improving Large Language Models Through Effective Periodicity Modeling
Yihong Dong
G. Li
Xue Jiang
Yongding Tao
Kechi Zhang
...
Huanyu Liu
Jiazheng Ding
Jia Li
Jinliang Deng
Hong Mei
AI4TS
41
0
0
28 Feb 2025
FOReCAst: The Future Outcome Reasoning and Confidence Assessment Benchmark
Zhangdie Yuan
Zifeng Ding
Andreas Vlachos
AI4TS
77
0
0
27 Feb 2025
Sparse Brains are Also Adaptive Brains: Cognitive-Load-Aware Dynamic Activation for LLMs
Yiheng Yang
Yujie Wang
Chi Ma
Lei Yu
Emmanuele Chersoni
Chu-Ren Huang
74
0
0
26 Feb 2025
A Sliding Layer Merging Method for Efficient Depth-Wise Pruning in LLMs
Xuan Ding
Rui Sun
Yunjian Zhang
Xiu Yan
Yueqi Zhou
Kaihao Huang
Suzhong Fu
Angelica I Aviles-Rivero
Chuanlong Xie
Yao Zhu
123
1
0
26 Feb 2025
Sliding Window Attention Training for Efficient Large Language Models
Zichuan Fu
Wentao Song
Y. Wang
X. Wu
Yefeng Zheng
Yingying Zhang
Derong Xu
Xuetao Wei
Tong Bill Xu
Xiangyu Zhao
76
1
0
26 Feb 2025
Compressing Language Models for Specialized Domains
Miles Williams
G. Chrysostomou
Vitor Jeronymo
Nikolaos Aletras
MQ
39
0
0
25 Feb 2025
Predicting Through Generation: Why Generation Is Better for Prediction
Md. Kowsher
Nusrat Jahan Prottasha
Prakash Bhat
Chun-Nam Yu
Mojtaba Soltanalian
Ivan Garibay
O. Garibay
Chen Chen
Niloofar Yousefi
AI4TS
60
0
0
25 Feb 2025
Self-Adjust Softmax
Chuanyang Zheng
Yihang Gao
Guoxuan Chen
Han Shi
Jing Xiong
Xiaozhe Ren
Chao Huang
Xin Jiang
Z. Li
Yu-Hu Li
38
0
0
25 Feb 2025
Make LoRA Great Again: Boosting LoRA with Adaptive Singular Values and Mixture-of-Experts Optimization Alignment
Chenghao Fan
Zhenyi Lu
Sichen Liu
Xiaoye Qu
Wei Wei
Chengfeng Gu
Yu-Xi Cheng
MoE
124
0
0
24 Feb 2025
NEAT: Nonlinear Parameter-efficient Adaptation of Pre-trained Models
Yibo Zhong
Haoxiang Jiang
Lincan Li
Ryumei Nakada
Tianci Liu
Linjun Zhang
Huaxiu Yao
Haoyu Wang
75
2
0
24 Feb 2025
TituLLMs: A Family of Bangla LLMs with Comprehensive Benchmarking
Shahriar Kabir Nahin
R. N. Nandi
Sagor Sarker
Quazi Sarwar Muhtaseem
Md. Kowsher
Apu Chandraw Shill
Md Ibrahim
Mehadi Hasan Menon
Tareq Al Muntasir
Firoj Alam
66
0
0
24 Feb 2025
Correlating and Predicting Human Evaluations of Language Models from Natural Language Processing Benchmarks
Rylan Schaeffer
Punit Singh Koura
Binh Tang
R. Subramanian
Aaditya K. Singh
...
Vedanuj Goswami
Sergey Edunov
Dieuwke Hupkes
Sanmi Koyejo
Sharan Narang
ALM
69
0
0
24 Feb 2025
Fed-SB: A Silver Bullet for Extreme Communication Efficiency and Performance in (Private) Federated LoRA Fine-Tuning
Raghav Singhal
Kaustubh Ponkshe
Rohit Vartak
Lav R. Varshney
Praneeth Vepakomma
FedML
74
0
0
24 Feb 2025
Selective Prompt Anchoring for Code Generation
Yuan Tian
Tianyi Zhang
86
3
0
24 Feb 2025
Capability Instruction Tuning: A New Paradigm for Dynamic LLM Routing
Yi-Kai Zhang
De-Chuan Zhan
Han-Jia Ye
ALM
ELM
LRM
36
1
0
24 Feb 2025
Probe Pruning: Accelerating LLMs through Dynamic Pruning via Model-Probing
Qi Le
Enmao Diao
Ziyan Wang
Xinran Wang
Jie Ding
Li Yang
Ali Anwar
69
1
0
24 Feb 2025
Adversarial Prompt Evaluation: Systematic Benchmarking of Guardrails Against Prompt Input Attacks on LLMs
Giulio Zizzo
Giandomenico Cornacchia
Kieran Fraser
Muhammad Zaid Hameed
Ambrish Rawat
Beat Buesser
Mark Purcell
Pin-Yu Chen
P. Sattigeri
Kush R. Varshney
AAML
43
1
0
24 Feb 2025
Recent Advances in Large Langauge Model Benchmarks against Data Contamination: From Static to Dynamic Evaluation
Simin Chen
Yiming Chen
Zexin Li
Yifan Jiang
Zhongwei Wan
...
Dezhi Ran
Tianle Gu
H. Li
Tao Xie
Baishakhi Ray
43
3
0
23 Feb 2025
Previous
1
2
3
4
5
...
19
20
21
Next