v1v2v3v4v5 (latest)

The Lottery Ticket Hypothesis: Finding Sparse, Trainable Neural Networks

9 March 2018

Jonathan Frankle

Michael Carbin

ArXiv (abs)PDF HTML

Papers citing "The Lottery Ticket Hypothesis: Finding Sparse, Trainable Neural Networks"

50 / 2,187 papers shown

DiP-GO: A Diffusion Pruner via Few-step Gradient OptimizationNeural Information Processing Systems (NeurIPS), 2024

Ji Liu

...

168

22 Oct 2024

Influential Language Data Selection via Gradient Trajectory Pursuit

Zhiwei Deng

Tao Li

Yang Li

213

22 Oct 2024

Generalized Multimodal Fusion via Poisson-Nernst-Planck Equation

Jing Wang

189

20 Oct 2024

The Propensity for Density in Feed-forward ModelsEuropean Conference on Artificial Intelligence (ECAI), 2024

161

18 Oct 2024

Linguistically Grounded Analysis of Language Models using Shapley Head ValuesNorth American Chapter of the Association for Computational Linguistics (NAACL), 2024

Marcell Richard Fekete

Johannes Bjerva

414

17 Oct 2024

The Non-Local Model Merging Problem: Permutation Symmetries and Variance Collapse

Ekansh Sharma

Daniel M. Roy

Gintare Karolina Dziugaite

MoMe

276

16 Oct 2024

FusionLLM: A Decentralized LLM Training System on Geo-distributed GPUs with Adaptive Compression

...

Bo Li

220

16 Oct 2024

Deep Model Merging: The Sister of Neural Network Interpretability -- A Survey

Kyle Chard

205

16 Oct 2024

MoE-Pruner: Pruning Mixture-of-Experts Large Language Model using the Hints from Its Router

Zhi Zhang

Yanzhi Wang

230

15 Oct 2024

PaSTe: Improving the Efficiency of Visual Anomaly Detection at the Edge

228

15 Oct 2024

AlphaPruning: Using Heavy-Tailed Self Regularization Theory for Improved Layer-wise Pruning of Large Language ModelsNeural Information Processing Systems (NeurIPS), 2024

Haiquan Lu

Yefan Zhou

Shiwei Liu

Zhangyang Wang

Michael W. Mahoney

Yaoqing Yang

146

14 Oct 2024

RoCoFT: Efficient Finetuning of Large Language Models with Row-Column UpdatesAnnual Meeting of the Association for Computational Linguistics (ACL), 2024

358

14 Oct 2024

ALLoRA: Adaptive Learning Rate Mitigates LoRA Fatal Flaws

Hai Huang

Randall Balestriero

200

13 Oct 2024

Non-transferable PruningEuropean Conference on Computer Vision (ECCV), 2024

192

10 Oct 2024

Neural MetamorphosisEuropean Conference on Computer Vision (ECCV), 2024

Xingyi Yang

Xinchao Wang

276

10 Oct 2024

Mitigating Gender Bias in Code Large Language Models via Model Editing

Haochuan Wang

Zhiying Tu

Dianbo Sui

199

10 Oct 2024

Growing Efficient Accurate and Robust Neural Networks on the Edge

Vignesh Sundaresha

Naresh Shanbhag

274

10 Oct 2024

Bilinear MLPs enable weight-based mechanistic interpretabilityInternational Conference on Learning Representations (ICLR), 2024

233

10 Oct 2024

More Experts Than Galaxies: Conditionally-overlapping Experts With Biologically-Inspired Fixed RoutingInternational Conference on Learning Representations (ICLR), 2024

Sagi Shaier

Francisco Pereira

Katharina von der Wense

Lawrence E Hunter

Matt Jones

MoE

698

10 Oct 2024

Extracting and Combining Abilities For Building Multi-lingual Ability-enhanced Large Language Models

395

10 Oct 2024

Is C4 Dataset Optimal for Pruning? An Investigation of Calibration Data for LLM PruningConference on Empirical Methods in Natural Language Processing (EMNLP), 2024

Abhinav Bandari

L. Yin

Cheng-Yu Hsieh

Ajay Kumar Jaiswal

Tianlong Chen

Li Shen

Ranjay Krishna

Shiwei Liu

193

09 Oct 2024

Convex Distillation: Efficient Compression of Deep Networks via Convex Optimization

Prateek Varshney

Mert Pilanci

373

09 Oct 2024

RespDiff: An End-to-End Multi-scale RNN Diffusion Model for Respiratory Waveform Estimation from PPG Signals

344

06 Oct 2024

Measuring and Controlling Solution Degeneracy across Task-Trained Recurrent Neural Networks

378

04 Oct 2024

Dynamic Sparse Training versus Dense Training: The Unexpected Winner in Image Corruption RobustnessInternational Conference on Learning Representations (ICLR), 2024

Decebal Constantin Mocanu

Elena Mocanu

OOD 3DH

514

03 Oct 2024

Efficient Source-Free Time-Series Adaptation via Parameter Subspace DisentanglementInternational Conference on Learning Representations (ICLR), 2024

385

03 Oct 2024

FedPeWS: Personalized Warmup via Subnetworks for Enhanced Heterogeneous Federated Learning

434

03 Oct 2024

On the Geometry and Optimization of Polynomial Convolutional NetworksInternational Conference on Artificial Intelligence and Statistics (AISTATS), 2024

Vahid Shahverdi

Giovanni Luca Marchetti

Kathlén Kohn

259

01 Oct 2024

Do Influence Functions Work on Large Language Models?

Yige Li

228

30 Sep 2024

EEG Emotion Copilot: Optimizing Lightweight LLMs for Emotional EEG Interpretation with Assisted Medical Record GenerationNeural Networks (NN), 2024

...

337

30 Sep 2024

Inferring Thunderstorm Occurrence from Vertical Profiles of Convection-Permitting Simulations: Physical Insights from a Physical Deep Learning ModelArtificial Intelligence for the Earth Systems (AI4ES), 2024

Kianusch Vahid Yousefnia

Tobias Bölle

Christoph Metzl

340

30 Sep 2024

Investigating the Effect of Network Pruning on Performance and Interpretability

Jonathan von Rad

Florian Seuffert

286

29 Sep 2024

Value-Based Deep Multi-Agent Reinforcement Learning with Dynamic Sparse TrainingNeural Information Processing Systems (NeurIPS), 2024

Longbo Huang

195

28 Sep 2024

Harmful Fine-tuning Attacks and Defenses for Large Language Models: A Survey

Tiansheng Huang

Sihao Hu

Fatih Ilhan

Selim Furkan Tekin

Ling Liu

AAML

477

26 Sep 2024

AlterMOMA: Fusion Redundancy Pruning for Camera-LiDAR Fusion Models with Alternative Modality MaskingNeural Information Processing Systems (NeurIPS), 2024

Ying Zhang

219

26 Sep 2024

MaskLLM: Learnable Semi-Structured Sparsity for Large Language ModelsNeural Information Processing Systems (NeurIPS), 2024

Hongxu Yin

Jan Kautz

Xinchao Wang

174

26 Sep 2024

Multiplicative Logit Adjustment Approximates Neural-Collapse-Aware Decision Boundary AdjustmentInternational Conference on Learning Representations (ICLR), 2024

Naoya Hasegawa

Issei Sato

408

26 Sep 2024

Training Neural Networks for Modularity aids Interpretability

Satvik Golechha

Dylan R. Cope

Nandi Schoots

241

24 Sep 2024

On Importance of Pruning and Distillation for Efficient Low Resource NLP

Raviraj Joshi

Geetanjali Kale

244

21 Sep 2024

CFSP: An Efficient Structured Pruning Framework for LLMs with Coarse-to-Fine Activation InformationInternational Conference on Computational Linguistics (COLING), 2024

Yuxin Wang

Zekun Wang

Qing Yang

Ming Liu

Bing Qin

175

20 Sep 2024

Hidden Activations Are Not Enough: A General Approach to Neural Network Predictions

Samuel Leblanc

Aiky Rasolomanana

Marco Armenta

228

20 Sep 2024

Cross-Domain Content Generation with Domain-Specific Small Language Models

Ankit Maloo

Abhinav Garg

CLL

214

19 Sep 2024

Monomial Matrix Group Equivariant Neural Functional NetworksNeural Information Processing Systems (NeurIPS), 2024

Hoang V. Tran

Thieu N. Vo

Tho H. Tran

An T. Nguyen

Tan M. Nguyen

474

18 Sep 2024

Evaluating the Impact of Compression Techniques on Task-Specific Performance of Large Language Models

Bishwash Khanal

Jeffery M. Capone

267

17 Sep 2024

Are Sparse Neural Networks Better Hard Sample Learners?British Machine Vision Conference (BMVC), 2024

Q. Xiao

Boqian Wu

Lu Yin

Christopher Neil Gadzinski

Tianjin Huang

Mykola Pechenizkiy

Decebal Constantin Mocanu

215

13 Sep 2024

S-STE: Continuous Pruning Function for Efficient 2:4 Sparse Pre-trainingNeural Information Processing Systems (NeurIPS), 2024

Yuezhou Hu

Jun-Jie Zhu

Jianfei Chen

410

13 Sep 2024

A framework for measuring the training efficiency of a neural architectureArtificial Intelligence Review (Artif Intell Rev), 2024

Eduardo Cueto-Mendoza

John D. Kelleher

255

12 Sep 2024

Self-Masking Networks for Unsupervised AdaptationGerman Conference on Pattern Recognition (DAGM), 2024

Alfonso Taboada Warmerdam

Mathilde Caron

Yuki M. Asano

305

11 Sep 2024

HESSO: Towards Automatic Efficient and User Friendly Any Neural Network Training and Pruning

446

11 Sep 2024

LEIA: Latent View-invariant Embeddings for Implicit 3D ArticulationEuropean Conference on Computer Vision (ECCV), 2024

Archana Swaminathan

Abhinav Shrivastava

229

10 Sep 2024