A Tighter Complexity Analysis of SparseGPT

22 August 2024

Papers citing "A Tighter Complexity Analysis of SparseGPT"

21 / 21 papers shown

Only Large Weights (And Not Skip Connections) Can Prevent the Perils of Rank Collapse

Josh Alman

Zhao Song

371

22 May 2025

Fast RoPE Attention: Combining the Polynomial Method and Fast Fourier Transform

Josh Alman

Zhao Song

349

17 May 2025

Theoretical Foundation of Flow-Based Time Series Generation: Provable Approximation, Generalization, and Efficiency

947

18 Mar 2025

Theoretical Guarantees for High Order Trajectory Refinement in Generative Flows

285

12 Mar 2025

Scaling Law Phenomena Across Regression Paradigms: Multiple and Kernel Approaches

323

03 Mar 2025

When Can We Solve the Weighted Low Rank Approximation Problem in Truly Subquadratic Time?International Conference on Artificial Intelligence and Statistics (AISTATS), 2025

242

24 Feb 2025

Video Latent Flow Matching: Optimal Polynomial Projections for Video Interpolation and Extrapolation

514

01 Feb 2025

Tensor Product Attention Is All You Need

787

11 Jan 2025

$Theoretical Constraints on the Expressive Power of $\mathsf{RoPE}$-based Tensor Attention Transformers$

Theoretical Constraints on the Expressive Power of

\mathsf{RoPE}

-based Tensor Attention Transformers

606

23 Dec 2024

RoPE Attention Can Be Trained in Almost Linear Time

354

23 Dec 2024

Advancing the Understanding of Fixed Point Iterations in Deep Neural Networks: A Detailed Analytical Study

231

15 Oct 2024

Bypassing the Exponential Dependency: Looped Transformers Efficiently Learn In-context by Multi-step Gradient DescentInternational Conference on Artificial Intelligence and Statistics (AISTATS), 2024

463

15 Oct 2024

HSR-Enhanced Sparse Attention Acceleration

818

14 Oct 2024

On Fine-Grained I/O Complexity of Attention Backward Passes

Jiahao Zhang

255

12 Oct 2024

Differentially Private Kernel Density Estimation

Erzhi Liu

Jerry Yao-Chieh Hu

Alex Reneau

Zhao Song

Han Liu

460

03 Sep 2024

Coupling without Communication and Drafter-Invariant Speculative DecodingInternational Symposium on Information Theory (ISIT), 2024

Majid Daliri

Christopher Musco

A. Suresh

398

15 Aug 2024

Nearest Neighbor Speculative Decoding for LLM Generation and Attribution

736

29 May 2024

The Fine-Grained Complexity of Gradient Computation for Training Large Language Models

Josh Alman

Zhao Song

222

07 Feb 2024

Differentially Private Attention Computation

Yeqi Gao

Zhao Song

Xin Yang

229

08 May 2023

The Closeness of In-Context Learning and Weight Shifting for Softmax RegressionNeural Information Processing Systems (NeurIPS), 2023

Shuai Li

188

26 Apr 2023

Streaming Kernel PCA Algorithm With Small Space

344

08 Mar 2023