ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2304.04397
  4. Cited By
Randomized and Deterministic Attention Sparsification Algorithms for
  Over-parameterized Feature Dimension

Randomized and Deterministic Attention Sparsification Algorithms for Over-parameterized Feature Dimension

10 April 2023
Yichuan Deng
Sridhar Mahadevan
Zhao Song
ArXiv (abs)PDFHTML

Papers citing "Randomized and Deterministic Attention Sparsification Algorithms for Over-parameterized Feature Dimension"

11 / 11 papers shown
Title
Bypassing the Exponential Dependency: Looped Transformers Efficiently Learn In-context by Multi-step Gradient Descent
Bypassing the Exponential Dependency: Looped Transformers Efficiently Learn In-context by Multi-step Gradient Descent
Bo Chen
Xiaoyu Li
Yingyu Liang
Zhenmei Shi
Zhao Song
152
22
0
15 Oct 2024
Binary Hypothesis Testing for Softmax Models and Leverage Score Models
Binary Hypothesis Testing for Softmax Models and Leverage Score Models
Yeqi Gao
Yuzhou Gu
Zhao Song
71
0
0
09 May 2024
H$_2$O: Heavy-Hitter Oracle for Efficient Generative Inference of Large
  Language Models
H2_22​O: Heavy-Hitter Oracle for Efficient Generative Inference of Large Language Models
Zhenyu Zhang
Ying Sheng
Dinesh Manocha
Tianlong Chen
Lianmin Zheng
...
Yuandong Tian
Christopher Ré
Clark W. Barrett
Zhangyang Wang
Beidi Chen
VLM
178
314
0
24 Jun 2023
Query Complexity of Active Learning for Function Family With Nearly
  Orthogonal Basis
Query Complexity of Active Learning for Function Family With Nearly Orthogonal Basis
Xiangyi Chen
Zhao Song
Baochen Sun
Junze Yin
Danyang Zhuo
88
3
0
06 Jun 2023
A Mathematical Abstraction for Balancing the Trade-off Between
  Creativity and Reality in Large Language Models
A Mathematical Abstraction for Balancing the Trade-off Between Creativity and Reality in Large Language Models
Ritwik Sinha
Zhao Song
Dinesh Manocha
102
25
0
04 Jun 2023
Fast Submodular Function Maximization
Fast Submodular Function Maximization
Lianke Qin
Zhao Song
Yitan Wang
81
10
0
15 May 2023
Efficient Asynchronize Stochastic Gradient Algorithm with Structured
  Data
Efficient Asynchronize Stochastic Gradient Algorithm with Structured Data
Zhao Song
Mingquan Ye
69
4
0
13 May 2023
Differentially Private Attention Computation
Differentially Private Attention Computation
Yeqi Gao
Zhao Song
Xin Yang
90
21
0
08 May 2023
Solving Tensor Low Cycle Rank Approximation
Solving Tensor Low Cycle Rank Approximation
Yichuan Deng
Yeqi Gao
Zhao Song
80
6
0
13 Apr 2023
Bypass Exponential Time Preprocessing: Fast Neural Network Training via
  Weight-Data Correlation Preprocessing
Bypass Exponential Time Preprocessing: Fast Neural Network Training via Weight-Data Correlation Preprocessing
Josh Alman
Jiehao Liang
Zhao Song
Ruizhe Zhang
Danyang Zhuo
136
31
0
25 Nov 2022
Dynamic Maintenance of Kernel Density Estimation Data Structure: From
  Practice to Theory
Dynamic Maintenance of Kernel Density Estimation Data Structure: From Practice to Theory
Jiehao Liang
Zhao Song
Zhaozhuo Xu
Junze Yin
Danyang Zhuo
OOD
67
4
0
08 Aug 2022
1