Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2309.07988
Cited By
v1
v2
v3 (latest)
Folding Attention: Memory and Power Optimization for On-Device Transformer-based Streaming Speech Recognition
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023
14 September 2023
Yang Li
Liangzhen Lai
Shangguan Yuan
Forrest N. Iandola
Zhaoheng Ni
Ernie Chang
Yangyang Shi
Vikas Chandra
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Folding Attention: Memory and Power Optimization for On-Device Transformer-based Streaming Speech Recognition"
4 / 4 papers shown
Title
Model-free Speculative Decoding for Transformer-based ASR with Token Map Drafting
Tuan Vu Ho
Hiroaki Kokubo
Masaaki Yamamoto
Yohei Kawaguchi
76
0
0
29 Jul 2025
Deep Learning Models in Speech Recognition: Measuring GPU Energy Consumption, Impact of Noise and Model Quantization for Edge Deployment
Aditya Chakravarty
155
2
0
02 May 2024
Fast Transformer Decoding: One Write-Head is All You Need
Noam M. Shazeer
556
625
0
06 Nov 2019
Xception: Deep Learning with Depthwise Separable Convolutions
Computer Vision and Pattern Recognition (CVPR), 2016
François Chollet
MDE
BDL
PINN
2.8K
16,539
0
07 Oct 2016
1