Monotonic Chunkwise Attention

14 December 2017

Papers citing "Monotonic Chunkwise Attention"

50 / 53 papers shown

Title
Lightweight Transducer Based on Frame-Level Criterion Genshun Wan Mengzhi Wang Tingzhi Mao Hang Chen Z. Ye 44 1 0 05 Sep 2024
A Non-autoregressive Generation Framework for End-to-End Simultaneous Speech-to-Any Translation Zhengrui Ma Qingkai Fang Shaolei Zhang Shoutao Guo Yang Feng Min Zhang 53 9 0 11 Jun 2024
Exploring RWKV for Memory Efficient and Low Latency Streaming ASR Keyu An Shiliang Zhang 31 4 0 26 Sep 2023
Incremental Blockwise Beam Search for Simultaneous Speech Translation with Controllable Quality-Latency Tradeoff Peter Polák Brian Yan Shinji Watanabe A. Waibel Ondrej Bojar 28 9 0 20 Sep 2023
TAPIR: Learning Adaptive Revision for Incremental Natural Language Understanding with a Two-Pass Model Patrick Kahardipraja Brielen Madureira David Schlangen CLL 34 9 0 18 May 2023
Self-regularised Minimum Latency Training for Streaming Transformer-based Speech Recognition Mohan Li R. Doddipatla Catalin Zorila 30 0 0 24 Apr 2023
Dynamic Chunk Convolution for Unified Streaming and Non-Streaming Conformer ASR Xilai Li Goeric Huybrechts S. Ronanki Jeffrey J. Farris S. Bodapati 38 6 0 18 Apr 2023
Streaming Joint Speech Recognition and Disfluency Detection Hayato Futami E. Tsunoo Kentarou Shibata Yosuke Kashiwagi Takao Okuda Siddhant Arora Shinji Watanabe 42 6 0 16 Nov 2022
A Weakly-Supervised Streaming Multilingual Speech Model with Truly Zero-Shot Capability Jian Xue Peidong Wang Jinyu Li Eric Sun 32 10 0 04 Nov 2022
Minimum Latency Training of Sequence Transducers for Streaming End-to-End Speech Recognition Yusuke Shinohara Shinji Watanabe AI4TS 23 9 0 04 Nov 2022
Streaming Audio-Visual Speech Recognition with Alignment Regularization Pingchuan Ma Niko Moritz Stavros Petridis Christian Fuegen M. Pantic 37 2 0 03 Nov 2022
Monotonic segmental attention for automatic speech recognition Albert Zeyer Robin Schmitt Wei Zhou Ralf Schluter Hermann Ney 16 8 0 26 Oct 2022
ConvRNN-T: Convolutional Augmented Recurrent Neural Network Transducers for Streaming Speech Recognition Martin H. Radfar Rohit Barnwal R. Swaminathan Feng-Ju Chang Grant P. Strimel Nathan Susanj Athanasios Mouchtaris 31 13 0 29 Sep 2022
Large-Scale Streaming End-to-End Speech Translation with Neural Transducers Jian Xue Peidong Wang Jinyu Li Matt Post Yashesh Gaur AI4TS 26 26 0 11 Apr 2022
End to End Lip Synchronization with a Temporal AutoEncoder Yoav Shalev Lior Wolf 12 7 0 30 Mar 2022
Integrating Lattice-Free MMI into End-to-End Speech Recognition Jinchuan Tian Jianwei Yu Chao Weng Yuexian Zou Dong Yu 26 8 0 29 Mar 2022
Dynamic Latency for CTC-Based Streaming Automatic Speech Recognition With Emformer J. Sun Guiping Zhong Dinghao Zhou Baoxiang Li 21 0 0 29 Mar 2022
Transformer-based Streaming ASR with Cumulative Attention Mohan Li Shucong Zhang Catalin Zorila R. Doddipatla 24 9 0 11 Mar 2022
Decision Attentive Regularization to Improve Simultaneous Speech Translation Systems Mohd Abbas Zaidi Beomseok Lee Sangha Kim Chanwoo Kim 22 5 0 13 Oct 2021
Translating Images into Maps Avishkar Saha Oscar Alejandro Mendez Maldonado Chris Russell Richard Bowden ViT 21 144 0 03 Oct 2021
Factorized Neural Transducer for Efficient Language Model Adaptation Xie Chen Zhong Meng S. Parthasarathy Jinyu Li 21 39 0 27 Sep 2021
Infusing Future Information into Monotonic Attention Through Language Models Mohd Abbas Zaidi S. Indurthi Beomseok Lee Nikhil Kumar Lakumarapu Sangha Kim 27 2 0 07 Sep 2021
A Survey on Neural Speech Synthesis Xu Tan Tao Qin Frank Soong Tie-Yan Liu AI4TS 18 352 0 29 Jun 2021
Multi-mode Transformer Transducer with Stochastic Future Context Kwangyoun Kim Felix Wu Prashant Sridhar Kyu Jeong Han Shinji Watanabe 30 9 0 17 Jun 2021
Streaming end-to-end speech recognition with jointly trained neural feature enhancement Chanwoo Kim Abhinav Garg Dhananjaya N. Gowda Seongkyu Mun C. Han AuLLM 26 6 0 04 May 2021
A study of latent monotonic attention variants Albert Zeyer Ralf Schluter Hermann Ney 24 5 0 30 Mar 2021
A review of on-device fully neural end-to-end automatic speech recognition algorithms Chanwoo Kim Dhananjaya N. Gowda Dongsoo Lee Jiyeon Kim Ankur Kumar Sungsoo Kim Abhinav Garg C. Han 24 27 0 14 Dec 2020
Block-Online Guided Source Separation Shota Horiguchi Yusuke Fujita Kenji Nagamatsu 17 4 0 16 Nov 2020
Transformer-based End-to-End Speech Recognition with Local Dense Synthesizer Attention Menglong Xu Shengqiang Li Xiao-Lei Zhang 27 31 0 23 Oct 2020
Developing Real-time Streaming Transformer Transducer for Speech Recognition on Large-scale Dataset Xie Chen Yu-Huan Wu Zhenghao Wang Shujie Liu Jinyu Li 22 169 0 22 Oct 2020
Conv-Transformer Transducer: Low Latency, Low Frame Rate, Streamable End-to-End Speech Recognition Wenyong Huang Wenchao Hu Y. Yeung Xiao Chen 22 50 0 13 Aug 2020
Class LM and word mapping for contextual biasing in End-to-End ASR Rongqing Huang Ossama Abdel-Hamid Xinwei Li G. Evermann 23 47 0 10 Jul 2020
Streaming Transformer ASR with Blockwise Synchronous Beam Search E. Tsunoo Yosuke Kashiwagi Shinji Watanabe 19 11 0 25 Jun 2020
A Comparison of Label-Synchronous and Frame-Synchronous End-to-End Models for Speech Recognition Linhao Dong Cheng Yi Jianzong Wang Shiyu Zhou Shuang Xu X. Jia Bo Xu 36 17 0 20 May 2020
Efficient Wait-k Models for Simultaneous Machine Translation Maha Elbayad Laurent Besacier Jakob Verbeek VLM 24 77 0 18 May 2020
Attention-based Transducer for Online Speech Recognition Bin Wang Yan Yin Hui-Ching Lin 18 4 0 18 May 2020
Speech Recognition and Multi-Speaker Diarization of Long Conversations H. H. Mao Shuyang Li Julian McAuley G. Cottrell VLM 22 40 0 16 May 2020
Neural Data-to-Text Generation via Jointly Learning the Segmentation and Correspondence Xiaoyu Shen Ernie Chang Hui Su Jie Zhou Dietrich Klakow 34 49 0 03 May 2020
Exploring Pre-training with Alignments for RNN Transducer based End-to-End Speech Recognition Hu Hu Rui Zhao Jinyu Li Liang Lu Jiawei Liu 19 27 0 01 May 2020
Minimum Latency Training Strategies for Streaming Sequence-to-Sequence ASR Hirofumi Inaguma Yashesh Gaur Liang Lu Jinyu Li Jiawei Liu AI4TS 27 46 0 10 Apr 2020
A Streaming On-Device End-to-End Model Surpassing Server-Side Conventional Model Quality and Latency Tara N. Sainath Yanzhang He Bo-wen Li A. Narayanan Ruoming Pang ... Trevor Strohman Mirkó Visontai Yonghui Wu Yu Zhang Ding Zhao 25 215 0 28 Mar 2020
High-Accuracy and Low-Latency Speech Recognition with Two-Head Contextual Layer Trajectory LSTM Model Jinyu Li Rui Zhao Eric Sun J. H. M. Wong Amit Das Zhong Meng Jiawei Liu VLM 24 24 0 17 Mar 2020
Towards Online End-to-end Transformer Automatic Speech Recognition E. Tsunoo Yosuke Kashiwagi Toshiyuki Kumakura Shinji Watanabe 22 32 0 25 Oct 2019
Recognizing long-form speech using streaming end-to-end models A. Narayanan Rohit Prabhavalkar Chung-Cheng Chiu David Rybach Tara N. Sainath Trevor Strohman 23 129 0 24 Oct 2019
Transformer ASR with Contextual Block Processing E. Tsunoo Yosuke Kashiwagi Toshiyuki Kumakura Shinji Watanabe 59 64 0 16 Oct 2019
Improving RNN Transducer Modeling for End-to-End Speech Recognition Jinyu Li Rui Zhao Hu Hu Jiawei Liu 13 170 0 26 Sep 2019
Monotonic Multihead Attention Xutai Ma J. Pino James Cross Liezl Puzon Jiatao Gu 25 136 0 26 Sep 2019
Monotonic Infinite Lookback Attention for Simultaneous Machine Translation N. Arivazhagan Colin Cherry Wolfgang Macherey Chung-Cheng Chiu Semih Yavuz Ruoming Pang Wei Li Colin Raffel CLL 11 190 0 12 Jun 2019
CIF: Continuous Integrate-and-Fire for End-to-End Speech Recognition Linhao Dong Bo Xu 27 125 0 27 May 2019
Generating Long Sequences with Sparse Transformers R. Child Scott Gray Alec Radford Ilya Sutskever 11 1,848 0 23 Apr 2019