ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2104.09426
  4. Cited By
Advanced Long-context End-to-end Speech Recognition Using
  Context-expanded Transformers

Advanced Long-context End-to-end Speech Recognition Using Context-expanded Transformers

19 April 2021
Takaaki Hori
Niko Moritz
Chiori Hori
Jonathan Le Roux
ArXivPDFHTML

Papers citing "Advanced Long-context End-to-end Speech Recognition Using Context-expanded Transformers"

24 / 24 papers shown
Title
Efficient Long-Form Speech Recognition for General Speech In-Context
  Learning
Efficient Long-Form Speech Recognition for General Speech In-Context Learning
Hao Yen
Shaoshi Ling
Guoli Ye
21
0
0
29 Sep 2024
An Efficient Self-Learning Framework For Interactive Spoken Dialog
  Systems
An Efficient Self-Learning Framework For Interactive Spoken Dialog Systems
Hitesh Tulsiani
David M. Chan
Shalini Ghosh
Garima Lalwani
Prabhat Pandey
Ankish Bansal
Sri Garimella
Ariya Rastrow
Björn Hoffmeister
26
0
0
16 Sep 2024
Advanced Long-Content Speech Recognition With Factorized Neural
  Transducer
Advanced Long-Content Speech Recognition With Factorized Neural Transducer
Xun Gong
Yu Wu
Jinyu Li
Shujie Liu
Rui Zhao
Xie Chen
Yanmin Qian
16
6
0
20 Mar 2024
Using Large Language Model for End-to-End Chinese ASR and NER
Using Large Language Model for End-to-End Chinese ASR and NER
Yuang Li
Jiawei Yu
Min Zhang
Mengxin Ren
Yanqing Zhao
Xiaofeng Zhao
Miaomiao Ma
Chang Su
Hao-Yu Yang
29
7
0
21 Jan 2024
Promptformer: Prompted Conformer Transducer for ASR
Promptformer: Prompted Conformer Transducer for ASR
Sergio Duarte Torres
Arunasish Sen
Aman Rana
Lukas Drude
Alejandro Gomez-Alanis
Andreas Schwarz
Leif Rädel
Volker Leutnant
27
3
0
14 Jan 2024
Improved Long-Form Speech Recognition by Jointly Modeling the Primary
  and Non-primary Speakers
Improved Long-Form Speech Recognition by Jointly Modeling the Primary and Non-primary Speakers
Guru Prakash Arumugam
Shuo-yiin Chang
Tara N. Sainath
Rohit Prabhavalkar
Quan Wang
Shaan Bijwadia
14
3
0
18 Dec 2023
Generative Context-aware Fine-tuning of Self-supervised Speech Models
Generative Context-aware Fine-tuning of Self-supervised Speech Models
Suwon Shon
Kwangyoun Kim
Prashant Sridhar
Yi-Te Hsu
Shinji Watanabe
Karen Livescu
12
2
0
15 Dec 2023
How Much Context Does My Attention-Based ASR System Need?
How Much Context Does My Attention-Based ASR System Need?
Robert Flynn
Anton Ragni
30
1
0
24 Oct 2023
Conversational Speech Recognition by Learning Audio-textual Cross-modal
  Contextual Representation
Conversational Speech Recognition by Learning Audio-textual Cross-modal Contextual Representation
Kun Wei
Bei Li
Hang Lv
Quan Lu
Ning Jiang
Lei Xie
31
3
0
22 Oct 2023
Enhancing End-to-End Conversational Speech Translation Through Target
  Language Context Utilization
Enhancing End-to-End Conversational Speech Translation Through Target Language Context Utilization
A. Hussein
Brian Yan
Antonios Anastasopoulos
Shinji Watanabe
Sanjeev Khudanpur
29
3
0
27 Sep 2023
Memory-augmented conformer for improved end-to-end long-form ASR
Memory-augmented conformer for improved end-to-end long-form ASR
Carlos Carvalho
A. Abad
RALM
25
1
0
22 Sep 2023
Investigating End-to-End ASR Architectures for Long Form Audio
  Transcription
Investigating End-to-End ASR Architectures for Long Form Audio Transcription
Nithin Rao Koluguri
Samuel Kriman
Georgy Zelenfroind
Somshubra Majumdar
Dima Rekesh
Vahid Noroozi
Jagadeesh Balam
Boris Ginsburg
AuLLM
29
9
0
18 Sep 2023
BASS: Block-wise Adaptation for Speech Summarization
BASS: Block-wise Adaptation for Speech Summarization
Roshan S. Sharma
Kenneth Zheng
Siddhant Arora
Shinji Watanabe
Rita Singh
Bhiksha Raj
21
7
0
17 Jul 2023
Accelerating Transducers through Adjacent Token Merging
Accelerating Transducers through Adjacent Token Merging
Yuang Li
Yu-Huan Wu
Jinyu Li
Shujie Liu
17
4
0
28 Jun 2023
Towards Effective and Compact Contextual Representation for Conformer
  Transducer Speech Recognition Systems
Towards Effective and Compact Contextual Representation for Conformer Transducer Speech Recognition Systems
Mingyu Cui
Jiawen Kang
Jiajun Deng
Xiaoyue Yin
Yutao Xie
Xie Chen
Xunying Liu
14
8
0
23 Jun 2023
Context-aware Fine-tuning of Self-supervised Speech Models
Context-aware Fine-tuning of Self-supervised Speech Models
Suwon Shon
Felix Wu
Kwangyoun Kim
Prashant Sridhar
Karen Livescu
Shinji Watanabe
25
7
0
16 Dec 2022
LongFNT: Long-form Speech Recognition with Factorized Neural Transducer
LongFNT: Long-form Speech Recognition with Factorized Neural Transducer
Xun Gong
Yu-Huan Wu
Jinyu Li
Shujie Liu
Rui Zhao
Xie Chen
Y. Qian
RALM
11
10
0
17 Nov 2022
Contextual-Utterance Training for Automatic Speech Recognition
Contextual-Utterance Training for Automatic Speech Recognition
Alejandro Gomez-Alanis
Lukas Drude
A. Schwarz
R. Swaminathan
Simon Wiesler
21
1
0
27 Oct 2022
Leveraging Acoustic Contextual Representation by Audio-textual
  Cross-modal Learning for Conversational ASR
Leveraging Acoustic Contextual Representation by Audio-textual Cross-modal Learning for Conversational ASR
Kun Wei
Yike Zhang
Sining Sun
Lei Xie
Long Ma
16
9
0
03 Jul 2022
Conversational Speech Recognition By Learning Conversation-level
  Characteristics
Conversational Speech Recognition By Learning Conversation-level Characteristics
Kun Wei
Yike Zhang
Sining Sun
Lei Xie
Long Ma
22
7
0
16 Feb 2022
PM-MMUT: Boosted Phone-Mask Data Augmentation using Multi-Modeling Unit
  Training for Phonetic-Reduction-Robust E2E Speech Recognition
PM-MMUT: Boosted Phone-Mask Data Augmentation using Multi-Modeling Unit Training for Phonetic-Reduction-Robust E2E Speech Recognition
Guodong Ma
Pengfei Hu
Nurmemet Yolwas
Shen Huang
Hao-Ming Huang
19
4
0
13 Dec 2021
Recent Advances in End-to-End Automatic Speech Recognition
Recent Advances in End-to-End Automatic Speech Recognition
Jinyu Li
VLM
13
362
0
02 Nov 2021
Speech Summarization using Restricted Self-Attention
Speech Summarization using Restricted Self-Attention
Roshan S. Sharma
Shruti Palaskar
A. Black
Florian Metze
17
33
0
12 Oct 2021
Input Length Matters: Improving RNN-T and MWER Training for Long-form
  Telephony Speech Recognition
Input Length Matters: Improving RNN-T and MWER Training for Long-form Telephony Speech Recognition
Zhiyun Lu
Yanwei Pan
Thibault Doutre
Parisa Haghani
Liangliang Cao
Rohit Prabhavalkar
C. Zhang
Trevor Strohman
AuLLM
72
14
0
08 Oct 2021
1