Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2401.16658
Cited By
OWSM v3.1: Better and Faster Open Whisper-Style Speech Models based on E-Branchformer
30 January 2024
Yifan Peng
Jinchuan Tian
William Chen
Siddhant Arora
Brian Yan
Yui Sudo
Muhammad Shakeel
Kwanghee Choi
Jiatong Shi
Xuankai Chang
Jee-weon Jung
Shinji Watanabe
VLM
OSLM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"OWSM v3.1: Better and Faster Open Whisper-Style Speech Models based on E-Branchformer"
34 / 34 papers shown
Title
Whisper-LM: Improving ASR Models with Language Models for Low-Resource Languages
Xabier de Zuazo
Eva Navas
Ibon Saratxaga
Inma Hernáez Rioja
34
0
0
30 Mar 2025
Dolphin: A Large-Scale Automatic Speech Recognition Model for Eastern Languages
Yangyang Meng
Jinpeng Li
Guodong Lin
Yu Pu
G. Wang
Hu Du
Zhiming Shao
Yukai Huang
Ke Li
Wei-Qiang Zhang
ObjD
90
0
0
26 Mar 2025
ESPnet-SDS: Unified Toolkit and Demo for Spoken Dialogue Systems
Siddhant Arora
Yifan Peng
Jiatong Shi
Jinchuan Tian
William Chen
...
Yosuke Kashiwagi
E. Tsunoo
Shuichiro Shimizu
Vaibhav Srivastav
Shinji Watanabe
36
0
0
11 Mar 2025
SQ-Whisper: Speaker-Querying based Whisper Model for Target-Speaker ASR
Pengcheng Guo
Xuankai Chang
Hang Lv
Shinji Watanabe
Lei Xie
54
0
0
07 Dec 2024
End-to-End Transformer-based Automatic Speech Recognition for Northern Kurdish: A Pioneering Approach
Abdulhady Abas Abdullah
Shima Tabibian
H. Veisi
Aso Mahmudi
Tarik A. Rashid
15
0
0
19 Oct 2024
Efficiently Identifying Low-Quality Language Subsets in Multilingual Datasets: A Case Study on a Large-Scale Multilingual Audio Dataset
Farhan Samir
Emily P. Ahn
Shreya Prakash
Márton Soskuthy
Vered Shwartz
Jian Zhu
16
0
0
05 Oct 2024
Distilling an End-to-End Voice Assistant Without Instruction Training Data
William B. Held
Ella Li
Michael Joseph Ryan
Weiyan Shi
Yanzhe Zhang
Diyi Yang
AuLLM
29
8
0
03 Oct 2024
Robust Audiovisual Speech Recognition Models with Mixture-of-Experts
Yihan Wu
Yifan Peng
Yichen Lu
Xuankai Chang
Ruihua Song
Shinji Watanabe
26
2
0
19 Sep 2024
SpoofCeleb: Speech Deepfake Detection and SASV In The Wild
Jee-weon Jung
Yihan Wu
Xin Wang
Ji-Hoon Kim
Soumi Maiti
...
Joon Son Chung
Wangyou Zhang
Seyun Um
Shinnosuke Takamichi
Shinji Watanabe
46
1
0
18 Sep 2024
ESPnet-EZ: Python-only ESPnet for Easy Fine-tuning and Integration
Masao Someki
Kwanghee Choi
Siddhant Arora
William Chen
Samuele Cornell
Jionghao Han
Yifan Peng
Jiatong Shi
Vaibhav Srivastav
Shinji Watanabe
VLM
23
0
0
14 Sep 2024
Text-To-Speech Synthesis In The Wild
Jee-weon Jung
Wangyou Zhang
Soumi Maiti
Yihan Wu
Xin Wang
...
Hye-jin Shim
Nicholas W. D. Evans
Joon Son Chung
Shinnosuke Takamichi
Shinji Watanabe
19
1
0
13 Sep 2024
Resource-Efficient Adaptation of Speech Foundation Models for Multi-Speaker ASR
Weiqing Wang
Kunal Dhawan
Taejin Park
Krishna C. Puvvada
Ivan Medennikov
Somshubra Majumdar
He Huang
Jagadeesh Balam
Boris Ginsburg
18
1
0
02 Sep 2024
Lyrics Transcription for Humans: A Readability-Aware Benchmark
Ondvrej Cífka
Hendrik Schreiber
Luke Miner
Fabian-Robert Stöter
16
0
0
30 Jul 2024
Towards Robust Speech Representation Learning for Thousands of Languages
William Chen
Wangyou Zhang
Yifan Peng
Xinjian Li
Jinchuan Tian
Jiatong Shi
Xuankai Chang
Soumi Maiti
Karen Livescu
Shinji Watanabe
ELM
23
6
0
30 Jun 2024
BESTOW: Efficient and Streamable Speech Language Model with the Best of Two Worlds in GPT and T5
Zhehuai Chen
He Huang
Oleksii Hrinchuk
Krishna C. Puvvada
Nithin Rao Koluguri
Piotr Żelasko
Jagadeesh Balam
Boris Ginsburg
AuLLM
RALM
29
10
0
28 Jun 2024
Rapid Language Adaptation for Multilingual E2E Speech Recognition Using Encoder Prompting
Yosuke Kashiwagi
Hayato Futami
E. Tsunoo
Siddhant Arora
Shinji Watanabe
16
1
0
18 Jun 2024
On the Evaluation of Speech Foundation Models for Spoken Language Understanding
Siddhant Arora
Ankita Pasad
Chung-Ming Chien
Jionghao Han
Roshan S. Sharma
...
William Chen
Suwon Shon
Hung-yi Lee
Karen Livescu
Shinji Watanabe
ELM
27
4
0
14 Jun 2024
On the Effects of Heterogeneous Data Sources on Speech-to-Text Foundation Models
Jinchuan Tian
Yifan Peng
William Chen
Kwanghee Choi
Karen Livescu
Shinji Watanabe
19
5
0
13 Jun 2024
ML-SUPERB 2.0: Benchmarking Multilingual Speech Models Across Modeling Constraints, Languages, and Datasets
Jiatong Shi
Shih-Heng Wang
William Chen
Martijn Bartelds
Vanya Bannihatti Kumar
...
Xuankai Chang
Dan Jurafsky
Karen Livescu
Hung-yi Lee
Shinji Watanabe
AuLLM
70
5
0
12 Jun 2024
Neural Blind Source Separation and Diarization for Distant Speech Recognition
Yoshiaki Bando
Tomohiko Nakamura
Shinji Watanabe
BDL
19
5
0
12 Jun 2024
URGENT Challenge: Universality, Robustness, and Generalizability For Speech Enhancement
Wangyou Zhang
Robin Scheibler
Kohei Saijo
Samuele Cornell
Chenda Li
...
Jan Pirklbauer
Marvin Sach
Shinji Watanabe
Tim Fingscheidt
Yanmin Qian
VLM
24
1
0
07 Jun 2024
Beyond Performance Plateaus: A Comprehensive Study on Scalability in Speech Enhancement
Wangyou Zhang
Kohei Saijo
Jee-weon Jung
Chenda Li
Shinji Watanabe
Yanmin Qian
22
4
0
06 Jun 2024
Joint Beam Search Integrating CTC, Attention, and Transducer Decoders
Yui Sudo
Muhammad Shakeel
Yosuke Fukumoto
Brian Yan
Jiatong Shi
Yifan Peng
Shinji Watanabe
19
0
0
05 Jun 2024
YODAS: Youtube-Oriented Dataset for Audio and Speech
Xinjian Li
Shinnosuke Takamichi
Takaaki Saeki
William Chen
Sayaka Shiota
Shinji Watanabe
38
16
0
02 Jun 2024
Self-Taught Recognizer: Toward Unsupervised Adaptation for Speech Foundation Models
Yuchen Hu
Chen Chen
Chao-Han Huck Yang
Chengwei Qin
Pin-Yu Chen
Chng Eng Siong
Chao Zhang
VLM
25
3
0
23 May 2024
Contextualized Automatic Speech Recognition with Dynamic Vocabulary
Yui Sudo
Yosuke Fukumoto
Muhammad Shakeel
Yifan Peng
Shinji Watanabe
16
0
0
22 May 2024
Wav2Gloss: Generating Interlinear Glossed Text from Speech
Taiqi He
Kwanghee Choi
Lindia Tjuatja
Nathaniel R. Robinson
Jiatong Shi
Shinji Watanabe
Graham Neubig
David R. Mortensen
Lori S. Levin
VLM
25
2
0
19 Mar 2024
OWSM-CTC: An Open Encoder-Only Speech Foundation Model for Speech Recognition, Translation, and Language Identification
Yifan Peng
Yui Sudo
Muhammad Shakeel
Shinji Watanabe
VLM
30
17
0
20 Feb 2024
OLMo: Accelerating the Science of Language Models
Dirk Groeneveld
Iz Beltagy
Pete Walsh
Akshita Bhagia
Rodney Michael Kinney
...
Jesse Dodge
Kyle Lo
Luca Soldaini
Noah A. Smith
Hanna Hajishirzi
OSLM
124
349
0
01 Feb 2024
Exploring the limits of decoder-only models trained on public speech recognition corpora
Ankit Gupta
G. Saon
Brian Kingsbury
OffRL
21
5
0
31 Jan 2024
SLM: Bridge the thin gap between speech and text foundation models
Mingqiu Wang
Wei Han
Izhak Shafran
Zelin Wu
Chung-Cheng Chiu
...
Zhong Meng
Golan Pundak
Nikhil Siddhartha
J. Schalkwyk
Yonghui Wu
AuLLM
37
56
0
30 Sep 2023
Google USM: Scaling Automatic Speech Recognition Beyond 100 Languages
Yu Zhang
Wei Han
James Qin
Yongqiang Wang
Ankur Bapna
...
Pedro J. Moreno
Chung-Cheng Chiu
J. Schalkwyk
Franccoise Beaufays
Yonghui Wu
VLM
77
249
0
02 Mar 2023
E-Branchformer: Branchformer with Enhanced merging for speech recognition
Kwangyoun Kim
Felix Wu
Yifan Peng
Jing Pan
Prashant Sridhar
Kyu Jeong Han
Shinji Watanabe
31
104
0
30 Sep 2022
FLEURS: Few-shot Learning Evaluation of Universal Representations of Speech
Alexis Conneau
Min Ma
Simran Khanuja
Yu Zhang
Vera Axelrod
Siddharth Dalmia
Jason Riesa
Clara E. Rivera
Ankur Bapna
VLM
75
281
0
25 May 2022
1