Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2406.00899
Cited By
YODAS: Youtube-Oriented Dataset for Audio and Speech
2 June 2024
Xinjian Li
Shinnosuke Takamichi
Takaaki Saeki
William Chen
Sayaka Shiota
Shinji Watanabe
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"YODAS: Youtube-Oriented Dataset for Audio and Speech"
15 / 15 papers shown
Title
GLAP: General contrastive audio-text pretraining across domains and languages
Heinrich Dinkel
Zhiyong Yan
Tianzi Wang
Yongqing Wang
Xingwei Sun
Yadong Niu
Jizhong Liu
Gang Li
Junbo Zhang
Jian Luan
CLIP
VLM
17
0
0
12 Jun 2025
Loquacious Set: 25,000 Hours of Transcribed and Diverse English Speech Recognition Data for Research and Commercial Use
Titouan Parcollet
Yuan Tseng
Shucong Zhang
Rogier van Dalen
24
1
0
27 May 2025
Charting the Landscape of African NLP: Mapping Progress and Shaping the Road Ahead
Jesujoba Oluwadara Alabi
Michael A. Hedderich
David Ifeoluwa Adelani
Dietrich Klakow
103
0
0
27 May 2025
TEDI: Trustworthy and Ethical Dataset Indicators to Analyze and Compare Dataset Documentation
Wiebke Hutiri
Mircea Cimpoi
M. Scheuerman
Victoria Matthews
Alice Xiang
167
0
0
23 May 2025
Granary: Speech Recognition and Translation Dataset in 25 European Languages
Nithin Rao Koluguri
Monica Sekoyan
George Zelenfroynd
Sasha Meister
Shuoyang Ding
...
Yifan Peng
Sara Papi
Marco Gaido
Alessio Brutti
Boris Ginsburg
53
0
0
19 May 2025
Dolphin: A Large-Scale Automatic Speech Recognition Model for Eastern Languages
Yangyang Meng
Jinpeng Li
Guodong Lin
Yu Pu
G. Wang
Hu Du
Zhiming Shao
Yukai Huang
Ke Li
Wei-Qiang Zhang
ObjD
140
0
0
26 Mar 2025
Emilia: A Large-Scale, Extensive, Multilingual, and Diverse Dataset for Speech Generation
Haorui He
Zengqiang Shang
Chaoren Wang
Xuyuan Li
Yicheng Gu
...
Peiyang Shi
Yansen Wang
Kai Chen
Pengyuan Zhang
Zhikai Wu
AuLLM
139
5
0
28 Jan 2025
Distilling an End-to-End Voice Assistant Without Instruction Training Data
William B. Held
Ella Li
Michael Joseph Ryan
Weiyan Shi
Yanzhe Zhang
Diyi Yang
AuLLM
87
16
0
03 Oct 2024
FruitsMusic: A Real-World Corpus of Japanese Idol-Group Songs
Hitoshi Suda
Shunsuke Yoshida
Tomohiko Nakamura
Satoru Fukayama
Jun Ogata
59
0
0
19 Sep 2024
Enhancing Low-Resource Language and Instruction Following Capabilities of Audio Language Models
Potsawee Manakul
Guangzhi Sun
Warit Sirichotedumrong
Kasima Tharnpipitchai
Kunat Pipatanakul
AuLLM
118
7
0
17 Sep 2024
Enhancing Large Language Model-based Speech Recognition by Contextualization for Rare and Ambiguous Words
Kento Nozawa
Takashi Masuko
Toru Taniguchi
67
1
0
15 Aug 2024
Consent in Crisis: The Rapid Decline of the AI Data Commons
Shayne Longpre
Robert Mahari
Ariel N. Lee
Campbell Lund
Hamidah Oderinwale
...
Hanlin Li
Daphne Ippolito
Sara Hooker
Jad Kabbara
Sandy Pentland
123
42
0
20 Jul 2024
Towards Robust Speech Representation Learning for Thousands of Languages
William Chen
Wangyou Zhang
Yifan Peng
Xinjian Li
Jinchuan Tian
Jiatong Shi
Xuankai Chang
Soumi Maiti
Karen Livescu
Shinji Watanabe
ELM
132
19
0
30 Jun 2024
GigaSpeech 2: An Evolving, Large-Scale and Multi-domain ASR Corpus for Low-Resource Languages with Automated Crawling, Transcription and Refinement
Yifan Yang
Zheshu Song
Jianheng Zhuo
Mingyu Cui
Jinpeng Li
...
Shuai Fan
Kai Yu
Wei Zhang
Guoguo Chen
Xie Chen
133
12
0
17 Jun 2024
OWSM v3.1: Better and Faster Open Whisper-Style Speech Models based on E-Branchformer
Yifan Peng
Jinchuan Tian
William Chen
Siddhant Arora
Brian Yan
...
Kwanghee Choi
Jiatong Shi
Xuankai Chang
Jee-weon Jung
Shinji Watanabe
VLM
OSLM
103
54
0
30 Jan 2024
1