Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1911.09602
Cited By
Learning Hierarchical Discrete Linguistic Units from Visually-Grounded Speech
21 November 2019
David F. Harwath
Wei-Ning Hsu
James R. Glass
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Learning Hierarchical Discrete Linguistic Units from Visually-Grounded Speech"
15 / 15 papers shown
Title
A model of early word acquisition based on realistic-scale audiovisual naming events
Khazar Khorrami
Okko Rasanen
NAI
35
0
0
07 Jun 2024
SD-HuBERT: Sentence-Level Self-Distillation Induces Syllabic Organization in HuBERT
Cheol Jun Cho
Abdelrahman Mohamed
Shang-Wen Li
Alan W. Black
Gopala K. Anumanchipalli
29
8
0
16 Oct 2023
Syllable Discovery and Cross-Lingual Generalization in a Visually Grounded, Self-Supervised Speech Model
Puyuan Peng
Shang-Wen Li
Okko Rasanen
Abdel-rahman Mohamed
David F. Harwath
SSL
VLM
26
7
0
19 May 2023
Hindi as a Second Language: Improving Visually Grounded Speech with Semantically Similar Samples
H. Ryu
Arda Senocak
In So Kweon
Joon Son Chung
VLM
19
8
0
30 Mar 2023
Towards visually prompted keyword localisation for zero-resource spoken languages
Leanne Nortje
Herman Kamper
11
6
0
12 Oct 2022
Self-Supervised Speech Representation Learning: A Review
Abdel-rahman Mohamed
Hung-yi Lee
Lasse Borgholt
Jakob Drachmann Havtorn
Joakim Edin
...
Shang-Wen Li
Karen Livescu
Lars Maaløe
Tara N. Sainath
Shinji Watanabe
SSL
AI4TS
124
348
0
21 May 2022
SAMU-XLSR: Semantically-Aligned Multimodal Utterance-level Cross-Lingual Speech Representation
Sameer Khurana
Antoine Laurent
James R. Glass
25
36
0
17 May 2022
Improving Multimodal Speech Recognition by Data Augmentation and Speech Representations
Dan Oneaţă
H. Cucu
11
19
0
27 Apr 2022
Autoregressive Co-Training for Learning Discrete Speech Representations
Sung-Lin Yeh
Hao Tang
SSL
11
6
0
29 Mar 2022
Audio Self-supervised Learning: A Survey
Shuo Liu
Adria Mallol-Ragolta
Emilia Parada-Cabeleiro
Kun Qian
Xingshuo Jing
Alexander Kathan
Bin Hu
Bjoern W. Schuller
SSL
26
106
0
02 Mar 2022
Keyword localisation in untranscribed speech using visually grounded speech models
Kayode Olaleye
Dan Oneaţă
Herman Kamper
19
7
0
02 Feb 2022
Towards Semi-Supervised Semantics Understanding from Speech
Cheng-I Jeff Lai
Jin Cao
S. Bodapati
Shang-Wen Li
SSL
14
7
0
11 Nov 2020
Similarity Analysis of Self-Supervised Speech Representations
Yu-An Chung
Yonatan Belinkov
James R. Glass
SSL
28
36
0
22 Oct 2020
AVLnet: Learning Audio-Visual Language Representations from Instructional Videos
Andrew Rouditchenko
Angie Boggust
David F. Harwath
Brian Chen
D. Joshi
...
Rogerio Feris
Brian Kingsbury
M. Picheny
Antonio Torralba
James R. Glass
SSL
22
141
0
16 Jun 2020
Experience Grounds Language
Yonatan Bisk
Ari Holtzman
Jesse Thomason
Jacob Andreas
Yoshua Bengio
...
Angeliki Lazaridou
Jonathan May
Aleksandr Nisnevich
Nicolas Pinto
Joseph P. Turian
19
350
0
21 Apr 2020
1