ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1902.03052
  4. Cited By
Models of Visually Grounded Speech Signal Pay Attention To Nouns: a
  Bilingual Experiment on English and Japanese

Models of Visually Grounded Speech Signal Pay Attention To Nouns: a Bilingual Experiment on English and Japanese

8 February 2019
William N. Havard
Jean-Pierre Chevrot
Laurent Besacier
ArXiv (abs)PDFHTML

Papers citing "Models of Visually Grounded Speech Signal Pay Attention To Nouns: a Bilingual Experiment on English and Japanese"

13 / 13 papers shown
Syllable Discovery and Cross-Lingual Generalization in a Visually
  Grounded, Self-Supervised Speech Model
Syllable Discovery and Cross-Lingual Generalization in a Visually Grounded, Self-Supervised Speech ModelInterspeech (Interspeech), 2023
Puyuan Peng
Shang-Wen Li
Okko Räsänen
Abdel-rahman Mohamed
David Harwath
SSLVLM
335
11
0
19 May 2023
M-SpeechCLIP: Leveraging Large-Scale, Pre-Trained Models for
  Multilingual Speech to Image Retrieval
M-SpeechCLIP: Leveraging Large-Scale, Pre-Trained Models for Multilingual Speech to Image RetrievalIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022
Layne Berry
Yi-Jen Shih
Hsuan-Fu Wang
Heng-Jui Chang
Hung-yi Lee
David Harwath
VLM
256
11
0
02 Nov 2022
Self-Supervised Speech Representation Learning: A Review
Self-Supervised Speech Representation Learning: A ReviewIEEE Journal on Selected Topics in Signal Processing (IEEE JSTSP), 2022
Abdel-rahman Mohamed
Hung-yi Lee
Lasse Borgholt
Jakob Drachmann Havtorn
Joakim Edin
...
Shang-Wen Li
Karen Livescu
Lars Maaløe
Tara N. Sainath
Shinji Watanabe
SSLAI4TS
796
475
0
21 May 2022
Learning English with Peppa Pig
Learning English with Peppa PigTransactions of the Association for Computational Linguistics (TACL), 2022
Mitja Nikolaus
Afra Alishahi
Grzegorz Chrupała
262
16
0
25 Feb 2022
Keyword localisation in untranscribed speech using visually grounded
  speech models
Keyword localisation in untranscribed speech using visually grounded speech modelsIEEE Journal on Selected Topics in Signal Processing (IEEE JSTSP), 2022
Kayode Olaleye
Dan Oneaţă
Herman Kamper
260
7
0
02 Feb 2022
Cascaded Multilingual Audio-Visual Learning from Videos
Cascaded Multilingual Audio-Visual Learning from Videos
Andrew Rouditchenko
Angie Boggust
David Harwath
Samuel Thomas
Hilde Kuehne
...
Yikang Shen
Rogerio Feris
Brian Kingsbury
M. Picheny
James R. Glass
618
8
0
08 Nov 2021
Spoken ObjectNet: A Bias-Controlled Spoken Caption Dataset
Spoken ObjectNet: A Bias-Controlled Spoken Caption Dataset
Ian Palmer
Andrew Rouditchenko
Andrei Barbu
Boris Katz
James R. Glass
179
4
0
14 Oct 2021
Can phones, syllables, and words emerge as side-products of
  cross-situational audiovisual learning? -- A computational investigation
Can phones, syllables, and words emerge as side-products of cross-situational audiovisual learning? -- A computational investigation
Khazar Khorrami
Okko Räsänen
298
24
0
29 Sep 2021
ZR-2021VG: Zero-Resource Speech Challenge, Visually-Grounded Language
  Modelling track, 2021 edition
ZR-2021VG: Zero-Resource Speech Challenge, Visually-Grounded Language Modelling track, 2021 edition
Afra Alishahia
Grzegorz Chrupała
Alejandrina Cristià
Emmanuel Dupoux
Bertrand Higy
Marvin Lavechin
Okko Räsänen
Chen Yu
222
7
0
14 Jul 2021
Text-Free Image-to-Speech Synthesis Using Learned Segmental Units
Text-Free Image-to-Speech Synthesis Using Learned Segmental UnitsAnnual Meeting of the Association for Computational Linguistics (ACL), 2020
Wei-Ning Hsu
David Harwath
Christopher Song
James R. Glass
CLIP
227
74
0
31 Dec 2020
Catplayinginthesnow: Impact of Prior Segmentation on a Model of Visually
  Grounded Speech
Catplayinginthesnow: Impact of Prior Segmentation on a Model of Visually Grounded Speech
William N. Havard
Jean-Pierre Chevrot
Laurent Besacier
242
11
0
15 Jun 2020
Learning Hierarchical Discrete Linguistic Units from Visually-Grounded
  Speech
Learning Hierarchical Discrete Linguistic Units from Visually-Grounded SpeechInternational Conference on Learning Representations (ICLR), 2019
David Harwath
Wei-Ning Hsu
James R. Glass
284
88
0
21 Nov 2019
Word Recognition, Competition, and Activation in a Model of Visually
  Grounded Speech
Word Recognition, Competition, and Activation in a Model of Visually Grounded SpeechConference on Computational Natural Language Learning (CoNLL), 2019
William N. Havard
Jean-Pierre Chevrot
Laurent Besacier
153
23
0
18 Sep 2019
1
Page 1 of 1