ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1811.03875
  4. Cited By
Multimodal One-Shot Learning of Speech and Images
v1v2 (latest)

Multimodal One-Shot Learning of Speech and Images

IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2018
9 November 2018
Ryan Eloff
H. Engelbrecht
Herman Kamper
    SSLVLM
ArXiv (abs)PDFHTML

Papers citing "Multimodal One-Shot Learning of Speech and Images"

13 / 13 papers shown
Towards visually prompted keyword localisation for zero-resource spoken
  languages
Towards visually prompted keyword localisation for zero-resource spoken languagesSpoken Language Technology Workshop (SLT), 2022
Leanne Nortje
Herman Kamper
151
6
0
12 Oct 2022
YFACC: A Yorùbá speech-image dataset for cross-lingual keyword
  localisation through visual grounding
YFACC: A Yorùbá speech-image dataset for cross-lingual keyword localisation through visual groundingSpoken Language Technology Workshop (SLT), 2022
Kayode Olaleye
Dan Oneaţă
Herman Kamper
ObjD
209
8
0
10 Oct 2022
Meta Learning for Natural Language Processing: A Survey
Meta Learning for Natural Language Processing: A SurveyNorth American Chapter of the Association for Computational Linguistics (NAACL), 2022
Hung-yi Lee
Shang-Wen Li
Ngoc Thang Vu
343
51
0
03 May 2022
Keyword localisation in untranscribed speech using visually grounded
  speech models
Keyword localisation in untranscribed speech using visually grounded speech modelsIEEE Journal on Selected Topics in Signal Processing (IEEE JSTSP), 2022
Kayode Olaleye
Dan Oneaţă
Herman Kamper
196
7
0
02 Feb 2022
Multimodality in Meta-Learning: A Comprehensive Survey
Multimodality in Meta-Learning: A Comprehensive Survey
Yao Ma
Shilin Zhao
Weixiao Wang
Yaoman Li
Irwin King
252
71
0
28 Sep 2021
HetMAML: Task-Heterogeneous Model-Agnostic Meta-Learning for Few-Shot
  Learning Across Modalities
HetMAML: Task-Heterogeneous Model-Agnostic Meta-Learning for Few-Shot Learning Across ModalitiesInternational Conference on Information and Knowledge Management (CIKM), 2021
Jiayi Chen
Aidong Zhang
162
16
0
17 May 2021
Text-Free Image-to-Speech Synthesis Using Learned Segmental Units
Text-Free Image-to-Speech Synthesis Using Learned Segmental UnitsAnnual Meeting of the Association for Computational Linguistics (ACL), 2020
Wei-Ning Hsu
David Harwath
Christopher Song
James R. Glass
CLIP
186
74
0
31 Dec 2020
Direct multimodal few-shot learning of speech and images
Direct multimodal few-shot learning of speech and imagesInterspeech (Interspeech), 2020
Leanne Nortje
Herman Kamper
SSL
303
10
0
10 Dec 2020
A Survey on Machine Learning from Few Samples
A Survey on Machine Learning from Few SamplesPattern Recognition (Pattern Recognit.), 2020
Jiang Lu
Pinghua Gong
Jieping Ye
Jianwei Zhang
Changshu Zhang
325
78
0
06 Sep 2020
Unsupervised vs. transfer learning for multimodal one-shot matching of
  speech and images
Unsupervised vs. transfer learning for multimodal one-shot matching of speech and images
Leanne Nortje
Herman Kamper
SSL
129
9
0
14 Aug 2020
AVLnet: Learning Audio-Visual Language Representations from
  Instructional Videos
AVLnet: Learning Audio-Visual Language Representations from Instructional Videos
Andrew Rouditchenko
Angie Boggust
David Harwath
Brian Chen
D. Joshi
...
Rogerio Feris
Brian Kingsbury
M. Picheny
Antonio Torralba
James R. Glass
SSL
248
142
0
16 Jun 2020
Deep Neural Networks for Automatic Speech Processing: A Survey from
  Large Corpora to Limited Data
Deep Neural Networks for Automatic Speech Processing: A Survey from Large Corpora to Limited DataEURASIP Journal on Audio, Speech, and Music Processing (JEASMP), 2020
Vincent Roger
Jérôme Farinas
J. Pinquier
115
31
0
09 Mar 2020
Learning Hierarchical Discrete Linguistic Units from Visually-Grounded
  Speech
Learning Hierarchical Discrete Linguistic Units from Visually-Grounded SpeechInternational Conference on Learning Representations (ICLR), 2019
David Harwath
Wei-Ning Hsu
James R. Glass
177
88
0
21 Nov 2019
1