Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales

Terms and Conditions

Twitter GitHub LinkedIn Bluesky Youtube

© 2026 ResearchTrend.AI, All rights reserved.

Home
Papers
2008.06258
Cited By

Unsupervised vs. transfer learning for multimodal one-shot matching of
speech and images

Unsupervised vs. transfer learning for multimodal one-shot matching of speech and images

14 August 2020

ArXiv (abs)PDF HTML

Papers citing "Unsupervised vs. transfer learning for multimodal one-shot matching of speech and images"

8 / 8 papers shown

Visually grounded few-shot word learning in low-resource settings

Visually grounded few-shot word learning in low-resource settingsIEEE/ACM Transactions on Audio Speech and Language Processing (TASLP), 2023

216

4

0

20 Jun 2023

Visually grounded few-shot word acquisition with fewer shots

Visually grounded few-shot word acquisition with fewer shotsInterspeech (Interspeech), 2023

Benjamin van Niekerk

157

1

0

25 May 2023

Towards visually prompted keyword localisation for zero-resource spoken
languages

Towards visually prompted keyword localisation for zero-resource spoken languagesSpoken Language Technology Workshop (SLT), 2022

157

6

0

12 Oct 2022

YFACC: A Yorùbá speech-image dataset for cross-lingual keyword
localisation through visual grounding

YFACC: A Yorùbá speech-image dataset for cross-lingual keyword localisation through visual groundingSpoken Language Technology Workshop (SLT), 2022

226

8

0

10 Oct 2022

Keyword localisation in untranscribed speech using visually grounded
speech models

Keyword localisation in untranscribed speech using visually grounded speech modelsIEEE Journal on Selected Topics in Signal Processing (IEEE JSTSP), 2022

206

7

0

02 Feb 2022

Attention-Based Keyword Localisation in Speech using Visual Grounding

Attention-Based Keyword Localisation in Speech using Visual Grounding

129

13

0

16 Jun 2021

A Multiple Classifier Approach for Concatenate-Designed Neural Networks

A Multiple Classifier Approach for Concatenate-Designed Neural Networks

132

23

0

14 Jan 2021

Direct multimodal few-shot learning of speech and images

Direct multimodal few-shot learning of speech and imagesInterspeech (Interspeech), 2020

316

10

0

10 Dec 2020

Page 1 of 1