ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2309.04628
  4. Cited By
Leveraging Pretrained Image-text Models for Improving Audio-Visual
  Learning

Leveraging Pretrained Image-text Models for Improving Audio-Visual Learning

8 September 2023
Saurabhchand Bhati
Jesús Villalba
Laureano Moro Velázquez
Thomas Thebaud
Najim Dehak
    CLIP
ArXivPDFHTML

Papers citing "Leveraging Pretrained Image-text Models for Improving Audio-Visual Learning"

3 / 3 papers shown
Title
SpeechCLIP: Integrating Speech with Pre-Trained Vision and Language
  Model
SpeechCLIP: Integrating Speech with Pre-Trained Vision and Language Model
Yi-Jen Shih
Hsuan-Fu Wang
Heng-Jui Chang
Layne Berry
Hung-yi Lee
David F. Harwath
VLM
CLIP
38
32
0
03 Oct 2022
Unsupervised Speech Segmentation and Variable Rate Representation
  Learning using Segmental Contrastive Predictive Coding
Unsupervised Speech Segmentation and Variable Rate Representation Learning using Segmental Contrastive Predictive Coding
Saurabhchand Bhati
Jesús Villalba
Piotr Żelasko
Laureano Moro Velázquez
Najim Dehak
SSL
53
22
0
05 Oct 2021
Emerging Properties in Self-Supervised Vision Transformers
Emerging Properties in Self-Supervised Vision Transformers
Mathilde Caron
Hugo Touvron
Ishan Misra
Hervé Jégou
Julien Mairal
Piotr Bojanowski
Armand Joulin
283
5,723
0
29 Apr 2021
1