Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2112.08995
Cited By
Connecting the Dots between Audio and Text without Parallel Data through Visual Knowledge Transfer
16 December 2021
Yanpeng Zhao
Jack Hessel
Youngjae Yu
Ximing Lu
Rowan Zellers
Yejin Choi
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Connecting the Dots between Audio and Text without Parallel Data through Visual Knowledge Transfer"
8 / 8 papers shown
Title
Audio-Language Datasets of Scenes and Events: A Survey
Gijs Wijngaard
Elia Formisano
Michele Esposito
M. Dumontier
74
2
0
10 Jan 2025
Gramian Multimodal Representation Learning and Alignment
Giordano Cicchetti
Eleonora Grassucci
Luigi Sigillo
Danilo Comminiello
76
0
0
16 Dec 2024
Harvesting Event Schemas from Large Language Models
Jialong Tang
Hongyu Lin
Zhuoqun Li
Yaojie Lu
Xianpei Han
Le Sun
12
4
0
12 May 2023
CAT: Causal Audio Transformer for Audio Classification
Xiaoyu Liu
Hanlin Lu
Jianbo Yuan
Xinyu Li
ViT
8
21
0
14 Mar 2023
ECLIPSE: Efficient Long-range Video Retrieval using Sight and Sound
Yan-Bo Lin
Jie Lei
Mohit Bansal
Gedas Bertasius
23
39
0
06 Apr 2022
Multimodal Self-Supervised Learning of General Audio Representations
Luyu Wang
Pauline Luc
Adrià Recasens
Jean-Baptiste Alayrac
Aaron van den Oord
SSL
70
41
0
26 Apr 2021
VATT: Transformers for Multimodal Self-Supervised Learning from Raw Video, Audio and Text
Hassan Akbari
Liangzhe Yuan
Rui Qian
Wei-Hong Chuang
Shih-Fu Chang
Yin Cui
Boqing Gong
ViT
231
573
0
22 Apr 2021
Scaling Laws for Neural Language Models
Jared Kaplan
Sam McCandlish
T. Henighan
Tom B. Brown
B. Chess
R. Child
Scott Gray
Alec Radford
Jeff Wu
Dario Amodei
220
4,424
0
23 Jan 2020
1