Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1912.02615
Cited By
Audiovisual Transformer Architectures for Large-Scale Classification and Synchronization of Weakly Labeled Audio Events
2 December 2019
Wim Boes
Hugo Van hamme
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Audiovisual Transformer Architectures for Large-Scale Classification and Synchronization of Weakly Labeled Audio Events"
4 / 4 papers shown
Title
Temporal and cross-modal attention for audio-visual zero-shot learning
Otniel-Bogdan Mercea
Thomas Hummel
A. Sophia Koepke
Zeynep Akata
38
25
0
20 Jul 2022
GAFX: A General Audio Feature eXtractor
Zhaoyang Bu
Han Zhang
Xiaohu Zhu
30
0
0
19 Jul 2022
TriBERT: Full-body Human-centric Audio-visual Representation Learning for Visual Sound Separation
Tanzila Rahman
Mengyu Yang
Leonid Sigal
ViT
29
8
0
26 Oct 2021
A Transformer-based Audio Captioning Model with Keyword Estimation
Yuma Koizumi
Ryo Masumura
Kyosuke Nishida
Masahiro Yasuda
Shoichiro Saito
18
54
0
01 Jul 2020
1