ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1912.02615
  4. Cited By
Audiovisual Transformer Architectures for Large-Scale Classification and
  Synchronization of Weakly Labeled Audio Events

Audiovisual Transformer Architectures for Large-Scale Classification and Synchronization of Weakly Labeled Audio Events

2 December 2019
Wim Boes
Hugo Van hamme
ArXivPDFHTML

Papers citing "Audiovisual Transformer Architectures for Large-Scale Classification and Synchronization of Weakly Labeled Audio Events"

4 / 4 papers shown
Title
Temporal and cross-modal attention for audio-visual zero-shot learning
Temporal and cross-modal attention for audio-visual zero-shot learning
Otniel-Bogdan Mercea
Thomas Hummel
A. Sophia Koepke
Zeynep Akata
38
25
0
20 Jul 2022
GAFX: A General Audio Feature eXtractor
GAFX: A General Audio Feature eXtractor
Zhaoyang Bu
Han Zhang
Xiaohu Zhu
30
0
0
19 Jul 2022
TriBERT: Full-body Human-centric Audio-visual Representation Learning
  for Visual Sound Separation
TriBERT: Full-body Human-centric Audio-visual Representation Learning for Visual Sound Separation
Tanzila Rahman
Mengyu Yang
Leonid Sigal
ViT
29
8
0
26 Oct 2021
A Transformer-based Audio Captioning Model with Keyword Estimation
A Transformer-based Audio Captioning Model with Keyword Estimation
Yuma Koizumi
Ryo Masumura
Kyosuke Nishida
Masahiro Yasuda
Shoichiro Saito
18
54
0
01 Jul 2020
1