Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1609.08675
Cited By
YouTube-8M: A Large-Scale Video Classification Benchmark
27 September 2016
Sami Abu-El-Haija
Nisarg Kothari
Joonseok Lee
Apostol Natsev
G. Toderici
Balakrishnan Varadarajan
Sudheendra Vijayanarasimhan
VLM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"YouTube-8M: A Large-Scale Video Classification Benchmark"
50 / 170 papers shown
Title
Skating-Mixer: Long-Term Sport Audio-Visual Modeling with MLPs
Jingfei Xia
Mingchen Zhuge
Tiantian Geng
Shun Fan
Yuantai Wei
Zhenyu He
Feng Zheng
21
14
0
08 Mar 2022
Spatio-temporal Vision Transformer for Super-resolution Microscopy
Charles N Christensen
M. Lu
Edward N. Ward
Pietro Lio'
C. Kaminski
19
8
0
28 Feb 2022
When Did It Happen? Duration-informed Temporal Localization of Narrated Actions in Vlogs
Oana Ignat
Santiago Castro
Yuhang Zhou
Jiajun Bao
Dandan Shan
Rada Mihalcea
18
3
0
16 Feb 2022
Boundary-aware Self-supervised Learning for Video Scene Segmentation
Jonghwan Mun
Minchul Shin
Gunsoo Han
Sangho Lee
S. Ha
Joonseok Lee
Eun-Sol Kim
SSL
44
20
0
14 Jan 2022
MERLOT Reserve: Neural Script Knowledge through Vision and Language and Sound
Rowan Zellers
Jiasen Lu
Ximing Lu
Youngjae Yu
Yanpeng Zhao
Mohammadreza Salehi
Aditya Kusupati
Jack Hessel
Ali Farhadi
Yejin Choi
26
207
0
07 Jan 2022
InstaIndoor and Multi-modal Deep Learning for Indoor Scene Recognition
A. Glavan
Estefanía Talavera
13
10
0
23 Dec 2021
SVIP: Sequence VerIfication for Procedures in Videos
Yichen Qian
Weixin Luo
Dongze Lian
Xu Tang
P. Zhao
Shenghua Gao
ViT
21
17
0
13 Dec 2021
Time-Equivariant Contrastive Video Representation Learning
Simon Jenni
Hailin Jin
SSL
AI4TS
135
60
0
07 Dec 2021
Self-supervised Video Transformer
Kanchana Ranasinghe
Muzammal Naseer
Salman Khan
F. Khan
Michael S. Ryoo
ViT
26
84
0
02 Dec 2021
Advancing High-Resolution Video-Language Representation with Large-Scale Video Transcriptions
Hongwei Xue
Tiankai Hang
Yanhong Zeng
Yuchong Sun
Bei Liu
Huan Yang
Jianlong Fu
B. Guo
AI4TS
VLM
27
189
0
19 Nov 2021
Video Background Music Generation with Controllable Music Transformer
Shangzhe Di
Jiang
Sihan Liu
Zhaokai Wang
Leyan Zhu
Zexin He
Hongming Liu
Shuicheng Yan
11
91
0
16 Nov 2021
Occluded Video Instance Segmentation: Dataset and ICCV 2021 Challenge
Jiyang Qi
Yan Gao
Yao Hu
Xinggang Wang
Xiaoyu Liu
Xiang Bai
Serge J. Belongie
Alan Yuille
Philip H. S. Torr
S. Bai
VOS
24
6
0
15 Nov 2021
Masking Modalities for Cross-modal Video Retrieval
Valentin Gabeur
Arsha Nagrani
Chen Sun
Alahari Karteek
Cordelia Schmid
11
29
0
01 Nov 2021
SERAB: A multi-lingual benchmark for speech emotion recognition
Neil Scheidwasser
M. Kegler
P. Beckmann
Milos Cernak
24
44
0
07 Oct 2021
Motion-aware Contrastive Video Representation Learning via Foreground-background Merging
Shuangrui Ding
Maomao Li
Tianyu Yang
Rui Qian
Haohang Xu
Qingyi Chen
Jue Wang
Hongkai Xiong
SSL
18
49
0
30 Sep 2021
LIGAR: Lightweight General-purpose Action Recognition
Evgeny Izutov
10
3
0
30 Aug 2021
Weakly-supervised Joint Anomaly Detection and Classification
Snehashis Majhi
Srijan Das
F. Brémond
Ratnakar Dash
Pankaj K. Sa
19
19
0
20 Aug 2021
Cross-modal Spectrum Transformation Network For Acoustic Scene classification
Yang Liu
A. Neophytou
Sunando Sengupta
Eric Sommerlade
16
9
0
13 Aug 2021
FakeAVCeleb: A Novel Audio-Video Multimodal Deepfake Dataset
Hasam Khalid
Shahroz Tariq
Minha Kim
Simon S. Woo
29
183
0
11 Aug 2021
Enhancing Self-supervised Video Representation Learning via Multi-level Feature Optimization
Rui Qian
Yuxi Li
Huabin Liu
John See
Shuangrui Ding
Xian Liu
Dian Li
Weiyao Lin
30
42
0
04 Aug 2021
FoleyGAN: Visually Guided Generative Adversarial Network-Based Synchronous Sound Generation in Silent Videos
Sanchita Ghose
John J. Prevost
GAN
16
26
0
20 Jul 2021
FineAction: A Fine-Grained Video Dataset for Temporal Action Localization
Yi Liu
Limin Wang
Yali Wang
Xiao Ma
Yu Qiao
17
56
0
24 May 2021
MultiSports: A Multi-Person Video Dataset of Spatio-Temporally Localized Sports Actions
Yixuan Li
Lei Chen
Runyu He
Zhenzhi Wang
Gangshan Wu
Limin Wang
19
97
0
16 May 2021
Audio Retrieval with Natural Language Queries
Andreea-Maria Oncescu
A. Sophia Koepke
João F. Henriques
Zeynep Akata
Samuel Albanie
19
77
0
05 May 2021
Comparison and Analysis of Deep Audio Embeddings for Music Emotion Recognition
E. Koh
Shlomo Dubnov
24
38
0
13 Apr 2021
Automatic Generation of Descriptive Titles for Video Clips Using Deep Learning
Soheyla Amirian
Khaled Rasheed
T. Taha
H. Arabnia
VLM
VGen
14
23
0
07 Apr 2021
Broaden Your Views for Self-Supervised Video Learning
Adrià Recasens
Pauline Luc
Jean-Baptiste Alayrac
Luyu Wang
Ross Hemsley
...
Florent Altché
M. Valko
Jean-Bastien Grill
Aaron van den Oord
Andrew Zisserman
SSL
AI4TS
23
127
0
30 Mar 2021
Enhancing Transformer for Video Understanding Using Gated Multi-Level Attention and Temporal Adversarial Training
Saurabh Sahu
Palash Goyal
ViT
27
2
0
18 Mar 2021
RMS-Net: Regression and Masking for Soccer Event Spotting
Matteo Tomei
Lorenzo Baraldi
Simone Calderara
Simone Bronzin
Rita Cucchiara
27
28
0
15 Feb 2021
Learning to Anticipate Egocentric Actions by Imagination
Yu Wu
Linchao Zhu
Xiaohan Wang
Yi Yang
Fei Wu
EgoV
79
69
0
13 Jan 2021
Advances in Electron Microscopy with Deep Learning
Jeffrey M. Ede
27
2
0
04 Jan 2021
Context-Aware Personality Inference in Dyadic Scenarios: Introducing the UDIVA Dataset
Cristina Palmero
Javier Selva
Sorina Smeureanu
Julio C. S. Jacques Junior
Albert Clapés
...
Zejian Zhang
D. Gallardo-Pujol
G. Guilera
D. Leiva
Sergio Escalera
28
53
0
28 Dec 2020
SMART Frame Selection for Action Recognition
Shreyank N. Gowda
Marcus Rohrbach
Laura Sevilla-Lara
15
141
0
19 Dec 2020
Multi-shot Temporal Event Localization: a Benchmark
Xiaolong Liu
Yao Hu
S. Bai
Fei Ding
X. Bai
Philip H. S. Torr
36
81
0
17 Dec 2020
A Comprehensive Study of Deep Video Action Recognition
Yi Zhu
Xinyu Li
Chunhui Liu
Mohammadreza Zolfaghari
Yuanjun Xiong
Chongruo Wu
Zhi-Li Zhang
Joseph Tighe
R. Manmatha
Mu Li
VLM
AI4TS
30
184
0
11 Dec 2020
MEVA: A Large-Scale Multiview, Multimodal Video Dataset for Activity Detection
Kellie Corona
Katie Osterdahl
Roderic Collins
A. Hoogs
14
62
0
02 Dec 2020
Multi-Modal Detection of Alzheimer's Disease from Speech and Text
Amish Mittal
Sourav Sahoo
Arnhav Datar
Juned Kadiwala
H. Shalu
Jimson Mathew
4
20
0
30 Nov 2020
SoccerNet-v2: A Dataset and Benchmarks for Holistic Understanding of Broadcast Soccer Videos
A. Deliège
A. Cioppa
Silvio Giancola
M. J. Seikavandi
J. Dueholm
Kamal Nasrollahi
Bernard Ghanem
T. Moeslund
Marc Van Droogenbroeck
13
151
0
26 Nov 2020
Video Big Data Analytics in the Cloud: A Reference Architecture, Survey, Opportunities, and Open Research Issues
A. Alam
I. Ullah
Young-Koo Lee
34
22
0
16 Nov 2020
Deep Analysis of CNN-based Spatio-temporal Representations for Action Recognition
Chun-Fu Chen
Rameswar Panda
K. Ramakrishnan
Rogerio Feris
J. M. Cohn
A. Oliva
Quanfu Fan
21
95
0
22 Oct 2020
Learning Visual Voice Activity Detection with an Automatically Annotated Dataset
Sylvain Guy
Stéphane Lathuilière
Pablo Mesejo
Radu Horaud
19
11
0
23 Sep 2020
DeepRemaster: Temporal Source-Reference Attention Networks for Comprehensive Video Enhancement
S. Iizuka
E. Simo-Serra
13
39
0
18 Sep 2020
Review: Deep Learning in Electron Microscopy
Jeffrey M. Ede
22
79
0
17 Sep 2020
Real-Time Selfie Video Stabilization
Ji-yang Yu
R. Ramamoorthi
Ke-Li Cheng
M. Sarkis
N. Bi
19
22
0
04 Sep 2020
Look, Listen, and Attend: Co-Attention Network for Self-Supervised Audio-Visual Representation Learning
Ying Cheng
Ruize Wang
Zhihao Pan
Rui Feng
Yuejie Zhang
SSL
14
106
0
13 Aug 2020
Pixel-wise Crowd Understanding via Synthetic Data
Wang Qi
Junyu Gao
Lin Wei
Yuan. Yuan
25
117
0
30 Jul 2020
Learning Video Representations from Textual Web Supervision
Jonathan C. Stroud
Zhichao Lu
Chen Sun
Jia Deng
Rahul Sukthankar
Cordelia Schmid
David A. Ross
SSL
24
48
0
29 Jul 2020
End-to-end Learning of Compressible Features
Saurabh Singh
Sami Abu-El-Haija
Nick Johnston
Johannes Ballé
Abhinav Shrivastava
G. Toderici
SSL
89
71
0
23 Jul 2020
Context-Aware RCNN: A Baseline for Action Detection in Videos
Jianchao Wu
Zhanghui Kuang
Limin Wang
Wayne Zhang
Gangshan Wu
22
79
0
20 Jul 2020
On Robustness and Transferability of Convolutional Neural Networks
Josip Djolonga
Jessica Yung
Michael Tschannen
Rob Romijnders
Lucas Beyer
...
D. Moldovan
Sylvain Gelly
N. Houlsby
Xiaohua Zhai
Mario Lucic
OOD
8
153
0
16 Jul 2020
Previous
1
2
3
4
Next