Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1602.04364
Cited By
Look, Listen and Learn - A Multimodal LSTM for Speaker Identification
13 February 2016
Jimmy S. J. Ren
Yongtao Hu
Yu-Wing Tai
Chuan Wang
Li Xu
Wenxiu Sun
Qiong Yan
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Look, Listen and Learn - A Multimodal LSTM for Speaker Identification"
14 / 14 papers shown
Title
WASD: A Wilder Active Speaker Detection Dataset
Tiago Roxo
Joana Cabral Costa
Pedro R. M. Inácio
Hugo Manuel Proença
14
3
0
09 Mar 2023
Online Multi-modal Person Search in Videos
J. Xia
Anyi Rao
Qingqiu Huang
Linning Xu
Jiangtao Wen
Dahua Lin
23
28
0
08 Aug 2020
Video Playback Rate Perception for Self-supervisedSpatio-Temporal Representation Learning
Yuan Yao
Chang-rui Liu
Dezhao Luo
Yu Zhou
QiXiang Ye
18
169
0
20 Jun 2020
Video Cloze Procedure for Self-Supervised Spatio-Temporal Learning
Dezhao Luo
Chang-rui Liu
Yu Zhou
Dongbao Yang
Can Ma
QiXiang Ye
Weiping Wang
SSL
14
160
0
02 Jan 2020
Content-Aware Unsupervised Deep Homography Estimation
Jirong Zhang
Chuan Wang
Shuaicheng Liu
Lanpeng Jia
Nianjin Ye
Jue Wang
Ji Zhou
Jian-jun Sun
113
149
0
12 Sep 2019
Frame-Recurrent Video Inpainting by Robust Optical Flow Inference
Yifan Ding
Chuan Wang
Haibin Huang
Jiaming Liu
Jue Wang
Liqiang Wang
17
12
0
08 May 2019
Two-phase Hair Image Synthesis by Self-Enhancing Generative Model
Haonan Qiu
Chuan Wang
Hang Zhu
Xiangyu Zhu
Jinjin Gu
Xiaoguang Han
3DH
GAN
11
12
0
28 Feb 2019
MGANet: A Robust Model for Quality Enhancement of Compressed Video
Xiandong Meng
Xuan Deng
Shuyuan Zhu
Shuaicheng Liu
Chuan Wang
Chen Chen
B. Zeng
GAN
16
19
0
22 Nov 2018
Attention-based Audio-Visual Fusion for Robust Automatic Speech Recognition
George Sterpu
Christian Saam
N. Harte
29
65
0
05 Sep 2018
Video Inpainting by Jointly Learning Temporal Structure and Spatial Details
Chuan Wang
Haibin Huang
Xiaoguang Han
Jue Wang
24
154
0
22 Jun 2018
A Hybrid Method for Traffic Flow Forecasting Using Multimodal Deep Learning
Shengdong Du
Tianrui Li
Xun Gong
S. Horng
AI4TS
16
150
0
06 Mar 2018
Self-Supervised Vision-Based Detection of the Active Speaker as Support for Socially-Aware Language Acquisition
Kalin Stefanov
Jonas Beskow
G. Salvi
26
17
0
24 Nov 2017
Accurate Single Stage Detector Using Recurrent Rolling Convolution
Jimmy S. J. Ren
Xiaohao Chen
Jianbo Liu
Wenxiu Sun
Jiahao Pang
Qiong Yan
Yu-Wing Tai
Li Xu
ObjD
18
280
0
19 Apr 2017
Deep Multimodal Representation Learning from Temporal Data
Xitong Yang
Palghat Ramesh
Radha Chitta
S. Madhvanath
Edgar A. Bernal
Jiebo Luo
AI4TS
14
94
0
11 Apr 2017
1