ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2301.04856
  4. Cited By
Multimodal Deep Learning

Multimodal Deep Learning

International Conference on Machine Learning (ICML), 2011
12 January 2023
Cem Akkus
Jiquan Ngiam
Vladana Djakovic
Steffen Jauch-Walser
A. Khosla
Mingyu Kim
Christopher Marquardt
Marco Moldovan
Nadja Sauter
Juhan Nam
Rickmer Schulte
Karol Urbanczyk
Jann Goschenhofer
Honglak Lee
A. Ng
Daniel Schalk
Yi Men
ArXiv (abs)PDFHTML

Papers citing "Multimodal Deep Learning"

50 / 844 papers shown
Learning Factorized Multimodal Representations
Learning Factorized Multimodal Representations
Yifan Hao
Paul Pu Liang
Amir Zadeh
Louis-Philippe Morency
Ruslan Salakhutdinov
DRL
276
500
0
16 Jun 2018
On Machine Learning and Structure for Mobile Robots
On Machine Learning and Structure for Mobile Robots
Markus Wulfmeier
130
5
0
15 Jun 2018
Deep Learning for Classification Tasks on Geospatial Vector Polygons
Deep Learning for Classification Tasks on Geospatial Vector Polygons
R. V. Veer
Peter Bloem
E. Folmer
196
20
0
11 Jun 2018
Learn to Combine Modalities in Multimodal Deep Learning
Learn to Combine Modalities in Multimodal Deep Learning
Kuan Liu
Yanen Li
N. Xu
Premkumar Natarajan
210
157
0
29 May 2018
More Than a Feeling: Learning to Grasp and Regrasp using Vision and
  Touch
More Than a Feeling: Learning to Grasp and Regrasp using Vision and Touch
Roberto Calandra
Andrew Owens
Dinesh Jayaraman
Justin Lin
Wenzhen Yuan
Jitendra Malik
Edward H. Adelson
Sergey Levine
331
373
0
28 May 2018
Unsupervised Learning for Trustworthy IoT
Unsupervised Learning for Trustworthy IoT
Nikhil Banerjee
Thanassis Giannetsos
E. Panaousis
C. C. Took
66
22
0
25 May 2018
Omega: An Architecture for AI Unification
Omega: An Architecture for AI Unification
Eray Özkural
AI4CE
134
1
0
16 May 2018
On Learning Associations of Faces and Voices
On Learning Associations of Faces and Voices
Changil Kim
Hijung Valentina Shin
Tae-Hyun Oh
Alexandre Kaspar
Mohamed A. Elgharib
Wojciech Matusik
CVBM
220
91
0
15 May 2018
Learnable PINs: Cross-Modal Embeddings for Person Identity
Learnable PINs: Cross-Modal Embeddings for Person Identity
Arsha Nagrani
Samuel Albanie
Andrew Zisserman
SSL
202
162
0
02 May 2018
Investigations on End-to-End Audiovisual Fusion
Investigations on End-to-End Audiovisual Fusion
Michael Wand
Ngoc Thang Vu
J. Schmidhuber
96
28
0
30 Apr 2018
A Bimodal Learning Approach to Assist Multi-sensory Effects
  Synchronization
A Bimodal Learning Approach to Assist Multi-sensory Effects Synchronization
R. Abreu
J. Santos
Eduardo Bezerra
71
9
0
28 Apr 2018
Multi Layered-Parallel Graph Convolutional Network (ML-PGCN) for Disease
  Prediction
Multi Layered-Parallel Graph Convolutional Network (ML-PGCN) for Disease Prediction
Anees Kazi
Shadi Albarqouni
K. Kortuem
Nassir Navab
MedIm
116
6
0
28 Apr 2018
Multi-Modal Coreference Resolution with the Correlation between Space
  Structures
Multi-Modal Coreference Resolution with the Correlation between Space Structures
Qibin Zheng
Xingchun Diao
Jianjun Cao
Xiaolei Zhou
Yi Liu
Hongmei Li
97
3
0
21 Apr 2018
Weakly Supervised Representation Learning for Unsynchronized
  Audio-Visual Events
Weakly Supervised Representation Learning for Unsynchronized Audio-Visual Events
Sanjeel Parekh
S. Essid
A. Ozerov
Ngoc Q. K. Duong
P. Pérez
G. Richard
SSL
180
19
0
19 Apr 2018
Multi-view Hybrid Embedding: A Divide-and-Conquer Approach
Multi-view Hybrid Embedding: A Divide-and-Conquer ApproachIEEE Transactions on Cybernetics (IEEE Trans. Cybern.), 2018
Jiamiao Xu
Shujian Yu
Xinge You
Mengjun Leng
Xiaoyuan Jing
Chen Chen
169
15
0
19 Apr 2018
Deep Multimodal Subspace Clustering Networks
Deep Multimodal Subspace Clustering Networks
Mahdi Abavisani
Vishal M. Patel
304
180
0
17 Apr 2018
Watch, Listen, and Describe: Globally and Locally Aligned Cross-Modal
  Attentions for Video Captioning
Watch, Listen, and Describe: Globally and Locally Aligned Cross-Modal Attentions for Video Captioning
Xinze Wang
Yuan-fang Wang
William Yang Wang
139
80
0
15 Apr 2018
Audio-Visual Scene Analysis with Self-Supervised Multisensory Features
Audio-Visual Scene Analysis with Self-Supervised Multisensory Features
Andrew Owens
Alexei A. Efros
SSL
590
796
0
10 Apr 2018
The Sound of Pixels
The Sound of Pixels
Hang Zhao
Chuang Gan
Andrew Rouditchenko
Carl Vondrick
Josh H. McDermott
Antonio Torralba
VLM
415
575
0
09 Apr 2018
Mix and match networks: encoder-decoder alignment for zero-pair image
  translation
Mix and match networks: encoder-decoder alignment for zero-pair image translation
Yaxing Wang
Joost van de Weijer
Luis Herranz
106
35
0
06 Apr 2018
Unsupervised Correlation Analysis
Unsupervised Correlation Analysis
Yedid Hoshen
Lior Wolf
117
8
0
01 Apr 2018
Cross-modal Deep Variational Hand Pose Estimation
Cross-modal Deep Variational Hand Pose Estimation
Adrian Spurr
Mingli Song
Seonwook Park
Otmar Hilliges
3DH
198
304
0
30 Mar 2018
Audio-Visual Event Localization in Unconstrained Videos
Audio-Visual Event Localization in Unconstrained Videos
Yapeng Tian
Jing Shi
Bochen Li
Zhiyao Duan
Chenliang Xu
358
532
0
23 Mar 2018
Text2Shape: Generating Shapes from Natural Language by Learning Joint
  Embeddings
Text2Shape: Generating Shapes from Natural Language by Learning Joint Embeddings
Kevin Chen
Chris Choy
Manolis Savva
Angel X. Chang
Thomas Funkhouser
Silvio Savarese
3DV
198
270
0
22 Mar 2018
Acoustic feature learning using cross-domain articulatory measurements
Acoustic feature learning using cross-domain articulatory measurements
Qingming Tang
Weiran Wang
Karen Livescu
126
2
0
19 Mar 2018
A Survey on Deep Learning Toolkits and Libraries for Intelligent User
  Interfaces
A Survey on Deep Learning Toolkits and Libraries for Intelligent User Interfaces
Jan Zacharias
Michael Barz
Daniel Sonntag
VLM
181
33
0
13 Mar 2018
Multimodal Recurrent Neural Networks with Information Transfer Layers
  for Indoor Scene Labeling
Multimodal Recurrent Neural Networks with Information Transfer Layers for Indoor Scene LabelingIEEE transactions on multimedia (TMM), 2018
Abrar H. Abdulnabi
Bing Shuai
Zhen Zuo
Lap-Pui Chau
G. Wang
141
25
0
13 Mar 2018
Deep Learning in Mobile and Wireless Networking: A Survey
Deep Learning in Mobile and Wireless Networking: A SurveyIEEE Communications Surveys and Tutorials (COMST), 2018
Chaoyun Zhang
P. Patras
Hamed Haddadi
357
1,423
0
12 Mar 2018
A Hybrid Method for Traffic Flow Forecasting Using Multimodal Deep
  Learning
A Hybrid Method for Traffic Flow Forecasting Using Multimodal Deep Learning
Shengdong Du
Tianrui Li
Xun Gong
S. Horng
AI4TS
151
173
0
06 Mar 2018
Cross-Paced Representation Learning with Partial Curricula for
  Sketch-based Image Retrieval
Cross-Paced Representation Learning with Partial Curricula for Sketch-based Image Retrieval
Dan Xu
Xavier Alameda-Pineda
Jingkuan Song
Elisa Ricci
Andrii Zadaianchuk
SSL
108
22
0
05 Mar 2018
Indic Handwritten Script Identification using Offline-Online Multimodal
  Deep Network
Indic Handwritten Script Identification using Offline-Online Multimodal Deep Network
A. Bhunia
S. Mukherjee
Aneeshan Sain
A. Bhunia
P. Roy
Umapada Pal
215
35
0
23 Feb 2018
ViTac: Feature Sharing between Vision and Tactile Sensing for Cloth
  Texture Recognition
ViTac: Feature Sharing between Vision and Tactile Sensing for Cloth Texture Recognition
Shan Luo
Wenzhen Yuan
Edward H. Adelson
Anthony G. Cohn
R. Fuentes
198
153
0
21 Feb 2018
End-to-end Audiovisual Speech Recognition
End-to-end Audiovisual Speech Recognition
Stavros Petridis
Themos Stafylakis
Pingchuan Ma
Feipeng Cai
Georgios Tzimiropoulos
Maja Pantic
218
276
0
18 Feb 2018
Exact and Consistent Interpretation for Piecewise Linear Neural
  Networks: A Closed Form Solution
Exact and Consistent Interpretation for Piecewise Linear Neural Networks: A Closed Form Solution
Lingyang Chu
X. Hu
Juhua Hu
Lanjun Wang
Jian Pei
158
105
0
17 Feb 2018
Multimodal Generative Models for Scalable Weakly-Supervised Learning
Multimodal Generative Models for Scalable Weakly-Supervised Learning
Mike Wu
Noah D. Goodman
DRL
328
436
0
14 Feb 2018
Attention-Based Guided Structured Sparsity of Deep Neural Networks
Attention-Based Guided Structured Sparsity of Deep Neural Networks
A. Torfi
Rouzbeh A. Shirvani
Sobhan Soleymani
Nasser M. Nasrabadi
192
23
0
13 Feb 2018
Learning to score the figure skating sports videos
Learning to score the figure skating sports videos
C. Xu
Yanwei Fu
Bing Zhang
Z. Chen
Yu-Gang Jiang
Xiangyang Xue
240
151
0
08 Feb 2018
Efficient Large-Scale Multi-Modal Classification
Efficient Large-Scale Multi-Modal Classification
D. Kiela
Edouard Grave
Armand Joulin
Tomas Mikolov
175
174
0
06 Feb 2018
Personalized Machine Learning for Robot Perception of Affect and Engagement in Autism Therapy
Ognjen Rudovic
Jaeryoung Lee
Miles Dai
Bjorn Schuller
Rosalind W. Picard
153
289
0
04 Feb 2018
Real-world Multi-object, Multi-grasp Detection
Real-world Multi-object, Multi-grasp Detection
Fu-Jen Chu
Ruinian Xu
Patricio A. Vela
166
35
0
01 Feb 2018
Deep Multi-view Learning to Rank
Deep Multi-view Learning to Rank
G. Cao
Alexandros Iosifidis
Moncef Gabbouj
Vijay V. Raghavan
Raju N. Gottumukkala
273
14
0
31 Jan 2018
Improving Bi-directional Generation between Different Modalities with
  Variational Autoencoders
Improving Bi-directional Generation between Different Modalities with Variational Autoencoders
Masahiro Suzuki
Kotaro Nakayama
Y. Matsuo
DRL
94
9
0
26 Jan 2018
PDNet: Semantic Segmentation integrated with a Primal-Dual Network for
  Document binarization
PDNet: Semantic Segmentation integrated with a Primal-Dual Network for Document binarization
K. R. Ayyalasomayajula
F. Malmberg
Anders Brun
210
30
0
26 Jan 2018
Deep Canonically Correlated LSTMs
Deep Canonically Correlated LSTMs
Neil Rohit Mallinar
Corbin Rosset
70
15
0
16 Jan 2018
Cross-modal Embeddings for Video and Audio Retrieval
Cross-modal Embeddings for Video and Audio Retrieval
Dídac Surís
A. Duarte
Amaia Salvador
Jordi Torres
Xavier Giró-i-Nieto
SSL
137
75
0
07 Jan 2018
An Order Preserving Bilinear Model for Person Detection in Multi-Modal
  Data
An Order Preserving Bilinear Model for Person Detection in Multi-Modal DataIEEE Workshop/Winter Conference on Applications of Computer Vision (WACV), 2017
Oytun Ulutan
B. Riggan
Nasser M. Nasrabadi
B. S. Manjunath
144
3
0
20 Dec 2017
Learning Sight from Sound: Ambient Sound Provides Supervision for Visual
  Learning
Learning Sight from Sound: Ambient Sound Provides Supervision for Visual LearningInternational Journal of Computer Vision (IJCV), 2017
Andrew Owens
Jiajun Wu
Josh H. McDermott
William T. Freeman
Antonio Torralba
SSL
266
170
0
20 Dec 2017
A Survey on Multi-View Clustering
A Survey on Multi-View Clustering
Guoqing Chao
Shiliang Sun
J. Bi
196
290
0
18 Dec 2017
Adversarial Attribute-Image Person Re-identification
Adversarial Attribute-Image Person Re-identification
Zhou Yin
Weishi Zheng
Ancong Wu
Hong-Xing Yu
Hai Wang
Xiaowei Guo
Feiyue Huang
Jianhuang Lai
GAN
201
3
0
05 Dec 2017
Multimodal Storytelling via Generative Adversarial Imitation Learning
Multimodal Storytelling via Generative Adversarial Imitation Learning
Zhiqian Chen
Xuchao Zhang
Arnold P. Boedihardjo
Jing Dai
Chang-Tien Lu
155
13
0
05 Dec 2017
Previous
123...121314151617
Next