ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2301.04856
  4. Cited By
Multimodal Deep Learning

Multimodal Deep Learning

International Conference on Machine Learning (ICML), 2011
12 January 2023
Cem Akkus
Jiquan Ngiam
Vladana Djakovic
Steffen Jauch-Walser
A. Khosla
Mingyu Kim
Christopher Marquardt
Marco Moldovan
Nadja Sauter
Juhan Nam
Rickmer Schulte
Karol Urbanczyk
Jann Goschenhofer
Honglak Lee
A. Ng
Daniel Schalk
Yi Men
ArXiv (abs)PDFHTML

Papers citing "Multimodal Deep Learning"

50 / 844 papers shown
Deep Collective Matrix Factorization for Augmented Multi-View Learning
Deep Collective Matrix Factorization for Augmented Multi-View Learning
Ragunathan Mariappan
Vaibhav Rajan
150
15
0
28 Nov 2018
Uncertainty aware audiovisual activity recognition using deep Bayesian
  variational inference
Uncertainty aware audiovisual activity recognition using deep Bayesian variational inference
Mahesh Subedar
R. Krishnan
P. López-Meyer
Omesh Tickoo
Jonathan Huang
BDLEDLUQCV
182
0
0
27 Nov 2018
Cross-domain Deep Feature Combination for Bird Species Classification
  with Audio-visual Data
Cross-domain Deep Feature Combination for Bird Species Classification with Audio-visual Data
B. Naranchimeg
Chao Zhang
T. Akashi
78
17
0
26 Nov 2018
A Novel Technique for Evidence based Conditional Inference in Deep
  Neural Networks via Latent Feature Perturbation
A Novel Technique for Evidence based Conditional Inference in Deep Neural Networks via Latent Feature Perturbation
Dinesh Khandelwal
Suyash Agrawal
Parag Singla
Chetan Arora
206
1
0
24 Nov 2018
Words Can Shift: Dynamically Adjusting Word Representations Using
  Nonverbal Behaviors
Words Can Shift: Dynamically Adjusting Word Representations Using Nonverbal BehaviorsAAAI Conference on Artificial Intelligence (AAAI), 2018
Yansen Wang
Ying Shen
Zhun Liu
Paul Pu Liang
Amir Zadeh
Louis-Philippe Morency
242
455
0
23 Nov 2018
Learning from Multiview Correlations in Open-Domain Videos
Learning from Multiview Correlations in Open-Domain VideosIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2018
Nils Holzenberger
Shruti Palaskar
Pranava Madhyastha
Florian Metze
R. Arora
SSL
134
11
0
21 Nov 2018
Visual-Texual Emotion Analysis with Deep Coupled Video and Danmu Neural
  Networks
Visual-Texual Emotion Analysis with Deep Coupled Video and Danmu Neural NetworksIEEE transactions on multimedia (TMM), 2018
Chenchen Li
Jialin Wang
Hongwei Wang
Miao Zhao
Wenjie Li
Xiaotie Deng
116
24
0
19 Nov 2018
Multimodal Densenet
Multimodal Densenet
Faisal Mahmood
Ziyun Yang
Thomas Ashley
Nicholas J. Durr
159
13
0
18 Nov 2018
Semi-supervised Deep Representation Learning for Multi-View Problems
Semi-supervised Deep Representation Learning for Multi-View Problems
Vahid Noroozi
S. Bahaadini
Lei Zheng
Sihong Xie
Weixiang Shao
Philip S. Yu
149
17
0
11 Nov 2018
Multi-Source Neural Variational Inference
Multi-Source Neural Variational Inference
Richard Kurle
Stephan Günnemann
Patrick van der Smagt
BDLSSLDRL
179
27
0
11 Nov 2018
Cross and Learn: Cross-Modal Self-Supervision
Cross and Learn: Cross-Modal Self-SupervisionGerman Conference on Pattern Recognition (DAGM), 2018
Nawid Sayed
Biagio Brattoli
Bjorn Ommer
SSL
250
83
0
09 Nov 2018
Multimodal One-Shot Learning of Speech and Images
Multimodal One-Shot Learning of Speech and ImagesIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2018
Ryan Eloff
H. Engelbrecht
Herman Kamper
SSLVLM
161
36
0
09 Nov 2018
Y^2Seq2Seq: Cross-Modal Representation Learning for 3D Shape and Text by
  Joint Reconstruction and Prediction of View and Word Sequences
Y^2Seq2Seq: Cross-Modal Representation Learning for 3D Shape and Text by Joint Reconstruction and Prediction of View and Word SequencesAAAI Conference on Artificial Intelligence (AAAI), 2018
Akila Pemasiri
Mingyang Shang
Sabesan Sivapalan
Yu-Shen Liu
Matthias Zwicker
3DV
109
58
0
07 Nov 2018
Cogni-Net: Cognitive Feature Learning through Deep Visual Perception
Cogni-Net: Cognitive Feature Learning through Deep Visual Perception
Pranay Mukherjee
Abhirup Das
A. Bhunia
P. Roy
184
19
0
01 Nov 2018
Software Engineering Challenges of Deep Learning
Software Engineering Challenges of Deep Learning
Anders Arpteg
B. Brinne
L. Crnkovic-Friis
J. Bosch
227
189
0
29 Oct 2018
Vehicle Tracking Using Surveillance with Multimodal Data Fusion
Vehicle Tracking Using Surveillance with Multimodal Data Fusion
Yue Zhang
Bin Song
S. Lefebvre
Mohsen Guizani
85
60
0
29 Oct 2018
Decoding Brain Representations by Multimodal Learning of Neural Activity
  and Visual Features
Decoding Brain Representations by Multimodal Learning of Neural Activity and Visual Features
S. Palazzo
C. Spampinato
I. Kavasidis
D. Giordano
Joseph Schmidt
M. Shah
320
150
0
25 Oct 2018
Making Sense of Vision and Touch: Self-Supervised Learning of Multimodal
  Representations for Contact-Rich Tasks
Making Sense of Vision and Touch: Self-Supervised Learning of Multimodal Representations for Contact-Rich Tasks
Michelle A. Lee
Yuke Zhu
K. Srinivasan
Parth Shah
Silvio Savarese
Li Fei-Fei
Animesh Garg
Jeannette Bohg
SSL
257
406
0
24 Oct 2018
Dense Multimodal Fusion for Hierarchically Joint Representation
Dense Multimodal Fusion for Hierarchically Joint Representation
Di Hu
Feiping Nie
Xuelong Li
183
47
0
08 Oct 2018
Image and Encoded Text Fusion for Multi-Modal Classification
Image and Encoded Text Fusion for Multi-Modal Classification
I. Gallo
Alessandro Calefati
Shah Nawaz
Muhammad Kamran Janjua
89
48
0
03 Oct 2018
Pixel and Feature Level Based Domain Adaption for Object Detection in
  Autonomous Driving
Pixel and Feature Level Based Domain Adaption for Object Detection in Autonomous Driving
Yuhu Shan
W. Lu
C. Chew
172
97
0
30 Sep 2018
Audio-Visual Speech Recognition With A Hybrid CTC/Attention Architecture
Audio-Visual Speech Recognition With A Hybrid CTC/Attention Architecture
Stavros Petridis
Themos Stafylakis
Pingchuan Ma
Georgios Tzimiropoulos
Maja Pantic
175
151
0
28 Sep 2018
Vector Learning for Cross Domain Representations
Vector Learning for Cross Domain RepresentationsInternational Conference on Artificial Intelligence and Pattern Recognition (AIPR), 2017
Shagan Sah
Chi Zhang
Thang Nguyen
D. Peri
Ameya Shringi
R. Ptucha
GAN
92
3
0
27 Sep 2018
Machine Learning for Forecasting Mid Price Movement using Limit Order
  Book Data
Machine Learning for Forecasting Mid Price Movement using Limit Order Book Data
Paraskevi Nousi
Avraam Tsantekidis
Nikolaos Passalis
Adamantios Ntakaris
Juho Kanniainen
Anastasios Tefas
Moncef Gabbouj
Alexandros Iosifidis
201
57
0
19 Sep 2018
Incomplete Multi-view Clustering via Graph Regularized Matrix
  Factorization
Incomplete Multi-view Clustering via Graph Regularized Matrix Factorization
Jie Wen
Zheng Zhang
Yong-mei Xu
Zuofeng Zhong
109
82
0
17 Sep 2018
End-to-end Audiovisual Speech Activity Detection with Bimodal Recurrent
  Neural Models
End-to-end Audiovisual Speech Activity Detection with Bimodal Recurrent Neural Models
Fei Tao
John H. L. Hansen
152
35
0
12 Sep 2018
Implicit Analysis of Perceptual Multimedia Experience Based on
  Physiological Response: A Review
Implicit Analysis of Perceptual Multimedia Experience Based on Physiological Response: A Review
Seong-Eun Moon
Jong-Seok Lee
69
45
0
12 Sep 2018
Using Sparse Semantic Embeddings Learned from Multimodal Text and Image
  Data to Model Human Conceptual Knowledge
Using Sparse Semantic Embeddings Learned from Multimodal Text and Image Data to Model Human Conceptual Knowledge
Steven Derby
Paul Miller
B. Murphy
Barry Devereux
226
15
0
07 Sep 2018
Multi-view Factorization AutoEncoder with Network Constraints for
  Multi-omic Integrative Analysis
Multi-view Factorization AutoEncoder with Network Constraints for Multi-omic Integrative Analysis
Tianle Ma
A. Zhang
106
26
0
06 Sep 2018
Attention-based Audio-Visual Fusion for Robust Automatic Speech
  Recognition
Attention-based Audio-Visual Fusion for Robust Automatic Speech Recognition
George Sterpu
Christian Saam
N. Harte
294
70
0
05 Sep 2018
Multi-Adversarial Domain Adaptation
Multi-Adversarial Domain Adaptation
Zhongyi Pei
Zhangjie Cao
Mingsheng Long
Jianmin Wang
OODTTA
191
927
0
04 Sep 2018
Role of Intonation in Scoring Spoken English
Role of Intonation in Scoring Spoken English
Amber Nigam
Ishan Sodhi
Tuhinanksu Das
69
1
0
23 Aug 2018
LRMM: Learning to Recommend with Missing Modalities
LRMM: Learning to Recommend with Missing Modalities
Cheng Wang
Mathias Niepert
Hui Li
138
31
0
21 Aug 2018
Dynamic Temporal Alignment of Speech to Lips
Dynamic Temporal Alignment of Speech to Lips
Tavi Halperin
Ariel Ephrat
Shmuel Peleg
124
43
0
19 Aug 2018
Multimodal Deep Neural Networks using Both Engineered and Learned
  Representations for Biodegradability Prediction
Multimodal Deep Neural Networks using Both Engineered and Learned Representations for Biodegradability Prediction
Garrett B. Goh
Khushmeen Sakloth
Charles Siegel
Abhinav Vishnu
J. Pfaendtner
HAI
149
11
0
13 Aug 2018
Multimodal Language Analysis with Recurrent Multistage Fusion
Multimodal Language Analysis with Recurrent Multistage Fusion
Paul Pu Liang
Liu Ziyin
Amir Zadeh
Louis-Philippe Morency
220
216
0
12 Aug 2018
Semi-supervised Deep Generative Modelling of Incomplete Multi-Modality
  Emotional Data
Semi-supervised Deep Generative Modelling of Incomplete Multi-Modality Emotional Data
Changde Du
Changying Du
Hao Wang
Jinpeng Li
Wei-Long Zheng
Bao-Liang Lu
Huiguang He
179
79
0
27 Jul 2018
Visual Affordance and Function Understanding: A Survey
Visual Affordance and Function Understanding: A SurveyACM Computing Surveys (CSUR), 2018
Mohammed Hassanin
Salman Khan
M. Tahtali
175
60
0
18 Jul 2018
Robust Deep Multi-modal Learning Based on Gated Information Fusion
  Network
Robust Deep Multi-modal Learning Based on Gated Information Fusion NetworkAsian Conference on Computer Vision (ACCV), 2018
Jaekyum Kim
Junho Koh
Yecheol Kim
Jaehyung Choi
Youngbae Hwang
Jun-Won Choi
221
69
0
17 Jul 2018
A Multimodal Approach to Predict Social Media Popularity
A Multimodal Approach to Predict Social Media PopularityConference on Multimedia Information Processing and Retrieval (MIPR), 2018
Mayank Meghawat
Satyendra Yadav
Debanjan Mahata
Yifang Yin
R. Shah
Roger Zimmermann
137
52
0
16 Jul 2018
Object Detection with Deep Learning: A Review
Object Detection with Deep Learning: A Review
Zhong-Qiu Zhao
Peng Zheng
Shou-tao Xu
Xindong Wu
ObjD
584
4,478
0
15 Jul 2018
3D Hand Pose Estimation using Simulation and Partial-Supervision with a
  Shared Latent Space
3D Hand Pose Estimation using Simulation and Partial-Supervision with a Shared Latent Space
M. Abdi
Ehsan Abbasnejad
C. Lim
S. Nahavandi
3DH
77
14
0
14 Jul 2018
Large-Scale Visual Speech Recognition
Large-Scale Visual Speech Recognition
Brendan Shillingford
Yannis Assael
Matthew W. Hoffman
T. Paine
Cían Hughes
...
Marie Mulville
Ben Coppin
Ben Laurie
A. Senior
Nando de Freitas
272
165
0
13 Jul 2018
Seq2Seq2Sentiment: Multimodal Sequence to Sequence Models for Sentiment
  Analysis
Seq2Seq2Sentiment: Multimodal Sequence to Sequence Models for Sentiment Analysis
Hai Pham
Thomas Manzini
Paul Pu Liang
Barnabás Póczós
169
67
0
11 Jul 2018
Fuzzy Logic Interpretation of Quadratic Networks
Fuzzy Logic Interpretation of Quadratic Networks
Fenglei Fan
Ge Wang
213
7
0
04 Jul 2018
Harnessing AI for Speech Reconstruction using Multi-view Silent Video
  Feed
Harnessing AI for Speech Reconstruction using Multi-view Silent Video FeedACM Multimedia (ACM MM), 2018
Yaman Kumar Singla
Mayank Aggarwal
Pratham Nawal
Shiníchi Satoh
R. Shah
Roger Zimmermann
126
26
0
02 Jul 2018
Learning Visually-Grounded Semantics from Contrastive Adversarial
  Samples
Learning Visually-Grounded Semantics from Contrastive Adversarial SamplesInternational Conference on Computational Linguistics (COLING), 2018
Freda Shi
Jiayuan Mao
Tete Xiao
Yuning Jiang
Jian Sun
ObjD
177
52
0
27 Jun 2018
Disentangled VAE Representations for Multi-Aspect and Missing Data
Disentangled VAE Representations for Multi-Aspect and Missing Data
Samuel K. Ainsworth
N. Foti
E. Fox
DRL
164
15
0
24 Jun 2018
Learning Multimodal Representations for Unseen Activities
Learning Multimodal Representations for Unseen Activities
A. Piergiovanni
Michael S. Ryoo
SSL
218
4
0
21 Jun 2018
Multimodal Grounding for Language Processing
Multimodal Grounding for Language Processing
Lisa Beinborn
Teresa Botschen
Iryna Gurevych
159
37
0
17 Jun 2018
Previous
123...111213...151617
Next