Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
1705.09406
Cited By
v1
v2 (latest)
Multimodal Machine Learning: A Survey and Taxonomy
26 May 2017
T. Baltrušaitis
Chaitanya Ahuja
Louis-Philippe Morency
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Multimodal Machine Learning: A Survey and Taxonomy"
50 / 941 papers shown
What BERT Sees: Cross-Modal Transfer for Visual Question Generation
Thomas Scialom
Patrick Bordes
Paul-Alexis Dray
Jacopo Staiano
Patrick Gallinari
252
7
0
25 Feb 2020
AutoFoley: Artificial Synthesis of Synchronized Sound Tracks for Silent Videos with Deep Learning
IEEE transactions on multimedia (TMM), 2020
Sanchita Ghose
John J. Prevost
VGen
171
50
0
21 Feb 2020
Stroke Constrained Attention Network for Online Handwritten Mathematical Expression Recognition
Pattern Recognition (Pattern Recognit.), 2020
Jiaming Wang
Jun Du
Jianshu Zhang
203
24
0
20 Feb 2020
Neural Attentive Multiview Machines
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2020
Oren Barkan
Ori Katz
Noam Koenigstein
HAI
121
22
0
18 Feb 2020
Deep Robust Multilevel Semantic Cross-Modal Hashing
Ge Song
Jun Zhao
Xiaoyang Tan
111
0
0
07 Feb 2020
Multimodal Data Fusion based on the Global Workspace Theory
International Conference on Multimodal Interaction (ICMI), 2020
C. Bao
Zafeirios Fountas
Temitayo A. Olugbade
N. Bianchi-Berthouze
212
7
0
26 Jan 2020
Improved Robust ASR for Social Robots in Public Spaces
Charles Jankowski
Vishwas Mruthyunjaya
Ruixi Lin
VLM
79
3
0
14 Jan 2020
Think Locally, Act Globally: Federated Learning with Local and Global Representations
Paul Pu Liang
Terrance Liu
Liu Ziyin
Nicholas B. Allen
Randy P. Auerbach
David Brent
Ruslan Salakhutdinov
Louis-Philippe Morency
FedML
624
679
0
06 Jan 2020
Improving Visual Recognition using Ambient Sound for Supervision
Rohan Mahadev
Hongyu Lu
67
1
0
25 Dec 2019
Multimodal Prediction based on Graph Representations
Í. C. Dourado
S. Tabbone
Ricardo da S. Torres
159
0
0
21 Dec 2019
Pathomic Fusion: An Integrated Framework for Fusing Histopathology and Genomic Features for Cancer Diagnosis and Prognosis
IEEE Transactions on Medical Imaging (TMI), 2019
Richard J. Chen
Ming Y. Lu
Jingwen Wang
Drew F. K. Williamson
S. Rodig
N. Lindeman
Faisal Mahmood
361
526
0
18 Dec 2019
Multimodal Self-Supervised Learning for Medical Image Analysis
Information Processing in Medical Imaging (IPMI), 2019
Aiham Taleb
Christoph Lippert
T. Klein
Moin Nabi
SSL
351
122
0
11 Dec 2019
Drug-Target Indication Prediction by Integrating End-to-End Learning and Fingerprints
Brighter Agyemang
Wei-Ping Wu
Michael Y. Kpiebaareh
Ebenezer Nanor
158
3
0
03 Dec 2019
See and Read: Detecting Depression Symptoms in Higher Education Students Using Multimodal Social Media Data
International Conference on Web and Social Media (ICWSM), 2019
Paulo Mann
A. Paes
Elton H. Matsushima
207
44
0
03 Dec 2019
Biometrics Recognition Using Deep Learning: A Survey
Shervin Minaee
AmirAli Abdolrashidi
Hang Su
Bennamoun
David C. Zhang
340
90
0
30 Nov 2019
Multimodal Machine Translation through Visuals and Speech
Machine Translation (MT), 2019
U. Sulubacak
Ozan Caglayan
Stig-Arne Gronroos
Aku Rouhe
Desmond Elliott
Lucia Specia
Jörg Tiedemann
201
88
0
28 Nov 2019
Factorized Multimodal Transformer for Multimodal Sequential Learning
Amir Zadeh
Chengfeng Mao
Kelly Shi
Yiwei Zhang
Paul Pu Liang
Soujanya Poria
Louis-Philippe Morency
138
51
0
22 Nov 2019
Modal-aware Features for Multimodal Hashing
Haien Zeng
Hanjiang Lai
Hanlu Chu
Yong Tang
Jian Yin
153
0
0
19 Nov 2019
Modality to Modality Translation: An Adversarial Representation Learning and Graph Fusion Network for Multimodal Fusion
AAAI Conference on Artificial Intelligence (AAAI), 2019
Sijie Mai
Haifeng Hu
Songlong Xing
GAN
385
222
0
18 Nov 2019
M3ER: Multiplicative Multimodal Emotion Recognition Using Facial, Textual, and Speech Cues
AAAI Conference on Artificial Intelligence (AAAI), 2019
Trisha Mittal
Uttaran Bhattacharya
Rohan Chandra
Aniket Bera
Tianyi Zhou
241
267
0
09 Nov 2019
Towards a General Model of Knowledge for Facial Analysis by Multi-Source Transfer Learning
Valentin Vielzeuf
Alexis Lechervy
S. Pateux
F. Jurie
CVBM
156
5
0
08 Nov 2019
A coupled autoencoder approach for multi-modal analysis of cell types
Neural Information Processing Systems (NeurIPS), 2019
Rohan Gala
N. Gouwens
Zizhen Yao
Agata Budzillo
Osnat Penn
Bosiljka Tasic
G. Murphy
Hongkui Zeng
U. Sümbül
76
32
0
06 Nov 2019
Picking groups instead of samples: A close look at Static Pool-based Meta-Active Learning
Ignasi Mas
J. Morros
Verónica Vilaplana
123
2
0
01 Nov 2019
SignCol: Open-Source Software for Collecting Sign Language Gestures
M. Eslami
Mahdiyeh Karami
Sedigheh Eslami
Solale Tabarestani
F. Torkamani-Azar
Christoph Meinel
SLR
82
2
0
31 Oct 2019
TCT: A Cross-supervised Learning Method for Multimodal Sequence Representation
Wubo Li
Wei Zou
Xiangang Li
ViT
105
0
0
23 Oct 2019
Cross-task pre-training for on-device acoustic scene classification
Ruixiong Zhang
Wei Zou
Xiangang Li
119
1
0
22 Oct 2019
A Survey and Taxonomy of Adversarial Neural Networks for Text-to-Image Synthesis
Jorge Agnese
Jonathan Herrera
Haicheng Tao
Xingquan Zhu
EGVM
169
114
0
21 Oct 2019
PyTorchPipe: a framework for rapid prototyping of pipelines combining language and vision
Tomasz Kornuta
129
3
0
18 Oct 2019
Seeing and Hearing Egocentric Actions: How Much Can We Learn?
Alejandro Cartas
Jordi Luque
Petia Radeva
Carlos Segura
Mariella Dimiccoli
EgoV
127
21
0
15 Oct 2019
To React or not to React: End-to-End Visual Pose Forecasting for Personalized Avatar during Dyadic Conversations
International Conference on Multimodal Interaction (ICMI), 2019
Chaitanya Ahuja
Shugao Ma
Louis-Philippe Morency
Yaser Sheikh
160
62
0
05 Oct 2019
Affective Computing for Large-Scale Heterogeneous Multimedia Data: A Survey
ACM Transactions on Multimedia Computing, Communications, and Applications (TOMCCAP) (TOMM), 2019
Sicheng Zhao
Shangfei Wang
M. Soleymani
D. Joshi
Q. Ji
156
76
0
03 Oct 2019
Multimodal Multitask Representation Learning for Pathology Biobank Metadata Prediction
W. Weng
Yuannan Cai
Angela Lin
Fraser Tan
Po-Hsuan Cameron Chen
135
22
0
17 Sep 2019
Joint Wasserstein Autoencoders for Aligning Multimodal Embeddings
Shweta Mahajan
Teresa Botschen
Iryna Gurevych
Stefan Roth
96
8
0
14 Sep 2019
Supervised Multimodal Bitransformers for Classifying Images and Text
Douwe Kiela
Suvrat Bhooshan
Hamed Firooz
Ethan Perez
Davide Testuggine
327
295
0
06 Sep 2019
Multimodal Deep Learning for Mental Disorders Prediction from Audio Speech Samples
Habibeh Naderi
Behrouz Haji Soleimani
Stan Matwin
251
15
0
03 Sep 2019
Integrating Multimodal Information in Large Pretrained Transformers
Wasifur Rahman
M. Hasan
Sangwu Lee
Amir Zadeh
Chengfeng Mao
Louis-Philippe Morency
Ehsan Hoque
207
29
0
15 Aug 2019
Harmonized Multimodal Learning with Gaussian Process Latent Variable Models
IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2019
Guoli Song
Shuhui Wang
Qingming Huang
Q. Tian
175
24
0
14 Aug 2019
Multimodal Emotion Recognition Using Deep Canonical Correlation Analysis
Wei Liu
Jielin Qiu
Wei-Long Zheng
Bao-Liang Lu
153
80
0
13 Aug 2019
Semi Supervised Phrase Localization in a Bidirectional Caption-Image Retrieval Framework
Deepan Das
Noor Mohammed Ghouse
Shashank Verma
Yin Li
117
0
0
08 Aug 2019
Trends in Integration of Vision and Language Research: A Survey of Tasks, Datasets, and Methods
Journal of Artificial Intelligence Research (JAIR), 2019
Aditya Mogadala
M. Kalimuthu
Dietrich Klakow
VLM
405
142
0
22 Jul 2019
Canonical Correlation Analysis (CCA) Based Multi-View Learning: An Overview
Chenfeng Guo
Dongrui Wu
HAI
187
38
0
03 Jul 2019
Audio-Visual Kinship Verification
Xiaoting Wu
Eric Granger
Xiaoyi Feng
CVBM
91
4
0
24 Jun 2019
AI-enabled Blockchain: An Outlier-aware Consensus Protocol for Blockchain-based IoT Networks
Mehrdad Salimitari
M. Joneidi
M. Chatterjee
201
1
0
17 Jun 2019
Multi-modal Active Learning From Human Data: A Deep Reinforcement Learning Approach
International Conference on Multimodal Interaction (ICMI), 2019
Ognjen Rudovic
Meiru Zhang
Bjorn Schuller
Rosalind W. Picard
OffRL
171
51
0
07 Jun 2019
Cross-modal Variational Auto-encoder with Distributed Latent Spaces and Associators
D. Jo
Byeongju Lee
Jongwon Choi
Haanju Yoo
J. Choi
BDL
DRL
140
7
0
30 May 2019
What Makes Training Multi-Modal Classification Networks Hard?
Computer Vision and Pattern Recognition (CVPR), 2019
Weiyao Wang
Du Tran
Matt Feiszli
574
566
0
29 May 2019
Multi-Modal Graph Interaction for Multi-Graph Convolution Network in Urban Spatiotemporal Forecasting
Xu Geng
Xiyu Wu
Lingyu Zhang
Qiang Yang
Yan Liu
Jieping Ye
AI4TS
131
34
0
27 May 2019
Ensemble Model Patching: A Parameter-Efficient Variational Bayesian Neural Network
Oscar Chang
Yuling Yao
David Williams-King
Hod Lipson
BDL
UQCV
188
8
0
23 May 2019
Deep Unified Multimodal Embeddings for Understanding both Content and Users in Social Media Networks
Karan Sikka
Lucas Van Bramer
Ajay Divakaran
243
2
0
17 May 2019
Strong and Simple Baselines for Multimodal Utterance Embeddings
North American Chapter of the Association for Computational Linguistics (NAACL), 2019
Paul Pu Liang
Y. Lim
Yifan Hao
Ruslan Salakhutdinov
Louis-Philippe Morency
SSL
165
31
0
14 May 2019
Previous
1
2
3
...
17
18
19
Next