v1v2 (latest)

Multimodal Machine Learning: A Survey and Taxonomy

26 May 2017

T. Baltrušaitis

Chaitanya Ahuja

Louis-Philippe Morency

ArXiv (abs)PDF HTML

Papers citing "Multimodal Machine Learning: A Survey and Taxonomy"

50 / 941 papers shown

What BERT Sees: Cross-Modal Transfer for Visual Question Generation

252

25 Feb 2020

AutoFoley: Artificial Synthesis of Synchronized Sound Tracks for Silent Videos with Deep LearningIEEE transactions on multimedia (TMM), 2020

Sanchita Ghose

John J. Prevost

VGen

171

21 Feb 2020

Stroke Constrained Attention Network for Online Handwritten Mathematical Expression RecognitionPattern Recognition (Pattern Recognit.), 2020

Jiaming Wang

Jun Du

Jianshu Zhang

203

20 Feb 2020

Neural Attentive Multiview MachinesIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2020

121

18 Feb 2020

Deep Robust Multilevel Semantic Cross-Modal Hashing

Ge Song

Jun Zhao

Xiaoyang Tan

111

07 Feb 2020

Multimodal Data Fusion based on the Global Workspace TheoryInternational Conference on Multimodal Interaction (ICMI), 2020

212

26 Jan 2020

Improved Robust ASR for Social Robots in Public Spaces

14 Jan 2020

Think Locally, Act Globally: Federated Learning with Local and Global Representations

Louis-Philippe Morency

FedML

624

679

06 Jan 2020

Improving Visual Recognition using Ambient Sound for Supervision

Rohan Mahadev

Hongyu Lu

25 Dec 2019

Multimodal Prediction based on Graph Representations

Í. C. Dourado

S. Tabbone

Ricardo da S. Torres

159

21 Dec 2019

Pathomic Fusion: An Integrated Framework for Fusing Histopathology and Genomic Features for Cancer Diagnosis and PrognosisIEEE Transactions on Medical Imaging (TMI), 2019

Richard J. Chen

Ming Y. Lu

Jingwen Wang

Drew F. K. Williamson

S. Rodig

N. Lindeman

Faisal Mahmood

361

526

18 Dec 2019

Multimodal Self-Supervised Learning for Medical Image AnalysisInformation Processing in Medical Imaging (IPMI), 2019

351

122

11 Dec 2019

Drug-Target Indication Prediction by Integrating End-to-End Learning and Fingerprints

Brighter Agyemang

Wei-Ping Wu

Michael Y. Kpiebaareh

Ebenezer Nanor

158

03 Dec 2019

See and Read: Detecting Depression Symptoms in Higher Education Students Using Multimodal Social Media DataInternational Conference on Web and Social Media (ICWSM), 2019

Paulo Mann

A. Paes

Elton H. Matsushima

207

03 Dec 2019

Biometrics Recognition Using Deep Learning: A Survey

340

30 Nov 2019

Multimodal Machine Translation through Visuals and SpeechMachine Translation (MT), 2019

201

28 Nov 2019

Factorized Multimodal Transformer for Multimodal Sequential Learning

Louis-Philippe Morency

138

22 Nov 2019

Modal-aware Features for Multimodal Hashing

153

19 Nov 2019

Modality to Modality Translation: An Adversarial Representation Learning and Graph Fusion Network for Multimodal FusionAAAI Conference on Artificial Intelligence (AAAI), 2019

385

222

18 Nov 2019

M3ER: Multiplicative Multimodal Emotion Recognition Using Facial, Textual, and Speech CuesAAAI Conference on Artificial Intelligence (AAAI), 2019

241

267

09 Nov 2019

Towards a General Model of Knowledge for Facial Analysis by Multi-Source Transfer Learning

156

08 Nov 2019

A coupled autoencoder approach for multi-modal analysis of cell typesNeural Information Processing Systems (NeurIPS), 2019

06 Nov 2019

Picking groups instead of samples: A close look at Static Pool-based Meta-Active Learning

Ignasi Mas

J. Morros

Verónica Vilaplana

123

01 Nov 2019

SignCol: Open-Source Software for Collecting Sign Language Gestures

31 Oct 2019

TCT: A Cross-supervised Learning Method for Multimodal Sequence Representation

Wubo Li

Wei Zou

Xiangang Li

ViT

105

23 Oct 2019

Cross-task pre-training for on-device acoustic scene classification

Ruixiong Zhang

Wei Zou

Xiangang Li

119

22 Oct 2019

A Survey and Taxonomy of Adversarial Neural Networks for Text-to-Image Synthesis

169

114

21 Oct 2019

PyTorchPipe: a framework for rapid prototyping of pipelines combining language and vision

Tomasz Kornuta

129

18 Oct 2019

Seeing and Hearing Egocentric Actions: How Much Can We Learn?

127

15 Oct 2019

To React or not to React: End-to-End Visual Pose Forecasting for Personalized Avatar during Dyadic ConversationsInternational Conference on Multimodal Interaction (ICMI), 2019

Chaitanya Ahuja

Shugao Ma

Louis-Philippe Morency

Yaser Sheikh

160

05 Oct 2019

Affective Computing for Large-Scale Heterogeneous Multimedia Data: A SurveyACM Transactions on Multimedia Computing, Communications, and Applications (TOMCCAP) (TOMM), 2019

156

03 Oct 2019

Multimodal Multitask Representation Learning for Pathology Biobank Metadata Prediction

Po-Hsuan Cameron Chen

135

17 Sep 2019

Joint Wasserstein Autoencoders for Aligning Multimodal Embeddings

14 Sep 2019

Supervised Multimodal Bitransformers for Classifying Images and Text

Douwe Kiela

327

295

06 Sep 2019

Multimodal Deep Learning for Mental Disorders Prediction from Audio Speech Samples

Habibeh Naderi

Behrouz Haji Soleimani

Stan Matwin

251

03 Sep 2019

Integrating Multimodal Information in Large Pretrained Transformers

Louis-Philippe Morency

Ehsan Hoque

207

15 Aug 2019

Harmonized Multimodal Learning with Gaussian Process Latent Variable ModelsIEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2019

175

14 Aug 2019

Multimodal Emotion Recognition Using Deep Canonical Correlation Analysis

153

13 Aug 2019

Semi Supervised Phrase Localization in a Bidirectional Caption-Image Retrieval Framework

Deepan Das

Noor Mohammed Ghouse

Shashank Verma

Yin Li

117

08 Aug 2019

Trends in Integration of Vision and Language Research: A Survey of Tasks, Datasets, and MethodsJournal of Artificial Intelligence Research (JAIR), 2019

405

142

22 Jul 2019

Canonical Correlation Analysis (CCA) Based Multi-View Learning: An Overview

Chenfeng Guo

Dongrui Wu

HAI

187

03 Jul 2019

Audio-Visual Kinship Verification

24 Jun 2019

AI-enabled Blockchain: An Outlier-aware Consensus Protocol for Blockchain-based IoT Networks

Mehrdad Salimitari

M. Joneidi

M. Chatterjee

201

17 Jun 2019

Multi-modal Active Learning From Human Data: A Deep Reinforcement Learning ApproachInternational Conference on Multimodal Interaction (ICMI), 2019

Bjorn Schuller

171

07 Jun 2019

Cross-modal Variational Auto-encoder with Distributed Latent Spaces and Associators

140

30 May 2019

What Makes Training Multi-Modal Classification Networks Hard?Computer Vision and Pattern Recognition (CVPR), 2019

Weiyao Wang

Du Tran

Matt Feiszli

574

566

29 May 2019

Multi-Modal Graph Interaction for Multi-Graph Convolution Network in Urban Spatiotemporal Forecasting

131

27 May 2019

Ensemble Model Patching: A Parameter-Efficient Variational Bayesian Neural Network

188

23 May 2019

Deep Unified Multimodal Embeddings for Understanding both Content and Users in Social Media Networks

Karan Sikka

Lucas Van Bramer

Ajay Divakaran

243

17 May 2019

Strong and Simple Baselines for Multimodal Utterance EmbeddingsNorth American Chapter of the Association for Computational Linguistics (NAACL), 2019

Louis-Philippe Morency

SSL

165

14 May 2019