Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2209.03430
Cited By
Foundations and Trends in Multimodal Machine Learning: Principles, Challenges, and Open Questions
7 September 2022
Paul Pu Liang
Amir Zadeh
Louis-Philippe Morency
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Foundations and Trends in Multimodal Machine Learning: Principles, Challenges, and Open Questions"
24 / 24 papers shown
Title
Engineering Artificial Intelligence: Framework, Challenges, and Future Direction
Jay Lee
Hanqi Su
Dai-Yan Ji
Takanobu Minami
AI4CE
46
0
0
03 Apr 2025
CLIMB: Data Foundations for Large Scale Multimodal Clinical Foundation Models
Wei Dai
Peilin Chen
Malinda Lu
Daniel Li
Haowen Wei
Hejie Cui
Paul Pu Liang
LM&MA
44
1
0
09 Mar 2025
TabulaTime: A Novel Multimodal Deep Learning Framework for Advancing Acute Coronary Syndrome Prediction through Environmental and Clinical Data Integration
Xin Zhang
Liangxiu Han
Stephen White
Saad Hassan
Philip A Kalra
James Ritchie
Carl Diver
Jennie Shorley
70
1
0
24 Feb 2025
Modality Interactive Mixture-of-Experts for Fake News Detection
Yifan Liu
Y. Liu
Z. Li
Ruichen Yao
Yang Zhang
Dong Wang
MoE
31
0
0
21 Jan 2025
MM-Path: Multi-modal, Multi-granularity Path Representation Learning -- Extended Version
Ronghui Xu
Hanyin Cheng
Chenjuan Guo
Hongfan Gao
J. Hu
Sean Bin Yang
Bin Yang
69
4
0
03 Jan 2025
Progressive Compositionality in Text-to-Image Generative Models
Xu Han
Linghao Jin
Xiaofeng Liu
Paul Pu Liang
CoGe
93
2
0
22 Oct 2024
Towards Urban General Intelligence: A Review and Outlook of Urban Foundation Models
Weijiao Zhang
Jindong Han
Zhao Xu
Hang Ni
Hao Liu
Hui Xiong
Hui Xiong
AI4CE
77
14
0
30 Jan 2024
MultiIoT: Benchmarking Machine Learning for the Internet of Things
Shentong Mo
Louis-Philippe Morency
Russ Salakhutdinov
Paul Pu Liang
15
1
0
10 Nov 2023
An Empirical Study of Scaling Instruct-Tuned Large Multimodal Models
Yadong Lu
Chunyuan Li
Haotian Liu
Jianwei Yang
Jianfeng Gao
Yelong Shen
MLLM
94
31
0
18 Sep 2023
Learn to Explain: Multimodal Reasoning via Thought Chains for Science Question Answering
Pan Lu
Swaroop Mishra
Tony Xia
Liang Qiu
Kai-Wei Chang
Song-Chun Zhu
Oyvind Tafjord
Peter Clark
A. Kalyan
ELM
ReLM
LRM
198
1,089
0
20 Sep 2022
A Molecular Multimodal Foundation Model Associating Molecule Graphs with Natural Language
Bing-Huang Su
Dazhao Du
Zhao-Qing Yang
Yujie Zhou
Jiangmeng Li
Anyi Rao
Haoran Sun
Zhiwu Lu
Ji-Rong Wen
40
107
0
12 Sep 2022
Screen2Words: Automatic Mobile UI Summarization with Multimodal Learning
Bryan Wang
Gang Li
Xin Zhou
Zhourong Chen
Tovi Grossman
Yang Li
159
152
0
07 Aug 2021
Geometric Deep Learning: Grids, Groups, Graphs, Geodesics, and Gauges
M. Bronstein
Joan Bruna
Taco S. Cohen
Petar Velivcković
GNN
163
1,095
0
27 Apr 2021
VATT: Transformers for Multimodal Self-Supervised Learning from Raw Video, Audio and Text
Hassan Akbari
Liangzhe Yuan
Rui Qian
Wei-Hong Chuang
Shih-Fu Chang
Yin Cui
Boqing Gong
ViT
231
573
0
22 Apr 2021
The Power of Scale for Parameter-Efficient Prompt Tuning
Brian Lester
Rami Al-Rfou
Noah Constant
VPVLM
278
3,784
0
18 Apr 2021
Zero-Shot Text-to-Image Generation
Aditya A. Ramesh
Mikhail Pavlov
Gabriel Goh
Scott Gray
Chelsea Voss
Alec Radford
Mark Chen
Ilya Sutskever
VLM
253
4,735
0
24 Feb 2021
Scaling Up Visual and Vision-Language Representation Learning With Noisy Text Supervision
Chao Jia
Yinfei Yang
Ye Xia
Yi-Ting Chen
Zarana Parekh
Hieu H. Pham
Quoc V. Le
Yun-hsuan Sung
Zhen Li
Tom Duerig
VLM
CLIP
293
2,875
0
11 Feb 2021
MUFASA: Multimodal Fusion Architecture Search for Electronic Health Records
Zhen Xu
David R. So
Andrew M. Dai
Mamba
48
51
0
03 Feb 2021
Decoupling the Role of Data, Attention, and Losses in Multimodal Transformers
Lisa Anne Hendricks
John F. J. Mellor
R. Schneider
Jean-Baptiste Alayrac
Aida Nematzadeh
75
110
0
31 Jan 2021
Extracting Training Data from Large Language Models
Nicholas Carlini
Florian Tramèr
Eric Wallace
Matthew Jagielski
Ariel Herbert-Voss
...
Tom B. Brown
D. Song
Ulfar Erlingsson
Alina Oprea
Colin Raffel
MLAU
SILM
264
1,798
0
14 Dec 2020
The Woman Worked as a Babysitter: On Biases in Language Generation
Emily Sheng
Kai-Wei Chang
Premkumar Natarajan
Nanyun Peng
198
607
0
03 Sep 2019
Are We Modeling the Task or the Annotator? An Investigation of Annotator Bias in Natural Language Understanding Datasets
Mor Geva
Yoav Goldberg
Jonathan Berant
232
306
0
21 Aug 2019
Decoding Brain Representations by Multimodal Learning of Neural Activity and Visual Features
S. Palazzo
C. Spampinato
I. Kavasidis
D. Giordano
Joseph Schmidt
M. Shah
94
109
0
25 Oct 2018
Multimodal Compact Bilinear Pooling for Visual Question Answering and Visual Grounding
Akira Fukui
Dong Huk Park
Daylen Yang
Anna Rohrbach
Trevor Darrell
Marcus Rohrbach
141
1,458
0
06 Jun 2016
1