ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2210.02884
  4. Cited By
Vision+X: A Survey on Multimodal Learning in the Light of Data
v1v2 (latest)

Vision+X: A Survey on Multimodal Learning in the Light of Data

IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2022
5 October 2022
Ye Zhu
Yuehua Wu
Andrii Zadaianchuk
Yan Yan
ArXiv (abs)PDFHTML

Papers citing "Vision+X: A Survey on Multimodal Learning in the Light of Data"

15 / 15 papers shown
Title
Caption Injection for Optimization in Generative Search Engine
Caption Injection for Optimization in Generative Search Engine
Xiaolu Chen
Yong Liao
DiffM
88
0
0
06 Nov 2025
Mixup Helps Understanding Multimodal Video Better
Mixup Helps Understanding Multimodal Video Better
Xiaoyu Ma
Ding Ding
Hao Chen
84
0
0
13 Oct 2025
AIM: Adaptive Intra-Network Modulation for Balanced Multimodal Learning
AIM: Adaptive Intra-Network Modulation for Balanced Multimodal Learning
Shu Shen
Chao Chen
Tong Zhang
196
0
0
27 Aug 2025
Principled Multimodal Representation Learning
Principled Multimodal Representation Learning
Xiaohao Liu
Xiaobo Xia
See-Kiong Ng
Tat-Seng Chua
203
6
0
23 Jul 2025
DaMO: A Data-Efficient Multimodal Orchestrator for Temporal Reasoning with Video LLMs
DaMO: A Data-Efficient Multimodal Orchestrator for Temporal Reasoning with Video LLMs
Bo-Cheng Chiu
Jen-Jee Chen
Yu-Chee Tseng
Feng-Chi Chen
261
0
0
13 Jun 2025
Improving Multimodal Learning Balance and Sufficiency through Data Remixing
Improving Multimodal Learning Balance and Sufficiency through Data Remixing
Xiaoyu Ma
Hao Chen
Yongjian Deng
208
4
0
13 Jun 2025
ReID5o: Achieving Omni Multi-modal Person Re-identification in a Single Model
ReID5o: Achieving Omni Multi-modal Person Re-identification in a Single Model
Jialong Zuo
Yongtai Deng
Mengdan Tan
Rui Jin
Dongyue Wu
Nong Sang
Liang Pan
Changxin Gao
199
0
0
11 Jun 2025
Implicit Bias Injection Attacks against Text-to-Image Diffusion Models
Implicit Bias Injection Attacks against Text-to-Image Diffusion ModelsComputer Vision and Pattern Recognition (CVPR), 2025
Huayang Huang
Xiangye Jin
Jiaxu Miao
Yu Wu
274
3
0
02 Apr 2025
ROBIN: Robust and Invisible Watermarks for Diffusion Models with Adversarial Optimization
ROBIN: Robust and Invisible Watermarks for Diffusion Models with Adversarial OptimizationNeural Information Processing Systems (NeurIPS), 2024
Huayang Huang
Yu Wu
Qian Wang
DiffMWIGM
407
24
0
06 Nov 2024
CLSP: High-Fidelity Contrastive Language-State Pre-training for Agent
  State Representation
CLSP: High-Fidelity Contrastive Language-State Pre-training for Agent State Representation
Fuxian Huang
Tao Gui
Shaopeng Zhai
Jie Wang
Tianyi Zhang
Haoran Zhang
Ming Zhou
Yu Liu
Yu Qiao
CLIPAI4TS
185
0
0
24 Sep 2024
Deep Learning for Video Anomaly Detection: A Review
Deep Learning for Video Anomaly Detection: A Review
Peng Wu
Chengyu Pan
Yuting Yan
Guansong Pang
Peng Wang
Yanning Zhang
VLMAI4TS
180
30
0
09 Sep 2024
A Systematic Review of Intermediate Fusion in Multimodal Deep Learning
  for Biomedical Applications
A Systematic Review of Intermediate Fusion in Multimodal Deep Learning for Biomedical ApplicationsImage and Vision Computing (IVC), 2024
V. Guarrasi
Fatih Aksu
Camillo Maria Caruso
Francesco Di Feola
Aurora Rofena
Filippo Ruffini
Paolo Soda
OffRLMedImAI4CE
153
45
0
02 Aug 2024
Vision-Language Dataset Distillation
Vision-Language Dataset Distillation
Xindi Wu
Byron Zhang
Zhiwei Deng
Olga Russakovsky
DDVLM
387
14
0
15 Aug 2023
Discrete Contrastive Diffusion for Cross-Modal Music and Image
  Generation
Discrete Contrastive Diffusion for Cross-Modal Music and Image GenerationInternational Conference on Learning Representations (ICLR), 2022
Ye Zhu
Yuehua Wu
Kyle Olszewski
Jian Ren
Sergey Tulyakov
Yan Yan
DiffM
358
56
0
15 Jun 2022
Learning Audio-Visual Correlations from Variational Cross-Modal
  Generation
Learning Audio-Visual Correlations from Variational Cross-Modal GenerationIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2021
Ye Zhu
Yu Wu
Hugo Latapie
Yi Yang
Yan Yan
SSL
232
21
0
05 Feb 2021
1