ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2304.06306
  4. Cited By
Efficient Multimodal Fusion via Interactive Prompting

Efficient Multimodal Fusion via Interactive Prompting

13 April 2023
Yaowei Li
Ruijie Quan
Linchao Zhu
Yezhou Yang
ArXivPDFHTML

Papers citing "Efficient Multimodal Fusion via Interactive Prompting"

29 / 29 papers shown
Title
Decoupled Multimodal Prototypes for Visual Recognition with Missing Modalities
Decoupled Multimodal Prototypes for Visual Recognition with Missing Modalities
Jueqing Lu
Yuanyuan Qi
Xiaohao Yang
Shujie Zhou
Lan Du
14
0
0
13 May 2025
BrainGuard: Privacy-Preserving Multisubject Image Reconstructions from Brain Activities
BrainGuard: Privacy-Preserving Multisubject Image Reconstructions from Brain Activities
Zhibo Tian
Ruijie Quan
Fan Ma
Kun Zhan
Yi Yang
29
1
0
24 Jan 2025
Vision Transformer Neural Architecture Search for Out-of-Distribution Generalization: Benchmark and Insights
Vision Transformer Neural Architecture Search for Out-of-Distribution Generalization: Benchmark and Insights
Sy-Tuyen Ho
Tuan Van Vo
Somayeh Ebrahimkhani
Ngai-man Cheung
32
0
0
08 Jan 2025
End-to-end Open-vocabulary Video Visual Relationship Detection using Multi-modal Prompting
End-to-end Open-vocabulary Video Visual Relationship Detection using Multi-modal Prompting
Yongqi Wang
Xinxiao Wu
Shuo Yang
Jiebo Luo
50
1
0
19 Sep 2024
FusionSAM: Latent Space driven Segment Anything Model for Multimodal
  Fusion and Segmentation
FusionSAM: Latent Space driven Segment Anything Model for Multimodal Fusion and Segmentation
Daixun Li
Weiying Xie
Mingxiang Cao
Yunke Wang
Jiaqing Zhang
Yunsong Li
Leyuan Fang
Chang Xu
34
6
0
26 Aug 2024
Multi-modal Crowd Counting via Modal Emulation
Multi-modal Crowd Counting via Modal Emulation
Chenhao Wang
Xiaopeng Hong
Zhiheng Ma
Yupeng Wei
Yabin Wang
Xiaopeng Fan
25
1
0
28 Jul 2024
SDPT: Synchronous Dual Prompt Tuning for Fusion-based Visual-Language
  Pre-trained Models
SDPT: Synchronous Dual Prompt Tuning for Fusion-based Visual-Language Pre-trained Models
Yang Zhou
Yongjian Wu
Jiya Saiyin
Bingzheng Wei
Maode Lai
Eric Chang
Yan Xu
VLM
30
0
0
16 Jul 2024
Evaluating Fairness in Large Vision-Language Models Across Diverse
  Demographic Attributes and Prompts
Evaluating Fairness in Large Vision-Language Models Across Diverse Demographic Attributes and Prompts
Xuyang Wu
Yuan Wang
Hsin-Tai Wu
Zhiqiang Tao
Yi Fang
VLM
32
7
0
25 Jun 2024
Robust Latent Representation Tuning for Image-text Classification
Robust Latent Representation Tuning for Image-text Classification
Hao Sun
Yu Song
VLM
39
0
0
10 Jun 2024
Enhancing Emotion Recognition in Conversation through Emotional
  Cross-Modal Fusion and Inter-class Contrastive Learning
Enhancing Emotion Recognition in Conversation through Emotional Cross-Modal Fusion and Inter-class Contrastive Learning
Haoxiang Shi
Xulong Zhang
Ning Cheng
Yong Zhang
Jun Yu
Jing Xiao
Jianzong Wang
21
1
0
28 May 2024
Impact of Stickers on Multimodal Chat Sentiment Analysis and Intent
  Recognition: A New Task, Dataset and Baseline
Impact of Stickers on Multimodal Chat Sentiment Analysis and Intent Recognition: A New Task, Dataset and Baseline
Yuanchen Shi
Biao Ma
Fang Kong
16
0
0
14 May 2024
Memory-based Cross-modal Semantic Alignment Network for Radiology Report
  Generation
Memory-based Cross-modal Semantic Alignment Network for Radiology Report Generation
Yitian Tao
Liyan Ma
Jing Yu
Han Zhang
MedIm
18
5
0
31 Mar 2024
Psychometry: An Omnifit Model for Image Reconstruction from Human Brain
  Activity
Psychometry: An Omnifit Model for Image Reconstruction from Human Brain Activity
Ruijie Quan
Wenguan Wang
Zhibo Tian
Fan Ma
Yi Yang
34
12
0
29 Mar 2024
Multimodal Infusion Tuning for Large Models
Multimodal Infusion Tuning for Large Models
Hao Sun
Yu Song
Xinyao Yu
Jiaqing Liu
Yen-Wei Chen
Lanfen Lin
VLM
27
0
0
08 Mar 2024
LSTP: Language-guided Spatial-Temporal Prompt Learning for Long-form
  Video-Text Understanding
LSTP: Language-guided Spatial-Temporal Prompt Learning for Long-form Video-Text Understanding
Yuxuan Wang
Yueqian Wang
Pengfei Wu
Jianxin Liang
Dongyan Zhao
Zilong Zheng
VLM
21
9
0
25 Feb 2024
Memory-Inspired Temporal Prompt Interaction for Text-Image
  Classification
Memory-Inspired Temporal Prompt Interaction for Text-Image Classification
Xinyao Yu
Hao Sun
Ziwei Niu
Rui Qin
Zhenjia Bai
Yen-Wei Chen
Lanfen Lin
VLM
13
2
0
26 Jan 2024
Cascaded Cross-Modal Transformer for Audio-Textual Classification
Cascaded Cross-Modal Transformer for Audio-Textual Classification
Nicolae-Cătălin Ristea
Andrei Anghel
Radu Tudor Ionescu
17
2
0
15 Jan 2024
Conditional Prompt Tuning for Multimodal Fusion
Conditional Prompt Tuning for Multimodal Fusion
Ruixia Jiang
Lingbo Liu
Changwen Chen
16
0
0
28 Nov 2023
Unlocking the Potential of Prompt-Tuning in Bridging Generalized and
  Personalized Federated Learning
Unlocking the Potential of Prompt-Tuning in Bridging Generalized and Personalized Federated Learning
Wenlong Deng
Christos Thrampoulidis
Xiaoxiao Li
19
12
0
27 Oct 2023
Text-driven Prompt Generation for Vision-Language Models in Federated
  Learning
Text-driven Prompt Generation for Vision-Language Models in Federated Learning
Chen Qiu
Xingyu Li
Chaithanya Kumar Mummadi
Madan Ravi Ganesh
Zhenzhen Li
Lu Peng
Wan-Yi Lin
VLM
FedML
20
11
0
09 Oct 2023
Improving Discriminative Multi-Modal Learning with Large-Scale
  Pre-Trained Models
Improving Discriminative Multi-Modal Learning with Large-Scale Pre-Trained Models
Chenzhuang Du
Yue Zhao
Chonghua Liao
Jiacheng You
Jie Fu
Hang Zhao
12
2
0
08 Oct 2023
ChatGPT-Like Large-Scale Foundation Models for Prognostics and Health
  Management: A Survey and Roadmaps
ChatGPT-Like Large-Scale Foundation Models for Prognostics and Health Management: A Survey and Roadmaps
Yanfang Li
Huan Wang
Muxia Sun
LM&MA
AI4TS
AI4CE
19
44
0
10 May 2023
MaPLe: Multi-modal Prompt Learning
MaPLe: Multi-modal Prompt Learning
Muhammad Uzair Khattak
H. Rasheed
Muhammad Maaz
Salman Khan
F. Khan
VPVLM
VLM
186
521
0
06 Oct 2022
Masked Autoencoders Are Scalable Vision Learners
Masked Autoencoders Are Scalable Vision Learners
Kaiming He
Xinlei Chen
Saining Xie
Yanghao Li
Piotr Dollár
Ross B. Girshick
ViT
TPM
258
7,337
0
11 Nov 2021
P-Tuning v2: Prompt Tuning Can Be Comparable to Fine-tuning Universally
  Across Scales and Tasks
P-Tuning v2: Prompt Tuning Can Be Comparable to Fine-tuning Universally Across Scales and Tasks
Xiao Liu
Kaixuan Ji
Yicheng Fu
Weng Lam Tam
Zhengxiao Du
Zhilin Yang
Jie Tang
VLM
236
780
0
14 Oct 2021
Learning to Prompt for Vision-Language Models
Learning to Prompt for Vision-Language Models
Kaiyang Zhou
Jingkang Yang
Chen Change Loy
Ziwei Liu
VPVLM
CLIP
VLM
322
2,249
0
02 Sep 2021
The Power of Scale for Parameter-Efficient Prompt Tuning
The Power of Scale for Parameter-Efficient Prompt Tuning
Brian Lester
Rami Al-Rfou
Noah Constant
VPVLM
278
3,784
0
18 Apr 2021
Scaling Up Visual and Vision-Language Representation Learning With Noisy
  Text Supervision
Scaling Up Visual and Vision-Language Representation Learning With Noisy Text Supervision
Chao Jia
Yinfei Yang
Ye Xia
Yi-Ting Chen
Zarana Parekh
Hieu H. Pham
Quoc V. Le
Yun-hsuan Sung
Zhen Li
Tom Duerig
VLM
CLIP
293
3,683
0
11 Feb 2021
Supervised Multimodal Bitransformers for Classifying Images and Text
Supervised Multimodal Bitransformers for Classifying Images and Text
Douwe Kiela
Suvrat Bhooshan
Hamed Firooz
Ethan Perez
Davide Testuggine
57
238
0
06 Sep 2019
1