Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2203.15332
Cited By
Balanced Multimodal Learning via On-the-fly Gradient Modulation
Computer Vision and Pattern Recognition (CVPR), 2022
29 March 2022
Xiaokang Peng
Yake Wei
Andong Deng
Dong Wang
Di Hu
Re-assign community
ArXiv (abs)
PDF
HTML
Github (274★)
Papers citing
"Balanced Multimodal Learning via On-the-fly Gradient Modulation"
43 / 143 papers shown
Title
Can Text-to-image Model Assist Multi-modal Learning for Visual Recognition with Visual Modality Missing?
Tiantian Feng
Daniel Yang
Digbalay Bose
Shrikanth Narayanan
242
6
0
14 Feb 2024
Enhancing ID and Text Fusion via Alternative Training in Session-based Recommendation
Juanhui Li
Haoyu Han
Zhikai Chen
Harry Shomer
Wei Jin
Amin Javari
Shucheng Zhou
163
1
0
14 Feb 2024
Quantifying and Enhancing Multi-modal Robustness with Modality Preference
Zequn Yang
Yake Wei
Ce Liang
Di Hu
AAML
288
22
0
09 Feb 2024
Bootstrapping Audio-Visual Segmentation by Strengthening Audio Cues
Tianxiang Chen
Zhentao Tan
Tao Gong
Qi Chu
Yue-bo Wu
Bin Liu
Le Lu
Jieping Ye
Nenghai Yu
VOS
202
9
0
04 Feb 2024
Balanced Multi-modal Federated Learning via Cross-Modal Infiltration
Yunfeng Fan
Wenchao Xu
Yining Qi
Jiaqi Zhu
Song Guo
189
4
0
31 Dec 2023
RedCore: Relative Advantage Aware Cross-modal Representation Learning for Missing Modalities with Imbalanced Missing Rates
AAAI Conference on Artificial Intelligence (AAAI), 2023
Junyang Sun
Xinxin Zhang
Shoukang Han
Yunjie Ruan
Taihao Li
180
14
0
16 Dec 2023
More than Vanilla Fusion: a Simple, Decoupling-free, Attention Module for Multimodal Fusion Based on Signal Theory
Peiwen Sun
Yifan Zhang
Zishan Liu
Donghao Chen
Honggang Zhang
245
0
0
12 Dec 2023
Understanding Unimodal Bias in Multimodal Deep Linear Networks
International Conference on Machine Learning (ICML), 2023
Yedi Zhang
Peter E. Latham
Andrew Saxe
248
14
0
01 Dec 2023
Multimodal Representation Learning by Alternating Unimodal Adaptation
Xiaohui Zhang
Jaehong Yoon
Mohit Bansal
Huaxiu Yao
238
67
0
17 Nov 2023
UniCat: Crafting a Stronger Fusion Baseline for Multimodal Re-Identification
Jennifer Crawford
Haoli Yin
Luke McDermott
Daniel Cummings
264
21
0
28 Oct 2023
What Makes for Robust Multi-Modal Models in the Face of Missing Modalities?
Siting Li
Chenzhuang Du
Yue Zhao
Yu Huang
Hang Zhao
187
6
0
10 Oct 2023
Improving Discriminative Multi-Modal Learning with Large-Scale Pre-Trained Models
Chenzhuang Du
Yue Zhao
Chonghua Liao
Jiacheng You
Jie Fu
Hang Zhao
218
2
0
08 Oct 2023
Enhancing multimodal cooperation via sample-level modality valuation
Computer Vision and Pattern Recognition (CVPR), 2023
Yake Wei
Ruoxuan Feng
Zihe Wang
Di Hu
409
48
0
12 Sep 2023
Decoupling Common and Unique Representations for Multimodal Self-supervised Learning
European Conference on Computer Vision (ECCV), 2023
Yi Wang
C. Albrecht
Nassim Ait Ali Braham
Chenying Liu
Zhitong Xiong
Xiaoxiang Zhu
SSL
283
33
0
11 Sep 2023
Self-Supervised Representation Learning with Cross-Context Learning between Global and Hypercolumn Features
IEEE Workshop/Winter Conference on Applications of Computer Vision (WACV), 2023
Zheng Gao
Chen Feng
Ioannis Patras
SSL
199
6
0
25 Aug 2023
Boosting Multi-modal Model Performance with Adaptive Gradient Modulation
IEEE International Conference on Computer Vision (ICCV), 2023
Hong Li
Xingyu Li
Pengbo Hu
Yinuo Lei
Chunxiao Li
Yi Zhou
235
63
0
15 Aug 2023
Progressive Spatio-temporal Perception for Audio-Visual Question Answering
ACM Multimedia (ACM MM), 2023
Guangyao Li
Wenxuan Hou
Di Hu
208
42
0
10 Aug 2023
Revisiting Disentanglement and Fusion on Modality and Context in Conversational Multimodal Emotion Recognition
ACM Multimedia (ACM MM), 2023
Bobo Li
Hao Fei
Lizi Liao
Yu Zhao
Chong Teng
Tat-Seng Chua
Donghong Ji
Fei Li
173
57
0
08 Aug 2023
FULLER: Unified Multi-modality Multi-task 3D Perception via Multi-level Gradient Calibration
IEEE International Conference on Computer Vision (ICCV), 2023
Zhiji Huang
Sihao Lin
Guiyu Liu
Mukun Luo
Chao Ye
Hang Xu
Xiaojun Chang
Xiaodan Liang
196
14
0
31 Jul 2023
Variational Probabilistic Fusion Network for RGB-T Semantic Segmentation
Baihong Lin
Zengrong Lin
Yulan Guo
Yulan Zhang
Jianxiao Zou
Shicai Fan
178
9
0
17 Jul 2023
Multimodal Imbalance-Aware Gradient Modulation for Weakly-supervised Audio-Visual Video Parsing
Jie Fu
Junyu Gao
Changsheng Xu
231
17
0
05 Jul 2023
Exploring the Role of Audio in Video Captioning
Yuhan Shen
Linjie Yang
Longyin Wen
Haichao Yu
Ehsan Elhamifar
Heng Wang
152
6
0
21 Jun 2023
Towards Balanced Active Learning for Multimodal Classification
ACM Multimedia (ACM MM), 2023
Meng Shen
Yizheng Huang
Jianxiong Yin
Heqing Zou
D. Rajan
Simon See
161
9
0
14 Jun 2023
Provable Dynamic Fusion for Low-Quality Multimodal Data
International Conference on Machine Learning (ICML), 2023
Qingyang Zhang
Haitao Wu
Changqing Zhang
Qinghua Hu
Huazhu Fu
Qiufeng Wang
Xi Peng
317
103
0
03 Jun 2023
Continual Multimodal Knowledge Graph Construction
International Joint Conference on Artificial Intelligence (IJCAI), 2023
Xiang Chen
Jintian Zhang
Xiaohan Wang
Ningyu Zhang
Tongtong Wu
Luo Si
Yongheng Wang
Huajun Chen
KELM
CLL
205
21
0
15 May 2023
MMoT: Mixture-of-Modality-Tokens Transformer for Composed Multimodal Conditional Image Synthesis
International Journal of Computer Vision (IJCV), 2023
Jinsheng Zheng
Daqing Liu
Chaoyue Wang
Minghui Hu
Zuopeng Yang
Changxing Ding
Dacheng Tao
129
5
0
10 May 2023
Instance-Variant Loss with Gaussian RBF Kernel for 3D Cross-modal Retriveal
Zhitao Liu
Zengyu Liu
Jiwei Wei
Guan Wang
Zhenjiang Du
Ning Xie
Mengqi Li
111
2
0
07 May 2023
On Uni-Modal Feature Learning in Supervised Multi-Modal Learning
International Conference on Machine Learning (ICML), 2023
Chenzhuang Du
Jiaye Teng
Tingle Li
Yichen Liu
Tianyuan Yuan
Yue Wang
Yang Yuan
Hang Zhao
374
70
0
02 May 2023
Robust Cross-Modal Knowledge Distillation for Unconstrained Videos
Wenke Xia
Xingjian Li
Andong Deng
Haoyi Xiong
Dejing Dou
Di Hu
126
7
0
16 Apr 2023
Looking Similar, Sounding Different: Leveraging Counterfactual Cross-Modal Pairs for Audiovisual Representation Learning
Computer Vision and Pattern Recognition (CVPR), 2023
Nikhil Singh
Chih-Wei Wu
Iroro Orife
Mahdi M. Kalayeh
370
3
0
12 Apr 2023
Unraveling Instance Associations: A Closer Look for Audio-Visual Segmentation
Computer Vision and Pattern Recognition (CVPR), 2023
Yuanhong Chen
Yuyuan Liu
Hu Wang
Fengbei Liu
Chong Wang
Helen Frazer
G. Carneiro
VOS
257
34
0
06 Apr 2023
MMCosine: Multi-Modal Cosine Loss Towards Balanced Audio-Visual Fine-Grained Learning
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023
Ruize Xu
Ruoxuan Feng
Shi-Xiong Zhang
Di Hu
227
42
0
09 Mar 2023
Balanced Audiovisual Dataset for Imbalance Analysis
Wenke Xia
Xu Zhao
Xincheng Pang
Changqing Zhang
Di Hu
255
3
0
14 Feb 2023
Revisiting Pre-training in Audio-Visual Learning
Ruoxuan Feng
Wenke Xia
Di Hu
193
1
0
07 Feb 2023
MS-DETR: Multispectral Pedestrian Detection Transformer with Loosely Coupled Fusion and Modality-Balanced Optimization
Yinghui Xing
Song Wang
Shizhou Zhang
Guoqiang Liang
Xiuwei Zhang
Yanning Zhang
ViT
352
21
0
01 Feb 2023
IMKGA-SM: Interpretable Multimodal Knowledge Graph Answer Prediction via Sequence Modeling
Yilin Wen
Biao Luo
Yuqian Zhao
231
2
0
06 Jan 2023
PMR: Prototypical Modal Rebalance for Multimodal Learning
Computer Vision and Pattern Recognition (CVPR), 2022
Yunfeng Fan
Wenchao Xu
Yining Qi
Junxiao Wang
Song Guo
1.5K
145
0
14 Nov 2022
Play It Back: Iterative Attention for Audio Recognition
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022
Alexandros Stergiou
Dima Damen
177
5
0
20 Oct 2022
Multimodal Analogical Reasoning over Knowledge Graphs
International Conference on Learning Representations (ICLR), 2022
Ningyu Zhang
Lei Li
Xiang Chen
Xiaozhuan Liang
Shumin Deng
Huajun Chen
325
36
0
01 Oct 2022
Video-based Cross-modal Auxiliary Network for Multimodal Sentiment Analysis
Rongfei Chen
Wenju Zhou
Yang Li
Huiyu Zhou
161
29
0
30 Aug 2022
Make Acoustic and Visual Cues Matter: CH-SIMS v2.0 Dataset and AV-Mixup Consistent Module
International Conference on Multimodal Interaction (ICMI), 2022
Yih-Ling Liu
Ziqi Yuan
Huisheng Mao
Zhiyun Liang
Wanqiuyue Yang
Yuanzhe Qiu
Tie Cheng
Xiaoteng Li
Hua Xu
Kai Gao
160
69
0
22 Aug 2022
Learning in Audio-visual Context: A Review, Analysis, and New Perspective
Yake Wei
Di Hu
Yapeng Tian
Xuelong Li
264
68
0
20 Aug 2022
Modality Competition: What Makes Joint Training of Multi-modal Network Fail in Deep Learning? (Provably)
International Conference on Machine Learning (ICML), 2022
Yu Huang
Junyang Lin
Chang Zhou
Hongxia Yang
Longbo Huang
151
141
0
23 Mar 2022
Previous
1
2
3