A Multimodal-Multitask Framework with Cross-modal Relation and Hierarchical Interactive Attention for Semantic ComprehensionInformation Fusion (Inf. Fusion), 2025 |
Multi-modal Latent Space Learning for Chain-of-Thought Reasoning in
Language ModelsAAAI Conference on Artificial Intelligence (AAAI), 2023 |
Multimodal Prompt Learning for Product Title Generation with Extremely
Limited LabelsAnnual Meeting of the Association for Computational Linguistics (ACL), 2023 |
VITR: Augmenting Vision Transformers with Relation-Focused Learning for
Cross-Modal Information RetrievalACM Transactions on Knowledge Discovery from Data (TKDD), 2023 |