Embedded Heterogeneous Attention Transformer for Cross-lingual Image
CaptioningIEEE transactions on multimedia (IEEE TMM), 2023 |
DilateFormer: Multi-Scale Dilated Transformer for Visual RecognitionIEEE transactions on multimedia (IEEE TMM), 2023 |
DEVICE: Depth and Visual Concepts Aware Transformer for OCR-based Image CaptioningPattern Recognition (Pattern Recogn.), 2023 |
HGAN: Hierarchical Graph Alignment Network for Image-Text RetrievalIEEE transactions on multimedia (IEEE TMM), 2022 |
OSIC: A New One-Stage Image Captioner CoinedInternational Joint Conference on Artificial Intelligence (IJCAI), 2022 |
Hierarchical Local-Global Transformer for Temporal Sentence GroundingIEEE transactions on multimedia (IEEE TMM), 2022 |