VDocRAG: Retrieval-Augmented Generation over Visually-Rich DocumentsComputer Vision and Pattern Recognition (CVPR), 2025 |
RANGE: Retrieval Augmented Neural Fields for Multi-Resolution Geo-EmbeddingsComputer Vision and Pattern Recognition (CVPR), 2025 |
WavRAG: Audio-Integrated Retrieval Augmented Generation for Spoken Dialogue ModelsAnnual Meeting of the Association for Computational Linguistics (ACL), 2025 |
Audio-Language Datasets of Scenes and Events: A SurveyIEEE Access (IEEE Access), 2024 |
R^2AG: Incorporating Retrieval Information into Retrieval Augmented
GenerationConference on Empirical Methods in Natural Language Processing (EMNLP), 2024 |
RECAP: Retrieval-Augmented Audio CaptioningIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023 |
Training Audio Captioning Models without AudioIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023 |
Enhance Temporal Relations in Audio Captioning with Sound Event
DetectionInterspeech (Interspeech), 2023 |
Listen, Think, and UnderstandInternational Conference on Learning Representations (ICLR), 2023 |
Prefix tuning for automated audio captioningIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023 |
eP-ALM: Efficient Perceptual Augmentation of Language ModelsIEEE International Conference on Computer Vision (ICCV), 2023 |
Retrieving Multimodal Information for Augmented Generation: A SurveyConference on Empirical Methods in Natural Language Processing (EMNLP), 2023 |
BLAT: Bootstrapping Language-Audio Pre-training based on AudioSet
Tag-guided Synthetic DataACM Multimedia (ACM MM), 2023 |
Automated Audio Captioning with Epochal Difficult Captions for
Curriculum LearningAsia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC), 2022 |
Automated Audio Captioning: An Overview of Recent Progress and New
ChallengesEURASIP Journal on Audio, Speech, and Music Processing (EURASIP J. Audio Speech Music Process.), 2022 |
Beyond the Status Quo: A Contemporary Survey of Advances and Challenges
in Audio CaptioningIEEE/ACM Transactions on Audio Speech and Language Processing (TASLP), 2022 |
Leveraging Pre-trained BERT for Audio CaptioningEuropean Signal Processing Conference (EUSIPCO), 2022 |
Automated Audio Captioning using Transfer Learning and Reconstruction
Latent Space Similarity RegularizationIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2021 |
Audio Captioning TransformerWorkshop on Detection and Classification of Acoustic Scenes and Events (DCASE), 2021 |
Audio Retrieval with Natural Language QueriesInterspeech (Interspeech), 2021 |
MusCaps: Generating Captions for Music AudioIEEE International Joint Conference on Neural Network (IJCNN), 2021 |