Enhancing Multimodal Emotion Recognition through Multi-Granularity Cross-Modal AlignmentIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2024 |
Adapting Whisper for Code-Switching through Encoding Refining and Language-Aware DecodingIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2024 |
Serialized Speech Information Guidance with Overlapped Encoding
Separation for Multi-Speaker Automatic Speech RecognitionSpoken Language Technology Workshop (SLT), 2024 |
Iterative Prototype Refinement for Ambiguous Speech Emotion RecognitionInterspeech (Interspeech), 2024 |