Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2311.11375
Cited By
ML-LMCL: Mutual Learning and Large-Margin Contrastive Learning for Improving ASR Robustness in Spoken Language Understanding
19 November 2023
Xuxin Cheng
Bowen Cao
Qichen Ye
Zhihong Zhu
Hongxiang Li
Yuexian Zou
Re-assign community
ArXiv
PDF
HTML
Papers citing
"ML-LMCL: Mutual Learning and Large-Margin Contrastive Learning for Improving ASR Robustness in Spoken Language Understanding"
20 / 20 papers shown
Title
SimpleSpeech: Towards Simple and Efficient Text-to-Speech with Scalar Latent Transformer Diffusion Models
Dongchao Yang
Dingdong Wang
Haohan Guo
Xueyuan Chen
Xixin Wu
Helen M. Meng
44
24
0
04 Jun 2024
Towards Spoken Language Understanding via Multi-level Multi-grained Contrastive Learning
Xuxin Cheng
Wanshi Xu
Zhihong Zhu
Hongxiang Li
Yuexian Zou
50
13
0
31 May 2024
Self-Supervised Pre-training with Symmetric Superimposition Modeling for Scene Text Recognition
Zuan Gao
Yuxin Wang
Yadong Qu
Boqiang Zhang
Zixiao Wang
Jianjun Xu
Hongtao Xie
ViT
40
9
0
09 May 2024
V2A-Mark: Versatile Deep Visual-Audio Watermarking for Manipulation Localization and Copyright Protection
Xuanyu Zhang
You-song Xu
Runyi Li
Jiwen Yu
Weiqi Li
Zhipei Xu
Jian Andrew Zhang
VGen
25
16
0
25 Apr 2024
MINT: Boosting Audio-Language Model via Multi-Target Pre-Training and Instruction Tuning
Hang Zhao
Yifei Xin
Zhesong Yu
Bilei Zhu
Lu Lu
Zejun Ma
AuLLM
23
4
0
12 Feb 2024
A Frustratingly Easy Plug-and-Play Detection-and-Reasoning Module for Chinese Spelling Check
Haojing Huang
Jingheng Ye
Qingyu Zhou
Yinghui Li
Yangning Li
Feng Zhou
Hai-Tao Zheng
LRM
18
13
0
13 Oct 2023
I
2
^2
2
KD-SLU: An Intra-Inter Knowledge Distillation Framework for Zero-Shot Cross-Lingual Spoken Language Understanding
Tianjun Mao
Chenghong Zhang
VLM
17
0
0
04 Oct 2023
MultiCapCLIP: Auto-Encoding Prompts for Zero-Shot Multilingual Visual Captioning
Bang-ju Yang
Fenglin Liu
X. Wu
Yaowei Wang
Xu Sun
Yuexian Zou
VLM
CLIP
22
13
0
25 Aug 2023
Improving Audio-Text Retrieval via Hierarchical Cross-Modal Interaction and Auxiliary Captions
Yifei Xin
Yuexian Zou
36
9
0
28 Jul 2023
G2L: Semantically Aligned and Uniform Video Grounding via Geodesic and Game Theory
Hongxiang Li
Meng Cao
Xuxin Cheng
Yaowei Li
Zhihong Zhu
Yuexian Zou
11
20
0
26 Jul 2023
CRoSS: Diffusion Model Makes Controllable, Robust and Secure Image Steganography
Jiwen Yu
Xuanyu Zhang
You-song Xu
Jian Zhang
DiffM
22
44
0
26 May 2023
The Interpreter Understands Your Meaning: End-to-end Spoken Language Understanding Aided by Speech Translation
Mutian He
Philip N. Garner
15
4
0
16 May 2023
Improving Weakly Supervised Sound Event Detection with Causal Intervention
Yifei Xin
Dongchao Yang
Fan Cui
Yujun Wang
Yuexian Zou
CML
46
8
0
10 Mar 2023
NoreSpeech: Knowledge Distillation based Conditional Diffusion Model for Noise-robust Expressive TTS
Dongchao Yang
Songxiang Liu
Jianwei Yu
Helin Wang
Chao Weng
Yuexian Zou
DiffM
VLM
18
18
0
04 Nov 2022
Error Correction in ASR using Sequence-to-Sequence Models
S. Dutta
Shreyansh Jain
Ayush Maheshwari
Souvik Pal
Ganesh Ramakrishnan
P. Jyothi
KELM
31
29
0
02 Feb 2022
HTS-AT: A Hierarchical Token-Semantic Audio Transformer for Sound Classification and Detection
Ke Chen
Xingjian Du
Bilei Zhu
Zejun Ma
Taylor Berg-Kirkpatrick
Shlomo Dubnov
ViT
111
262
0
02 Feb 2022
Are we ready for a new paradigm shift? A Survey on Visual Deep MLP
Ruiyang Liu
Yinghui Li
Li Tao
Dun Liang
Haitao Zheng
47
96
0
07 Nov 2021
Towards Joint Intent Detection and Slot Filling via Higher-order Attention
Dongsheng Chen
Zhiqi Huang
Xian Wu
Shen Ge
Yuexian Zou
14
20
0
18 Sep 2021
Improved Baselines with Momentum Contrastive Learning
Xinlei Chen
Haoqi Fan
Ross B. Girshick
Kaiming He
SSL
229
3,029
0
09 Mar 2020
Unified Vision-Language Pre-Training for Image Captioning and VQA
Luowei Zhou
Hamid Palangi
Lei Zhang
Houdong Hu
Jason J. Corso
Jianfeng Gao
MLLM
VLM
250
922
0
24 Sep 2019
1