Title |
---|
![]() WavTokenizer: an Efficient Acoustic Discrete Codec Tokenizer for Audio Language Modeling Shengpeng Ji Ziyue Jiang Xize Cheng Yifu Chen Minghui Fang ...Rongjie Huang Yidi Jiang Qian Chen Zhou Zhao Zhou Zhao |
![]() Siamese Vision Transformers are Scalable Audio-visual Learners Yan-Bo Lin Gedas Bertasius |