Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2007.11154
Cited By
Rethinking CNN Models for Audio Classification
22 July 2020
Kamalesh Palanisamy
Dipika Singhania
Angela Yao
SSL
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Rethinking CNN Models for Audio Classification"
48 / 48 papers shown
Title
When Vision Models Meet Parameter Efficient Look-Aside Adapters Without Large-Scale Audio Pretraining
Juan Yeo
Jinkwan Jang
Kyubyung Chae
Seongkyu Mun
Taesup Kim
VLM
57
0
0
08 Dec 2024
Transfer Learning in Vocal Education: Technical Evaluation of Limited Samples Describing Mezzo-soprano
Zhenyi Hou
Xu Zhao
Kejie Ye
Xinyu Sheng
Shanggerile Jiang
...
Jiaxing Chen
Yan Zou
Yuchao Feng
Guangyu Fan
Xin Yuan
DiffM
32
1
0
30 Oct 2024
Towards reliable respiratory disease diagnosis based on cough sounds and vision transformers
Qian Wang
Zhaoyang Bu
Jiaxuan Mao
Wenyu Zhu
Jingya Zhao
Wei Du
Guochao Shi
Min Zhou
Si Chen
Jieming Qu
MedIm
39
0
0
28 Aug 2024
Imperceptible Rhythm Backdoor Attacks: Exploring Rhythm Transformation for Embedding Undetectable Vulnerabilities on Speech Recognition
Wenhan Yao
Jiangkun Yang
yongqiang He
Jia Liu
Weiping Wen
34
1
0
16 Jun 2024
FastAST: Accelerating Audio Spectrogram Transformer via Token Merging and Cross-Model Knowledge Distillation
Swarup Ranjan Behera
Abhishek Dhiman
Karthik Gowda
Aalekhya Satya Narayani
21
1
0
11 Jun 2024
Images that Sound: Composing Images and Sounds on a Single Canvas
Ziyang Chen
Daniel Geng
Andrew Owens
DiffM
48
9
0
20 May 2024
Exploring Vision Transformers for 3D Human Motion-Language Models with Motion Patches
Qing Yu
Mikihiro Tanaka
Kent Fujiwara
ViT
34
2
0
08 May 2024
Tuning In: Analysis of Audio Classifier Performance in Clinical Settings with Limited Data
Hamza Mahdi
Eptehal Nashnoush
Rami Saab
Arjun Balachandar
Rishit Dagli
Lucas X. Perri
H. Khosravani
16
1
0
07 Feb 2024
Emotional Speech-driven 3D Body Animation via Disentangled Latent Diffusion
Kiran Chhatre
Radek Danvevcek
Nikos Athanasiou
Giorgio Becherini
Christopher Peters
Michael J. Black
Timo Bolkart
DiffM
34
16
0
07 Dec 2023
A Holistic Evaluation of Piano Sound Quality
Monan Zhou
Shangda Wu
Shaohua Ji
Zijin Li
Wei Li
21
0
0
07 Oct 2023
LanguageBind: Extending Video-Language Pretraining to N-modality by Language-based Semantic Alignment
Bin Zhu
Bin Lin
Munan Ning
Yang Yan
Jiaxi Cui
...
Zongwei Li
Wancai Zhang
Zhifeng Li
Wei Liu
Liejie Yuan
VLM
MLLM
27
201
0
03 Oct 2023
Real-Time Emergency Vehicle Detection using Mel Spectrograms and Regular Expressions
Alberto Pacheco-Gonzalez
Raymund J. Torres
R. Chacon
Isidro Robledo
16
1
0
25 Sep 2023
A Large-scale Dataset for Audio-Language Representation Learning
Luoyi Sun
Xuenan Xu
Mengyue Wu
Weidi Xie
21
20
0
20 Sep 2023
AudRandAug: Random Image Augmentations for Audio Classification
Teerath Kumar
Muhammad Turab
Alessandra Mileo
Malika Bendechache
Takfarinas Saber
8
7
0
09 Sep 2023
Example-Based Framework for Perceptually Guided Audio Texture Generation
Purnima Kamath
Chitralekha Gupta
L. Wyse
Suranga Nanayakkara
16
4
0
23 Aug 2023
Improving Primate Sounds Classification using Binary Presorting for Deep Learning
Michael Kolle
Steffen Illium
Maximilian Zorn
Jonas Nusslein
Patrick Suchostawski
Claudia Linnhoff-Popien
25
1
0
28 Jun 2023
Dual Bayesian ResNet: A Deep Learning Approach to Heart Murmur Detection
Benjamin Walker
"Felix H. Krones
Ivan Kiskin
Guy Parsons
Terry Lyons
Adam Mahdi
4
15
0
26 May 2023
Towards generalizing deep-audio fake detection networks
Konstantin Gasenzer
Moritz Wolter
28
4
0
22 May 2023
Towards Controllable Audio Texture Morphing
Chitralekha Gupta
Purnima Kamath
Yize Wei
Zhuoyao Li
Suranga Nanayakkara
L. Wyse
21
4
0
23 Apr 2023
Detection and classification of vocal productions in large scale audio recordings
Guillem Bonafos
Pierre Pudlo
Jean-Marc Freyermuth
T. Legou
J. Fagot
Samuel Tronccon
Arnaud Rey
AI4TS
11
1
0
14 Feb 2023
SemanticAC: Semantics-Assisted Framework for Audio Classification
Yicheng Xiao
Yue Ma
Shuyan Li
Hantao Zhou
Ran Liao
Xiu Li
13
8
0
12 Feb 2023
Revisiting Pre-training in Audio-Visual Learning
Ruoxuan Feng
Wenke Xia
Di Hu
17
1
0
07 Feb 2023
GAFX: A General Audio Feature eXtractor
Zhaoyang Bu
Han Zhang
Xiaohu Zhu
28
0
0
19 Jul 2022
Sound Model Factory: An Integrated System Architecture for Generative Audio Modelling
L. Wyse
Purnima Kamath
Chitralekha Gupta
9
9
0
27 Jun 2022
Domain Generalization with Relaxed Instance Frequency-wise Normalization for Multi-device Acoustic Scene Classification
Byeonggeun Kim
Seunghan Yang
Jangho Kim
Hyunsin Park
Juntae Lee
Simyung Chang
36
28
0
24 Jun 2022
QbyE-MLPMixer: Query-by-Example Open-Vocabulary Keyword Spotting using MLPMixer
Jinmiao Huang
W. Gharbieh
Qianhui Wan
Han Suk Shim
Chul Lee
17
9
0
23 Jun 2022
Investigating Multi-Feature Selection and Ensembling for Audio Classification
Muhammad Turab
Teerath Kumar
Malika Bendechache
Takfarinas Saber
25
41
0
15 Jun 2022
Deep Learning-based automated classification of Chinese Speech Sound Disorders
Yao-Ming Kuo
S. Ruan
Yu-Chin Chen
Ya-Wen Tu
14
6
0
24 May 2022
UFRC: A Unified Framework for Reliable COVID-19 Detection on Crowdsourced Cough Audio
Jiangeng Chang
Y. Ruan
Shaoze Cui
John Soong Tshon Yit
Mengling Feng
24
6
0
16 Apr 2022
CMKD: CNN/Transformer-Based Cross-Model Knowledge Distillation for Audio Classification
Yuan Gong
Sameer Khurana
Andrew Rouditchenko
James R. Glass
VLM
25
29
0
13 Mar 2022
Maximizing Audio Event Detection Model Performance on Small Datasets Through Knowledge Transfer, Data Augmentation, And Pretraining: An Ablation Study
Daniel C. Tompkins
Kshitiz Kumar
Jian Wu
15
5
0
07 Feb 2022
Continual Transformers: Redundancy-Free Attention for Online Inference
Lukas Hedegaard
Arian Bakhtiarnia
Alexandros Iosifidis
CLL
14
11
0
17 Jan 2022
Deep Learning for Enhanced Scratch Input
Aman Bhargava
Alice Zhou
Adam Carnaffan
Steve Mann
10
0
0
30 Nov 2021
NeuroView: Explainable Deep Network Decision Making
C. Barberan
Randall Balestriero
Richard G. Baraniuk
FAtt
8
2
0
15 Oct 2021
HumBugDB: A Large-scale Acoustic Mosquito Dataset
Ivan Kiskin
Marianne E. Sinka
Adam D. Cobb
Waqas Rafique
Lawrence Wang
...
E. Kaindoa
G. Killeen
Eva Herreros-Moya
Katherine J. Willis
Stephen J. Roberts
41
27
0
14 Oct 2021
Cross-domain Semi-Supervised Audio Event Classification Using Contrastive Regularization
Donmoon Lee
Kyogu Lee
18
3
0
29 Sep 2021
A Visual Domain Transfer Learning Approach for Heartbeat Sound Classification
Uddipan Mukherjee
Sidharth Pancholi
24
0
0
28 Jul 2021
AudioCLIP: Extending CLIP to Image, Text and Audio
A. Guzhov
Federico Raue
Jörn Hees
Andreas Dengel
CLIP
VLM
25
359
0
24 Jun 2021
Do sound event representations generalize to other audio tasks? A case study in audio transfer learning
Anurag Kumar
Yun Wang
V. Ithapu
Christian Fuegen
14
3
0
21 Jun 2021
Single-Layer Vision Transformers for More Accurate Early Exits with Less Overhead
Arian Bakhtiarnia
Qi Zhang
Alexandros Iosifidis
25
35
0
19 May 2021
ESResNe(X)t-fbsp: Learning Robust Time-Frequency Transformation of Audio
A. Guzhov
Federico Raue
Jörn Hees
Andreas Dengel
11
38
0
23 Apr 2021
Detection of Audio-Video Synchronization Errors Via Event Detection
Joshua Peter Ebenezer
Yongjun Wu
Hai Wei
S. Sethuraman
Z. Liu
24
12
0
20 Apr 2021
AST: Audio Spectrogram Transformer
Yuan Gong
Yu-An Chung
James R. Glass
ViT
17
829
0
05 Apr 2021
SoundCLR: Contrastive Learning of Representations For Improved Environmental Sound Classification
Alireza Nasiri
Jianjun Hu
14
17
0
02 Mar 2021
PSLA: Improving Audio Tagging with Pretraining, Sampling, Labeling, and Aggregation
Yuan Gong
Yu-An Chung
James R. Glass
VLM
99
144
0
02 Feb 2021
Triplet Entropy Loss: Improving The Generalisation of Short Speech Language Identification Systems
Ruan van der Merwe
20
7
0
03 Dec 2020
Simple and Scalable Predictive Uncertainty Estimation using Deep Ensembles
Balaji Lakshminarayanan
Alexander Pritzel
Charles Blundell
UQCV
BDL
268
5,660
0
05 Dec 2016
SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation
Vijay Badrinarayanan
Alex Kendall
R. Cipolla
SSeg
437
15,631
0
02 Nov 2015
1