ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2007.11154
  4. Cited By
Rethinking CNN Models for Audio Classification

Rethinking CNN Models for Audio Classification

22 July 2020
Kamalesh Palanisamy
Dipika Singhania
Angela Yao
    SSL
ArXivPDFHTML

Papers citing "Rethinking CNN Models for Audio Classification"

48 / 48 papers shown
Title
When Vision Models Meet Parameter Efficient Look-Aside Adapters Without
  Large-Scale Audio Pretraining
When Vision Models Meet Parameter Efficient Look-Aside Adapters Without Large-Scale Audio Pretraining
Juan Yeo
Jinkwan Jang
Kyubyung Chae
Seongkyu Mun
Taesup Kim
VLM
57
0
0
08 Dec 2024
Transfer Learning in Vocal Education: Technical Evaluation of Limited
  Samples Describing Mezzo-soprano
Transfer Learning in Vocal Education: Technical Evaluation of Limited Samples Describing Mezzo-soprano
Zhenyi Hou
Xu Zhao
Kejie Ye
Xinyu Sheng
Shanggerile Jiang
...
Jiaxing Chen
Yan Zou
Yuchao Feng
Guangyu Fan
Xin Yuan
DiffM
30
1
0
30 Oct 2024
Towards reliable respiratory disease diagnosis based on cough sounds and
  vision transformers
Towards reliable respiratory disease diagnosis based on cough sounds and vision transformers
Qian Wang
Zhaoyang Bu
Jiaxuan Mao
Wenyu Zhu
Jingya Zhao
Wei Du
Guochao Shi
Min Zhou
Si Chen
Jieming Qu
MedIm
37
0
0
28 Aug 2024
Imperceptible Rhythm Backdoor Attacks: Exploring Rhythm Transformation
  for Embedding Undetectable Vulnerabilities on Speech Recognition
Imperceptible Rhythm Backdoor Attacks: Exploring Rhythm Transformation for Embedding Undetectable Vulnerabilities on Speech Recognition
Wenhan Yao
Jiangkun Yang
yongqiang He
Jia Liu
Weiping Wen
34
1
0
16 Jun 2024
FastAST: Accelerating Audio Spectrogram Transformer via Token Merging
  and Cross-Model Knowledge Distillation
FastAST: Accelerating Audio Spectrogram Transformer via Token Merging and Cross-Model Knowledge Distillation
Swarup Ranjan Behera
Abhishek Dhiman
Karthik Gowda
Aalekhya Satya Narayani
19
1
0
11 Jun 2024
Images that Sound: Composing Images and Sounds on a Single Canvas
Images that Sound: Composing Images and Sounds on a Single Canvas
Ziyang Chen
Daniel Geng
Andrew Owens
DiffM
48
9
0
20 May 2024
Exploring Vision Transformers for 3D Human Motion-Language Models with
  Motion Patches
Exploring Vision Transformers for 3D Human Motion-Language Models with Motion Patches
Qing Yu
Mikihiro Tanaka
Kent Fujiwara
ViT
34
2
0
08 May 2024
Tuning In: Analysis of Audio Classifier Performance in Clinical Settings
  with Limited Data
Tuning In: Analysis of Audio Classifier Performance in Clinical Settings with Limited Data
Hamza Mahdi
Eptehal Nashnoush
Rami Saab
Arjun Balachandar
Rishit Dagli
Lucas X. Perri
H. Khosravani
16
1
0
07 Feb 2024
Emotional Speech-driven 3D Body Animation via Disentangled Latent
  Diffusion
Emotional Speech-driven 3D Body Animation via Disentangled Latent Diffusion
Kiran Chhatre
Radek Danvevcek
Nikos Athanasiou
Giorgio Becherini
Christopher Peters
Michael J. Black
Timo Bolkart
DiffM
34
16
0
07 Dec 2023
A Holistic Evaluation of Piano Sound Quality
A Holistic Evaluation of Piano Sound Quality
Monan Zhou
Shangda Wu
Shaohua Ji
Zijin Li
Wei Li
21
0
0
07 Oct 2023
LanguageBind: Extending Video-Language Pretraining to N-modality by
  Language-based Semantic Alignment
LanguageBind: Extending Video-Language Pretraining to N-modality by Language-based Semantic Alignment
Bin Zhu
Bin Lin
Munan Ning
Yang Yan
Jiaxi Cui
...
Zongwei Li
Wancai Zhang
Zhifeng Li
Wei Liu
Liejie Yuan
VLM
MLLM
27
201
0
03 Oct 2023
Real-Time Emergency Vehicle Detection using Mel Spectrograms and Regular
  Expressions
Real-Time Emergency Vehicle Detection using Mel Spectrograms and Regular Expressions
Alberto Pacheco-Gonzalez
Raymund J. Torres
R. Chacon
Isidro Robledo
14
1
0
25 Sep 2023
A Large-scale Dataset for Audio-Language Representation Learning
A Large-scale Dataset for Audio-Language Representation Learning
Luoyi Sun
Xuenan Xu
Mengyue Wu
Weidi Xie
21
20
0
20 Sep 2023
AudRandAug: Random Image Augmentations for Audio Classification
AudRandAug: Random Image Augmentations for Audio Classification
Teerath Kumar
Muhammad Turab
Alessandra Mileo
Malika Bendechache
Takfarinas Saber
8
7
0
09 Sep 2023
Example-Based Framework for Perceptually Guided Audio Texture Generation
Example-Based Framework for Perceptually Guided Audio Texture Generation
Purnima Kamath
Chitralekha Gupta
L. Wyse
Suranga Nanayakkara
16
4
0
23 Aug 2023
Improving Primate Sounds Classification using Binary Presorting for Deep
  Learning
Improving Primate Sounds Classification using Binary Presorting for Deep Learning
Michael Kolle
Steffen Illium
Maximilian Zorn
Jonas Nusslein
Patrick Suchostawski
Claudia Linnhoff-Popien
25
1
0
28 Jun 2023
Dual Bayesian ResNet: A Deep Learning Approach to Heart Murmur Detection
Dual Bayesian ResNet: A Deep Learning Approach to Heart Murmur Detection
Benjamin Walker
"Felix H. Krones
Ivan Kiskin
Guy Parsons
Terry Lyons
Adam Mahdi
4
15
0
26 May 2023
Towards generalizing deep-audio fake detection networks
Towards generalizing deep-audio fake detection networks
Konstantin Gasenzer
Moritz Wolter
28
4
0
22 May 2023
Towards Controllable Audio Texture Morphing
Towards Controllable Audio Texture Morphing
Chitralekha Gupta
Purnima Kamath
Yize Wei
Zhuoyao Li
Suranga Nanayakkara
L. Wyse
19
4
0
23 Apr 2023
Detection and classification of vocal productions in large scale audio
  recordings
Detection and classification of vocal productions in large scale audio recordings
Guillem Bonafos
Pierre Pudlo
Jean-Marc Freyermuth
T. Legou
J. Fagot
Samuel Tronccon
Arnaud Rey
AI4TS
11
1
0
14 Feb 2023
SemanticAC: Semantics-Assisted Framework for Audio Classification
SemanticAC: Semantics-Assisted Framework for Audio Classification
Yicheng Xiao
Yue Ma
Shuyan Li
Hantao Zhou
Ran Liao
Xiu Li
13
8
0
12 Feb 2023
Revisiting Pre-training in Audio-Visual Learning
Revisiting Pre-training in Audio-Visual Learning
Ruoxuan Feng
Wenke Xia
Di Hu
17
1
0
07 Feb 2023
GAFX: A General Audio Feature eXtractor
GAFX: A General Audio Feature eXtractor
Zhaoyang Bu
Han Zhang
Xiaohu Zhu
28
0
0
19 Jul 2022
Sound Model Factory: An Integrated System Architecture for Generative
  Audio Modelling
Sound Model Factory: An Integrated System Architecture for Generative Audio Modelling
L. Wyse
Purnima Kamath
Chitralekha Gupta
9
9
0
27 Jun 2022
Domain Generalization with Relaxed Instance Frequency-wise Normalization
  for Multi-device Acoustic Scene Classification
Domain Generalization with Relaxed Instance Frequency-wise Normalization for Multi-device Acoustic Scene Classification
Byeonggeun Kim
Seunghan Yang
Jangho Kim
Hyunsin Park
Juntae Lee
Simyung Chang
36
28
0
24 Jun 2022
QbyE-MLPMixer: Query-by-Example Open-Vocabulary Keyword Spotting using
  MLPMixer
QbyE-MLPMixer: Query-by-Example Open-Vocabulary Keyword Spotting using MLPMixer
Jinmiao Huang
W. Gharbieh
Qianhui Wan
Han Suk Shim
Chul Lee
15
9
0
23 Jun 2022
Investigating Multi-Feature Selection and Ensembling for Audio
  Classification
Investigating Multi-Feature Selection and Ensembling for Audio Classification
Muhammad Turab
Teerath Kumar
Malika Bendechache
Takfarinas Saber
25
41
0
15 Jun 2022
Deep Learning-based automated classification of Chinese Speech Sound
  Disorders
Deep Learning-based automated classification of Chinese Speech Sound Disorders
Yao-Ming Kuo
S. Ruan
Yu-Chin Chen
Ya-Wen Tu
14
6
0
24 May 2022
UFRC: A Unified Framework for Reliable COVID-19 Detection on
  Crowdsourced Cough Audio
UFRC: A Unified Framework for Reliable COVID-19 Detection on Crowdsourced Cough Audio
Jiangeng Chang
Y. Ruan
Shaoze Cui
John Soong Tshon Yit
Mengling Feng
24
6
0
16 Apr 2022
CMKD: CNN/Transformer-Based Cross-Model Knowledge Distillation for Audio
  Classification
CMKD: CNN/Transformer-Based Cross-Model Knowledge Distillation for Audio Classification
Yuan Gong
Sameer Khurana
Andrew Rouditchenko
James R. Glass
VLM
25
29
0
13 Mar 2022
Maximizing Audio Event Detection Model Performance on Small Datasets
  Through Knowledge Transfer, Data Augmentation, And Pretraining: An Ablation
  Study
Maximizing Audio Event Detection Model Performance on Small Datasets Through Knowledge Transfer, Data Augmentation, And Pretraining: An Ablation Study
Daniel C. Tompkins
Kshitiz Kumar
Jian Wu
13
5
0
07 Feb 2022
Continual Transformers: Redundancy-Free Attention for Online Inference
Continual Transformers: Redundancy-Free Attention for Online Inference
Lukas Hedegaard
Arian Bakhtiarnia
Alexandros Iosifidis
CLL
14
11
0
17 Jan 2022
Deep Learning for Enhanced Scratch Input
Deep Learning for Enhanced Scratch Input
Aman Bhargava
Alice Zhou
Adam Carnaffan
Steve Mann
10
0
0
30 Nov 2021
NeuroView: Explainable Deep Network Decision Making
NeuroView: Explainable Deep Network Decision Making
C. Barberan
Randall Balestriero
Richard G. Baraniuk
FAtt
6
2
0
15 Oct 2021
HumBugDB: A Large-scale Acoustic Mosquito Dataset
HumBugDB: A Large-scale Acoustic Mosquito Dataset
Ivan Kiskin
Marianne E. Sinka
Adam D. Cobb
Waqas Rafique
Lawrence Wang
...
E. Kaindoa
G. Killeen
Eva Herreros-Moya
Katherine J. Willis
Stephen J. Roberts
41
27
0
14 Oct 2021
Cross-domain Semi-Supervised Audio Event Classification Using
  Contrastive Regularization
Cross-domain Semi-Supervised Audio Event Classification Using Contrastive Regularization
Donmoon Lee
Kyogu Lee
18
3
0
29 Sep 2021
A Visual Domain Transfer Learning Approach for Heartbeat Sound
  Classification
A Visual Domain Transfer Learning Approach for Heartbeat Sound Classification
Uddipan Mukherjee
Sidharth Pancholi
22
0
0
28 Jul 2021
AudioCLIP: Extending CLIP to Image, Text and Audio
AudioCLIP: Extending CLIP to Image, Text and Audio
A. Guzhov
Federico Raue
Jörn Hees
Andreas Dengel
CLIP
VLM
25
359
0
24 Jun 2021
Do sound event representations generalize to other audio tasks? A case
  study in audio transfer learning
Do sound event representations generalize to other audio tasks? A case study in audio transfer learning
Anurag Kumar
Yun Wang
V. Ithapu
Christian Fuegen
14
3
0
21 Jun 2021
Single-Layer Vision Transformers for More Accurate Early Exits with Less
  Overhead
Single-Layer Vision Transformers for More Accurate Early Exits with Less Overhead
Arian Bakhtiarnia
Qi Zhang
Alexandros Iosifidis
25
35
0
19 May 2021
ESResNe(X)t-fbsp: Learning Robust Time-Frequency Transformation of Audio
ESResNe(X)t-fbsp: Learning Robust Time-Frequency Transformation of Audio
A. Guzhov
Federico Raue
Jörn Hees
Andreas Dengel
11
38
0
23 Apr 2021
Detection of Audio-Video Synchronization Errors Via Event Detection
Detection of Audio-Video Synchronization Errors Via Event Detection
Joshua Peter Ebenezer
Yongjun Wu
Hai Wei
S. Sethuraman
Z. Liu
24
12
0
20 Apr 2021
AST: Audio Spectrogram Transformer
AST: Audio Spectrogram Transformer
Yuan Gong
Yu-An Chung
James R. Glass
ViT
14
829
0
05 Apr 2021
SoundCLR: Contrastive Learning of Representations For Improved
  Environmental Sound Classification
SoundCLR: Contrastive Learning of Representations For Improved Environmental Sound Classification
Alireza Nasiri
Jianjun Hu
12
17
0
02 Mar 2021
PSLA: Improving Audio Tagging with Pretraining, Sampling, Labeling, and
  Aggregation
PSLA: Improving Audio Tagging with Pretraining, Sampling, Labeling, and Aggregation
Yuan Gong
Yu-An Chung
James R. Glass
VLM
99
144
0
02 Feb 2021
Triplet Entropy Loss: Improving The Generalisation of Short Speech
  Language Identification Systems
Triplet Entropy Loss: Improving The Generalisation of Short Speech Language Identification Systems
Ruan van der Merwe
13
7
0
03 Dec 2020
Simple and Scalable Predictive Uncertainty Estimation using Deep
  Ensembles
Simple and Scalable Predictive Uncertainty Estimation using Deep Ensembles
Balaji Lakshminarayanan
Alexander Pritzel
Charles Blundell
UQCV
BDL
268
5,660
0
05 Dec 2016
SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image
  Segmentation
SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation
Vijay Badrinarayanan
Alex Kendall
R. Cipolla
SSeg
435
15,631
0
02 Nov 2015
1