v1v2v3v4v5 (latest)

PANNs: Large-Scale Pretrained Audio Neural Networks for Audio Pattern Recognition

21 December 2019

Yuxuan Wang

ArXiv (abs)PDF HTML Github (1475★)

Papers citing "PANNs: Large-Scale Pretrained Audio Neural Networks for Audio Pattern Recognition"

50 / 545 papers shown

Title
Interpretability Analysis of Deep Models for COVID-19 Detection Daniel Peixoto Pinto da Silva Edresson Casanova L. Gris A. Júnior Marcelo Finger ... Beatriz Raposo Marcus Martins S. Aluísio L. Berti João Paulo Teixeira 60 3 0 25 Nov 2022
Learning General Audio Representations with Large-Scale Training of Patchout Audio Transformers Khaled Koutini Shahed Masoudian Florian Schmid Hamid Eghbalzadeh Jan Schluter Gerhard Widmer 131 6 0 25 Nov 2022
ASiT: Local-Global Audio Spectrogram vIsion Transformer for Event Classification Sara Atito Muhammad Awais Wenwu Wang Mark D. Plumbley J. Kittler ViT 64 11 0 23 Nov 2022
Ontology-aware Learning and Evaluation for Audio Tagging Haohe Liu Qiuqiang Kong Xubo Liu Xinhao Mei Wenwu Wang Mark D. Plumbley 40 4 0 22 Nov 2022
Impact of visual assistance for automated audio captioning Wim Boes Hugo Van hamme 52 1 0 18 Nov 2022
SpectNet : End-to-End Audio Signal Classification Using Learnable Spectrograms Md. Istiaq Ansari Taufiq Hasan 38 5 0 17 Nov 2022
Music Instrument Classification Reprogrammed Hsin-Hung Chen Alexander Lerch 83 4 0 15 Nov 2022
Describing emotions with acoustic property prompts for speech emotion recognition Hira Dhamyal Benjamin Elizalde Soham Deshmukh Huaming Wang Bhiksha Raj Rita Singh 57 10 0 14 Nov 2022
The Birds Need Attention Too: Analysing usage of Self Attention in identifying bird calls in soundscapes Chandra Kanth Nagesh Abhishek Purushothama 48 2 0 14 Nov 2022
Is my automatic audio captioning system so bad? spider-max: a metric to consider several caption candidates Etienne Labbé Thomas Pellegrini J. Pinquier 32 4 0 14 Nov 2022
Large-scale Contrastive Language-Audio Pretraining with Feature Fusion and Keyword-to-Caption Augmentation Yusong Wu Kai Chen Tianyu Zhang Yuchen Hui Marianna Nezhurina Taylor Berg-Kirkpatrick Shlomo Dubnov CLIP 190 544 0 12 Nov 2022
Investigations in Audio Captioning: Addressing Vocabulary Imbalance and Evaluating Suitability of Language-Centric Performance Metrics Sandeep Reddy Kothinti Dimitra Emmanouilidou 30 3 0 12 Nov 2022
Efficient Large-scale Audio Tagging via Transformer-to-CNN Knowledge Distillation Florian Schmid Khaled Koutini Gerhard Widmer ViT 86 60 0 09 Nov 2022
On Negative Sampling for Contrastive Audio-Text Retrieval Huang Xie Okko Räsänen Tuomas Virtanen 55 7 0 08 Nov 2022
Exploring Train and Test-Time Augmentations for Audio-Language Learning Eungbeom Kim Jinhee Kim Yoori Oh Kyungsu Kim Minju Park Jaeheon Sim J. Lee Kyogu Lee 40 12 0 31 Oct 2022
Introducing topography in convolutional neural networks Maxime Poli Emmanuel Dupoux Rachid Riad 64 0 0 28 Oct 2022
Pretraining Respiratory Sound Representations using Metadata and Contrastive Learning Ilyass Moummad Nicolas Farrugia 97 23 0 27 Oct 2022
Multi-dimensional Edge-based Audio Event Relational Graph Representation Learning for Acoustic Scene Classification Yuanbo Hou Siyang Song Chuan Yu Yuxin Song Wenwu Wang Dick Botteldooren 44 3 0 27 Oct 2022
Pretrained audio neural networks for Speech emotion recognition in Portuguese M. Gauy Marcelo Finger 26 4 0 26 Oct 2022
Neural Sound Field Decomposition with Super-resolution of Sound Direction Qiuqiang Kong Shilei Liu Junjie Shi Xuzhou Ye Yin Cao Qiaoxi Zhu Yong-mei Xu Yuxuan Wang 48 0 0 22 Oct 2022
Play It Back: Iterative Attention for Audio Recognition Alexandros Stergiou Dima Damen 83 4 0 20 Oct 2022
Propagating Variational Model Uncertainty for Bioacoustic Call Label Smoothing Georgios Rizos J. Lawson Simon Mitchell Pranay Shah Xin Wen Cristina Banks‐Leite R. Ewers Bjoern W. Schuller UQCV 57 2 0 19 Oct 2022
Robust, General, and Low Complexity Acoustic Scene Classification Systems and An Effective Visualization for Presenting a Sound Scene Context L. D. Pham Dusan Salovic Anahid N. Jalali Alexander Schindler Khoa Tran H. Vu Phu X. Nguyen 57 5 0 16 Oct 2022
Description and analysis of novelties introduced in DCASE Task 4 2022 on the baseline system Francesca Ronchini Samuele Cornell Romain Serizel Nicolas Turpault Eduardo Fonseca D. Ellis 49 14 0 14 Oct 2022
Cross-dataset COVID-19 Transfer Learning with Cough Detection, Cough Segmentation, and Data Augmentation Bagus Tris Atmaja Zanjabila Suyanto A. Sasou 70 1 0 12 Oct 2022
Supervised and Unsupervised Learning of Audio Representations for Music Understanding Matthew C. McCallum Filip Korzeniowski Sergio Oramas F. Gouyon Andreas F. Ehmann SSL 135 40 0 07 Oct 2022
Matching Text and Audio Embeddings: Exploring Transfer-learning Strategies for Language-based Audio Retrieval Benno Weck Miguel Pérez Fernández Holger Kirchhoff Xavier Serra 58 3 0 06 Oct 2022
Learning Temporal Resolution in Spectrogram for Audio Classification Haohe Liu Xubo Liu Qiuqiang Kong Wenwu Wang Mark D. Plumbley 75 7 0 04 Oct 2022
Simple Pooling Front-ends For Efficient Audio Classification Xubo Liu Haohe Liu Qiuqiang Kong Xinhao Mei Mark D. Plumbley Wenwu Wang 102 17 0 03 Oct 2022
Contrastive Audio-Visual Masked Autoencoder Yuan Gong Andrew Rouditchenko Alexander H. Liu David Harwath Leonid Karlinsky Hilde Kuehne James R. Glass 122 128 0 02 Oct 2022
An empirical study of weakly supervised audio tagging embeddings for general audio representations Heinrich Dinkel Zhiyong Yan Yongqing Wang Junbo Zhang Yujun Wang 60 1 0 30 Sep 2022
Audio Retrieval with WavText5K and CLAP Training Soham Deshmukh Benjamin Elizalde Huaming Wang 3DV CLIP 181 53 0 28 Sep 2022
MeWEHV: Mel and Wave Embeddings for Human Voice Tasks Andrés Vasco-Carofilis Laura Fernández-Robles Enrique Alegre Eduardo FIDALGO 78 3 0 28 Sep 2022
The Efficacy of Self-Supervised Speech Models for Audio Representations Tung-Yu Wu Chen-An Li Tzu-Han Lin Tsung-Yuan Hsu Hung-yi Lee 64 5 0 26 Sep 2022
UniKW-AT: Unified Keyword Spotting and Audio Tagging Heinrich Dinkel Yongqing Wang Zhiyong Yan Junbo Zhang Yujun Wang 60 3 0 23 Sep 2022
Language-based Audio Retrieval Task in DCASE 2022 Challenge Huang Xie Samuel Lipping Tuomas Virtanen 115 18 0 20 Sep 2022
Improving Natural-Language-based Audio Retrieval with Transfer Learning and Audio & Text Augmentations Paul Primus Gerhard Widmer 51 6 0 24 Aug 2022
Improved Zero-Shot Audio Tagging & Classification with Patchout Spectrogram Transformers Paul Primus Gerhard Widmer VLM 110 5 0 24 Aug 2022
Fall Detection from Audios with Audio Transformers Prabhjot Kaur Qifan Wang Weisong Shi 42 18 0 23 Aug 2022
Pathway to Future Symbiotic Creativity Yi-Ting Guo Qi-fei Liu Jie Chen Wei Xue Jie Fu ... Fernando Rosas Jeffrey Shaw Xing Wu Jiji Zhang Jianliang Xu 66 0 0 18 Aug 2022
An investigation on selecting audio pre-trained models for audio captioning Peiran Yan Sheng-Wei Li 58 0 0 12 Aug 2022
Seeing your sleep stage: cross-modal distillation from EEG to infrared video Jianan Han Shenmin Zhang Aidong Men Yang Liu Z. Yao Yan-Tao Yan Qingchao Chen 68 4 0 11 Aug 2022
Hybrid Multimodal Feature Extraction, Mining and Fusion for Sentiment Analysis Jia Li Ziyang Zhang Jun Lang Yueqi Jiang Liuwei An ... Sheng Gao Jie Lin Chunxiao Fan Xiao Sun Meng Wang 101 32 0 05 Aug 2022
Surrey System for DCASE 2022 Task 5: Few-shot Bioacoustic Event Detection with Segment-level Metric Learning Haohe Liu Xubo Liu Xinhao Mei Qiuqiang Kong Wenwu Wang Mark D. Plumbley 64 9 0 21 Jul 2022
Introducing Auxiliary Text Query-modifier to Content-based Audio Retrieval Daiki Takeuchi Yasunori Ohishi Daisuke Niizumi Noboru Harada K. Kashino 112 2 0 20 Jul 2022
GAFX: A General Audio Feature eXtractor Zhaoyang Bu Han Zhang Xiaohu Zhu 54 0 0 19 Jul 2022
Visually-aware Acoustic Event Detection using Heterogeneous Graphs A. Shirian Krishna Somandepalli Victor Sanchez T. Guha 61 3 0 16 Jul 2022
Segment-level Metric Learning for Few-shot Bioacoustic Event Detection Haohe Liu Xubo Liu Xinhao Mei Qiuqiang Kong Wenwu Wang Mark D. Plumbley 74 8 0 15 Jul 2022
Masked Autoencoders that Listen Po-Yao (Bernie) Huang Hu Xu Juncheng Billy Li Alexei Baevski Michael Auli Wojciech Galuba Florian Metze Christoph Feichtenhofer 142 290 0 13 Jul 2022
EfficientLEAF: A Faster LEarnable Audio Frontend of Questionable Use Jan Schluter Gerald Gutenbrunner VLM 58 13 0 12 Jul 2022