ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1912.10211
  4. Cited By
PANNs: Large-Scale Pretrained Audio Neural Networks for Audio Pattern
  Recognition
v1v2v3v4v5 (latest)

PANNs: Large-Scale Pretrained Audio Neural Networks for Audio Pattern Recognition

21 December 2019
Qiuqiang Kong
Yin Cao
Turab Iqbal
Yuxuan Wang
Wenwu Wang
Mark D. Plumbley
    VLMSSL
ArXiv (abs)PDFHTMLGithub (1475★)

Papers citing "PANNs: Large-Scale Pretrained Audio Neural Networks for Audio Pattern Recognition"

50 / 545 papers shown
Title
Interpretability Analysis of Deep Models for COVID-19 Detection
Interpretability Analysis of Deep Models for COVID-19 Detection
Daniel Peixoto Pinto da Silva
Edresson Casanova
L. Gris
A. Júnior
Marcelo Finger
...
Beatriz Raposo
Marcus Martins
S. Aluísio
L. Berti
João Paulo Teixeira
60
3
0
25 Nov 2022
Learning General Audio Representations with Large-Scale Training of
  Patchout Audio Transformers
Learning General Audio Representations with Large-Scale Training of Patchout Audio Transformers
Khaled Koutini
Shahed Masoudian
Florian Schmid
Hamid Eghbalzadeh
Jan Schluter
Gerhard Widmer
131
6
0
25 Nov 2022
ASiT: Local-Global Audio Spectrogram vIsion Transformer for Event
  Classification
ASiT: Local-Global Audio Spectrogram vIsion Transformer for Event Classification
Sara Atito
Muhammad Awais
Wenwu Wang
Mark D. Plumbley
J. Kittler
ViT
64
11
0
23 Nov 2022
Ontology-aware Learning and Evaluation for Audio Tagging
Ontology-aware Learning and Evaluation for Audio Tagging
Haohe Liu
Qiuqiang Kong
Xubo Liu
Xinhao Mei
Wenwu Wang
Mark D. Plumbley
40
4
0
22 Nov 2022
Impact of visual assistance for automated audio captioning
Impact of visual assistance for automated audio captioning
Wim Boes
Hugo Van hamme
52
1
0
18 Nov 2022
SpectNet : End-to-End Audio Signal Classification Using Learnable
  Spectrograms
SpectNet : End-to-End Audio Signal Classification Using Learnable Spectrograms
Md. Istiaq Ansari
Taufiq Hasan
38
5
0
17 Nov 2022
Music Instrument Classification Reprogrammed
Music Instrument Classification Reprogrammed
Hsin-Hung Chen
Alexander Lerch
83
4
0
15 Nov 2022
Describing emotions with acoustic property prompts for speech emotion
  recognition
Describing emotions with acoustic property prompts for speech emotion recognition
Hira Dhamyal
Benjamin Elizalde
Soham Deshmukh
Huaming Wang
Bhiksha Raj
Rita Singh
57
10
0
14 Nov 2022
The Birds Need Attention Too: Analysing usage of Self Attention in
  identifying bird calls in soundscapes
The Birds Need Attention Too: Analysing usage of Self Attention in identifying bird calls in soundscapes
Chandra Kanth Nagesh
Abhishek Purushothama
48
2
0
14 Nov 2022
Is my automatic audio captioning system so bad? spider-max: a metric to
  consider several caption candidates
Is my automatic audio captioning system so bad? spider-max: a metric to consider several caption candidates
Etienne Labbé
Thomas Pellegrini
J. Pinquier
32
4
0
14 Nov 2022
Large-scale Contrastive Language-Audio Pretraining with Feature Fusion
  and Keyword-to-Caption Augmentation
Large-scale Contrastive Language-Audio Pretraining with Feature Fusion and Keyword-to-Caption Augmentation
Yusong Wu
Kai Chen
Tianyu Zhang
Yuchen Hui
Marianna Nezhurina
Taylor Berg-Kirkpatrick
Shlomo Dubnov
CLIP
190
544
0
12 Nov 2022
Investigations in Audio Captioning: Addressing Vocabulary Imbalance and
  Evaluating Suitability of Language-Centric Performance Metrics
Investigations in Audio Captioning: Addressing Vocabulary Imbalance and Evaluating Suitability of Language-Centric Performance Metrics
Sandeep Reddy Kothinti
Dimitra Emmanouilidou
30
3
0
12 Nov 2022
Efficient Large-scale Audio Tagging via Transformer-to-CNN Knowledge
  Distillation
Efficient Large-scale Audio Tagging via Transformer-to-CNN Knowledge Distillation
Florian Schmid
Khaled Koutini
Gerhard Widmer
ViT
86
60
0
09 Nov 2022
On Negative Sampling for Contrastive Audio-Text Retrieval
On Negative Sampling for Contrastive Audio-Text Retrieval
Huang Xie
Okko Räsänen
Tuomas Virtanen
55
7
0
08 Nov 2022
Exploring Train and Test-Time Augmentations for Audio-Language Learning
Exploring Train and Test-Time Augmentations for Audio-Language Learning
Eungbeom Kim
Jinhee Kim
Yoori Oh
Kyungsu Kim
Minju Park
Jaeheon Sim
J. Lee
Kyogu Lee
40
12
0
31 Oct 2022
Introducing topography in convolutional neural networks
Introducing topography in convolutional neural networks
Maxime Poli
Emmanuel Dupoux
Rachid Riad
64
0
0
28 Oct 2022
Pretraining Respiratory Sound Representations using Metadata and
  Contrastive Learning
Pretraining Respiratory Sound Representations using Metadata and Contrastive Learning
Ilyass Moummad
Nicolas Farrugia
97
23
0
27 Oct 2022
Multi-dimensional Edge-based Audio Event Relational Graph Representation
  Learning for Acoustic Scene Classification
Multi-dimensional Edge-based Audio Event Relational Graph Representation Learning for Acoustic Scene Classification
Yuanbo Hou
Siyang Song
Chuan Yu
Yuxin Song
Wenwu Wang
Dick Botteldooren
44
3
0
27 Oct 2022
Pretrained audio neural networks for Speech emotion recognition in
  Portuguese
Pretrained audio neural networks for Speech emotion recognition in Portuguese
M. Gauy
Marcelo Finger
26
4
0
26 Oct 2022
Neural Sound Field Decomposition with Super-resolution of Sound
  Direction
Neural Sound Field Decomposition with Super-resolution of Sound Direction
Qiuqiang Kong
Shilei Liu
Junjie Shi
Xuzhou Ye
Yin Cao
Qiaoxi Zhu
Yong-mei Xu
Yuxuan Wang
48
0
0
22 Oct 2022
Play It Back: Iterative Attention for Audio Recognition
Play It Back: Iterative Attention for Audio Recognition
Alexandros Stergiou
Dima Damen
83
4
0
20 Oct 2022
Propagating Variational Model Uncertainty for Bioacoustic Call Label
  Smoothing
Propagating Variational Model Uncertainty for Bioacoustic Call Label Smoothing
Georgios Rizos
J. Lawson
Simon Mitchell
Pranay Shah
Xin Wen
Cristina Banks‐Leite
R. Ewers
Bjoern W. Schuller
UQCV
57
2
0
19 Oct 2022
Robust, General, and Low Complexity Acoustic Scene Classification
  Systems and An Effective Visualization for Presenting a Sound Scene Context
Robust, General, and Low Complexity Acoustic Scene Classification Systems and An Effective Visualization for Presenting a Sound Scene Context
L. D. Pham
Dusan Salovic
Anahid N. Jalali
Alexander Schindler
Khoa Tran
H. Vu
Phu X. Nguyen
57
5
0
16 Oct 2022
Description and analysis of novelties introduced in DCASE Task 4 2022 on
  the baseline system
Description and analysis of novelties introduced in DCASE Task 4 2022 on the baseline system
Francesca Ronchini
Samuele Cornell
Romain Serizel
Nicolas Turpault
Eduardo Fonseca
D. Ellis
49
14
0
14 Oct 2022
Cross-dataset COVID-19 Transfer Learning with Cough Detection, Cough
  Segmentation, and Data Augmentation
Cross-dataset COVID-19 Transfer Learning with Cough Detection, Cough Segmentation, and Data Augmentation
Bagus Tris Atmaja
Zanjabila
Suyanto
A. Sasou
70
1
0
12 Oct 2022
Supervised and Unsupervised Learning of Audio Representations for Music
  Understanding
Supervised and Unsupervised Learning of Audio Representations for Music Understanding
Matthew C. McCallum
Filip Korzeniowski
Sergio Oramas
F. Gouyon
Andreas F. Ehmann
SSL
135
40
0
07 Oct 2022
Matching Text and Audio Embeddings: Exploring Transfer-learning
  Strategies for Language-based Audio Retrieval
Matching Text and Audio Embeddings: Exploring Transfer-learning Strategies for Language-based Audio Retrieval
Benno Weck
Miguel Pérez Fernández
Holger Kirchhoff
Xavier Serra
58
3
0
06 Oct 2022
Learning Temporal Resolution in Spectrogram for Audio Classification
Learning Temporal Resolution in Spectrogram for Audio Classification
Haohe Liu
Xubo Liu
Qiuqiang Kong
Wenwu Wang
Mark D. Plumbley
75
7
0
04 Oct 2022
Simple Pooling Front-ends For Efficient Audio Classification
Simple Pooling Front-ends For Efficient Audio Classification
Xubo Liu
Haohe Liu
Qiuqiang Kong
Xinhao Mei
Mark D. Plumbley
Wenwu Wang
102
17
0
03 Oct 2022
Contrastive Audio-Visual Masked Autoencoder
Contrastive Audio-Visual Masked Autoencoder
Yuan Gong
Andrew Rouditchenko
Alexander H. Liu
David Harwath
Leonid Karlinsky
Hilde Kuehne
James R. Glass
122
128
0
02 Oct 2022
An empirical study of weakly supervised audio tagging embeddings for
  general audio representations
An empirical study of weakly supervised audio tagging embeddings for general audio representations
Heinrich Dinkel
Zhiyong Yan
Yongqing Wang
Junbo Zhang
Yujun Wang
60
1
0
30 Sep 2022
Audio Retrieval with WavText5K and CLAP Training
Audio Retrieval with WavText5K and CLAP Training
Soham Deshmukh
Benjamin Elizalde
Huaming Wang
3DVCLIP
181
53
0
28 Sep 2022
MeWEHV: Mel and Wave Embeddings for Human Voice Tasks
MeWEHV: Mel and Wave Embeddings for Human Voice Tasks
Andrés Vasco-Carofilis
Laura Fernández-Robles
Enrique Alegre
Eduardo FIDALGO
78
3
0
28 Sep 2022
The Efficacy of Self-Supervised Speech Models for Audio Representations
The Efficacy of Self-Supervised Speech Models for Audio Representations
Tung-Yu Wu
Chen-An Li
Tzu-Han Lin
Tsung-Yuan Hsu
Hung-yi Lee
64
5
0
26 Sep 2022
UniKW-AT: Unified Keyword Spotting and Audio Tagging
UniKW-AT: Unified Keyword Spotting and Audio Tagging
Heinrich Dinkel
Yongqing Wang
Zhiyong Yan
Junbo Zhang
Yujun Wang
60
3
0
23 Sep 2022
Language-based Audio Retrieval Task in DCASE 2022 Challenge
Huang Xie
Samuel Lipping
Tuomas Virtanen
115
18
0
20 Sep 2022
Improving Natural-Language-based Audio Retrieval with Transfer Learning
  and Audio & Text Augmentations
Improving Natural-Language-based Audio Retrieval with Transfer Learning and Audio & Text Augmentations
Paul Primus
Gerhard Widmer
51
6
0
24 Aug 2022
Improved Zero-Shot Audio Tagging & Classification with Patchout
  Spectrogram Transformers
Improved Zero-Shot Audio Tagging & Classification with Patchout Spectrogram Transformers
Paul Primus
Gerhard Widmer
VLM
110
5
0
24 Aug 2022
Fall Detection from Audios with Audio Transformers
Fall Detection from Audios with Audio Transformers
Prabhjot Kaur
Qifan Wang
Weisong Shi
42
18
0
23 Aug 2022
Pathway to Future Symbiotic Creativity
Pathway to Future Symbiotic Creativity
Yi-Ting Guo
Qi-fei Liu
Jie Chen
Wei Xue
Jie Fu
...
Fernando Rosas
Jeffrey Shaw
Xing Wu
Jiji Zhang
Jianliang Xu
66
0
0
18 Aug 2022
An investigation on selecting audio pre-trained models for audio
  captioning
An investigation on selecting audio pre-trained models for audio captioning
Peiran Yan
Sheng-Wei Li
58
0
0
12 Aug 2022
Seeing your sleep stage: cross-modal distillation from EEG to infrared
  video
Seeing your sleep stage: cross-modal distillation from EEG to infrared video
Jianan Han
Shenmin Zhang
Aidong Men
Yang Liu
Z. Yao
Yan-Tao Yan
Qingchao Chen
68
4
0
11 Aug 2022
Hybrid Multimodal Feature Extraction, Mining and Fusion for Sentiment
  Analysis
Hybrid Multimodal Feature Extraction, Mining and Fusion for Sentiment Analysis
Jia Li
Ziyang Zhang
Jun Lang
Yueqi Jiang
Liuwei An
...
Sheng Gao
Jie Lin
Chunxiao Fan
Xiao Sun
Meng Wang
101
32
0
05 Aug 2022
Surrey System for DCASE 2022 Task 5: Few-shot Bioacoustic Event
  Detection with Segment-level Metric Learning
Surrey System for DCASE 2022 Task 5: Few-shot Bioacoustic Event Detection with Segment-level Metric Learning
Haohe Liu
Xubo Liu
Xinhao Mei
Qiuqiang Kong
Wenwu Wang
Mark D. Plumbley
64
9
0
21 Jul 2022
Introducing Auxiliary Text Query-modifier to Content-based Audio
  Retrieval
Introducing Auxiliary Text Query-modifier to Content-based Audio Retrieval
Daiki Takeuchi
Yasunori Ohishi
Daisuke Niizumi
Noboru Harada
K. Kashino
112
2
0
20 Jul 2022
GAFX: A General Audio Feature eXtractor
GAFX: A General Audio Feature eXtractor
Zhaoyang Bu
Han Zhang
Xiaohu Zhu
54
0
0
19 Jul 2022
Visually-aware Acoustic Event Detection using Heterogeneous Graphs
Visually-aware Acoustic Event Detection using Heterogeneous Graphs
A. Shirian
Krishna Somandepalli
Victor Sanchez
T. Guha
61
3
0
16 Jul 2022
Segment-level Metric Learning for Few-shot Bioacoustic Event Detection
Segment-level Metric Learning for Few-shot Bioacoustic Event Detection
Haohe Liu
Xubo Liu
Xinhao Mei
Qiuqiang Kong
Wenwu Wang
Mark D. Plumbley
74
8
0
15 Jul 2022
Masked Autoencoders that Listen
Masked Autoencoders that Listen
Po-Yao (Bernie) Huang
Hu Xu
Juncheng Billy Li
Alexei Baevski
Michael Auli
Wojciech Galuba
Florian Metze
Christoph Feichtenhofer
142
290
0
13 Jul 2022
EfficientLEAF: A Faster LEarnable Audio Frontend of Questionable Use
EfficientLEAF: A Faster LEarnable Audio Frontend of Questionable Use
Jan Schluter
Gerald Gutenbrunner
VLM
58
13
0
12 Jul 2022
Previous
123...1011789
Next