Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1912.10211
Cited By
v1
v2
v3
v4
v5 (latest)
PANNs: Large-Scale Pretrained Audio Neural Networks for Audio Pattern Recognition
21 December 2019
Qiuqiang Kong
Yin Cao
Turab Iqbal
Yuxuan Wang
Wenwu Wang
Mark D. Plumbley
VLM
SSL
Re-assign community
ArXiv (abs)
PDF
HTML
Github (1475★)
Papers citing
"PANNs: Large-Scale Pretrained Audio Neural Networks for Audio Pattern Recognition"
50 / 545 papers shown
Title
HTS-AT: A Hierarchical Token-Semantic Audio Transformer for Sound Classification and Detection
Ke Chen
Xingjian Du
Bilei Zhu
Zejun Ma
Taylor Berg-Kirkpatrick
Shlomo Dubnov
ViT
169
277
0
02 Feb 2022
Self-supervised Graphs for Audio Representation Learning with Limited Labeled Data
A. Shirian
Krishna Somandepalli
T. Guha
SSL
132
10
0
31 Jan 2022
Anomalous Sound Detection using Spectral-Temporal Information Fusion
Youde Liu
Jian Guan
Qiaoxi Zhu
Wenwu Wang
67
58
0
14 Jan 2022
Local Information Assisted Attention-free Decoder for Audio Captioning
Feiyang Xiao
Jian Guan
Haiyan Lan
Qiaoxi Zhu
Wenwu Wang
98
11
0
10 Jan 2022
An Ensemble of Deep Learning Frameworks Applied For Predicting Respiratory Anomalies
L. D. Pham
Dat Ngo
T. Hoang
Alexander Schindler
Ian Mcloughlin
69
5
0
09 Jan 2022
Detect what you want: Target Sound Detection
Dongchao Yang
Helin Wang
Yuexian Zou
Fan Cui
Chao Weng
95
7
0
19 Dec 2021
Audio Retrieval with Natural Language Queries: A Benchmark Study
A. Sophia Koepke
Andreea-Maria Oncescu
João F. Henriques
Zeynep Akata
Samuel Albanie
76
102
0
17 Dec 2021
MIDI-DDSP: Detailed Control of Musical Performance via Hierarchical Modeling
Yusong Wu
Ethan Manilow
Yi Deng
Rigel Swavely
Kyle Kastner
Tim Cooijmans
Aaron Courville
Cheng-Zhi Anna Huang
Jesse Engel
87
45
0
17 Dec 2021
Chimpanzee voice prints? Insights from transfer learning experiments from human voices
Maël Leroux
Orestes Uxio Gutierrez Al-Khudhairy
N. Perony
S. Townsend
16
7
0
15 Dec 2021
Zero-shot Audio Source Separation through Query-based Learning from Weakly-labeled Data
Ke Chen
Xingjian Du
Bilei Zhu
Zejun Ma
Taylor Berg-Kirkpatrick
Shlomo Dubnov
109
46
0
15 Dec 2021
Towards Learning Universal Audio Representations
Luyu Wang
Pauline Luc
Yan Wu
Adrià Recasens
Lucas Smaira
...
Andrew Jaegle
Jean-Baptiste Alayrac
Sander Dieleman
João Carreira
Aaron van den Oord
SSL
124
71
0
23 Nov 2021
Effect of noise suppression losses on speech distortion and ASR performance
Sebastian Braun
H. Gamper
56
21
0
23 Nov 2021
SALSA-Lite: A Fast and Effective Feature for Polyphonic Sound Event Localization and Detection with Microphone Arrays
Thi Ngoc Tho Nguyen
Douglas L. Jones
Karn N. Watcharasupat
Huy P Phan
W. Gan
70
37
0
16 Nov 2021
Who calls the shots? Rethinking Few-Shot Learning for Audio
Yu Wang
Nicholas J. Bryan
Justin Salamon
M. Cartwright
J. P. Bello
VLM
130
25
0
18 Oct 2021
Conformer-Based Self-Supervised Learning for Non-Speech Audio Tasks
Sangeeta Srivastava
Yun Wang
Andros Tjandra
Anurag Kumar
Chunxi Liu
Kritika Singh
Yatharth Saraf
SSL
99
25
0
14 Oct 2021
Diverse Audio Captioning via Adversarial Training
Xinhao Mei
Xubo Liu
Jianyuan Sun
Mark D. Plumbley
Wenwu Wang
DiffM
GAN
102
28
0
13 Oct 2021
Multistage linguistic conditioning of convolutional layers for speech emotion recognition
Andreas Triantafyllopoulos
U. Reichel
Shuo Liu
Simon Huber
F. Eyben
Björn W. Schuller
90
11
0
13 Oct 2021
Improving the Performance of Automated Audio Captioning via Integrating the Acoustic and Semantic Information
Zhongjie Ye
Helin Wang
Dongchao Yang
Yuexian Zou
98
28
0
12 Oct 2021
Pano-AVQA: Grounded Audio-Visual Question Answering on 360
∘
^\circ
∘
Videos
Heeseung Yun
Youngjae Yu
Wonsuk Yang
Kangil Lee
Gunhee Kim
100
86
0
11 Oct 2021
Efficient Training of Audio Transformers with Patchout
Khaled Koutini
Jan Schluter
Hamid Eghbalzadeh
Gerhard Widmer
ViT
167
262
0
11 Oct 2021
Can Audio Captions Be Evaluated with Image Caption Metrics?
Zelin Zhou
Zhiling Zhang
Xuenan Xu
Zeyu Xie
Mengyue Wu
Kenny Q. Zhu
68
46
0
10 Oct 2021
A Mutual learning framework for Few-shot Sound Event Detection
Dongchao Yang
Helin Wang
Yuexian Zou
Zhongjie Ye
Wenwu Wang
145
26
0
09 Oct 2021
MusicNet: Compact Convolutional Neural Network for Real-time Background Music Detection
Chandan K. A. Reddy
Vishak Gopa
Harishchandra Dubey
Sergiy Matusevych
Ross Cutler
R. Aichner
42
0
0
08 Oct 2021
A study of the robustness of raw waveform based speaker embeddings under mismatched conditions
Ge Zhu
Frank Cwitkowitz
Z. Duan
55
2
0
08 Oct 2021
Transferring Voice Knowledge for Acoustic Event Detection: An Empirical Study
Dawei Liang
Yangyang Shi
Yun Wang
Nayan Singhal
Alex Xiao
Jonathan Shaw
Edison Thomaz
Ozlem Kalinli
M. Seltzer
37
4
0
07 Oct 2021
Fairness and underspecification in acoustic scene classification: The case for disaggregated evaluations
Andreas Triantafyllopoulos
M. Milling
Konstantinos Drossos
Björn W. Schuller
60
7
0
04 Oct 2021
Enriching Ontology with Temporal Commonsense for Low-Resource Audio Tagging
Zhiling Zhang
Zelin Zhou
Haifeng Tang
Guangwei Li
Mengyue Wu
Kenny Q. Zhu
119
4
0
03 Oct 2021
SALSA: Spatial Cue-Augmented Log-Spectrogram Features for Polyphonic Sound Event Localization and Detection
Thi Ngoc Tho Nguyen
Karn N. Watcharasupat
Ngoc Khanh Nguyen
Douglas L. Jones
W. Gan
62
49
0
01 Oct 2021
SpliceOut: A Simple and Efficient Audio Augmentation Method
Arjit Jain
Pranay Reddy Samala
Deepak Mittal
Preethi Jyothi
M. Singh
130
11
0
30 Sep 2021
Complementing Handcrafted Features with Raw Waveform Using a Light-weight Auxiliary Model
Zhongwei Teng
Quchen Fu
Jules White
Maria E. Powell
Douglas C. Schmidt
46
5
0
06 Sep 2021
Parsing Birdsong with Deep Audio Embeddings
Irina Tolkova
Brian Chu
Marcel Hedman
Stefan Kahl
Holger Klinck
56
11
0
20 Aug 2021
RW-Resnet: A Novel Speech Anti-Spoofing Model Using Raw Waveform
Youxuan Ma
Zongze Ren
Shugong Xu
83
40
0
12 Aug 2021
Automated Audio Captioning using Transfer Learning and Reconstruction Latent Space Similarity Regularization
Andrew Koh
Fuzhao Xue
Chng Eng Siong
68
20
0
10 Aug 2021
The EIHW-GLAM Deep Attentive Multi-model Fusion System for Cough-based COVID-19 Recognition in the DiCOVA 2021 Challenge
Zhao Ren
Yi Chang
Björn W. Schuller
59
0
0
06 Aug 2021
An Encoder-Decoder Based Audio Captioning System With Transfer and Reinforcement Learning
Xinhao Mei
Qiushi Huang
Xubo Liu
Gengyun Chen
Jingqian Wu
...
Tom Ko
H. Tang
Xingkun Shao
Mark D. Plumbley
Wenwu Wang
91
54
0
05 Aug 2021
DarkGAN: Exploiting Knowledge Distillation for Comprehensible Audio Synthesis with GANs
J. Nistal
Stefan Lattner
G. Richard
74
9
0
03 Aug 2021
Improving Polyphonic Sound Event Detection on Multichannel Recordings with the Sørensen-Dice Coefficient Loss and Transfer Learning
Karn N. Watcharasupat
Thi Ngoc Tho Nguyen
Ngoc Khanh Nguyen
Zhen Jian Lee
Douglas L. Jones
W. Gan
116
0
0
22 Jul 2021
Audio Captioning Transformer
Xinhao Mei
Xubo Liu
Qiushi Huang
Mark D. Plumbley
Wenwu Wang
ViT
94
78
0
21 Jul 2021
A Multimodal Machine Learning Framework for Teacher Vocal Delivery Evaluation
Hang Li
Yunxing Kang
Y. Hao
Wenbiao Ding
Zhongqin Wu
Zitao Liu
40
4
0
15 Jul 2021
Weakly-Supervised Classification and Detection of Bird Sounds in the Wild. A BirdCLEF 2021 Solution
Marcos V. Conde
K. Shubham
Prateek Agnihotri
N. D. Movva
S. Bessenyei
22
15
0
10 Jul 2021
Multi-modal Affect Analysis using standardized data within subjects in the Wild
Sachihiro Youoku
Takahisa Yamamoto
Junya Saito
A. Uchida
Xiaoyue Mi
Ziqiang Shi
Liu Liu
Zhongling Liu
Osafumi Nakayama
Kentaro Murase
CVBM
58
6
0
07 Jul 2021
TENET: A Time-reversal Enhancement Network for Noise-robust ASR
Fu-An Chao
Shao-Wei Fan-Jiang
Bi-Cheng Yan
J. Hung
Berlin Chen
60
13
0
04 Jul 2021
Improving Sound Event Classification by Increasing Shift Invariance in Convolutional Neural Networks
Eduardo Fonseca
Andrés Ferraro
Xavier Serra
AI4TS
131
9
0
01 Jul 2021
DCASE 2021 Task 3: Spectrotemporally-aligned Features for Polyphonic Sound Event Localization and Detection
Thi Ngoc Tho Nguyen
Karn N. Watcharasupat
Ngoc Khanh Nguyen
Douglas L. Jones
W. Gan
70
16
0
29 Jun 2021
Do sound event representations generalize to other audio tasks? A case study in audio transfer learning
Anurag Kumar
Yun Wang
V. Ithapu
Christian Fuegen
76
3
0
21 Jun 2021
Deep Learning Frameworks Applied For Audio-Visual Scene Classification
L. D. Pham
Alexander Schindler
Mina Schütz
Jasmin Lampert
S. Schlarb
Ross King
57
9
0
12 Jun 2021
Anomalous Sound Detection Using a Binary Classification Model and Class Centroids
Ibuki Kuroyanagi
Tomoki Hayashi
K. Takeda
Tomoki Toda
36
8
0
11 Jun 2021
PARP: Prune, Adjust and Re-Prune for Self-Supervised Speech Recognition
Cheng-I Jeff Lai
Yang Zhang
Alexander H. Liu
Shiyu Chang
Yi-Lun Liao
Yung-Sung Chuang
Kaizhi Qian
Sameer Khurana
David D. Cox
James R. Glass
VLM
162
78
0
10 Jun 2021
ERANNs: Efficient Residual Audio Neural Networks for Audio Pattern Recognition
S. Verbitskiy
Vladimir Berikov
Viacheslav Vyshegorodtsev
109
75
0
03 Jun 2021
Unsupervised Multi-Target Domain Adaptation for Acoustic Scene Classification
Dongchao Yang
Helin Wang
Yuexian Zou
26
5
0
21 May 2021
Previous
1
2
3
...
10
11
9
Next