Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1912.10211
Cited By
v1
v2
v3
v4
v5 (latest)
PANNs: Large-Scale Pretrained Audio Neural Networks for Audio Pattern Recognition
21 December 2019
Qiuqiang Kong
Yin Cao
Turab Iqbal
Yuxuan Wang
Wenwu Wang
Mark D. Plumbley
VLM
SSL
Re-assign community
ArXiv (abs)
PDF
HTML
Github (1475★)
Papers citing
"PANNs: Large-Scale Pretrained Audio Neural Networks for Audio Pattern Recognition"
50 / 545 papers shown
Title
Language-Based Audio Retrieval with Converging Tied Layers and Contrastive Loss
Andrew Koh
Chng Eng Siong
144
1
0
29 Jun 2022
QbyE-MLPMixer: Query-by-Example Open-Vocabulary Keyword Spotting using MLPMixer
Jinmiao Huang
W. Gharbieh
Qianhui Wan
Han Suk Shim
Chul Lee
57
10
0
23 Jun 2022
Few-shot Long-Tailed Bird Audio Recognition
Marcos V. Conde
Ui-Jin Choi
43
8
0
22 Jun 2022
Probing Visual-Audio Representation for Video Highlight Detection via Hard-Pairs Guided Contrastive Learning
Shuaicheng Li
Feng Zhang
Kunlin Yang
Lin-Na Liu
Shinan Liu
Jun Hou
Shuai Yi
100
9
0
21 Jun 2022
Redundancy Reduction Twins Network: A Training framework for Multi-output Emotion Regression
Xin Jing
Meishu Song
Andreas Triantafyllopoulos
Zijiang Yang
Björn W. Schuller
32
8
0
18 Jun 2022
It's Time for Artistic Correspondence in Music and Video
Dídac Surís
Carl Vondrick
Bryan C. Russell
Justin Salamon
64
37
0
14 Jun 2022
Exploring speaker enrolment for few-shot personalisation in emotional vocalisation prediction
Andreas Triantafyllopoulos
Meishu Song
Zijiang Yang
Xin Jing
Björn W. Schuller
45
9
0
14 Jun 2022
PaddleSpeech: An Easy-to-Use All-in-One Speech Toolkit
Hui Zhang
Tian Yuan
Junkun Chen
Xintong Li
Renjie Zheng
...
Zeyu Chen
Xiaoguang Hu
Dianhai Yu
Yanjun Ma
Liang Huang
AuLLM
69
28
0
20 May 2022
The AI Mechanic: Acoustic Vehicle Characterization Neural Networks
Adam M. Terwilliger
J. Siegel
62
2
0
19 May 2022
Composing General Audio Representation by Fusing Multilayer Features of a Pre-trained Model
Daisuke Niizumi
Daiki Takeuchi
Yasunori Ohishi
Noboru Harada
K. Kashino
69
6
0
17 May 2022
Noise-Tolerant Learning for Audio-Visual Action Recognition
Haocheng Han
Qinghua Zheng
Minnan Luo
Kaiyao Miao
Feng Tian
Yuanchun Chen
NoLa
100
9
0
16 May 2022
Learning Representations for New Sound Classes With Continual Self-Supervised Learning
Zhepei Wang
Cem Subakan
Xilin Jiang
Junkai Wu
Efthymios Tzinis
Mirco Ravanelli
Paris Smaragdis
CLL
SSL
123
19
0
15 May 2022
Automated Audio Captioning: An Overview of Recent Progress and New Challenges
Xinhao Mei
Xubo Liu
Mark D. Plumbley
Wenwu Wang
104
44
0
12 May 2022
Beyond the Status Quo: A Contemporary Survey of Advances and Challenges in Audio Captioning
Xuenan Xu
Zeyu Xie
Mengyue Wu
K. Yu
84
16
0
11 May 2022
Fatigue Prediction in Outdoor Running Conditions using Audio Data
Andreas Triantafyllopoulos
Sandra Ottl
Alexander Gebhard
Esther Rituerto-González
Mirko Jaumann
...
P. Schneeweiss
I. Krauss
Maurice Gerczuk
Shahin Amiriparian
Björn W. Schuller
55
9
0
09 May 2022
Vocalsound: A Dataset for Improving Human Vocal Sounds Recognition
Yuan Gong
Jingbo Yu
James R. Glass
89
42
0
06 May 2022
Robustness of Neural Architectures for Audio Event Detection
Juncheng Billy Li
Zheng Wang
Shuhui Qu
Florian Metze
28
1
0
06 May 2022
Relation-guided acoustic scene classification aided with event embeddings
Yuanbo Hou
Bo Kang
Wout Van Hauwermeiren
Dick Botteldooren
52
16
0
01 May 2022
Autonomous In-Situ Soundscape Augmentation via Joint Selection of Masker and Gain
Karn N. Watcharasupat
Kenneth Ooi
Bhan Lam
Trevor Wong
Zhen-Ting Ong
W. Gan
52
8
0
29 Apr 2022
Pseudo strong labels for large scale weakly supervised audio tagging
Heinrich Dinkel
Zhiyong Yan
Yongqing Wang
Junbo Zhang
Yujun Wang
61
6
0
28 Apr 2022
Masked Spectrogram Prediction For Self-Supervised Audio Pre-Training
Dading Chong
Helin Wang
Peilin Zhou
Qingcheng Zeng
79
68
0
27 Apr 2022
Masked Spectrogram Modeling using Masked Autoencoders for Learning General-purpose Audio Representation
Daisuke Niizumi
Daiki Takeuchi
Yasunori Ohishi
Noboru Harada
K. Kashino
94
69
0
26 Apr 2022
ATST: Audio Representation Learning with Teacher-Student Transformer
Xian Li
Xiaofei Li
ViT
58
22
0
26 Apr 2022
Caption Feature Space Regularization for Audio Captioning
Yiming Zhang
Hong Yu
Ruoyi Du
Zhanyu Ma
Yuan Dong
122
1
0
18 Apr 2022
BYOL for Audio: Exploring Pre-trained General-purpose Audio Representations
Daisuke Niizumi
Daiki Takeuchi
Yasunori Ohishi
Noboru Harada
K. Kashino
SSL
100
59
0
15 Apr 2022
On the pragmatism of using binary classifiers over data intensive neural network classifiers for detection of COVID-19 from voice
Ankit Parag Shah
Hira Dhamyal
Yang Gao
Daniel Arancibia
Mario Arancibia
Bhiksha Raj
Rita Singh
79
5
0
11 Apr 2022
SoundBeam: Target sound extraction conditioned on sound-class labels and enrollment clues for increased performance and continuous learning
Marc Delcroix
Jorge Bennasar Vázquez
Tsubasa Ochiai
K. Kinoshita
Yasunori Ohishi
S. Araki
VLM
83
34
0
08 Apr 2022
RaDur: A Reference-aware and Duration-robust Network for Target Sound Detection
Dongchao Yang
Helin Wang
Zhongjie Ye
Yuexian Zou
Wenwu Wang
57
0
0
05 Apr 2022
A Mixed supervised Learning Framework for Target Sound Detection
Dongchao Yang
Helin Wang
Yuexian Zou
Wenwu Wang
49
0
0
05 Apr 2022
Improving Target Sound Extraction with Timestamp Information
Helin Wang
Dongchao Yang
Chao Weng
Jianwei Yu
Yuexian Zou
64
10
0
02 Apr 2022
A Temporal-oriented Broadcast ResNet for COVID-19 Detection
Xin Jing
Shuo Liu
Emilia Parada-Cabaleiro
Andreas Triantafyllopoulos
Meishu Song
Zijiang Yang
Björn W. Schuller
84
2
0
31 Mar 2022
WavThruVec: Latent speech representation as intermediate features for neural speech synthesis
Hubert Siuzdak
Piotr Dura
Pol van Rijn
Nori Jacoby
AI4TS
131
30
0
31 Mar 2022
A Passive Similarity based CNN Filter Pruning for Efficient Acoustic Scene Classification
Arshdeep Singh
Mark D. Plumbley
3DPC
50
14
0
29 Mar 2022
Interactive Audio-text Representation for Automated Audio Captioning with Contrastive Learning
Chen Chen
Nana Hou
Yuchen Hu
Heqing Zou
Xiaofeng Qi
Chng Eng Siong
VLM
84
21
0
29 Mar 2022
Audio-text Retrieval in Context
Siyu Lou
Xuenan Xu
Mengyue Wu
K. Yu
75
30
0
25 Mar 2022
AudioTagging Done Right: 2nd comparison of deep learning methods for environmental sound classification
Juncheng Billy Li
Shuhui Qu
Po-Yao (Bernie) Huang
Florian Metze
VLM
100
9
0
25 Mar 2022
Movie Genre Classification by Language Augmentation and Shot Sampling
Zhongping Zhang
Yiwen Gu
Bryan A. Plummer
Xin Miao
Jiayi Liu
Huayan Wang
VLM
CLIP
61
1
0
24 Mar 2022
UMT: Unified Multi-modal Transformers for Joint Video Moment Retrieval and Highlight Detection
Ye Liu
Siyuan Li
Yang Wu
C. Chen
Ying Shan
Xiaohu Qie
ViT
104
151
0
23 Mar 2022
On Adversarial Robustness of Large-scale Audio Visual Learning
Juncheng Billy Li
Shuhui Qu
Xinjian Li
Po-Yao (Bernie) Huang
Florian Metze
AAML
37
7
0
23 Mar 2022
Learning Audio Representations with MLPs
Mashrur M. Morshed
Ahmad Omar Ahsan
H. Mahmud
Md. Kamrul Hasan
72
4
0
16 Mar 2022
A Squeeze-and-Excitation and Transformer based Cross-task System for Environmental Sound Recognition
Jisheng Bai
Jianfeng Chen
Mou Wang
Muhammad Saad Ayub
56
9
0
16 Mar 2022
Dawn of the transformer era in speech emotion recognition: closing the valence gap
Johannes Wagner
Andreas Triantafyllopoulos
H. Wierstorf
Maximilian Schmitt
Felix Burkhardt
F. Eyben
Björn W. Schuller
96
306
0
14 Mar 2022
CMKD: CNN/Transformer-Based Cross-Model Knowledge Distillation for Audio Classification
Yuan Gong
Sameer Khurana
Andrew Rouditchenko
James R. Glass
VLM
73
29
0
13 Mar 2022
A study on joint modeling and data augmentation of multi-modalities for audio-visual scene classification
Qing Wang
Jun Du
Siyuan Zheng
Yunqing Li
Yajian Wang
...
Hu Hu
Chao-Han Huck Yang
Sabato Marco Siniscalchi
Yannan Wang
Chin-Hui Lee
45
2
0
07 Mar 2022
HEAR: Holistic Evaluation of Audio Representations
Joseph P. Turian
Jordie Shier
H. Khan
Bhiksha Raj
Björn W. Schuller
...
P. Esling
Pranay Manocha
Shinji Watanabe
Zeyu Jin
Yonatan Bisk
135
108
0
06 Mar 2022
Leveraging Pre-trained BERT for Audio Captioning
Xubo Liu
Xinhao Mei
Qiushi Huang
Jianyuan Sun
Jinzheng Zhao
Haohe Liu
Mark D. Plumbley
Volkan Kilicc
Wenwu Wang
115
30
0
06 Mar 2022
A Summary of the ComParE COVID-19 Challenges
H. Coppock
Ali Akman
Christian Bergler
Maurice Gerczuk
Chloë Brown
...
Sandra Ottl
Panagiotis Tzirakis
A. Batliner
Cecilia Mascolo
Björn W. Schuller
70
9
0
17 Feb 2022
Audio-Based Deep Learning Frameworks for Detecting COVID-19
Dat Ngo
L. D. Pham
Hoang Van Truong
Ş. Kolozali
D. Jarchi
83
5
0
10 Feb 2022
Maximizing Audio Event Detection Model Performance on Small Datasets Through Knowledge Transfer, Data Augmentation, And Pretraining: An Ablation Study
Daniel C. Tompkins
Kshitiz Kumar
Jian Wu
46
5
0
07 Feb 2022
Learning strides in convolutional neural networks
Rachid Riad
O. Teboul
David Grangier
Neil Zeghidour
82
42
0
03 Feb 2022
Previous
1
2
3
...
10
11
8
9
Next