Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1912.10211
Cited By
v1
v2
v3
v4
v5 (latest)
PANNs: Large-Scale Pretrained Audio Neural Networks for Audio Pattern Recognition
21 December 2019
Qiuqiang Kong
Yin Cao
Turab Iqbal
Yuxuan Wang
Wenwu Wang
Mark D. Plumbley
VLM
SSL
Re-assign community
ArXiv (abs)
PDF
HTML
Github (1475★)
Papers citing
"PANNs: Large-Scale Pretrained Audio Neural Networks for Audio Pattern Recognition"
50 / 545 papers shown
Title
RepAugment: Input-Agnostic Representation-Level Augmentation for Respiratory Sound Classification
June-Woo Kim
Miika Toikkanen
Sangmin Bae
Minseok Kim
Ho-Young Jung
81
7
0
05 May 2024
A Toolchain for Comprehensive Audio/Video Analysis Using Deep Learning Based Multimodal Approach (A use case of riot or violent context detection)
Lam Pham
Phat Lam
Tin Nguyen
Hieu Tang
Alexander Schindler
34
1
0
02 May 2024
Visual and audio scene classification for detecting discrepancies in video: a baseline method and experimental protocol
Konstantinos Apostolidis
Jakob Abesser
Luca Cuccovillo
Vasileios Mezaris
20
1
0
01 May 2024
Exploring Pre-trained General-purpose Audio Representations for Heart Murmur Detection
Daisuke Niizumi
Daiki Takeuchi
Yasunori Ohishi
Noboru Harada
K. Kashino
MedIm
54
2
0
26 Apr 2024
MIMOSA: Human-AI Co-Creation of Computational Spatial Audio Effects on Videos
Zheng Ning
Zheng Zhang
Jerrick Ban
Kaiwen Jiang
Ruohong Gan
Yapeng Tian
Tao Li
VGen
49
6
0
23 Apr 2024
AudioRepInceptionNeXt: A lightweight single-stream architecture for efficient audio recognition
Kin Wai Lau
Yasar Abbas Ur Rehman
L. Po
85
1
0
21 Apr 2024
Music Consistency Models
Zhengcong Fei
Mingyuan Fan
Junshi Huang
DiffM
97
5
0
20 Apr 2024
Track Role Prediction of Single-Instrumental Sequences
Changheon Han
Suhyun Lee
Minsam Ko
49
0
0
20 Apr 2024
Masked Modeling Duo: Towards a Universal Audio Pre-training Framework
Daisuke Niizumi
Daiki Takeuchi
Yasunori Ohishi
Noboru Harada
K. Kashino
105
15
0
09 Apr 2024
R
2
R^2
R
2
-Tuning: Efficient Image-to-Video Transfer Learning for Video Temporal Grounding
Ye Liu
Jixuan He
Wanhua Li
Junsik Kim
D. Wei
Hanspeter Pfister
Chang Wen Chen
93
14
0
31 Mar 2024
ACES: Evaluating Automated Audio Captioning Models on the Semantics of Sounds
Gijs Wijngaard
Elia Formisano
Bruno L. Giordano
M. Dumontier
91
3
0
27 Mar 2024
Detection of Deepfake Environmental Audio
Hafsa Ouajdi
Oussama Hadder
Modan Tailleur
Mathieu Lagrange
Laurie M. Heller
67
5
0
26 Mar 2024
Correlation of Fréchet Audio Distance With Human Perception of Environmental Audio Is Embedding Dependant
Modan Tailleur
Junwon Lee
Mathieu Lagrange
Keunwoo Choi
Laurie M. Heller
Keisuke Imoto
Yuki Okamoto
97
10
0
26 Mar 2024
Distributed collaborative anomalous sound detection by embedding sharing
Kota Dohi
Yohei Kawaguchi
FedML
28
2
0
25 Mar 2024
Listenable Maps for Audio Classifiers
Francesco Paissan
Mirco Ravanelli
Cem Subakan
63
10
0
19 Mar 2024
From Weak to Strong Sound Event Labels using Adaptive Change-Point Detection and Active Learning
John Martinsson
Olof Mogren
Maria Sandsten
Tuomas Virtanen
47
1
0
13 Mar 2024
Answering Diverse Questions via Text Attached with Key Audio-Visual Clues
Qilang Ye
Zitong Yu
Xin Liu
83
2
0
11 Mar 2024
EDTC: enhance depth of text comprehension in automated audio captioning
Liwen Tan
Yin Cao
Yi Zhou
77
0
0
27 Feb 2024
What Do Language Models Hear? Probing for Auditory Representations in Language Models
Jerry Ngo
Yoon Kim
AuLLM
MILM
58
8
0
26 Feb 2024
Phonetic and Lexical Discovery of a Canine Language using HuBERT
Xingyuan Li
Sinong Wang
Zeyu Xie
Mengyue Wu
Ke Zhu
55
0
0
25 Feb 2024
Cacophony: An Improved Contrastive Audio-Text Model
Ge Zhu
Jordan Darefsky
Zhiyao Duan
AuLLM
92
12
0
10 Feb 2024
Embedding Compression for Teacher-to-Student Knowledge Transfer
Yiwei Ding
Alexander Lerch
48
1
0
09 Feb 2024
Boosting Adversarial Transferability across Model Genus by Deformation-Constrained Warping
Qinliang Lin
Cheng Luo
Zenghao Niu
Xilin He
Weicheng Xie
Yuanbo Hou
Linlin Shen
Siyang Song
AAML
103
13
0
06 Feb 2024
Description on IEEE ICME 2024 Grand Challenge: Semi-supervised Acoustic Scene Classification under Domain Shift
Jisheng Bai
Mou Wang
Haohe Liu
Han Yin
Yafei Jia
...
Woon-Seng Gan
Mark D. Plumbley
S. Rahardja
Bin Xiang
Jianfeng Chen
51
7
0
05 Feb 2024
Audio Flamingo: A Novel Audio Language Model with Few-Shot Learning and Dialogue Abilities
Zhifeng Kong
Arushi Goel
Rohan Badlani
Ming-Yu Liu
Rafael Valle
Bryan Catanzaro
AuLLM
LM&MA
MLLM
163
94
0
02 Feb 2024
Bass Accompaniment Generation via Latent Diffusion
Marco Pasini
M. Grachten
Stefan Lattner
100
12
0
02 Feb 2024
PAM: Prompting Audio-Language Models for Audio Quality Assessment
Soham Deshmukh
Dareen Alharthi
Benjamin Elizalde
Hannes Gamper
Mahmoud Al Ismail
Rita Singh
Bhiksha Raj
Huaming Wang
96
13
0
01 Feb 2024
EnCLAP: Combining Neural Audio Codec and Audio-Text Joint Embedding for Automated Audio Captioning
Jaeyeon Kim
Jaeyoon Jung
Jinjoo Lee
Sang Hoon Woo
CLIP
VLM
72
25
0
31 Jan 2024
Masked Audio Modeling with CLAP and Multi-Objective Learning
Yifei Xin
Xiulian Peng
Yan Lu
112
8
0
29 Jan 2024
Expressive Acoustic Guitar Sound Synthesis with an Instrument-Specific Input Representation and Diffusion Outpainting
Hounsu Kim
Soonbeom Choi
Juhan Nam
57
3
0
24 Jan 2024
AAT: Adapting Audio Transformer for Various Acoustics Recognition Tasks
Yun Liang
Hai Lin
Shaojian Qiu
Yihang Zhang
35
1
0
19 Jan 2024
T-FOLEY: A Controllable Waveform-Domain Diffusion Model for Temporal-Event-Guided Foley Sound Synthesis
Yoonjin Chung
Junwon Lee
Juhan Nam
99
15
0
17 Jan 2024
From Coarse to Fine: Efficient Training for Audio Spectrogram Transformers
Jiu Feng
Mehmet Hamza Erol
Joon Son Chung
Arda Senocak
55
2
0
16 Jan 2024
Cascaded Cross-Modal Transformer for Audio-Textual Classification
Nicolae-Cătălin Ristea
Andrei Anghel
Radu Tudor Ionescu
96
2
0
15 Jan 2024
HiCMAE: Hierarchical Contrastive Masked Autoencoder for Self-Supervised Audio-Visual Emotion Recognition
Guoying Zhao
Zheng Lian
Bin Liu
Jianhua Tao
108
32
0
11 Jan 2024
VI-PANN: Harnessing Transfer Learning and Uncertainty-Aware Variational Inference for Improved Generalization in Audio Pattern Recognition
John Fischer
Marko Orescanin
Eric Eckstrand
UQCV
BDL
91
4
0
10 Jan 2024
Learning Audio Concepts from Counterfactual Natural Language
Ali Vosoughi
Luca Bondi
Ho-Hsiang Wu
Chenliang Xu
CML
91
5
0
10 Jan 2024
Class-Incremental Learning for Multi-Label Audio Classification
Manjunath Mulimani
A. Mesaros
CLL
65
12
0
09 Jan 2024
DJCM: A Deep Joint Cascade Model for Singing Voice Separation and Vocal Pitch Estimation
Haojie Wei
Xueke Cao
Wenbo Xu
Tangpeng Dan
Yueguo Chen
VLM
52
2
0
08 Jan 2024
EAT: Self-Supervised Pre-Training with Efficient Audio Transformer
Wenxi Chen
Yuzhe Liang
Ziyang Ma
Zhisheng Zheng
Xie Chen
ViT
107
22
0
07 Jan 2024
Towards Weakly Supervised Text-to-Audio Grounding
Xuenan Xu
Ziyang Ma
Mengyue Wu
Kai Yu
AI4TS
69
9
0
05 Jan 2024
TR-DETR: Task-Reciprocal Transformer for Joint Moment Retrieval and Highlight Detection
Hao Sun
Mingyao Zhou
Wenjing Chen
Wei Xie
PINN
3DGS
ViT
63
38
0
04 Jan 2024
PosCUDA: Position based Convolution for Unlearnable Audio Datasets
V. Gokul
Shlomo Dubnov
SSL
75
3
0
04 Jan 2024
Auffusion: Leveraging the Power of Diffusion and Large Language Models for Text-to-Audio Generation
Jinlong Xue
Yayue Deng
Yingming Gao
Ya Li
DiffM
98
36
0
02 Jan 2024
Selective-Memory Meta-Learning with Environment Representations for Sound Event Localization and Detection
Jinbo Hu
Yin Cao
Ming Wu
Qiuqiang Kong
Feiran Yang
Mark D. Plumbley
Jun Yang
78
1
0
27 Dec 2023
The NUS-HLT System for ICASSP2024 ICMC-ASR Grand Challenge
Meng Ge
Yizhou Peng
Yidi Jiang
Jingru Lin
Junyi Ao
Mehmet Sinan Yildirim
Shuai Wang
Haizhou Li
Mengling Feng
49
0
0
26 Dec 2023
Self-Supervised Learning for Few-Shot Bird Sound Classification
Ilyass Moummad
Romain Serizel
Nicolas Farrugia
SSL
91
10
0
25 Dec 2023
Audiobox: Unified Audio Generation with Natural Language Prompts
Apoorv Vyas
Bowen Shi
Matt Le
Andros Tjandra
Yi-Chiao Wu
...
Chris Summers
Carleigh Wood
Joshua Lane
Mary Williamson
Wei-Ning Hsu
129
94
0
25 Dec 2023
Deformable Audio Transformer for Audio Event Detection
Wentao Zhu
66
0
0
24 Dec 2023
Consistent and Relevant: Rethink the Query Embedding in General Sound Separation
Yuanyuan Wang
Hangting Chen
Dongchao Yang
Jianwei Yu
Chao Weng
Zhiyong Wu
Helen M. Meng
51
6
0
24 Dec 2023
Previous
1
2
3
4
5
...
9
10
11
Next