ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2203.03022
  4. Cited By
HEAR: Holistic Evaluation of Audio Representations

HEAR: Holistic Evaluation of Audio Representations

6 March 2022
Joseph P. Turian
Jordie Shier
H. Khan
Bhiksha Raj
Björn W. Schuller
C. Steinmetz
C. Malloy
George Tzanetakis
Gissel Velarde
K. McNally
Max Henry
Nicolas Pinto
Camille Noufi
Christian Clough
Dorien Herremans
Eduardo Fonseca
Jesse Engel
Justin Salamon
P. Esling
Pranay Manocha
Shinji Watanabe
Zeyu Jin
Yonatan Bisk
ArXivPDFHTML

Papers citing "HEAR: Holistic Evaluation of Audio Representations"

31 / 31 papers shown
Title
TS-SUPERB: A Target Speech Processing Benchmark for Speech Self-Supervised Learning Models
TS-SUPERB: A Target Speech Processing Benchmark for Speech Self-Supervised Learning Models
Junyi Peng
Takanori Ashihara
Marc Delcroix
Tsubasa Ochiai
Oldrich Plchot
Shoko Araki
J. Černocký
ELM
19
0
0
10 May 2025
Towards a Unified Representation Evaluation Framework Beyond Downstream Tasks
Towards a Unified Representation Evaluation Framework Beyond Downstream Tasks
Christos Plachouras
Julien Guinot
George Fazekas
Elio Quinton
Emmanouil Benetos
Johan Pauwels
65
1
0
09 May 2025
BLAB: Brutally Long Audio Bench
BLAB: Brutally Long Audio Bench
Orevaoghene Ahia
Martijn Bartelds
Kabir Ahuja
Hila Gonen
Valentin Hofmann
...
Noah Bennett
Shinji Watanabe
Noah A. Smith
Yulia Tsvetkov
Sachin Kumar
AuLLM
LM&MA
VLM
53
0
0
05 May 2025
Can Masked Autoencoders Also Listen to Birds?
Can Masked Autoencoders Also Listen to Birds?
Lukas Rauch
Ilyass Moummad
René Heinrich
Alexis Joly
Bernhard Sick
Christoph Scholz
27
0
0
17 Apr 2025
Audio-Language Datasets of Scenes and Events: A Survey
Audio-Language Datasets of Scenes and Events: A Survey
Gijs Wijngaard
Elia Formisano
Michele Esposito
M. Dumontier
79
2
0
10 Jan 2025
Bird Vocalization Embedding Extraction Using Self-Supervised Disentangled Representation Learning
Bird Vocalization Embedding Extraction Using Self-Supervised Disentangled Representation Learning
Runwu Shi
Katsutoshi Itoyama
K. Nakadai
SSL
DRL
32
1
0
31 Dec 2024
Zero-shot Musical Stem Retrieval with Joint-Embedding Predictive Architectures
Zero-shot Musical Stem Retrieval with Joint-Embedding Predictive Architectures
Alain Riou
Antonin Gagnere
Gaëtan Hadjeres
Stefan Lattner
Geoffroy Peeters
86
0
0
29 Nov 2024
Effective Pre-Training of Audio Transformers for Sound Event Detection
Effective Pre-Training of Audio Transformers for Sound Event Detection
Florian Schmid
T. Morocutti
Francesco Foscarin
Jan Schluter
Paul Primus
Gerhard Widmer
ViT
23
2
0
14 Sep 2024
Fusing Audio and Metadata Embeddings Improves Language-based Audio
  Retrieval
Fusing Audio and Metadata Embeddings Improves Language-based Audio Retrieval
Paul Primus
Gerhard Widmer
45
3
0
22 Jun 2024
Predicting Heart Activity from Speech using Data-driven and
  Knowledge-based features
Predicting Heart Activity from Speech using Data-driven and Knowledge-based features
Gasser Elbanna
Z. Mostaani
Mathew Magimai.-Doss
SSL
30
0
0
10 Jun 2024
Multi-label Open-set Audio Classification
Multi-label Open-set Audio Classification
Sripathi Sridhar
Mark Cartwright
VLM
27
3
0
20 Oct 2023
Efficient Supervised Training of Audio Transformers for Music
  Representation Learning
Efficient Supervised Training of Audio Transformers for Music Representation Learning
Pablo Alonso-Jiménez
Xavier Serra
Dmitry Bogdanov
ViT
19
3
0
28 Sep 2023
Joint Audio and Speech Understanding
Joint Audio and Speech Understanding
Yuan Gong
Alexander H. Liu
Hongyin Luo
Leonid Karlinsky
James R. Glass
AuLLM
21
66
0
25 Sep 2023
Advancing Natural-Language Based Audio Retrieval with PaSST and Large
  Audio-Caption Data Sets
Advancing Natural-Language Based Audio Retrieval with PaSST and Large Audio-Caption Data Sets
Paul Primus
Khaled Koutini
Gerhard Widmer
19
13
0
08 Aug 2023
Speaker Embeddings as Individuality Proxy for Voice Stress Detection
Speaker Embeddings as Individuality Proxy for Voice Stress Detection
Zihan Wu
Neil Scheidwasser
Karl El Hajal
Milos Cernak
24
3
0
09 Jun 2023
Pengi: An Audio Language Model for Audio Tasks
Pengi: An Audio Language Model for Audio Tasks
Soham Deshmukh
Benjamin Elizalde
Rita Singh
Huaming Wang
MLLM
AuLLM
30
156
0
19 May 2023
Leveraging Neural Representations for Audio Manipulation
Leveraging Neural Representations for Audio Manipulation
Scott H. Hawley
C. Steinmetz
19
2
0
10 Apr 2023
Low-Complexity Audio Embedding Extractors
Low-Complexity Audio Embedding Extractors
Florian Schmid
Khaled Koutini
Gerhard Widmer
11
4
0
03 Mar 2023
audb -- Sharing and Versioning of Audio and Annotation Data in Python
audb -- Sharing and Versioning of Audio and Annotation Data in Python
H. Wierstorf
Johannes Wagner
F. Eyben
Felix Burkhardt
Björn W. Schuller
25
1
0
01 Mar 2023
Audio Barlow Twins: Self-Supervised Audio Representation Learning
Audio Barlow Twins: Self-Supervised Audio Representation Learning
Jonah Anton
H. Coppock
Pancham Shukla
Bjorn W. Schuller
BDL
SSL
24
8
0
28 Sep 2022
BYOL-S: Learning Self-supervised Speech Representations by Bootstrapping
BYOL-S: Learning Self-supervised Speech Representations by Bootstrapping
Gasser Elbanna
Neil Scheidwasser
M. Kegler
P. Beckmann
Karl El Hajal
Milos Cernak
SSL
24
21
0
24 Jun 2022
Self-Supervised Speech Representation Learning: A Review
Self-Supervised Speech Representation Learning: A Review
Abdel-rahman Mohamed
Hung-yi Lee
Lasse Borgholt
Jakob Drachmann Havtorn
Joakim Edin
...
Shang-Wen Li
Karen Livescu
Lars Maaløe
Tara N. Sainath
Shinji Watanabe
SSL
AI4TS
124
344
0
21 May 2022
Masked Spectrogram Modeling using Masked Autoencoders for Learning
  General-purpose Audio Representation
Masked Spectrogram Modeling using Masked Autoencoders for Learning General-purpose Audio Representation
Daisuke Niizumi
Daiki Takeuchi
Yasunori Ohishi
N. Harada
K. Kashino
19
65
0
26 Apr 2022
BYOL for Audio: Exploring Pre-trained General-purpose Audio
  Representations
BYOL for Audio: Exploring Pre-trained General-purpose Audio Representations
Daisuke Niizumi
Daiki Takeuchi
Yasunori Ohishi
N. Harada
K. Kashino
SSL
29
53
0
15 Apr 2022
Multimodal Self-Supervised Learning of General Audio Representations
Multimodal Self-Supervised Learning of General Audio Representations
Luyu Wang
Pauline Luc
Adrià Recasens
Jean-Baptiste Alayrac
Aaron van den Oord
SSL
70
41
0
26 Apr 2021
PSLA: Improving Audio Tagging with Pretraining, Sampling, Labeling, and
  Aggregation
PSLA: Improving Audio Tagging with Pretraining, Sampling, Labeling, and Aggregation
Yuan Gong
Yu-An Chung
James R. Glass
VLM
99
144
0
02 Feb 2021
Generative Spoken Language Modeling from Raw Audio
Generative Spoken Language Modeling from Raw Audio
Kushal Lakhotia
Evgeny Kharitonov
Wei-Ning Hsu
Yossi Adi
Adam Polyak
...
Tu Nguyen
Jade Copet
Alexei Baevski
A. Mohamed
Emmanuel Dupoux
AuLLM
174
336
0
01 Feb 2021
LEAF: A Learnable Frontend for Audio Classification
LEAF: A Learnable Frontend for Audio Classification
Neil Zeghidour
O. Teboul
Félix de Chaumont Quitry
Marco Tagliasacchi
VLM
AAML
74
141
0
21 Jan 2021
DDSP: Differentiable Digital Signal Processing
DDSP: Differentiable Digital Signal Processing
Jesse Engel
Lamtharn Hantrakul
Chenjie Gu
Adam Roberts
DiffM
83
371
0
14 Jan 2020
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language
  Understanding
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding
Alex Jinpeng Wang
Amanpreet Singh
Julian Michael
Felix Hill
Omer Levy
Samuel R. Bowman
ELM
294
6,943
0
20 Apr 2018
MobileNets: Efficient Convolutional Neural Networks for Mobile Vision
  Applications
MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications
Andrew G. Howard
Menglong Zhu
Bo Chen
Dmitry Kalenichenko
Weijun Wang
Tobias Weyand
M. Andreetto
Hartwig Adam
3DH
948
20,471
0
17 Apr 2017
1