ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2304.00830
  4. Cited By
AUDIT: Audio Editing by Following Instructions with Latent Diffusion
  Models

AUDIT: Audio Editing by Following Instructions with Latent Diffusion Models

3 April 2023
Yuancheng Wang
Zeqian Ju
Xuejiao Tan
Lei He
Zhizheng Wu
Jiang Bian
Sheng Zhao
    DiffM
ArXivPDFHTML

Papers citing "AUDIT: Audio Editing by Following Instructions with Latent Diffusion Models"

44 / 44 papers shown
Title
A Survey on Cross-Modal Interaction Between Music and Multimodal Data
A Survey on Cross-Modal Interaction Between Music and Multimodal Data
Sifei Li
Mining Tan
Feier Shen
Minyan Luo
Zijiao Yin
Fan Tang
W. Dong
Changsheng Xu
57
0
0
17 Apr 2025
Audio Texture Manipulation by Exemplar-Based Analogy
Audio Texture Manipulation by Exemplar-Based Analogy
Kan Jen Cheng
Tingle Li
Gopala Anumanchipalli
DiffM
33
0
0
21 Jan 2025
FlowSep: Language-Queried Sound Separation with Rectified Flow Matching
FlowSep: Language-Queried Sound Separation with Rectified Flow Matching
Yi Yuan
Xubo Liu
Haohe Liu
Mark D. Plumbley
Wenwu Wang
52
3
0
10 Jan 2025
COCOLA: Coherence-Oriented Contrastive Learning of Musical Audio Representations
COCOLA: Coherence-Oriented Contrastive Learning of Musical Audio Representations
Ruben Ciranni
Emilian Postolache
Giorgio Mariani
Michele Mancusi
Giorgio Fabbro
Emanuele Rodolà
Luca Cosmo
59
7
0
10 Jan 2025
UIBDiffusion: Universal Imperceptible Backdoor Attack for Diffusion Models
UIBDiffusion: Universal Imperceptible Backdoor Attack for Diffusion Models
Yuning Han
Bingyin Zhao
Rui Chu
Feng Luo
Biplab Sikdar
Yingjie Lao
DiffM
AAML
70
1
0
16 Dec 2024
The Evolution and Future Perspectives of Artificial Intelligence
  Generated Content
The Evolution and Future Perspectives of Artificial Intelligence Generated Content
Chengzhang Zhu
Luobin Cui
Ying Tang
Jiacun Wang
89
1
0
02 Dec 2024
ST-ITO: Controlling Audio Effects for Style Transfer with Inference-Time
  Optimization
ST-ITO: Controlling Audio Effects for Style Transfer with Inference-Time Optimization
C. Steinmetz
Shubhr Singh
Marco Comunità
Ilias Ibnyahya
Shanxin Yuan
Emmanouil Benetos
Joshua Reiss
26
6
0
28 Oct 2024
Sound Check: Auditing Audio Datasets
Sound Check: Auditing Audio Datasets
William Agnew
Julia Barnett
Annie Chu
Rachel Hong
Michael Feffer
Robin Netzorg
Harry H. Jiang
Ezra Awumey
Sauvik Das
39
1
0
17 Oct 2024
Diff-SAGe: End-to-End Spatial Audio Generation Using Diffusion Models
Diff-SAGe: End-to-End Spatial Audio Generation Using Diffusion Models
Saksham Singh Kushwaha
Jianbo Ma
Mark R. P. Thomas
Yapeng Tian
Avery Bruni
27
1
0
15 Oct 2024
Did You Hear That? Introducing AADG: A Framework for Generating
  Benchmark Data in Audio Anomaly Detection
Did You Hear That? Introducing AADG: A Framework for Generating Benchmark Data in Audio Anomaly Detection
Ksheeraja Raghavan
Samiran Gode
Ankit Parag Shah
Surabhi Raghavan
Wolfram Burgard
Bhiksha Raj
Rita Singh
25
0
0
04 Oct 2024
Self-Supervised Audio-Visual Soundscape Stylization
Self-Supervised Audio-Visual Soundscape Stylization
Tingle Li
Renhao Wang
Po-Yao Huang
Andrew Owens
Gopala Anumanchipalli
DiffM
SSL
35
4
0
22 Sep 2024
AudioEditor: A Training-Free Diffusion-Based Audio Editing Framework
AudioEditor: A Training-Free Diffusion-Based Audio Editing Framework
Yuhang Jia
Yang Chen
Jinghua Zhao
Shiwan Zhao
Wenjia Zeng
Yong Chen
Yong Qin
DiffM
29
1
0
19 Sep 2024
Seed-Music: A Unified Framework for High Quality and Controlled Music
  Generation
Seed-Music: A Unified Framework for High Quality and Controlled Music Generation
Ye Bai
Haonan Chen
Jitong Chen
Zhuo Chen
Yi Deng
...
Hang Zhao
Ziyi Zhao
Dejian Zhong
Shicen Zhou
Pei Zou
DiffM
58
6
0
13 Sep 2024
Elucidating Optimal Reward-Diversity Tradeoffs in Text-to-Image
  Diffusion Models
Elucidating Optimal Reward-Diversity Tradeoffs in Text-to-Image Diffusion Models
Rohit Jena
Ali Taghibakhshi
Sahil Jain
Gerald Shen
Nima Tajbakhsh
Arash Vahdat
38
3
0
09 Sep 2024
Futga: Towards Fine-grained Music Understanding through
  Temporally-enhanced Generative Augmentation
Futga: Towards Fine-grained Music Understanding through Temporally-enhanced Generative Augmentation
Junda Wu
Zachary Novack
Amit Namburi
Jiaheng Dai
Hao-Wen Dong
Zhouhang Xie
Carol Chen
Julian McAuley
38
1
0
29 Jul 2024
PicoAudio: Enabling Precise Timestamp and Frequency Controllability of
  Audio Events in Text-to-audio Generation
PicoAudio: Enabling Precise Timestamp and Frequency Controllability of Audio Events in Text-to-audio Generation
Zeyu Xie
Xuenan Xu
Zhizheng Wu
Mengyue Wu
35
8
0
03 Jul 2024
Blind Inversion using Latent Diffusion Priors
Blind Inversion using Latent Diffusion Priors
Weimin Bai
Siyi Chen
Wenzheng Chen
He Sun
DiffM
26
4
0
01 Jul 2024
MeLFusion: Synthesizing Music from Image and Language Cues using
  Diffusion Models
MeLFusion: Synthesizing Music from Image and Language Cues using Diffusion Models
Sanjoy Chowdhury
Sayan Nag
K. J. Joseph
Balaji Vasan Srinivasan
Dinesh Manocha
DiffM
46
7
0
07 Jun 2024
Instruct-MusicGen: Unlocking Text-to-Music Editing for Music Language
  Models via Instruction Tuning
Instruct-MusicGen: Unlocking Text-to-Music Editing for Music Language Models via Instruction Tuning
Yixiao Zhang
Yukara Ikemiya
Woosung Choi
Naoki Murata
Marco A. Martínez Ramírez
Liwei Lin
Gus Xia
Wei-Hsiang Liao
Yuki Mitsufuji
Simon Dixon
55
10
0
28 May 2024
AudioSetMix: Enhancing Audio-Language Datasets with LLM-Assisted
  Augmentations
AudioSetMix: Enhancing Audio-Language Datasets with LLM-Assisted Augmentations
David Xu
21
2
0
17 May 2024
Prompt-guided Precise Audio Editing with Diffusion Models
Prompt-guided Precise Audio Editing with Diffusion Models
Manjie Xu
Chenxing Li
Duzhen Zhang
Dan Su
Weihan Liang
Dong Yu
DiffM
36
4
0
11 May 2024
Generalized Multi-Source Inference for Text Conditioned Music Diffusion
  Models
Generalized Multi-Source Inference for Text Conditioned Music Diffusion Models
Emilian Postolache
Giorgio Mariani
Luca Cosmo
Emmanouil Benetos
Emanuele Rodolà
DiffM
35
9
0
18 Mar 2024
MusicHiFi: Fast High-Fidelity Stereo Vocoding
MusicHiFi: Fast High-Fidelity Stereo Vocoding
Ge Zhu
Juan-Pablo Caceres
Zhiyao Duan
Nicholas J. Bryan
DiffM
24
4
0
15 Mar 2024
SingVisio: Visual Analytics of Diffusion Model for Singing Voice
  Conversion
SingVisio: Visual Analytics of Diffusion Model for Singing Voice Conversion
Liumeng Xue
Chaoren Wang
Mingxuan Wang
Xueyao Zhang
Jun Han
Zhizheng Wu
DiffM
24
5
0
20 Feb 2024
Zero-Shot Unsupervised and Text-Based Audio Editing Using DDPM Inversion
Zero-Shot Unsupervised and Text-Based Audio Editing Using DDPM Inversion
Hila Manor
T. Michaeli
DiffM
21
25
0
15 Feb 2024
Diffusion Models for Audio Restoration
Diffusion Models for Audio Restoration
Jean-Marie Lemercier
Julius Richter
Simon Welker
Eloi Moliner
Vesa Valimaki
Timo Gerkmann
25
14
0
15 Feb 2024
MusicMagus: Zero-Shot Text-to-Music Editing via Diffusion Models
MusicMagus: Zero-Shot Text-to-Music Editing via Diffusion Models
Yixiao Zhang
Yukara Ikemiya
Gus Xia
Naoki Murata
Marco A. Martínez Ramírez
Wei-Hsiang Liao
Yuki Mitsufuji
Simon Dixon
42
20
0
09 Feb 2024
Listen, Chat, and Edit: Text-Guided Soundscape Modification for Enhanced
  Auditory Experience
Listen, Chat, and Edit: Text-Guided Soundscape Modification for Enhanced Auditory Experience
Xilin Jiang
Cong Han
Yinghao Aaron Li
N. Mesgarani
KELM
26
4
0
06 Feb 2024
Audiobox: Unified Audio Generation with Natural Language Prompts
Audiobox: Unified Audio Generation with Natural Language Prompts
Apoorv Vyas
Bowen Shi
Matt Le
Andros Tjandra
Yi-Chiao Wu
...
Chris Summers
Carleigh Wood
Joshua Lane
Mary Williamson
Wei-Ning Hsu
42
75
0
25 Dec 2023
Balanced SNR-Aware Distillation for Guided Text-to-Audio Generation
Balanced SNR-Aware Distillation for Guided Text-to-Audio Generation
Bingzhi Liu
Yin Cao
Haohe Liu
Yi Zhou
DiffM
8
0
0
25 Dec 2023
Amphion: An Open-Source Audio, Music and Speech Generation Toolkit
Amphion: An Open-Source Audio, Music and Speech Generation Toolkit
Xueyao Zhang
Liumeng Xue
Yicheng Gu
Yuancheng Wang
Haorui He
...
Mingxuan Wang
Jun Han
Kai Chen
Haizhou Li
Zhizheng Wu
27
26
0
15 Dec 2023
PerMod: Perceptually Grounded Voice Modification with Latent Diffusion
  Models
PerMod: Perceptually Grounded Voice Modification with Latent Diffusion Models
Robin Netzorg
A. Jalal
Luna McNulty
Gopala Krishna Anumanchipalli
DiffM
17
1
0
13 Dec 2023
CoDi-2: In-Context, Interleaved, and Interactive Any-to-Any Generation
CoDi-2: In-Context, Interleaved, and Interactive Any-to-Any Generation
Zineng Tang
Ziyi Yang
Mahmoud Khademi
Yang Liu
Chenguang Zhu
Mohit Bansal
LRM
MLLM
AuLLM
52
44
0
30 Nov 2023
EDMSound: Spectrogram Based Diffusion Models for Efficient and
  High-Quality Audio Synthesis
EDMSound: Spectrogram Based Diffusion Models for Efficient and High-Quality Audio Synthesis
Ge Zhu
Yutong Wen
M. Carbonneau
Zhiyao Duan
DiffM
43
7
0
15 Nov 2023
Audio Editing with Non-Rigid Text Prompts
Audio Editing with Non-Rigid Text Prompts
Francesco Paissan
Luca Della Libera
Zhepei Wang
Mirco Ravanelli
Paris Smaragdis
Cem Subakan
DiffM
32
5
0
19 Oct 2023
Loop Copilot: Conducting AI Ensembles for Music Generation and Iterative
  Editing
Loop Copilot: Conducting AI Ensembles for Music Generation and Iterative Editing
Yixiao Zhang
Akira Maezawa
Gus Xia
Kazuhiko Yamamoto
Simon Dixon
44
17
0
19 Oct 2023
uSee: Unified Speech Enhancement and Editing with Conditional Diffusion
  Models
uSee: Unified Speech Enhancement and Editing with Conditional Diffusion Models
Muqiao Yang
Chunlei Zhang
Yong-mei Xu
Zhongweiyang Xu
Heming Wang
Bhiksha Raj
Dong Yu
DiffM
23
4
0
02 Oct 2023
UniAudio: An Audio Foundation Model Toward Universal Audio Generation
UniAudio: An Audio Foundation Model Toward Universal Audio Generation
Dongchao Yang
Jinchuan Tian
Xuejiao Tan
Rongjie Huang
Songxiang Liu
...
Jiang Bian
Xixin Wu
Zhou Zhao
Shinji Watanabe
Helen M. Meng
CVBM
AuLLM
22
114
0
01 Oct 2023
InstructME: An Instruction Guided Music Edit And Remix Framework with
  Latent Diffusion Models
InstructME: An Instruction Guided Music Edit And Remix Framework with Latent Diffusion Models
Bing Han
Junyu Dai
Weituo Hao
Xinyan He
Dong Guo
Jitong Chen
Yuxuan Wang
Y. Qian
Xuchen Song
DiffM
19
15
0
28 Aug 2023
Make-An-Audio: Text-To-Audio Generation with Prompt-Enhanced Diffusion
  Models
Make-An-Audio: Text-To-Audio Generation with Prompt-Enhanced Diffusion Models
Rongjie Huang
Jia-Bin Huang
Dongchao Yang
Yi Ren
Luping Liu
Mingze Li
Zhenhui Ye
Jinglin Liu
Xiaoyue Yin
Zhou Zhao
DiffM
140
315
0
30 Jan 2023
RePaint: Inpainting using Denoising Diffusion Probabilistic Models
RePaint: Inpainting using Denoising Diffusion Probabilistic Models
Andreas Lugmayr
Martin Danelljan
Andrés Romero
F. I. F. Richard Yu
Radu Timofte
Luc Van Gool
DiffM
213
1,354
0
24 Jan 2022
Palette: Image-to-Image Diffusion Models
Palette: Image-to-Image Diffusion Models
Chitwan Saharia
William Chan
Huiwen Chang
Chris A. Lee
Jonathan Ho
Tim Salimans
David J. Fleet
Mohammad Norouzi
DiffM
VLM
325
1,586
0
10 Nov 2021
Play as You Like: Timbre-enhanced Multi-modal Music Style Transfer
Play as You Like: Timbre-enhanced Multi-modal Music Style Transfer
Chien-Yu Lu
Min-Xin Xue
Chia-Che Chang
Che-Rung Lee
Li Su
29
33
0
28 Nov 2018
Image-to-Image Translation with Conditional Adversarial Networks
Image-to-Image Translation with Conditional Adversarial Networks
Phillip Isola
Jun-Yan Zhu
Tinghui Zhou
Alexei A. Efros
SSeg
212
19,444
0
21 Nov 2016
1