ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2309.16429
  4. Cited By
Diverse and Aligned Audio-to-Video Generation via Text-to-Video Model
  Adaptation

Diverse and Aligned Audio-to-Video Generation via Text-to-Video Model Adaptation

28 September 2023
Guy Yariv
Itai Gat
Sagie Benaim
Lior Wolf
Idan Schwartz
Yossi Adi
    DiffM
    VGen
ArXivPDFHTML

Papers citing "Diverse and Aligned Audio-to-Video Generation via Text-to-Video Model Adaptation"

9 / 9 papers shown
Title
A Simple but Strong Baseline for Sounding Video Generation: Effective Adaptation of Audio and Video Diffusion Models for Joint Generation
A Simple but Strong Baseline for Sounding Video Generation: Effective Adaptation of Audio and Video Diffusion Models for Joint Generation
Masato Ishii
Akio Hayakawa
Takashi Shibuya
Yuki Mitsufuji
VGen
DiffM
56
4
0
26 Sep 2024
STA-V2A: Video-to-Audio Generation with Semantic and Temporal Alignment
STA-V2A: Video-to-Audio Generation with Semantic and Temporal Alignment
Yong Ren
Chenxing Li
Manjie Xu
Wei Liang
Yu Gu
Rilin Chen
Dong Yu
VGen
DiffM
38
6
0
13 Sep 2024
Kubrick: Multimodal Agent Collaborations for Synthetic Video Generation
Kubrick: Multimodal Agent Collaborations for Synthetic Video Generation
Liu He
Yizhi Song
Hejun Huang
Pinxin Liu
Yunlong Tang
Daniel G. Aliaga
Xin Zhou
DiffM
VGen
90
3
0
19 Aug 2024
Video-to-Audio Generation with Hidden Alignment
Video-to-Audio Generation with Hidden Alignment
Manjie Xu
Chenxing Li
Yong Ren
Rilin Chen
Yu Gu
Yu Gu
Dong Yu
Dong Yu
DiffM
VGen
43
11
0
10 Jul 2024
CogVideo: Large-scale Pretraining for Text-to-Video Generation via
  Transformers
CogVideo: Large-scale Pretraining for Text-to-Video Generation via Transformers
Wenyi Hong
Ming Ding
Wendi Zheng
Xinghan Liu
Jie Tang
DiffM
235
556
0
29 May 2022
Video and Text Matching with Conditioned Embeddings
Video and Text Matching with Conditioned Embeddings
Ameen Ali
Idan Schwartz
Tamir Hazan
Lior Wolf
41
13
0
21 Oct 2021
Audio-to-Image Cross-Modal Generation
Audio-to-Image Cross-Modal Generation
Maciej Żelaszczyk
Jacek Mañdziuk
DiffM
46
12
0
27 Sep 2021
Zero-Shot Text-to-Image Generation
Zero-Shot Text-to-Image Generation
Aditya A. Ramesh
Mikhail Pavlov
Gabriel Goh
Scott Gray
Chelsea Voss
Alec Radford
Mark Chen
Ilya Sutskever
VLM
253
4,735
0
24 Feb 2021
Sound2Sight: Generating Visual Dynamics from Sound and Context
Sound2Sight: Generating Visual Dynamics from Sound and Context
A. Cherian
Moitreya Chatterjee
N. Ahuja
VGen
52
35
0
23 Jul 2020
1