v1v2 (latest)

Direct Speech-to-image Translation

IEEE Journal on Selected Topics in Signal Processing (JSTSP), 2020

7 April 2020

Papers citing "Direct Speech-to-image Translation"

14 / 14 papers shown

Semi-supervised reference-based sketch extraction using a contrastive learning framework

271

19 Jul 2024

Listen and Move: Improving GANs Coherency in Agnostic Sound-to-Video Generation

Rafael Redondo

210

23 Jun 2024

TMT: Tri-Modal Translation between Speech, Image, and Text by Processing Different Modalities as Different Languages

361

25 Feb 2024

Vision + Language Applications: A Survey

Yutong Zhou

N. Shimada

VLM

322

24 May 2023

Fusion-S2iGan: An Efficient and Effective Single-Stage Framework for Speech-to-Image Generation

Zhenxing Zhang

Lambert Schomaker

241

17 May 2023

Street-View Image Generation from a Bird's-Eye View LayoutIEEE Robotics and Automation Letters (RA-L), 2023

Alexander Swerdlow

Runsheng Xu

Bolei Zhou

528

111

11 Jan 2023

YFACC: A Yorùbá speech-image dataset for cross-lingual keyword localisation through visual groundingSpoken Language Technology Workshop (SLT), 2022

237

10 Oct 2022

Speech-to-SQL: Towards Speech-driven SQL Query Generation From Natural Language QuestionThe VLDB journal (VLDBJ), 2022

Wailing Ng

Raymond Chi-Wing Wong

Xuefang Zhao

Chen Zhang

225

04 Jan 2022

Multimodal Image Synthesis and Editing: The Generative AI EraIEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2021

657

27 Dec 2021

FA-GAN: Feature-Aware GAN for Text to Image Synthesis

129

02 Sep 2021

OPT: Omni-Perception Pre-Trainer for Cross-Modal Understanding and Generation

Jing Liu

...

319

01 Jul 2021

Conditional Frechet Inception Distance

235

21 Mar 2021

S2IGAN: Speech-to-Image Generation via Adversarial Learning

Alan Hanjalic

247

14 May 2020

Deep Audio-Visual Learning: A SurveyInternational Journal of Automation and Computing (IJAC), 2020

233

183

14 Jan 2020