ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2004.03413
  4. Cited By
Direct Speech-to-image Translation
v1v2 (latest)

Direct Speech-to-image Translation

IEEE Journal on Selected Topics in Signal Processing (JSTSP), 2020
7 April 2020
Jiguo Li
Xinfeng Zhang
Chuanmin Jia
Jizheng Xu
Li Zhang
Y. Wang
Siwei Ma
Wen Gao
ArXiv (abs)PDFHTML

Papers citing "Direct Speech-to-image Translation"

14 / 14 papers shown
Semi-supervised reference-based sketch extraction using a contrastive
  learning framework
Semi-supervised reference-based sketch extraction using a contrastive learning framework
Chang Wook Seo
Amirsaman Ashtari
Jun-yong Noh
SSL
271
16
0
19 Jul 2024
Listen and Move: Improving GANs Coherency in Agnostic Sound-to-Video
  Generation
Listen and Move: Improving GANs Coherency in Agnostic Sound-to-Video Generation
Rafael Redondo
210
0
0
23 Jun 2024
TMT: Tri-Modal Translation between Speech, Image, and Text by Processing Different Modalities as Different Languages
TMT: Tri-Modal Translation between Speech, Image, and Text by Processing Different Modalities as Different Languages
Minsu Kim
Jee-weon Jung
Hyeongseop Rha
Soumi Maiti
Siddhant Arora
Xuankai Chang
Shinji Watanabe
Y. Ro
361
8
0
25 Feb 2024
Vision + Language Applications: A Survey
Vision + Language Applications: A Survey
Yutong Zhou
N. Shimada
VLM
322
16
0
24 May 2023
Fusion-S2iGan: An Efficient and Effective Single-Stage Framework for
  Speech-to-Image Generation
Fusion-S2iGan: An Efficient and Effective Single-Stage Framework for Speech-to-Image Generation
Zhenxing Zhang
Lambert Schomaker
241
7
0
17 May 2023
Street-View Image Generation from a Bird's-Eye View Layout
Street-View Image Generation from a Bird's-Eye View LayoutIEEE Robotics and Automation Letters (RA-L), 2023
Alexander Swerdlow
Runsheng Xu
Bolei Zhou
528
111
0
11 Jan 2023
YFACC: A Yorùbá speech-image dataset for cross-lingual keyword
  localisation through visual grounding
YFACC: A Yorùbá speech-image dataset for cross-lingual keyword localisation through visual groundingSpoken Language Technology Workshop (SLT), 2022
Kayode Olaleye
Dan Oneaţă
Herman Kamper
ObjD
237
8
0
10 Oct 2022
Speech-to-SQL: Towards Speech-driven SQL Query Generation From Natural
  Language Question
Speech-to-SQL: Towards Speech-driven SQL Query Generation From Natural Language QuestionThe VLDB journal (VLDBJ), 2022
Wailing Ng
Raymond Chi-Wing Wong
Xuefang Zhao
Chen Zhang
225
19
0
04 Jan 2022
Multimodal Image Synthesis and Editing: The Generative AI Era
Multimodal Image Synthesis and Editing: The Generative AI EraIEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2021
Fangneng Zhan
Yingchen Yu
Rongliang Wu
Jiahui Zhang
Shijian Lu
Lingjie Liu
Adam Kortylewski
Christian Theobalt
Eric Xing
EGVM
657
90
0
27 Dec 2021
FA-GAN: Feature-Aware GAN for Text to Image Synthesis
FA-GAN: Feature-Aware GAN for Text to Image Synthesis
E. Jeon
Kunhee Kim
Daijin Kim
GAN
129
11
0
02 Sep 2021
OPT: Omni-Perception Pre-Trainer for Cross-Modal Understanding and
  Generation
OPT: Omni-Perception Pre-Trainer for Cross-Modal Understanding and Generation
Jing Liu
Xinxin Zhu
Fei Liu
Longteng Guo
Zijia Zhao
...
Weining Wang
Hanqing Lu
Shiyu Zhou
Jiajun Zhang
Jinqiao Wang
319
41
0
01 Jul 2021
Conditional Frechet Inception Distance
Conditional Frechet Inception Distance
Michael Soloveitchik
Tzvi Diskin
E. Morin
A. Wiesel
EGVM
235
45
0
21 Mar 2021
S2IGAN: Speech-to-Image Generation via Adversarial Learning
S2IGAN: Speech-to-Image Generation via Adversarial Learning
Xinsheng Wang
Tingting Qiao
Jihua Zhu
Alan Hanjalic
O. Scharenborg
VLMGAN
247
21
0
14 May 2020
Deep Audio-Visual Learning: A Survey
Deep Audio-Visual Learning: A SurveyInternational Journal of Automation and Computing (IJAC), 2020
Hao Zhu
Mandi Luo
Rui Wang
A. Zheng
Ran He
233
183
0
14 Jan 2020
1
Page 1 of 1