Generative Semantic Communication via Textual Prompts: Latency Performance Tradeoffs

15 September 2024

Abstract

This paper develops an edge-device collaborative Generative Semantic Communications (Gen SemCom) framework leveraging pre-trained Multi-modal/Vision Language Models (M/VLMs) for ultra-low-rate semantic communication via textual prompts. The proposed framework optimizes the use of M/VLMs on the wireless edge/device to generate high-fidelity textual prompts through visual captioning/question answering, which are then transmitted over a wireless channel for SemCom. Specifically, we develop a multi-user Gen SemCom framework using pre-trained M/VLMs, and formulate a joint optimization problem of prompt generation offloading, communication and computation resource allocation to minimize the latency and maximize the resulting semantic quality. Due to the nonconvex nature of the problem with highly coupled discrete and continuous variables, we decompose it as a two-level problem and propose a low-complexity swap/leaving/joining (SLJ)-based matching algorithm. Simulation results demonstrate significant performance improvements over the conventional semanticunaware/non-collaborative offloading benchmarks.

View on arXiv

@article{ren2025_2409.09715,
  title={ Generative Semantic Communication via Textual Prompts: Latency Performance Tradeoffs },
  author={ Mengmeng Ren and Li Qiao and Long Yang and Zhen Gao and Jian Chen and Mahdi Boloursaz Mashhadi and Pei Xiao and Rahim Tafazolli and Mehdi Bennis },
  journal={arXiv preprint arXiv:2409.09715},
  year={ 2025 }
}

Comments on this paper