GRAM: Generative Recommendation via Semantic-aware Multi-granular Late Fusion

2 June 2025

Main:5 Pages

11 Figures

Bibliography:1 Pages

19 Tables

Appendix:13 Pages

Abstract

Generative recommendation is an emerging paradigm that leverages the extensive knowledge of large language models by formulating recommendations into a text-to-text generation task. However, existing studies face two key limitations in (i) incorporating implicit item relationships and (ii) utilizing rich yet lengthy item information. To address these challenges, we propose a Generative Recommender via semantic-Aware Multi-granular late fusion (GRAM), introducing two synergistic innovations. First, we design semantic-to-lexical translation to encode implicit hierarchical and collaborative item relationships into the vocabulary space of LLMs. Second, we present multi-granular late fusion to integrate rich semantics efficiently with minimal information loss. It employs separate encoders for multi-granular prompts, delaying the fusion until the decoding stage. Experiments on four benchmark datasets show that GRAM outperforms eight state-of-the-art generative recommendation models, achieving significant improvements of 11.5-16.0% in Recall@5 and 5.3-13.6% in NDCG@5. The source code is available atthis https URL.

View on arXiv

@article{lee2025_2506.01673,
  title={ GRAM: Generative Recommendation via Semantic-aware Multi-granular Late Fusion },
  author={ Sunkyung Lee and Minjin Choi and Eunseong Choi and Hye-young Kim and Jongwuk Lee },
  journal={arXiv preprint arXiv:2506.01673},
  year={ 2025 }
}

Comments on this paper