A Multi-Stage Workflow for the Review of Marketing Content with Reasoning Large Language Models

19 December 2025

Alberto Purpura

Emily Chen

Swapnil Shinde

LRM

ArXiv (abs)PDF HTML Github

Main:8 Pages

4 Figures

Bibliography:1 Pages

5 Tables

Abstract

Reasoning Large Language Models (LLMs) have shown promising results when tasked with solving complex problems. In this paper, we propose and evaluate a multi-stage workflow that leverages the capabilities of fine-tuned reasoning LLMs to assist in the review process of marketing content, making sure they comply with a given list of requirements. The contributions of this paper are the following: (i) we present a novel approach -- that does not rely on any external knowledge representation -- for the automatic identification of compliance issues in textual content; (ii) compare the effectiveness of different fine-tuning strategies like Supervised Fine-Tuning (SFT) and Group Relative Policy Optimization (GRPO) in training models to solve this problem; (iii) we evaluate the effectiveness of training small LLMs to generate reasoning tokens before providing their final response; (iv) we evaluate how the choice and combinations of different reward functions affects the performance of a model trained with GRPO.

View on arXiv

Comments on this paper