37
0

Neuro-Symbolic Scene Graph Conditioning for Synthetic Image Dataset Generation

Abstract

As machine learning models increase in scale and complexity, obtaining sufficient training data has become a critical bottleneck due to acquisition costs, privacy constraints, and data scarcity in specialised domains. While synthetic data generation has emerged as a promising alternative, a notable performance gap remains compared to models trained on real data, particularly as task complexity grows. Concurrently, Neuro-Symbolic methods, which combine neural networks' learning strengths with symbolic reasoning's structured representations, have demonstrated significant potential across various cognitive tasks. This paper explores the utility of Neuro-Symbolic conditioning for synthetic image dataset generation, focusing specifically on improving the performance of Scene Graph Generation models. The research investigates whether structured symbolic representations in the form of scene graphs can enhance synthetic data quality through explicit encoding of relational constraints. The results demonstrate that Neuro-Symbolic conditioning yields significant improvements of up to +2.59% in standard Recall metrics and +2.83% in No Graph Constraint Recall metrics when used for dataset augmentation. These findings establish that merging Neuro-Symbolic and generative approaches produces synthetic data with complementary structural information that enhances model performance when combined with real data, providing a novel approach to overcome data scarcity limitations even for complex visual reasoning tasks.

View on arXiv
@article{savazzi2025_2503.17224,
  title={ Neuro-Symbolic Scene Graph Conditioning for Synthetic Image Dataset Generation },
  author={ Giacomo Savazzi and Eugenio Lomurno and Cristian Sbrolli and Agnese Chiatti and Matteo Matteucci },
  journal={arXiv preprint arXiv:2503.17224},
  year={ 2025 }
}
Comments on this paper