A Property Encoder for Graph Neural Networks

Graph machine learning, particularly using graph neural networks, fundamentally relies on node features. Nevertheless, numerous real-world systems, such as social and biological networks, often lack node features due to various reasons, including privacy concerns, incomplete or missing data, and limitations in data collection. In such scenarios, researchers typically resort to methods like structural and positional encoding to construct node features. However, the length of such features is contingent on the maximum value within the property being encoded, for example, the highest node degree, which can be exceedingly large in applications like scale-free networks. Furthermore, these encoding schemes are limited to categorical data and might not be able to encode metrics returning other type of values. In this paper, we introduce a novel, universally applicable encoder, termed \emph{PropEnc}, which constructs expressive node embedding from any given graph metric. \emph{PropEnc} leverages histogram construction combined with reversed index encoding, offering a flexible method for node features initialization. It supports flexible encoding in terms of both dimensionality and type of input, demonstrating its effectiveness across diverse applications. \emph{PropEnc} allows encoding metrics in low-dimensional space which effectively address the sparsity challenge and enhances the efficiency of the models. We show that \emph{PropEnc} can construct node features that either exactly replicate one-hot encoding or closely approximate indices under various settings. Our extensive evaluations in graph classification setting across multiple social networks that lack node features support our hypothesis. The empirical results conclusively demonstrate that \emph{PropEnc} is both an efficient and effective mechanism for constructing node features from diverse set of graph metrics.
View on arXiv