257
v1v2v3 (latest)

Traceable Black-box Watermarks for Federated Learning

Main:6 Pages
6 Figures
Bibliography:3 Pages
8 Tables
Appendix:4 Pages
Abstract

Due to the distributed nature of Federated Learning (FL) systems, each local client has access to the global model, posing a critical risk of model leakage. Existing works have explored injecting watermarks into local models to enable intellectual property protection. However, these methods either focus on non-traceable watermarks or traceable but white-box watermarks. We identify a gap in the literature regarding the formal definition of traceable black-box watermarking and the formulation of the problem of injecting such watermarks into FL systems. In this work, we first formalize the problem of injecting traceable black-box watermarks into FL. Based on the problem, we propose a novel server-side watermarking method, TraMark\mathbf{TraMark}, which creates a traceable watermarked model for each client, enabling verification of model leakage in black-box settings. To achieve this, TraMark\mathbf{TraMark} partitions the model parameter space into two distinct regions: the main task region and the watermarking region. Subsequently, a personalized global model is constructed for each client by aggregating only the main task region while preserving the watermarking region. Each model then learns a unique watermark exclusively within the watermarking region using a distinct watermark dataset before being sent back to the local client. Extensive results across various FL systems demonstrate that TraMark\mathbf{TraMark} ensures the traceability of all watermarked models while preserving their main task performance.

View on arXiv
Comments on this paper