With the development of deep learning, high-value and high-cost models have become valuable assets, and related intellectual property protection technologies have become a hot topic. However, existing model watermarking work in black-box scenarios mainly originates from training-based backdoor methods, which probably degrade primary task performance. To address this, we propose a branch backdoor-based model watermarking protocol to protect model intellectual property, where a construction based on a message authentication scheme is adopted as the branch indicator after a comparative analysis with secure cryptographic technologies primitives. We prove the lossless performance of the protocol by reduction. In addition, we analyze the potential threats to the protocol and provide a secure and feasible watermarking instance for language models.
View on arXiv