Gradient-Based Model Fingerprinting for LLM Similarity Detection and Family Classification
Main:2 Pages
6 Figures
1 Tables
Appendix:10 Pages
Abstract
As Large Language Models (LLMs) become integral software components in modern applications, unauthorized model derivations through fine-tuning, merging, and redistribution have emerged as critical software engineering challenges. Unlike traditional software where clone detection and license compliance are well-established, the LLM ecosystem lacks effective mechanisms to detect model lineage and enforce licensing agreements. This gap is particularly problematic when open-source model creators, such as Meta's LLaMA, require derivative works to maintain naming conventions for attribution, yet no technical means exist to verify compliance.
View on arXivComments on this paper
