You Still See Me: How Data Protection Supports the Architecture of ML
Surveillance
Data forms the backbone of machine learning. Thus, data protection law has strong bearing on how ML systems are governed. Given that most requirements accompany the processing of personal data, organizations have an incentive to keep their data out of legal scope. Privacy-preserving techniques incentivized by data protection law -- data protection techniques -- constitute an important strategy for ML development because they are used to distill data until it potentially falls outside the scope of data protection laws. In this paper, we examine the impact of a rhetoric that deems data wrapped in privacy-preserving techniques as data that is "good-to-go". We show how the application of data protection techniques in the development of ML systems -- from private set intersection as part of dataset curation to homomorphic encryption and federated learning as part of model computation to the framing of the privacy-utility trade-off as part of model updating -- can further support individual monitoring and data consolidation. With data accumulation at the core of how the ML pipeline is configured, we argue that data protection techniques are often instrumentalized in ways that support infrastructures of surveillance, rather than to protect individuals associated with data. Finally, we propose technology and policy strategies to evaluate data protection techniques in light of the protections they actually confer. We conclude by highlighting the role that security technologists might play in devising policies that combat surveillance ML technologies -- recommending the adversarial mindset inherent to the profession to more precisely articulate and prevent the use of "privacy-preserving" scaffoldings that support surveillance.
View on arXiv