Controlled Data Sharing for Collaborative Predictive Blacklisting

Although sharing data across organizational boundaries has often been advocated as a promising way to enhance security, collaborative initiatives are rarely put into practice owing to confidentiality, trust, and liability challenges. In this paper, we investigate whether collaborative threat mitigation can be realized via a controlled data sharing approach, whereby organizations make informed decisions as to whether or not, and how much, to share. Using appropriate cryptographic tools, entities can estimate the benefits of collaborating and agree on what to share in a privacy-preserving way, without having to disclose their entire datasets. We focus on collaborative predictive blacklisting, i.e., forecasting attack sources also based on logs contributed by other organizations and study the impact of different sharing strategies by experimenting on a real-world dataset of two billion suspicious IP addresses collected from Dshield over two months. We find that controlled data sharing yields up to an average 105% accuracy improvement, while also reducing the false positive rate.
View on arXiv