152

Towards a Rigorous Statistical Analysis of Empirical Password Datasets

IEEE Symposium on Security and Privacy (IEEE S&P), 2021
Abstract

In this paper we consider the following problem: given NN independent samples from an unknown distribution P\mathcal{P} over passwords pwd1,pwd2,pwd_1,pwd_2, \ldots can we generate high confidence upper/lower bounds on the guessing curve λGi=1Gpi\lambda_G \doteq \sum_{i=1}^G p_i where pi=Pr[pwdi]p_i=\Pr[pwd_i] and the passwords are ordered such that pipi+1p_i \geq p_{i+1}. Intuitively, λG\lambda_G represents the probability that an attacker who knows the distribution P\mathcal{P} can guess a random password pwdPpwd \leftarrow \mathcal{P} within GG guesses. Understanding how λG\lambda_G increases with the number of guesses GG can help quantify the damage of a password cracking attack and inform password policies. Despite an abundance of large (breached) password datasets upper/lower bounding λG\lambda_G remains a challenging problem. We introduce several statistical techniques to derive tighter upper/lower bounds on the guessing curve λG\lambda_G which hold with high confidence. We apply our techniques to analyze 99 large password datasets finding that our new lower bounds dramatically improve upon prior work. Our empirical analysis shows that even state-of-the-art password cracking models are significantly less guess efficient than an attacker who knows the distribution. When GG is not too large we find that our upper/lower bounds on λG\lambda_G are both very close to the empirical distribution which justifies the use of the empirical distribution in settings where GG is not too large i.e., GNG \ll N closely approximates λG\lambda_G. The analysis also highlights regions of the curve where we can, with high confidence, conclude that the empirical distribution significantly overestimates λG\lambda_G. Our new statistical techniques yield substantially tighter upper/lower bounds on λG\lambda_G though there are still regions of the curve where the best upper/lower bounds diverge significantly.

View on arXiv
Comments on this paper