ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2105.14170
35
3
v1v2 (latest)

Towards a Rigorous Statistical Analysis of Empirical Password Datasets

29 May 2021
Jeremiah Blocki
Peiyuan Liu
ArXiv (abs)PDFHTML
Abstract

A central challenge in password security is to characterize the attacker's guessing curve i.e., what is the probability that the attacker will crack a random user's password within the first GGG guesses. A key challenge is that the guessing curve depends on the attacker's guessing strategy and the distribution of user passwords both of which are unknown to us. In this work we aim to follow Kerckhoffs' principle and analyze the performance of an optimal attacker who knows the password distribution. Let λG\lambda_GλG​ denote the probability that such an attacker can crack a random user's password within GGG guesses. We develop several statistically rigorous techniques to upper and lower bound λG\lambda_GλG​ given NNN independent samples from the unknown distribution. We show that our bounds hold with high confidence and apply our techniques to analyze eight password datasets. Our empirical analysis shows that even state-of-the-art password cracking models are often significantly less guess efficient than an attacker who can optimize its attack based on its (partial) knowledge of the password distribution. We also apply our techniques to re-examine the empirical password distribution and Zipf's Law. We find that the empirical distribution closely matches our bounds on λG\lambda_GλG​ when GGG is not too large i.e., G≪NG \ll NG≪N. However, for larger values of GGG our empirical analysis rigorously demonstrates that the empirical distribution (resp. Zipf's Law) overestimates the attacker's success rate. We apply our techniques to upper/lower bound the effectiveness of password throttling mechanisms (key-stretching) which are used to reduce the number of attacker guesses GGG. Finally, if we make an additional assumption about the way users respond to password restrictions, we can use our techniques to evaluate the effectiveness of password composition policies which restrict the passwords users may select.

View on arXiv
Comments on this paper