Rigging Research Results by Manipulating Top Websites Rankings

4 June 2018

Victor Le Pochat

Tom van Goethem

Samaneh Tajalizadehkhoob

Maciej Korczyński

Wouter Joosen

ArXiv (abs)PDF HTML

Abstract

In order to evaluate the prevalence of security and privacy practices on a representative sample of the Web, researchers rely on website popularity rankings such as the Alexa list. While the validity and representativeness of these rankings are rarely questioned, our findings show the contrary: the conclusions made in these studies can be affected by the inherent properties of these rankings. To that end, we show for four main rankings how the choice of a list affects which domains are included, whether these are representative of real sites and if they are malicious. Moreover, we find that it is trivial for an adversary to manipulate the composition of these lists. We are the first to empirically validate that each of the lists can be manipulated, in certain instances with as little as a single HTTP request. This allows adversaries to manipulate rankings on a large scale and insert malicious domains into whitelists or bend the outcome of research studies to their will. Finally, to overcome the limitations of such rankings, we propose improvements to reduce the fluctuations in list composition and guarantee better defenses against manipulation. To allow the research community to work with reliable and reproducible rankings, we provide Tranco, an online service where these improved rankings can be accessed.

View on arXiv

Comments on this paper