510

Detecting Active and Stealthy Typosquatting Threats in Package Registries

Main:14 Pages
5 Figures
Bibliography:4 Pages
6 Tables
Abstract

Typosquatting attacks, also known as package confusion attacks, threaten software supply chains. Attackers make packages with names that resemble legitimate ones, tricking engineers into installing malware. While prior work has developed defenses against typosquatting in some software package registries, notably npm and PyPI, gaps remain: addressing high false-positive rates; generalizing to more software package ecosystems; and gaining insight from real-world deployment. In this work, we introduce TypoSmart, a solution designed to address the challenges posed by typosquatting attacks. We begin by conducting a novel analysis of typosquatting data to gain deeper insights into attack patterns and engineering practices. Building on state-of-the-art approaches, we extend support to six software package registries using embedding-based similarity search, achieving a 73%-91% improvement in speed. Additionally, our approach significantly reduces 70.4% false-positive compared to prior work results. TypoSmart is being used in production at our industry partner and contributed to the removal of 3,658 typosquatting packages in one month. We share lessons learned from the production deployment.

View on arXiv
Comments on this paper