46
60

Quantifying the relation between performance and success in soccer

Abstract

The availability of massive data about sports activities offers nowadays the opportunity to quantify the relation between performance and success. In this work, we analyze more than 6,000 games and 10 million events in the top six European leagues and investigate this relation in soccer competitions. We discover that a team's position in the final ranking of a national competition is significantly related to its typical performance, as described by a set of technical features extracted from data. Moreover, we observe that while victory and defeats can be explained by the team's performance during a game, draws are difficult to describe with a machine learning approach. We then perform a simulation of an entire season of the six leagues where the outcome of every game is replaced by a synthetic outcome (victory, defeat, or draw) based on a machine learning model trained on the previous seasons. We find that the final rankings in the simulated tournaments are close to the actual rankings in the real tournaments, suggesting that a complex systems' view on soccer has the potential of revealing hidden patterns regarding the relation between performance and success.

View on arXiv
Comments on this paper