Investigating the heterogenous effects of a massive content moderation
intervention via Difference-in-Differences
In today's online environments, users encounter harm and abuse on a daily basis. Therefore, content moderation is crucial to ensure their safety and well-being. However, the effectiveness of many moderation interventions is still uncertain. Here, we apply a causal inference approach to shed light on the effectiveness of The Great Ban, a massive social media deplatforming intervention. We analyze 53M comments shared by nearly 34K users, providing in-depth results on both the intended and unintended consequences of the ban. Our causal analyses reveal that 15.6% of the moderated users abandoned the platform while the remaining ones decreased their overall toxicity by 4.1%. Nonetheless, a subset of those users increased their toxicity by 70% after the intervention. However, the increases in toxicity did not lead to marked increases in activity or engagement, meaning that the most toxic users had an overall limited impact. Our findings bring to light new insights on the effectiveness of deplatforming moderation interventions. Furthermore, they also contribute to informing future content moderation strategies.
View on arXiv