239
v1v2v3 (latest)

POLAR: A Benchmark for Multilingual, Multicultural, and Multi-Event Online Polarization

Oleg Rogov
Aung Kyaw Htet
Xintong Wang
Surendrabikram Thapa
Kritesh Rauniyar
Tanmoy Chakraborty
Arfeen Zeeshan
Dheeraj Kodati
Satya Keerthi
Sahar Moradizeyveh
Firoj Alam
Arid Hasan
Syed Ishtiaque Ahmed
Ye Kyaw Thu
Shantipriya Parida
Ihsan Ayyub Qazi
Lilian Wanzare
Nelson Odhiambo Onyango
Clemencia Siro
Jane Wanjiru Kimani
Ibrahim Said Ahmad
Adem Chanie Ali
Martin Semmann
Chris Biemann
Shamsuddeen Hassan Muhammad
Seid Muhie Yimam
Main:8 Pages
4 Figures
Bibliography:3 Pages
11 Tables
Appendix:10 Pages
Abstract

Online polarization poses a growing challenge for democratic discourse, yet most computational social science research remains monolingual, culturally narrow, or event-specific. We introduce POLAR, a multilingual, multicultural, and multi-event dataset with over 110K instances in 22 languages drawn from diverse online platforms and real-world events. Polarization is annotated along three axes, namely detection, type, and manifestation, using a variety of annotation platforms adapted to each cultural context. We conduct two main experiments: (1) fine-tuning six pretrained small language models; and (2) evaluating a range of open and closed large language models in few-shot and zero-shot settings. The results show that, while most models perform well in binary polarization detection, they achieve substantially lower performance when predicting polarization types and manifestations. These findings highlight the complex, highly contextual nature of polarization and demonstrate the need for robust, adaptable approaches in NLP and computational social science. All resources will be released to support further research and effective mitigation of digital polarization globally.

View on arXiv
Comments on this paper