LLMs and Finetuning: Benchmarking cross-domain performance for hate speech detection

Communication Systems and Applications (CSA), 2023

29 October 2023

Ahmad Nasir

Aadish Sharma

Kokil Jaidka

ArXiv (abs)PDF HTML Github (18950★)

Main:18 Pages

3 Figures

7 Tables

Abstract

This paper compares different pre-trained and fine-tuned large language models (LLMs) for hate speech detection. Our research underscores challenges in LLMs' cross-domain validity and overfitting risks. Through evaluations, we highlight the need for fine-tuned models that grasp the nuances of hate speech through greater label heterogeneity. We conclude with a vision for the future of hate speech detection, emphasizing cross-domain generalizability and appropriate benchmarking practices.

View on arXiv

Comments on this paper