Challenges for Toxic Comment Classification: An In-Depth Error Analysis

20 September 2018

Papers citing "Challenges for Toxic Comment Classification: An In-Depth Error Analysis"

24 / 24 papers shown

Title
Personalisation or Prejudice? Addressing Geographic Bias in Hate Speech Detection using Debias Tuning in Large Language Models Paloma Piot Patricia Martín-Rodilla Javier Parapar 50 0 0 04 May 2025
DefVerify: Do Hate Speech Models Reflect Their Dataset's Definition? Urja Khurana Eric T. Nalisnick Antske Fokkens 64 1 0 21 Oct 2024
Model Surgery: Modulating LLM's Behavior Via Simple Parameter Editing Huanqian Wang Yang Yue Rui Lu Jingxin Shi Andrew Zhao Shenzhi Wang Shiji Song Gao Huang LM&Ro KELM 55 6 0 11 Jul 2024
DISCERN: Designing Decision Support Interfaces to Investigate the Complexities of Workplace Social Decision-Making With Line Managers Pranav Khadpe Lindy Le Kate Nowak Shamsi T. Iqbal Jina Suh 32 7 0 29 Feb 2024
Adding guardrails to advanced chatbots Yanchen Wang Lisa Singh AI4MH 23 7 0 13 Jun 2023
ToxBuster: In-game Chat Toxicity Buster with BERT Zachary Yang Yasmine Maricar M. Davari Nicolas Grenon-Godbout Reihaneh Rabbany 30 3 0 21 May 2023
Leveraging a New Spanish Corpus for Multilingual and Crosslingual Metaphor Detection Elisa Sanchez-Bayona Rodrigo Agerri 14 10 0 19 Oct 2022
Pile of Law: Learning Responsible Data Filtering from the Law and a 256GB Open-Source Legal Dataset Peter Henderson M. Krass Lucia Zheng Neel Guha Christopher D. Manning Dan Jurafsky Daniel E. Ho AILaw ELM 141 98 0 01 Jul 2022
Multilingual HateCheck: Functional Tests for Multilingual Hate Speech Detection Models Paul Röttger Haitham Seelawi Debora Nozza Zeerak Talat Bertie Vidgen 30 65 0 20 Jun 2022
Hidden behind the obvious: misleading keywords and implicitly abusive language on social media Wenjie Yin A. Zubiaga 33 27 0 03 May 2022
Automated Identification of Toxic Code Reviews Using ToxiCR Jaydeb Sarker Asif Kamal Turzo Mingyou Dong Amiangshu Bosu 27 31 0 26 Feb 2022
Jury Learning: Integrating Dissenting Voices into Machine Learning Models Mitchell L. Gordon Michelle S. Lam J. Park Kayur Patel Jeffrey T. Hancock Tatsunori Hashimoto Michael S. Bernstein 29 146 0 07 Feb 2022
SS-BERT: Mitigating Identity Terms Bias in Toxic Comment Classification by Utilising the Notion of "Subjectivity" and "Identity Terms" Zhixue Zhao Ziqi Zhang F. Hopfgartner 24 5 0 06 Sep 2021
Limitations of machine learning for building energy prediction: ASHRAE Great Energy Predictor III Kaggle competition error analysis Clayton Miller Bianca Picchetti Chunlei Fu Jovan Pantelic AI4CE 17 24 0 25 Jun 2021
Data Expansion using Back Translation and Paraphrasing for Hate Speech Detection D. Beddiar Md Saroar Jahan Mourad Oussalah 27 83 0 25 May 2021
Towards generalisable hate speech detection: a review on obstacles and solutions Wenjie Yin A. Zubiaga 117 164 0 17 Feb 2021
HateCheck: Functional Tests for Hate Speech Detection Models Paul Röttger B. Vidgen Dong Nguyen Zeerak Talat Helen Z. Margetts J. Pierrehumbert 31 260 0 31 Dec 2020
Towards Ethics by Design in Online Abusive Content Detection S. Kiritchenko I. Nejadgholi 29 13 0 28 Oct 2020
Detecting and Classifying Malevolent Dialogue Responses: Taxonomy, Data and Methodology Yangjun Zhang Pengjie Ren Maarten de Rijke 26 11 0 21 Aug 2020
Simple and Principled Uncertainty Estimation with Deterministic Deep Learning via Distance Awareness Jeremiah Zhe Liu Zi Lin Shreyas Padhy Dustin Tran Tania Bedrax-Weiss Balaji Lakshminarayanan UQCV BDL 37 437 0 17 Jun 2020
Toxicity Detection: Does Context Really Matter? John Pavlopoulos Jeffrey Scott Sorensen Lucas Dixon Nithum Thain Ion Androutsopoulos 18 158 0 01 Jun 2020
Read, Attend and Comment: A Deep Architecture for Automatic News Comment Generation Ze Yang Can Xu Wei Wu Zhoujun Li 3DV 23 29 0 26 Sep 2019
Empirical Analysis of Multi-Task Learning for Reducing Model Bias in Toxic Comment Detection Ameya Vaidya Feng Mai Yue Ning 115 21 0 21 Sep 2019
Convolutional Neural Networks for Sentence Classification Yoon Kim AILaw VLM 312 13,377 0 25 Aug 2014