v1v2 (latest)

COLD: A Benchmark for Chinese Offensive Language Detection

Conference on Empirical Methods in Natural Language Processing (EMNLP), 2022

16 January 2022

Papers citing "COLD: A Benchmark for Chinese Offensive Language Detection"

50 / 56 papers shown

Investigating the Impact of Rationales for LLMs on Natural Language Understanding

19 Oct 2025

From Ground Trust to Truth: Disparities in Offensive Language Judgments on Contemporary Korean Political Discourse

134

18 Sep 2025

Social Bias in Multilingual Language Models: A Survey

Lance Calvin Lim Gamboa

Yue Feng

Mark Lee

252

27 Aug 2025

Can NLP Tackle Hate Speech in the Real World? Stakeholder-Informed Feedback and Survey on Counterspeech

110

06 Aug 2025

MMBERT: Scaled Mixture-of-Experts Multimodal BERT for Robust Chinese Hate Speech Detection under Cloaking Perturbations

133

01 Aug 2025

Culture Matters in Toxic Language Detection in PersianAnnual Meeting of the Association for Computational Linguistics (ACL), 2025

Zahra Bokaei

Walid Magdy

Bonnie Webber

140

03 Jun 2025

Unified Game Moderation: Soft-Prompting and LLM-Assisted Label Transfer for Resource-Efficient Toxicity Detection

Zachary Yang

Domenico Tullo

Reihaneh Rabbany

01 Jun 2025

The Hidden Language of Harm: Examining the Role of Emojis in Harmful Online Communication and Content Moderation

181

31 May 2025

Exploring Multimodal Challenges in Toxic Chinese Detection: Taxonomy, Benchmark, and FindingsAnnual Meeting of the Association for Computational Linguistics (ACL), 2025

178

30 May 2025

Chinese Cyberbullying Detection: Dataset, Method, and Validation

Yi Zhu

Xin Zou

Xindong Wu

235

27 May 2025

Chinese Toxic Language Mitigation via Sentiment Polarity Consistent Rewrites

177

21 May 2025

LLM-C3MOD: A Human-LLM Collaborative System for Cross-Cultural Hate Speech Moderation

247

10 Mar 2025

U-Sticker: A Large-Scale Multi-Domain User Sticker Dataset for Retrieval and PersonalizationAnnual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR), 2025

321

26 Feb 2025

SafeDialBench: A Fine-Grained Safety Benchmark for Large Language Models in Multi-Turn Dialogues with Diverse Jailbreak Attacks

...

490

16 Feb 2025

SCCD: A Session-based Dataset for Chinese Cyberbullying DetectionInternational Conference on Computational Linguistics (COLING), 2025

292

28 Jan 2025

ChineseWebText 2.0: Large-Scale High-quality Chinese Web Text with Multi-dimensional and fine-grained information

258

29 Nov 2024

LongSafety: Enhance Safety for Long-Context LLMs

...

292

11 Nov 2024

DeMod: A Holistic Tool with Explainable Detection and Personalized Modification for Toxicity Censorship

224

04 Nov 2024

PclGPT: A Large Language Model for Patronizing and Condescending Language DetectionConference on Empirical Methods in Natural Language Processing (EMNLP), 2024

Junyu Lu

Hongfei Lin

150

01 Oct 2024

Edu-Values: Towards Evaluating the Chinese Education Values of Large Language ModelsThe Web Conference (WWW), 2024

Yazhou Zhang

373

19 Sep 2024

MultiHateClip: A Multilingual Benchmark Dataset for Hateful Video Detection on YouTube and BilibiliACM Multimedia (MM), 2024

267

28 Jul 2024

Purple-teaming LLMs with Adversarial Defender Training

Jingyan Zhou

Kun Li

Junan Li

Jiawen Kang

Minda Hu

Xixin Wu

Helen Meng

AAML

225

01 Jul 2024

Evaluating Implicit Bias in Large Language Models by Attacking From a Psychometric Perspective

520

20 Jun 2024

Quite Good, but Not Enough: Nationality Bias in Large Language Models -- A Case Study of ChatGPT

Shucheng Zhu

Weikang Wang

Ying Liu

263

11 May 2024

SGHateCheck: Functional Tests for Detecting Hate Speech in Low-Resource Languages of Singapore

Ming Shan Hee

217

03 May 2024

Chinese Offensive Language Detection:Current Status and Future Directions

Yunze Xiao

Houda Bouamor

Wajdi Zaghouani

373

27 Mar 2024

OpenEval: Benchmarking Chinese LLMs across Capability, Alignment and Safety

...

332

18 Mar 2024

Collaborative decoding of critical tokens for boosting factuality of large language models

Linfeng Song

Dong Yu

154

28 Feb 2024

ShieldLM: Empowering LLMs as Aligned, Customizable and Explainable Safety Detectors

...

Lei Sha

Zhifang Sui

Hongning Wang

Shiyu Huang

129

26 Feb 2024

Social Orientation: A New Feature for Dialogue Analysis

Kathleen McKeown

253

26 Feb 2024

Cross-lingual Offensive Language Detection: A Systematic Review of Datasets, Transfer Approaches and Challenges

Aiqi Jiang

A. Zubiaga

AAML

303

17 Jan 2024

Risk Taxonomy, Mitigation, and Assessment Benchmarks of Large Language Model Systems

...

Qi Li

321

11 Jan 2024

A Survey of the Evolution of Language Model-Based Dialogue Systems: Data, Task and Models

452

28 Nov 2023

Can Large Language Models Understand Content and Propagation for Misinformation Detection: An Empirical Study

136

21 Nov 2023

Flames: Benchmarking Value Alignment of LLMs in ChineseNorth American Chapter of the Association for Computational Linguistics (NAACL), 2023

Xiangyang Liu

Tianxiang Sun

...

Xipeng Qiu

Dahua Lin

412

12 Nov 2023

Self-Guard: Empower the LLM to Safeguard ItselfNorth American Chapter of the Association for Computational Linguistics (NAACL), 2023

Zezhong Wang

270

24 Oct 2023

The Skipped Beat: A Study of Sociopragmatic Understanding in LLMs for 64 LanguagesConference on Empirical Methods in Natural Language Processing (EMNLP), 2023

Chiyu Zhang

Khai Duy Doan

Qisheng Liao

Muhammad Abdul-Mageed

247

23 Oct 2023

Cultural Compass: Predicting Transfer Learning Success in Offensive Language Detection with Cultural FeaturesConference on Empirical Methods in Natural Language Processing (EMNLP), 2023

Li Zhou

Antonia Karamolegkou

Wenyu Chen

Daniel Hershcovich

225

10 Oct 2023

Adapting Large Language Models for Content Moderation: Pitfalls in Data Engineering and Supervised Fine-tuning

336

05 Oct 2023

Large Language Model Alignment: A Survey

359

282

26 Sep 2023

SafetyBench: Evaluating the Safety of Large Language ModelsAnnual Meeting of the Association for Computational Linguistics (ACL), 2023

Xiao Liu

304

169

13 Sep 2023

Exploring Cross-Cultural Differences in English Hate Speech Annotations: From Dataset Construction to AnalysisNorth American Chapter of the Association for Computational Linguistics (NAACL), 2023

Jose Camacho-Collados

Juho Kim

Alice Oh

314

31 Aug 2023

Rethinking Machine Ethics -- Can LLMs Perform Moral Reasoning through the Lens of Moral Theories?

Irwin King

282

29 Aug 2023

Enhancing Psychological Counseling with Large Language Model: A Multifaceted Decision-Support System for Non-Professionals

Guanghui Fu

Qing Zhao

Jianqiang Li

Dan Luo

Changwei Song

...

233

29 Aug 2023

CLEVA: Chinese Language Models EVAluation PlatformConference on Empirical Methods in Natural Language Processing (EMNLP), 2023

Zhi Chen

...

Michael R. Lyu

322

09 Aug 2023

Classifying Crime Types using Judgment Documents from Social Media

269

29 Jun 2023

CBBQ: A Chinese Bias Benchmark Dataset Curated with Human-AI Collaboration for Large Language ModelsInternational Conference on Language Resources and Evaluation (LREC), 2023

Yufei Huang

Deyi Xiong

ALM

302

28 Jun 2023

KoSBi: A Dataset for Mitigating Social Bias Risks Towards Safer Large Language Model ApplicationAnnual Meeting of the Association for Computational Linguistics (ACL), 2023

361

28 May 2023

Improved Instruction Ordering in Recipe-Grounded ConversationAnnual Meeting of the Association for Computational Linguistics (ACL), 2023

Duong Minh Le

Ruohao Guo

Wei Xu

Alan Ritter

225

26 May 2023

Facilitating Fine-grained Detection of Chinese Toxic Language: Hierarchical Taxonomy, Resources, and BenchmarksAnnual Meeting of the Association for Computational Linguistics (ACL), 2023

Junyu Lu

165

08 May 2023