People Make Better Edits: Measuring the Efficacy of LLM-Generated Counterfactually Augmented Data for Harmful Language Detection

2 November 2023

Wil M.P. van der Aalst

Claudia Wagner

ArXiv PDF HTML

Papers citing "People Make Better Edits: Measuring the Efficacy of LLM-Generated Counterfactually Augmented Data for Harmful Language Detection"

25 / 25 papers shown

Title
Tell Me What You Know About Sexism: Expert-LLM Interaction Strategies and Co-Created Definitions for Zero-Shot Sexism Detection Myrthe Reuver Indira Sen Matteo Melis Gabriella Lapesa 20 0 0 21 Apr 2025
Interpreting Language Reward Models via Contrastive Explanations Junqi Jiang Tom Bewley Saumitra Mishra Freddy Lecue Manuela Veloso 74 0 0 25 Nov 2024
DragEntity: Trajectory Guided Video Generation using Entity and Positional Relationships Zhang Wan Sheng Tang Jiawei Wei Ruize Zhang Juan Cao VGen 19 2 0 14 Oct 2024
Re-examining Sexism and Misogyny Classification with Annotator Attitudes Aiqi Jiang Nikolas Vitsakis Tanvi Dinkar Gavin Abercrombie Ioannis Konstas 37 1 0 04 Oct 2024
Decoding Hate: Exploring Language Models' Reactions to Hate Speech Paloma Piot Javier Parapar 43 1 0 01 Oct 2024
A Few Hypocrites: Few-Shot Learning and Subtype Definitions for Detecting Hypocrisy Accusations in Online Climate Change Debates Paulina Garcia Corral Avishai Green Hendrik Meyer Anke Stoll Xiaoyue Yan Myrthe Reuver 22 0 0 25 Sep 2024
ToxiCraft: A Novel Framework for Synthetic Generation of Harmful Information Zheng Hui Zhaoxiao Guo Hang Zhao Juanyong Duan Congrui Huang 25 6 0 23 Sep 2024
On Behalf of the Stakeholders: Trends in NLP Model Interpretability in the Era of LLMs Nitay Calderon Roi Reichart 32 10 0 27 Jul 2024
Golden-Retriever: High-Fidelity Agentic Retrieval Augmented Generation for Industrial Knowledge Base Zhiyu An Xianzhong Ding Yen-Chun Fu Cheng-Chung Chu Yan Li Wan Du RALM 25 5 0 20 Jul 2024
A Survey on Natural Language Counterfactual Generation Yongjie Wang Xiaoqi Qiu Yu Yue Xu Guo Zhiwei Zeng Yuhong Feng Zhiqi Shen 31 5 0 04 Jul 2024
GUARD-D-LLM: An LLM-Based Risk Assessment Engine for the Downstream uses of LLMs Sundaraparipurnan Narayanan Sandeep Vishwakarma 34 3 0 02 Apr 2024
A Survey on Multilingual Large Language Models: Corpora, Alignment, and Bias Yuemei Xu Ling Hu Jiayi Zhao Zihan Qiu Yuqi Ye Hanwen Gu LRM 19 36 0 01 Apr 2024
Improving Adversarial Data Collection by Supporting Annotators: Lessons from GAHD, a German Hate Speech Dataset Janis Goldzycher Paul Röttger Gerold Schneider AAML 29 8 0 28 Mar 2024
Hatred Stems from Ignorance! Distillation of the Persuasion Modes in Countering Conversational Hate Speech Ghadi Alyahya Abeer Aldayel 38 2 0 18 Mar 2024
What Does the Bot Say? Opportunities and Risks of Large Language Models in Social Media Bot Detection Shangbin Feng Herun Wan Ningnan Wang Zhaoxuan Tan Minnan Luo Yulia Tsvetkov AAML DeLMO 14 16 0 01 Feb 2024
Large Scale Foundation Models for Intelligent Manufacturing Applications: A Survey Haotian Zhang S. D. Semujju Zhicheng Wang Xianwei Lv Kang Xu ... Jing Wu Zhuo Long Wensheng Liang Xiaoguang Ma Ruiyan Zhuang UQCV AI4TS AI4CE 27 4 0 11 Dec 2023
Speak, Memory: An Archaeology of Books Known to ChatGPT/GPT-4 Kent K. Chang Mackenzie Cramer Sandeep Soni David Bamman RALM 140 110 0 28 Apr 2023
The Parrot Dilemma: Human-Labeled vs. LLM-augmented Data in Classification Tasks Anders Giovanni Møller Jacob Aarup Dalsgaard Arianna Pera L. Aiello 67 34 0 26 Apr 2023
Can we trust the evaluation on ChatGPT? Rachith Aiyappa Jisun An Haewoon Kwak Yong-Yeol Ahn ELM ALM LLMAG AI4MH LRM 106 87 0 22 Mar 2023
Exploiting Asymmetry for Synthetic Training Data Generation: SynthIE and the Case of Information Extraction Martin Josifoski Marija Sakota Maxime Peyrard Robert West SyDa 56 78 0 07 Mar 2023
NeuroCounterfactuals: Beyond Minimal-Edit Counterfactuals for Richer Data Augmentation Phillip Howard Gadi Singer Vasudev Lal Yejin Choi Swabha Swayamdipta CML 48 25 0 22 Oct 2022
Training language models to follow instructions with human feedback Long Ouyang Jeff Wu Xu Jiang Diogo Almeida Carroll L. Wainwright ... Amanda Askell Peter Welinder Paul Christiano Jan Leike Ryan J. Lowe OSLM ALM 303 11,881 0 04 Mar 2022
$Understanding Dataset Difficulty with $\mathcal{V}$-Usable Information$ Understanding Dataset Difficulty with $\mathcal{V}$ -Usable Information Kawin Ethayarajh Yejin Choi Swabha Swayamdipta 157 157 0 16 Oct 2021
Tailor: Generating and Perturbing Text with Semantic Controls Alexis Ross Tongshuang Wu Hao Peng Matthew E. Peters Matt Gardner 134 77 0 15 Jul 2021
Linguistically-Informed Transformations (LIT): A Method for Automatically Generating Contrast Sets Chuanrong Li Lin Shengshuo Leo Z. Liu Xinyi Wu Xuhui Zhou Shane Steinert-Threlkeld VLM 128 38 0 16 Oct 2020