Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2304.12397
Cited By
On the Challenges of Using Black-Box APIs for Toxicity Evaluation in Research
24 April 2023
Luiza Amador Pozzobon
B. Ermiş
Patrick Lewis
Sara Hooker
Re-assign community
ArXiv
PDF
HTML
Papers citing
"On the Challenges of Using Black-Box APIs for Toxicity Evaluation in Research"
13 / 13 papers shown
Title
Cross-Lingual Transfer of Debiasing and Detoxification in Multilingual LLMs: An Extensive Investigation
Vera Neplenbroek
Arianna Bisazza
Raquel Fernández
100
0
0
17 Feb 2025
Leveraging Open-Source Large Language Models for Native Language Identification
Yee Man Ng
Ilia Markov
30
0
0
15 Sep 2024
Diffusion Guided Language Modeling
Justin Lovelace
Varsha Kishore
Yiwei Chen
Kilian Q. Weinberger
36
6
0
08 Aug 2024
Exploring Human-LLM Conversations: Mental Models and the Originator of Toxicity
Johannes Schneider
Arianna Casanova Flores
Anne-Catherine Kranz
39
2
0
08 Jul 2024
FrenchToxicityPrompts: a Large Benchmark for Evaluating and Mitigating Toxicity in French Texts
Caroline Brun
Vassilina Nikoulina
34
1
0
25 Jun 2024
Uncovering Bias in Large Vision-Language Models at Scale with Counterfactuals
Phillip Howard
Kathleen C. Fraser
Anahita Bhiwandiwalla
S. Kiritchenko
48
9
0
30 May 2024
Hatred Stems from Ignorance! Distillation of the Persuasion Modes in Countering Conversational Hate Speech
Ghadi Alyahya
Abeer Aldayel
38
2
0
18 Mar 2024
From One to Many: Expanding the Scope of Toxicity Mitigation in Language Models
Luiza Amador Pozzobon
Patrick Lewis
Sara Hooker
B. Ermiş
36
7
0
06 Mar 2024
Black-Box Access is Insufficient for Rigorous AI Audits
Stephen Casper
Carson Ezell
Charlotte Siegmann
Noam Kolt
Taylor Lynn Curtis
...
Michael Gerovitch
David Bau
Max Tegmark
David M. Krueger
Dylan Hadfield-Menell
AAML
13
76
0
25 Jan 2024
Goodtriever: Adaptive Toxicity Mitigation with Retrieval-augmented Models
Luiza Amador Pozzobon
B. Ermiş
Patrick Lewis
Sara Hooker
28
20
0
11 Oct 2023
PaLM 2 Technical Report
Rohan Anil
Andrew M. Dai
Orhan Firat
Melvin Johnson
Dmitry Lepikhin
...
Ce Zheng
Wei Zhou
Denny Zhou
Slav Petrov
Yonghui Wu
ReLM
LRM
58
1,138
0
17 May 2023
Metadata Archaeology: Unearthing Data Subsets by Leveraging Training Dynamics
Shoaib Ahmed Siddiqui
Nitarshan Rajkumar
Tegan Maharaj
David M. Krueger
Sara Hooker
35
27
0
20 Sep 2022
Challenges in Detoxifying Language Models
Johannes Welbl
Amelia Glaese
J. Uesato
Sumanth Dathathri
John F. J. Mellor
Lisa Anne Hendricks
Kirsty Anderson
Pushmeet Kohli
Ben Coppin
Po-Sen Huang
LM&MA
242
193
0
15 Sep 2021
1