Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2306.01985
Cited By
COBRA Frames: Contextual Reasoning about Effects and Harms of Offensive Statements
3 June 2023
Xuhui Zhou
Haojie Zhu
Akhila Yerukola
Thomas Davidson
Jena D. Hwang
Swabha Swayamdipta
Maarten Sap
Re-assign community
ArXiv
PDF
HTML
Papers citing
"COBRA Frames: Contextual Reasoning about Effects and Harms of Offensive Statements"
28 / 28 papers shown
Title
PolyGuard: A Multilingual Safety Moderation Tool for 17 Languages
Priyanshu Kumar
Devansh Jain
Akhila Yerukola
Liwei Jiang
Himanshu Beniwal
Thomas Hartvigsen
Maarten Sap
52
0
0
06 Apr 2025
Disparities in LLM Reasoning Accuracy and Explanations: A Case Study on African American English
Runtao Zhou
Guangya Wan
Saadia Gabriel
Sheng R. Li
Alexander J Gates
Maarten Sap
Thomas Hartvigsen
LRM
57
0
0
06 Mar 2025
Adaptive Prompting: Ad-hoc Prompt Composition for Social Bias Detection
Maximilian Spliethover
Tim Knebler
Fabian Fumagalli
Maximilian Muschalik
Barbara Hammer
Eyke Hüllermeier
Henning Wachsmuth
97
1
0
10 Feb 2025
Black Swan: Abductive and Defeasible Video Reasoning in Unpredictable Events
Aditya Chinchure
Sahithya Ravi
R. Ng
Vered Shwartz
Boyang Albert Li
Leonid Sigal
ReLM
LRM
VLM
77
2
0
07 Dec 2024
Scalable Frame-based Construction of Sociocultural NormBases for Socially-Aware Dialogues
Shilin Qu
Weiqing Wang
Xin Zhou
Haolan Zhan
Zhuang Li
Lizhen Qu
Linhao Luo
Yuan-Fang Li
Gholamreza Haffari
23
0
0
04 Oct 2024
Incorporating Procedural Fairness in Flag Submissions on Social Media Platforms
Yunhee Shim
Shagun Jhaver
29
0
0
13 Sep 2024
Leveraging Machine-Generated Rationales to Facilitate Social Meaning Detection in Conversations
Ritam Dutt
Zhen Wu
Kelly Shi
Divyanshu Sheth
Prakhar Gupta
Carolyn Rose
24
2
0
27 Jun 2024
Silent Signals, Loud Impact: LLMs for Word-Sense Disambiguation of Coded Dog Whistles
Julia Kruk
Michela Marchini
Rijul Magu
Caleb Ziems
D. Muchlinski
Diyi Yang
30
1
0
10 Jun 2024
Tox-BART: Leveraging Toxicity Attributes for Explanation Generation of Implicit Hate Speech
Neemesh Yadav
Sarah Masud
Vikram Goyal
Vikram Goyal
Md. Shad Akhtar
Tanmoy Chakraborty
21
3
0
06 Jun 2024
Promoting Constructive Deliberation: Reframing for Receptiveness
Gauri Kambhatla
Matthew Lease
Ashwin Rajadesingan
33
2
0
23 May 2024
Harmful Speech Detection by Language Models Exhibits Gender-Queer Dialect Bias
Rebecca Dorn
Lee Kezar
Fred Morstatter
Kristina Lerman
27
7
0
23 May 2024
Is the Pope Catholic? Yes, the Pope is Catholic. Generative Evaluation of Non-Literal Intent Resolution in LLMs
Akhila Yerukola
Saujas Vaduguru
Daniel Fried
Maarten Sap
29
1
0
14 May 2024
The Constant in HATE: Analyzing Toxicity in Reddit across Topics and Languages
Wondimagegnhue Tufa
Ilia Markov
Piek Vossen
11
0
0
29 Apr 2024
Social Intelligence Data Infrastructure: Structuring the Present and Navigating the Future
Minzhi Li
Weiyan Shi
Caleb Ziems
Diyi Yang
28
8
0
28 Feb 2024
COBIAS: Contextual Reliability in Bias Assessment
Priyanshul Govil
Hemang Jain
Vamshi Bonagiri
Aman Chadha
Ponnurangam Kumaraguru
Manas Gaur
S. Dey
38
2
0
22 Feb 2024
Understanding News Creation Intents: Frame, Dataset, and Method
Zhengjia Wang
Danding Wang
Qiang Sheng
Juan Cao
Silong Su
Yifan Sun
Beizhe Hu
Siyuan Ma
10
4
0
27 Dec 2023
Can LLMs Keep a Secret? Testing Privacy Implications of Language Models via Contextual Integrity Theory
Niloofar Mireshghallah
Hyunwoo J. Kim
Xuhui Zhou
Yulia Tsvetkov
Maarten Sap
Reza Shokri
Yejin Choi
PILM
22
73
0
27 Oct 2023
Improving Few-shot Generalization of Safety Classifiers via Data Augmented Parameter-Efficient Fine-Tuning
Ananth Balashankar
Xiao Ma
Aradhana Sinha
Ahmad Beirami
Yao Qin
Jilin Chen
Alex Beutel
16
2
0
25 Oct 2023
What Makes it Ok to Set a Fire? Iterative Self-distillation of Contexts and Rationales for Disambiguating Defeasible Social and Moral Situations
Kavel Rao
Liwei Jiang
Valentina Pyatkin
Yuling Gu
Niket Tandon
Nouha Dziri
Faeze Brahman
Yejin Choi
16
15
0
24 Oct 2023
Value Kaleidoscope: Engaging AI with Pluralistic Human Values, Rights, and Duties
Taylor Sorensen
Liwei Jiang
Jena D. Hwang
Sydney Levine
Valentina Pyatkin
...
Kavel Rao
Chandra Bhagavatula
Maarten Sap
J. Tasioulas
Yejin Choi
SLR
11
50
0
02 Sep 2023
From Dogwhistles to Bullhorns: Unveiling Coded Rhetoric with Language Models
Julia Mendelsohn
Ronan Le Bras
Yejin Choi
Maarten Sap
13
25
0
26 May 2023
Don't Take This Out of Context! On the Need for Contextual Models and Evaluations for Stylistic Rewriting
Akhila Yerukola
Xuhui Zhou
Elizabeth Clark
Maarten Sap
12
6
0
24 May 2023
BiasX: "Thinking Slow" in Toxic Content Moderation with Explanations of Implied Social Biases
Yiming Zhang
Sravani Nanduri
Liwei Jiang
Tongshuang Wu
Maarten Sap
22
7
0
23 May 2023
Training language models to follow instructions with human feedback
Long Ouyang
Jeff Wu
Xu Jiang
Diogo Almeida
Carroll L. Wainwright
...
Amanda Askell
Peter Welinder
Paul Christiano
Jan Leike
Ryan J. Lowe
OSLM
ALM
303
11,881
0
04 Mar 2022
Can Machines Learn Morality? The Delphi Experiment
Liwei Jiang
Jena D. Hwang
Chandra Bhagavatula
Ronan Le Bras
Jenny T Liang
...
Yulia Tsvetkov
Oren Etzioni
Maarten Sap
Regina A. Rini
Yejin Choi
FaML
117
110
0
14 Oct 2021
Latent Hatred: A Benchmark for Understanding Implicit Hate Speech
Mai Elsherief
Caleb Ziems
D. Muchlinski
Vaishnavi Anupindi
Jordyn Seybolt
M. D. Choudhury
Diyi Yang
92
235
0
11 Sep 2021
Measuring Association Between Labels and Free-Text Rationales
Sarah Wiegreffe
Ana Marasović
Noah A. Smith
274
170
0
24 Oct 2020
Linguistically-Informed Transformations (LIT): A Method for Automatically Generating Contrast Sets
Chuanrong Li
Lin Shengshuo
Leo Z. Liu
Xinyi Wu
Xuhui Zhou
Shane Steinert-Threlkeld
VLM
128
38
0
16 Oct 2020
1