Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2106.07393
Cited By
Cross-replication Reliability -- An Empirical Approach to Interpreting Inter-rater Reliability
11 June 2021
KayYen Wong
Praveen K. Paritosh
Lora Aroyo
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Cross-replication Reliability -- An Empirical Approach to Interpreting Inter-rater Reliability"
16 / 16 papers shown
Title
Automating eHMI Action Design with LLMs for Automated Vehicle Communication
Ding Xia
Xinyue Gui
Fan Gao
Dongyuan Li
Mark Colley
Takeo Igarashi
22
0
0
27 May 2025
Multimodal Fusion with LLMs for Engagement Prediction in Natural Conversation
Cheng Charles Ma
Kevin Hyekang Joo
Alexandria K. Vail
Sunreeta Bhattacharya
Álvaro Fernández García
Kailana Baker-Matsuoka
Sheryl Mathew
Lori L. Holt
Fernando De la Torre
72
4
0
13 Sep 2024
Rater Cohesion and Quality from a Vicarious Perspective
Deepak Pandita
Tharindu Cyril Weerasooriya
Sujan Dutta
Sarah K. K. Luger
Tharindu Ranasinghe
Ashiqur R. KhudaBukhsh
Marcos Zampieri
Christopher M. Homan
61
1
0
15 Aug 2024
Localizing and Mitigating Errors in Long-form Question Answering
Rachneet Sachdeva
Yixiao Song
Mohit Iyyer
Iryna Gurevych
HILM
78
1
0
16 Jul 2024
TimeChara: Evaluating Point-in-Time Character Hallucination of Role-Playing Large Language Models
Jaewoo Ahn
Taehyun Lee
Junyoung Lim
Jin-Hwa Kim
Sangdoo Yun
Hwaran Lee
Gunhee Kim
LLMAG
HILM
88
13
0
28 May 2024
D3CODE: Disentangling Disagreements in Data across Cultures on Offensiveness Detection and Evaluation
Aida Mostafazadeh Davani
Mark Díaz
Dylan K. Baker
Vinodkumar Prabhakaran
74
10
0
16 Apr 2024
Mitigating Unhelpfulness in Emotional Support Conversations with Multifaceted AI Feedback
Jiashuo Wang
Chunpu Xu
Chak Tou Leong
Wenjie Li
Jing Li
105
2
0
11 Jan 2024
Evolving Domain Adaptation of Pretrained Language Models for Text Classification
Yun-Shiuan Chuang
Yi Wu
Dhruv Gupta
Rheeya Uppaal
Ananya Kumar
Luhang Sun
Makesh Narsimhan Sreedhar
Sijia Yang
Timothy T. Rogers
Junjie Hu
VLM
115
4
0
16 Nov 2023
GRASP: A Disagreement Analysis Framework to Assess Group Associations in Perspectives
Vinodkumar Prabhakaran
Christopher Homan
Lora Aroyo
Aida Mostafazadeh Davani
Alicia Parrish
Alex S. Taylor
Mark Díaz
Ding Wang
Greg Serapio-García
99
9
0
09 Nov 2023
Modeling subjectivity (by Mimicking Annotator Annotation) in toxic comment identification across diverse communities
Senjuti Dutta
Sid Mittal
Sherol Chen
Deepak Ramachandran
Ravi Rajakumar
Ian D Kivlichan
Sunny Mak
Alena Butryna
Praveen Paritosh University of Tennessee
109
7
0
01 Nov 2023
How We Define Harm Impacts Data Annotations: Explaining How Annotators Distinguish Hateful, Offensive, and Toxic Comments
Angela M. Schöpke-Gonzalez
Siqi Wu
Sagar Kumar
Paul Resnick
Libby Hemphill
41
2
0
12 Sep 2023
Collect, Measure, Repeat: Reliability Factors for Responsible AI Data Collection
Oana Inel
Tim Draws
Lora Aroyo
110
6
0
22 Aug 2023
Student's t-Distribution: On Measuring the Inter-Rater Reliability When the Observations are Scarce
Serge Gladkoff
Lifeng Han
Goran Nenadic
77
5
0
08 Mar 2023
Undesirable Biases in NLP: Addressing Challenges of Measurement
Oskar van der Wal
Dominik Bachmann
Alina Leidinger
L. Maanen
Willem H. Zuidema
K. Schulz
84
7
0
24 Nov 2022
Measuring and Improving Semantic Diversity of Dialogue Generation
Seungju Han
Beomsu Kim
Buru Chang
85
15
0
11 Oct 2022
k-Rater Reliability: The Correct Unit of Reliability for Aggregated Human Annotations
KayYen Wong
Praveen K. Paritosh
HILM
44
7
0
24 Mar 2022
1