Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2304.09991
Cited By
Supporting Human-AI Collaboration in Auditing LLMs with LLMs
19 April 2023
Charvi Rastogi
Marco Tulio Ribeiro
Nicholas King
Harsha Nori
Saleema Amershi
ALM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Supporting Human-AI Collaboration in Auditing LLMs with LLMs"
11 / 11 papers shown
Title
Toward Generalizable Evaluation in the LLM Era: A Survey Beyond Benchmarks
Yixin Cao
Shibo Hong
X. Li
Jiahao Ying
Yubo Ma
...
Juanzi Li
Aixin Sun
Xuanjing Huang
Tat-Seng Chua
Yu Jiang
ALM
ELM
84
1
0
26 Apr 2025
Think Together and Work Better: Combining Humans' and LLMs' Think-Aloud Outcomes for Effective Text Evaluation
SeongYeub Chu
JongWoo Kim
MunYong Yi
57
3
0
21 Feb 2025
Evaluating Human-AI Collaboration: A Review and Methodological Framework
George Fragiadakis
Christos Diou
George Kousiouris
Mara Nikolaidou
57
11
0
09 Jul 2024
Enhancing user experience in large language models through human-centered design: Integrating theoretical insights with an experimental study to meet diverse software learning needs with a single document knowledge base
Yuchen Wang
Yin-Shan Lin
Ruixin Huang
Jinyin Wang
Sensen Liu
21
7
0
19 May 2024
Navigating LLM Ethics: Advancements, Challenges, and Future Directions
Junfeng Jiao
S. Afroogh
Yiming Xu
Connor Phillips
AILaw
60
19
0
14 May 2024
Adversarial Nibbler: An Open Red-Teaming Method for Identifying Diverse Harms in Text-to-Image Generation
Jessica Quaye
Alicia Parrish
Oana Inel
Charvi Rastogi
Hannah Rose Kirk
...
Nathan Clement
Rafael Mosquera
Juan Ciro
Vijay Janapa Reddi
Lora Aroyo
29
7
0
14 Feb 2024
Rocks Coding, Not Development--A Human-Centric, Experimental Evaluation of LLM-Supported SE Tasks
Wei Wang
Huilong Ning
Gaowei Zhang
Libo Liu
Yi Wang
26
11
0
08 Feb 2024
LLM-based NLG Evaluation: Current Status and Challenges
Mingqi Gao
Xinyu Hu
Jie Ruan
Xiao Pu
Xiaojun Wan
ELM
LM&MA
55
29
0
02 Feb 2024
Beyond Testers' Biases: Guiding Model Testing with Knowledge Bases using LLMs
Chenyang Yang
Rishabh Rustogi
Rachel A. Brower-Sinning
Grace A. Lewis
Christian Kastner
Tongshuang Wu
KELM
30
11
0
14 Oct 2023
"I'm sorry to hear that": Finding New Biases in Language Models with a Holistic Descriptor Dataset
Eric Michael Smith
Melissa Hall
Melanie Kambadur
Eleonora Presani
Adina Williams
65
129
0
18 May 2022
Discovering and Validating AI Errors With Crowdsourced Failure Reports
Ángel Alexander Cabrera
Abraham J. Druck
Jason I. Hong
Adam Perer
HAI
48
54
0
23 Sep 2021
1