Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2310.09668
Cited By
Beyond Testers' Biases: Guiding Model Testing with Knowledge Bases using LLMs
14 October 2023
Chenyang Yang
Rishabh Rustogi
Rachel A. Brower-Sinning
Grace A. Lewis
Christian Kastner
Tongshuang Wu
KELM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Beyond Testers' Biases: Guiding Model Testing with Knowledge Bases using LLMs"
16 / 16 papers shown
Title
SPHERE: An Evaluation Card for Human-AI Systems
Qianou Ma
Dora Zhao
Xinran Zhao
Chenglei Si
Chenyang Yang
Ryan Louie
Ehud Reiter
Diyi Yang
Tongshuang Wu
ALM
46
0
0
24 Mar 2025
Orbit: A Framework for Designing and Evaluating Multi-objective Rankers
Chenyang Yang
Tesi Xiao
Michael Shavlovsky
Christian Kastner
Tongshuang Wu
27
0
0
07 Nov 2024
What Is Wrong with My Model? Identifying Systematic Problems with Semantic Data Slicing
Chenyang Yang
Yining Hong
Grace A. Lewis
Tongshuang Wu
Christian Kastner
36
1
0
14 Sep 2024
MaxMind: A Memory Loop Network to Enhance Software Productivity based on Large Language Models
Yuchen Dong
Xiaoxiang Fang
Yuchen Hu
Renshuang Jiang
Zhe Jiang
36
0
0
07 Aug 2024
Navigating LLM Ethics: Advancements, Challenges, and Future Directions
Junfeng Jiao
S. Afroogh
Yiming Xu
Connor Phillips
AILaw
53
19
0
14 May 2024
Clarify: Improving Model Robustness With Natural Language Corrections
Yoonho Lee
Michelle S. Lam
Helena Vasconcelos
Michael S. Bernstein
Chelsea Finn
10
6
0
06 Feb 2024
LLMs for Test Input Generation for Semantic Caches
Zafaryab Rasool
Scott Barnett
David Willie
Stefanus Kurniawan
Sherwin Balugo
Srikanth Thudumu
Mohamed Abdelrazek
13
1
0
16 Jan 2024
(Why) Is My Prompt Getting Worse? Rethinking Regression Testing for Evolving LLM APIs
Wanqin Ma
Chenyang Yang
Christian Kastner
11
7
0
18 Nov 2023
Selenite: Scaffolding Online Sensemaking with Comprehensive Overviews Elicited from Large Language Models
Michael Xieyang Liu
Tongshuang Wu
Tianying Chen
Franklin Mingzhe Li
A. Kittur
Brad A. Myers
LRM
RALM
14
20
0
03 Oct 2023
Model Sketching: Centering Concepts in Early-Stage Machine Learning Model Design
Michelle S. Lam
Zixian Ma
Anne Li
Izequiel Freitas
Dakuo Wang
James A. Landay
Michael S. Bernstein
150
21
0
06 Mar 2023
Crawling the Internal Knowledge-Base of Language Models
Roi Cohen
Mor Geva
Jonathan Berant
Amir Globerson
170
74
0
30 Jan 2023
Language Generation Models Can Cause Harm: So What Can We Do About It? An Actionable Survey
Sachin Kumar
Vidhisha Balachandran
Lucille Njoo
Antonios Anastasopoulos
Yulia Tsvetkov
ELM
58
59
0
14 Oct 2022
Training language models to follow instructions with human feedback
Long Ouyang
Jeff Wu
Xu Jiang
Diogo Almeida
Carroll L. Wainwright
...
Amanda Askell
Peter Welinder
Paul Christiano
Jan Leike
Ryan J. Lowe
OSLM
ALM
301
11,730
0
04 Mar 2022
NL-Augmenter: A Framework for Task-Sensitive Natural Language Augmentation
Kaustubh D. Dhole
Varun Gangal
Sebastian Gehrmann
Aadesh Gupta
Zhenhao Li
...
Tianbao Xie
Usama Yaseen
Michael A. Yee
Jing Zhang
Yue Zhang
153
86
0
06 Dec 2021
Robustness Gym: Unifying the NLP Evaluation Landscape
Karan Goel
Nazneen Rajani
Jesse Vig
Samson Tan
Jason M. Wu
Stephan Zheng
Caiming Xiong
Mohit Bansal
Christopher Ré
AAML
OffRL
OOD
133
136
0
13 Jan 2021
Improving fairness in machine learning systems: What do industry practitioners need?
Kenneth Holstein
Jennifer Wortman Vaughan
Hal Daumé
Miroslav Dudík
Hanna M. Wallach
FaML
HAI
181
730
0
13 Dec 2018
1