ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2302.04732
  4. Cited By
Zeno: An Interactive Framework for Behavioral Evaluation of Machine
  Learning

Zeno: An Interactive Framework for Behavioral Evaluation of Machine Learning

9 February 2023
Ángel Alexander Cabrera
Erica Fu
Donald Bertucci
Kenneth Holstein
Ameet Talwalkar
Jason I. Hong
Adam Perer
ArXivPDFHTML

Papers citing "Zeno: An Interactive Framework for Behavioral Evaluation of Machine Learning"

8 / 8 papers shown
Title
What's the Difference? Supporting Users in Identifying the Effects of Prompt and Model Changes Through Token Patterns
What's the Difference? Supporting Users in Identifying the Effects of Prompt and Model Changes Through Token Patterns
Michael A. Hedderich
Anyi Wang
Raoyuan Zhao
Florian Eichin
Barbara Plank
30
0
0
22 Apr 2025
Orbit: A Framework for Designing and Evaluating Multi-objective Rankers
Orbit: A Framework for Designing and Evaluating Multi-objective Rankers
Chenyang Yang
Tesi Xiao
Michael Shavlovsky
Christian Kastner
Tongshuang Wu
37
0
0
07 Nov 2024
Compress and Compare: Interactively Evaluating Efficiency and Behavior
  Across ML Model Compression Experiments
Compress and Compare: Interactively Evaluating Efficiency and Behavior Across ML Model Compression Experiments
Angie Boggust
Venkatesh Sivaraman
Yannick Assogba
Donghao Ren
Dominik Moritz
Fred Hohman
VLM
50
3
0
06 Aug 2024
Canvil: Designerly Adaptation for LLM-Powered User Experiences
Canvil: Designerly Adaptation for LLM-Powered User Experiences
K. J. Kevin Feng
Q. V. Liao
Ziang Xiao
Jennifer Wortman Vaughan
Amy X. Zhang
David W. McDonald
31
16
0
17 Jan 2024
Beyond Testers' Biases: Guiding Model Testing with Knowledge Bases using
  LLMs
Beyond Testers' Biases: Guiding Model Testing with Knowledge Bases using LLMs
Chenyang Yang
Rishabh Rustogi
Rachel A. Brower-Sinning
Grace A. Lewis
Christian Kastner
Tongshuang Wu
KELM
30
11
0
14 Oct 2023
Exploring How Machine Learning Practitioners (Try To) Use Fairness
  Toolkits
Exploring How Machine Learning Practitioners (Try To) Use Fairness Toolkits
Wesley Hanwen Deng
Manish Nagireddy
M. S. Lee
Jatinder Singh
Zhiwei Steven Wu
Kenneth Holstein
Haiyi Zhu
36
88
0
13 May 2022
Discovering and Validating AI Errors With Crowdsourced Failure Reports
Discovering and Validating AI Errors With Crowdsourced Failure Reports
Ángel Alexander Cabrera
Abraham J. Druck
Jason I. Hong
Adam Perer
HAI
45
54
0
23 Sep 2021
Robustness Gym: Unifying the NLP Evaluation Landscape
Robustness Gym: Unifying the NLP Evaluation Landscape
Karan Goel
Nazneen Rajani
Jesse Vig
Samson Tan
Jason M. Wu
Stephan Zheng
Caiming Xiong
Mohit Bansal
Christopher Ré
AAML
OffRL
OOD
146
136
0
13 Jan 2021
1