Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2206.00759
Cited By
Interpretability Guarantees with Merlin-Arthur Classifiers
1 June 2022
S. Wäldchen
Kartikey Sharma
Berkant Turan
Max Zimmer
Sebastian Pokutta
FAtt
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Interpretability Guarantees with Merlin-Arthur Classifiers"
5 / 5 papers shown
Title
Training Characteristic Functions with Reinforcement Learning: XAI-methods play Connect Four
S. Wäldchen
Felix Huber
Sebastian Pokutta
FAtt
28
8
0
23 Feb 2022
Interpretable Neural Networks with Frank-Wolfe: Sparse Relevance Maps and Relevance Orderings
Jan Macdonald
Mathieu Besançon
Sebastian Pokutta
32
11
0
15 Oct 2021
Invariant Rationalization
Shiyu Chang
Yang Zhang
Mo Yu
Tommi Jaakkola
179
201
0
22 Mar 2020
A Survey on Bias and Fairness in Machine Learning
Ninareh Mehrabi
Fred Morstatter
N. Saxena
Kristina Lerman
Aram Galstyan
SyDa
FaML
323
4,212
0
23 Aug 2019
AI safety via debate
G. Irving
Paul Christiano
Dario Amodei
204
200
0
02 May 2018
1