Counterfactual harm

27 April 2022

Papers citing "Counterfactual harm"

23 / 23 papers shown

Title
Measuring Goal-Directedness Matt MacDermott James Fox Francesco Belardinelli Tom Everitt 88 1 0 06 Dec 2024
From Imitation to Introspection: Probing Self-Consciousness in Language Models Sirui Chen Shu Yu Shengjie Zhao Chaochao Lu MILM LRM 30 1 0 24 Oct 2024
Counterfactual Effect Decomposition in Multi-Agent Sequential Decision Making Stelios Triantafyllou A. Sukovic Yasaman Zolfimoselo Goran Radanović CML 35 0 0 16 Oct 2024
Can a Bayesian Oracle Prevent Harm from an Agent? Yoshua Bengio Michael K. Cohen Nikolay Malkin Matt MacDermott Damiano Fornasiere Pietro Greiner Younesse Kaddar 37 4 0 09 Aug 2024
Teleporter Theory: A General and Simple Approach for Modeling Cross-World Counterfactual Causality Jiangmeng Li Bin Qin Qirui Ji Yi Li Wenwen Qiang Jianwen Cao Fanjiang Xu 44 0 0 17 Jun 2024
Harm Mitigation in Recommender Systems under User Preference Dynamics Jerry Chee Shankar Kalyanaraman S. Ernala Udi Weinsberg Sarah Dean Stratis Ioannidis 35 4 0 14 Jun 2024
Controlling Counterfactual Harm in Decision Support Systems Based on Prediction Sets Eleni Straitouri Suhas Thejaswi Manuel Gomez Rodriguez 36 1 0 10 Jun 2024
Matchings, Predictions and Counterfactual Harm in Refugee Resettlement Processes Seungeon Lee N. C. Benz Suhas Thejaswi Manuel Gomez Rodriguez 27 0 0 24 May 2024
Do No Harm: A Counterfactual Approach to Safe Reinforcement Learning Sean Vaskov Wilko Schwarting Chris Baker 17 1 0 19 May 2024
Robust agents learn causal world models Jonathan G. Richens Tom Everitt OOD 114 36 0 16 Feb 2024
Agent-Specific Effects: A Causal Effect Propagation Analysis in Multi-Agent MDPs Stelios Triantafyllou A. Sukovic Debmalya Mandal Goran Radanović 14 0 0 17 Oct 2023
Analyzing Intentional Behavior in Autonomous Agents under Uncertainty Filip Cano Córdoba Samuel Judson Timos Antonopoulos Katrine Bjørner Nicholas Shoemaker Scott J. Shapiro R. Piskac Bettina Könighofer 13 3 0 04 Jul 2023
Simulating counterfactuals J. Karvanen S. Tikka M. Vihola 16 0 0 27 Jun 2023
Causal Fairness for Outcome Control Drago Plečko Elias Bareinboim 17 5 0 08 Jun 2023
Finding Counterfactually Optimal Action Sequences in Continuous State Spaces Stratis Tsirtsis Manuel Gomez Rodriguez CML OffRL 22 9 0 06 Jun 2023
Partial Counterfactual Identification of Continuous Outcomes with a Curvature Sensitivity Model Valentyn Melnychuk Dennis Frauen Stefan Feuerriegel 13 11 0 02 Jun 2023
Human Control: Definitions and Algorithms Ryan Carey Tom Everitt 11 6 0 31 May 2023
Counterfactual Identifiability of Bijective Causal Models Arash Nasr-Esfahany MohammadIman Alizadeh Devavrat Shah CML BDL 22 26 0 04 Feb 2023
Personalised Decision-Making without Counterfactuals A. Dawid S. Senn OffRL 9 4 0 27 Jan 2023
A Causal Analysis of Harm Sander Beckers Hana Chockler J. Halpern AILaw ELM 16 17 0 11 Oct 2022
Quantifying Harm Sander Beckers Hana Chockler J. Halpern 41 9 0 29 Sep 2022
Discovering Agents Zachary Kenton Ramana Kumar Sebastian Farquhar Jonathan G. Richens Matt MacDermott Tom Everitt CML 26 31 0 17 Aug 2022
A Survey on Bias and Fairness in Machine Learning Ninareh Mehrabi Fred Morstatter N. Saxena Kristina Lerman Aram Galstyan SyDa FaML 294 4,187 0 23 Aug 2019