Attention Meets Post-hoc Interpretability: A Mathematical PerspectiveInternational Conference on Machine Learning (ICML), 2024 |
Dynamic Top-k Estimation Consolidates Disagreement between Feature
Attribution MethodsConference on Empirical Methods in Natural Language Processing (EMNLP), 2023 |
CATfOOD: Counterfactual Augmented Training for Improving Out-of-Domain
Performance and CalibrationConference of the European Chapter of the Association for Computational Linguistics (EACL), 2023 |
Feature Interactions Reveal Linguistic Structure in Language ModelsAnnual Meeting of the Association for Computational Linguistics (ACL), 2023 |
Quantifying Context Mixing in TransformersConference of the European Chapter of the Association for Computational Linguistics (EACL), 2023 |
Semantic match: Debugging feature attribution methods in XAI for
healthcareACM Conference on Health, Inference, and Learning (CHIL), 2023 |