FastIF: Scalable Influence Functions for Efficient Model Interpretation and Debugging

31 December 2020

Papers citing "FastIF: Scalable Influence Functions for Efficient Model Interpretation and Debugging"

31 / 81 papers shown

Title
Simfluence: Modeling the Influence of Individual Training Examples by Simulating Training Runs Kelvin Guu Albert Webson Ellie Pavlick Lucas Dixon Ian Tenney Tolga Bolukbasi TDI 66 33 0 14 Mar 2023
Make Every Example Count: On the Stability and Utility of Self-Influence for Learning from Noisy NLP Datasets Irina Bejan Artem Sokolov Katja Filippova TDI 19 8 0 27 Feb 2023
Which Experiences Are Influential for Your Agent? Policy Iteration with Turn-over Dropout Takuya Hiraoka Takashi Onishi Yoshimasa Tsuruoka OffRL 19 0 0 26 Jan 2023
Post hoc Explanations may be Ineffective for Detecting Unknown Spurious Correlation Julius Adebayo M. Muelly H. Abelson Been Kim 16 86 0 09 Dec 2022
Training Data Influence Analysis and Estimation: A Survey Zayd Hammoudeh Daniel Lowd TDI 29 82 0 09 Dec 2022
Influence Functions for Sequence Tagging Models Sarthak Jain Varun Manjunatha Byron C. Wallace A. Nenkova TDI 27 8 0 25 Oct 2022
Analyzing the Use of Influence Functions for Instance-Specific Data Filtering in Neural Machine Translation Tsz Kin Lam Eva Hasler F. Hieber TDI 27 4 0 24 Oct 2022
Understanding Influence Functions and Datamodels via Harmonic Analysis Nikunj Saunshi Arushi Gupta M. Braverman Sanjeev Arora TDI 60 17 0 03 Oct 2022
If Influence Functions are the Answer, Then What is the Question? Juhan Bae Nathan Ng Alston Lo Marzyeh Ghassemi Roger C. Grosse TDI 16 88 0 12 Sep 2022
Causality-Inspired Taxonomy for Explainable Artificial Intelligence Pedro C. Neto Tiago B. Gonccalves João Ribeiro Pinto W. Silva Ana F. Sequeira Arun Ross Jaime S. Cardoso XAI 28 12 0 19 Aug 2022
Leveraging Explanations in Interactive Machine Learning: An Overview Stefano Teso Öznur Alkan Wolfgang Stammer Elizabeth M. Daly XAI FAtt LRM 26 62 0 29 Jul 2022
ORCA: Interpreting Prompted Language Models via Locating Supporting Data Evidence in the Ocean of Pretraining Data Xiaochuang Han Yulia Tsvetkov 19 27 0 25 May 2022
Towards Tracing Factual Knowledge in Language Models Back to the Training Data Ekin Akyürek Tolga Bolukbasi Frederick Liu Binbin Xiong Ian Tenney Jacob Andreas Kelvin Guu HILM 11 11 0 23 May 2022
Unintended memorisation of unique features in neural networks J. Hartley Sotirios A. Tsaftaris 30 1 0 20 May 2022
An Empirical Study of Memorization in NLP Xiaosen Zheng Jing Jiang TDI 17 24 1 23 Mar 2022
Human-Centered Concept Explanations for Neural Networks Chih-Kuan Yeh Been Kim Pradeep Ravikumar FAtt 27 25 0 25 Feb 2022
Measuring Unintended Memorisation of Unique Private Features in Neural Networks J. Hartley Sotirios A. Tsaftaris 19 7 0 16 Feb 2022
Identifying a Training-Set Attack's Target Using Renormalized Influence Estimation Zayd Hammoudeh Daniel Lowd TDI 18 28 0 25 Jan 2022
FairIF: Boosting Fairness in Deep Learning via Influence Functions with Validation Set Sensitive Attributes Haonan Wang Ziwei Wu Jingrui He 19 12 0 15 Jan 2022
Scaling Up Influence Functions Andrea Schioppa Polina Zablotskaia David Vilar Artem Sokolov TDI 25 90 0 06 Dec 2021
Interpreting Deep Learning Models in Natural Language Processing: A Review Xiaofei Sun Diyi Yang Xiaoya Li Tianwei Zhang Yuxian Meng Han Qiu Guoyin Wang Eduard H. Hovy Jiwei Li 17 44 0 20 Oct 2021
Machine Unlearning of Features and Labels Alexander Warnecke Lukas Pirch Christian Wressnegger Konrad Rieck MU 6 171 0 26 Aug 2021
Post-hoc Interpretability for Neural NLP: A Survey Andreas Madsen Siva Reddy A. Chandar XAI 19 222 0 10 Aug 2021
A Source-Criticism Debiasing Method for GloVe Embeddings Hope McGovern 17 3 0 25 Jun 2021
Interactive Label Cleaning with Example-based Explanations Stefano Teso A. Bontempelli Fausto Giunchiglia Andrea Passerini 30 45 0 07 Jun 2021
Explanation-Based Human Debugging of NLP Models: A Survey Piyawat Lertvittayakumjorn Francesca Toni LRM 35 79 0 30 Apr 2021
Extracting Training Data from Large Language Models Nicholas Carlini Florian Tramèr Eric Wallace Matthew Jagielski Ariel Herbert-Voss ... Tom B. Brown D. Song Ulfar Erlingsson Alina Oprea Colin Raffel MLAU SILM 290 1,814 0 14 Dec 2020
Data Appraisal Without Data Sharing Mimee Xu L. V. D. van der Maaten Awni Y. Hannun TDI 26 6 0 11 Dec 2020
Measuring Association Between Labels and Free-Text Rationales Sarah Wiegreffe Ana Marasović Noah A. Smith 276 170 0 24 Oct 2020
e-SNLI: Natural Language Inference with Natural Language Explanations Oana-Maria Camburu Tim Rocktaschel Thomas Lukasiewicz Phil Blunsom LRM 255 620 0 04 Dec 2018
Towards A Rigorous Science of Interpretable Machine Learning Finale Doshi-Velez Been Kim XAI FaML 251 3,683 0 28 Feb 2017