The POLAR Framework: Polar Opposites Enable Interpretability of
Pre-Trained Word Embeddings

v1v2 (latest)

The POLAR Framework: Polar Opposites Enable Interpretability of Pre-Trained Word Embeddings

27 January 2020

Sandipan Sikdar

Florian Lemmerich

ArXiv (abs)PDF HTML

Papers citing "The POLAR Framework: Polar Opposites Enable Interpretability of Pre-Trained Word Embeddings"

18 / 18 papers shown

Title
Profiling Bias in LLMs: Stereotype Dimensions in Contextual Word Embeddings Carolin M. Schuster Maria-Alexandra Dinisor Shashwat Ghatiwala Georg Groh 164 2 0 25 Nov 2024
Rethinking Node Representation Interpretation through Relation Coherence Ying-Chun Lin Jennifer Neville Cassiano Becker Purvanshi Metha Nabiha Asghar Vipul Agarwal 66 0 0 01 Nov 2024
Disentangling Hate Across Target Identities Yiping Jin Leo Wanner Aneesh Moideen Koya 50 0 0 14 Oct 2024
Axis Tour: Word Tour Determines the Order of Axes in ICA-transformed Embeddings Hiroaki Yamagiwa Yusuke Takase Hidetoshi Shimodaira 62 2 0 11 Jan 2024
Discovering Universal Geometry in Embeddings with ICA Hiroaki Yamagiwa Momose Oyama Hidetoshi Shimodaira 58 15 0 22 May 2023
Similarity of Neural Network Models: A Survey of Functional and Representational Measures Max Klabunde Tobias Schumacher M. Strohmaier Florian Lemmerich 187 75 0 10 May 2023
SensePOLAR: Word sense aware interpretability for pre-trained contextual word embeddings Jan Engler Sandipan Sikdar Marlene Lutz M. Strohmaier 85 7 0 11 Jan 2023
Explainability of Text Processing and Retrieval Methods: A Critical Survey Sourav Saha Debapriyo Majumdar Mandar Mitra 96 5 0 14 Dec 2022
Discovering Differences in the Representation of People using Contextualized Semantic Axes L. Lucy Divya Tadimeti David Bamman 88 11 0 21 Oct 2022
Lex2Sent: A bagging approach to unsupervised sentiment analysis Kai-Robin Lange Jonas Rieger Carsten Jentsch SSL 36 2 0 26 Sep 2022
Interpreting Embedding Spaces by Conceptualization Adi Simhi Shaul Markovitch 97 7 0 22 Aug 2022
The Need for Interpretable Features: Motivation and Taxonomy Alexandra Zytek Ignacio Arnaldo Dongyu Liu Laure Berti-Equille K. Veeramachaneni FAtt XAI 84 14 0 23 Feb 2022
Interpretable contrastive word mover's embedding Ruijie Jiang J. Gouvea Eric L. Miller David M. Hammer Shuchin Aeron 43 2 0 01 Nov 2021
Understanding and Countering Stereotypes: A Computational Approach to the Stereotype Content Model Kathleen C. Fraser I. Nejadgholi S. Kiritchenko 74 41 0 04 Jun 2021
Are Interpretations Fairly Evaluated? A Definition Driven Pipeline for Post-Hoc Interpretability Ninghao Liu Yunsong Meng Helen Zhou Tie Wang Bo Long XAI FAtt 79 7 0 16 Sep 2020
Adversarial Attacks and Defenses: An Interpretation Perspective Ninghao Liu Mengnan Du Ruocheng Guo Huan Liu Helen Zhou AAML 63 8 0 23 Apr 2020
Word Equations: Inherently Interpretable Sparse Word Embeddingsthrough Sparse Coding Adly Templeton 36 7 0 08 Apr 2020
FrameAxis: Characterizing Microframe Bias and Intensity with Word Embedding Haewoon Kwak Jisun An Elise Jing Yong-Yeol Ahn 78 44 0 20 Feb 2020