Mechanistic Interpretability of Emotion Inference in Large Language ModelsAnnual Meeting of the Association for Computational Linguistics (ACL), 2025 |
Gumbel Counterfactual Generation From Language ModelsInternational Conference on Learning Representations (ICLR), 2024 |
Layer by Layer: Uncovering Where Multi-Task Learning Happens in
Instruction-Tuned Large Language ModelsConference on Empirical Methods in Natural Language Processing (EMNLP), 2024 |
Can Language Models Induce Grammatical Knowledge from Indirect Evidence?Conference on Empirical Methods in Natural Language Processing (EMNLP), 2024 |
Mechanistic?BlackboxNLP Workshop on Analyzing and Interpreting Neural Networks for NLP (BlackBoxNLP), 2024 |
How Language Models Prioritize Contextual Grammatical Cues?BlackboxNLP Workshop on Analyzing and Interpreting Neural Networks for NLP (BlackBoxNLP), 2024 |
Linguistic Minimal Pairs Elicit Linguistic Similarity in Large Language
ModelsInternational Conference on Computational Linguistics (COLING), 2024 |
Recurrent Neural Networks Learn to Store and Generate Sequences using
Non-Linear RepresentationsBlackboxNLP Workshop on Analyzing and Interpreting Neural Networks for NLP (BlackBoxNLP), 2024 |
Grammatical information in BERT sentence embeddings as two-dimensional
arraysWorkshop on Representation Learning for NLP (RepL4NLP), 2023 |
Codebook Features: Sparse and Discrete Interpretability for Neural
NetworksInternational Conference on Machine Learning (ICML), 2023 |
Subspace Chronicles: How Linguistic Information Emerges, Shifts and
Interacts during Language Model TrainingConference on Empirical Methods in Natural Language Processing (EMNLP), 2023 |
Unnatural language processing: How do language models handle
machine-generated prompts?Conference on Empirical Methods in Natural Language Processing (EMNLP), 2023 |
Verb Conjugation in Transformers Is Determined by Linear Encodings of
Subject NumberConference on Empirical Methods in Natural Language Processing (EMNLP), 2023 |
Investigating semantic subspaces of Transformer sentence embeddings
through linear structural probingBlackboxNLP Workshop on Analyzing and Interpreting Neural Networks for NLP (BlackboxNLP), 2023 |
Emergent Linear Representations in World Models of Self-Supervised
Sequence ModelsBlackboxNLP Workshop on Analyzing and Interpreting Neural Networks for NLP (BlackboxNLP), 2023 |
Operationalising Representation in Natural Language ProcessingBritish Journal for the Philosophy of Science (BJPS), 2023 |
How does GPT-2 compute greater-than?: Interpreting mathematical
abilities in a pre-trained language modelNeural Information Processing Systems (NeurIPS), 2023 |
Interventional Probing in High Dimensions: An NLI Case StudyFindings (Findings), 2023 |
An Overview on Language Models: Recent Developments and OutlookAPSIPA Transactions on Signal and Information Processing (TASIP), 2023 |
Reconstruction ProbingAnnual Meeting of the Association for Computational Linguistics (ACL), 2022 |
Assessing the Capacity of Transformer to Abstract Syntactic
Representations: A Contrastive Analysis Based on Long-distance AgreementTransactions of the Association for Computational Linguistics (TACL), 2022 |
Understanding Domain Learning in Language Models Through Subpopulation
AnalysisBlackboxNLP Workshop on Analyzing and Interpreting Neural Networks for NLP (BlackboxNLP), 2022 |
Probing with Noise: Unpicking the Warp and Weft of EmbeddingsBlackboxNLP Workshop on Analyzing and Interpreting Neural Networks for NLP (BlackboxNLP), 2022 |
State-of-the-art generalisation research in NLP: A taxonomy and reviewNature Machine Intelligence (Nat. Mach. Intell.), 2022 |
Probing via PromptingNorth American Chapter of the Association for Computational Linguistics (NAACL), 2022 |
Naturalistic Causal Probing for Morpho-SyntaxTransactions of the Association for Computational Linguistics (TACL), 2022 |
When Does Syntax Mediate Neural Language Model Performance? Evidence
from Dropout ProbesNorth American Chapter of the Association for Computational Linguistics (NAACL), 2022 |