The cell as a token: high-dimensional geometry in language models and cell embeddingsBioinformatics (Bioinformatics), 2025 |
Stable Anisotropic RegularizationInternational Conference on Learning Representations (ICLR), 2023 |
Reliable Measures of Spread in High Dimensional Latent SpacesInternational Conference on Machine Learning (ICML), 2022 |
On the Inductive Bias of Masked Language Modeling: From Statistical to
Syntactic DependenciesNorth American Chapter of the Association for Computational Linguistics (NAACL), 2021 Tianyi Zhang Tatsunori Hashimoto |
Do We Need Online NLU Tools?Language Resources and Evaluation (LRE), 2020 |
The Spectral Underpinning of word2vecFrontiers in Applied Mathematics and Statistics (FAMS), 2020 |
Humpty Dumpty: Controlling Word Meanings via Corpus PoisoningIEEE Symposium on Security and Privacy (S&P), 2020 |
A Generative Word Embedding Model and its Low Rank Positive Semidefinite
SolutionConference on Empirical Methods in Natural Language Processing (EMNLP), 2015 |
WordRank: Learning Word Embeddings via Robust RankingConference on Empirical Methods in Natural Language Processing (EMNLP), 2015 |