ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2104.03776
60
19
v1v2 (latest)

Statistically significant detection of semantic shifts using contextual word embeddings

8 April 2021
Yang Liu
A. Medlar
D. Głowacka
ArXiv (abs)PDFHTML
Abstract

Detecting lexical semantic shifts in smaller data sets, e.g. in historical linguistics and digital humanities, is challenging due to a lack of statistical power. This issue is exacerbated by non-contextual word embeddings that produce one embedding per token and therefore mask the variability present in the data. In this article, we propose an approach to estimate semantic shifts by combining contextual word embeddings with permutation-based statistical tests. Multiple comparisons are addressed using a false discovery rate procedure. We demonstrate the performance of this approach in simulation, achieving consistently high precision by suppressing false positives. We additionally analyzed real-world data from SemEval-2020 Task 1 and the Liverpool FC subreddit corpus. We show that by taking sample variation into account, we can improve the robustness of individual semantic shift estimates without degrading overall performance.

View on arXiv
Comments on this paper