34
0

Dynamic Embedded Topic Models: properties and recommendations based on diverse corpora

Abstract

We measure the effects of several implementation choices for the Dynamic Embedded Topic Model, as applied to five distinct diachronic corpora, with the goal of isolating important decisions for its use and further development. We identify priorities that will maximize utility in applied scholarship, including the practical scalability of vocabulary size to best exploit the strengths of embedded representations, and more flexible modeling of intervals to accommodate the uneven temporal distributions of historical writing. Of similar importance, we find performance is not significantly or consistently affected by several aspects that otherwise limit the model's application or might consume the resources of a grid search.

View on arXiv
@article{fittschen2025_2504.19209,
  title={ Dynamic Embedded Topic Models: properties and recommendations based on diverse corpora },
  author={ Elisabeth Fittschen and Bella Xia and Leib Celnik and Paul Dilley and Tom Lippincott },
  journal={arXiv preprint arXiv:2504.19209},
  year={ 2025 }
}
Comments on this paper