Small-Text: Active Learning for Text Classification in Python
- CLIPVLMAI4CE
We present small-text, an easy-to-use active learning library written in Python, which offers pool-based active learning for single- and multi-label text classification in Python. It features many pre-implemented state-of-the-art query strategies, including some that leverage the GPU. Standardized interfaces allow the combination of a variety of classifiers, query strategies, and stopping criteria, facilitating a quick mix and match, and enabling a rapid development of both active learning experiments and applications. In order to make various classifiers and query strategies accessible for active learning, small-text integrates several well-known machine learning libraries, namely scikit-learn, PyTorch, and Hugging Face transformers. The latter integrations are optionally installable extensions, so GPUs can be used but are not required. The library is publicly available under the MIT License at https://github.com/webis-de/small-text, in version 1.1.1 at the time of writing.
View on arXiv