Combining Counting Processes and Classification Improves a Stopping Rule
for Technology Assisted Review
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Abstract
Technology Assisted Review (TAR) stopping rules aim to reduce the cost of manually assessing documents for relevance by minimising the number of documents that need to be examined to ensure a desired level of recall. This paper extends an effective stopping rule using information derived from a text classifier that can be trained without the need for any additional annotation. Experiments on multiple data sets (CLEF e-Health, TREC Total Recall, TREC Legal and RCV1) showed that the proposed approach consistently improves performance and outperforms several alternative methods.
View on arXivComments on this paper
