31

DeepDelveAI: Identifying AI Related Documents in Large Scale Literature Data

Journal of Social Computing (JSC), 2024
Zhou Xiaochen
Liang Xingzhou
Qu Jingjing
Main:18 Pages
4 Figures
3 Tables
Abstract

This paper presents DeepDelveAI, a comprehensive dataset specifically curated to identify AI-related research papers from a large-scale academic literature database. The dataset was created using an advanced Long Short-Term Memory (LSTM) model trained on a binary classification task to distinguish between AI-related and non-AI-related papers. The model was trained and validated on a vast dataset, achieving high accuracy, precision, recall, and F1-score. The resulting DeepDelveAI dataset comprises over 9.4 million AI-related papers published since Dartmouth Conference, from 1956 to 2024, providing a crucial resource for analyzing trends, thematic developments, and the evolution of AI research across various disciplines.

View on arXiv
Comments on this paper