Large Language Models (LLMs) are transforming information extraction from academic literature, offering new possibilities for knowledge management. This study presents an LLM-based system designed to extract detailed information about research instruments used in the education field, including their names, types, target respondents, measured constructs, and outcomes. Using multi-step prompting and a domain-specific data schema, it generates structured outputs optimized for educational research. Our evaluation shows that this system significantly outperforms other approaches, particularly in identifying instrument names and detailed information. This demonstrates the potential of LLM-powered information extraction in educational contexts, offering a systematic way to organize research instrument information. The ability to aggregate such information at scale enhances accessibility for researchers and education leaders, facilitating informed decision-making in educational research and policy.
View on arXiv@article{yoo2025_2505.21855, title={ Extracting Research Instruments from Educational Literature Using LLMs }, author={ Jiseung Yoo and Curran Mahowald and Meiyu Li and Wei Ai }, journal={arXiv preprint arXiv:2505.21855}, year={ 2025 } }