Entity set expansion, taxonomy expansion, and seed-guided taxonomy construction are three representative tasks that aim to automatically populate an existing taxonomy with new concepts. Previous studies view them as three separate tasks, and the proposed methods usually only work for one specific task, which lack generalizability and a holistic perspective across different tasks. In this paper, we aim to discover a unified solution to all three tasks. To be specific, we identify two common skills needed for entity set expansion, taxonomy expansion, and seed-guided taxonomy construction: finding "siblings" and finding "parents". We introduce a taxonomy-guided instruction tuning framework to teach a large language model to generate siblings and parents for query entities, where the joint pre-training process facilitates the mutual enhancement of these two skills. Extensive experiments on multiple benchmark datasets demonstrate the efficacy of our proposed TaxoInstruct framework, which outperforms task-specific baselines across all three tasks.
View on arXiv