Unsupervised Feature Transformation via In-context Generation, Generator-critic LLM Agents, and Duet-play Teaming

30 April 2025

Abstract

Feature transformation involves generating a new set of features from the original dataset to enhance the data's utility. In certain domains like material performance screening, dimensionality is large and collecting labels is expensive and lengthy. It highly necessitates transforming feature spaces efficiently and without supervision to enhance data readiness and AI utility. However, existing methods fall short in efficient navigation of a vast space of feature combinations, and are mostly designed for supervised settings. To fill this gap, our unique perspective is to leverage a generator-critic duet-play teaming framework using LLM agents and in-context learning to derive pseudo-supervision from unsupervised data. The framework consists of three interconnected steps: (1) Critic agent diagnoses data to generate actionable advice, (2) Generator agent produces tokenized feature transformations guided by the critic's advice, and (3) Iterative refinement ensures continuous improvement through feedback between agents. The generator-critic framework can be generalized to human-agent collaborative generation, by replacing the critic agent with human experts. Extensive experiments demonstrate that the proposed framework outperforms even supervised baselines in feature transformation efficiency, robustness, and practical applicability across diverse datasets.

View on arXiv

@article{gong2025_2504.21304,
  title={ Unsupervised Feature Transformation via In-context Generation, Generator-critic LLM Agents, and Duet-play Teaming },
  author={ Nanxu Gong and Xinyuan Wang and Wangyang Ying and Haoyue Bai and Sixun Dong and Haifeng Chen and Yanjie Fu },
  journal={arXiv preprint arXiv:2504.21304},
  year={ 2025 }
}

Comments on this paper