30DayGen: Leveraging LLMs to Create a Content Corpus for Habit Formation

Abstract
In this paper, we present 30 Day Me, a habit formation application that leverages Large Language Models (LLMs) to help users break down their goals into manageable, actionable steps and track their progress. Central to the app is the 30DAYGEN system, which generates 3,531 unique 30-day challenges sourced from over 15K webpages, and enables runtime search of challenge ideas aligned with user-defined goals. We showcase how LLMs can be harnessed to rapidly construct domain specific content corpora for behavioral and educational purposes, and propose a practical pipeline that incorporates effective LLM enhanced approaches for content generation and semantic deduplication.
View on arXiv@article{zhang2025_2505.02851, title={ 30DayGen: Leveraging LLMs to Create a Content Corpus for Habit Formation }, author={ Franklin Zhang and Sonya Zhang and Alon Halevy }, journal={arXiv preprint arXiv:2505.02851}, year={ 2025 } }
Comments on this paper