We argue that immature data pipelines are preventing a large portion of industry practitioners from leveraging the latest research on recommender systems. We propose our template data stack for machine learning at "reasonable scale", and show how many challenges are solved by embracing a serverless paradigm. Leveraging our experience, we detail how modern open source can provide a pipeline processing terabytes of data with limited infrastructure work.
View on arXiv