38
4

Continuous Assortment Optimization with Logit Choice Probabilities under Incomplete Information

Abstract

We consider assortment optimization of a product for which a particular attribute can be adjusted in a continuous fashion. Examples include the duration of a loan, the data limit for a cell phone subscription and the greenness of paint. We represent the collection of all product variants as the unit interval and consider the question which subset of products a retailer should offer to customers, in order to maximize profit. We model customer choice behavior by a continuous extension of the multinomial logit model and allow for a capacity constraint on the offered assortment. We study this problem under incomplete information, which constitutes an instance of a continuous combinatorial multi-armed bandit problem. The unknown quantities in the model are estimated by kernel density estimation with Legendre kernels and bounded support, for which we derive new convergence rates. We present an explore-then-exploit policy and show that it endures regret of order T2/3T^{2/3} (neglecting logarithmic factors). Also, by showing that any policy in the worst case must endure at least a regret of order T2/3T^{2/3}, we conclude that our policy is asymptotically optimal.

View on arXiv
Comments on this paper