An Online Learning Theory of Brokerage

We investigate brokerage between traders from an online learning perspective. At any round , two traders arrive with their private valuations, and the broker proposes a trading price. Unlike other bilateral trade problems already studied in the online learning literature, we focus on the case where there are no designated buyer and seller roles: each trader will attempt to either buy or sell depending on the current price of the good. We assume the agents' valuations are drawn i.i.d. from a fixed but unknown distribution. If the distribution admits a density bounded by some constant , then, for any time horizon : If the agents' valuations are revealed after each interaction, we provide an algorithm achieving regret and show this rate is optimal, up to constant factors. If only their willingness to sell or buy at the proposed price is revealed after each interaction, we provide an algorithm achieving regret and show this rate is optimal, up to constant factors. Finally, if we drop the bounded density assumption, we show that the optimal rate degrades to in the first case, and the problem becomes unlearnable in the second.
View on arXiv