A Bayesian latent class reinforcement learning framework to capture adaptive, feedback-driven travel behaviour

8 December 2025

Georges Sfeir

Stephane Hess

Thomas O. Hancock

Filipe Rodrigues

Jamal Amani Rad

Michiel Bliemer

Matthew Beck

Fayyaz Khan

BDL

ArXiv (abs)PDF HTML Github

Main:21 Pages

9 Figures

7 Tables

Appendix:11 Pages

Abstract

Many travel decisions involve a degree of experience formation, where individuals learn their preferences over time. At the same time, there is extensive scope for heterogeneity across individual travellers, both in their underlying preferences and in how these evolve. The present paper puts forward a Latent Class Reinforcement Learning (LCRL) model that allows analysts to capture both of these phenomena. We apply the model to a driving simulator dataset and estimate the parameters through Variational Bayes. We identify three distinct classes of individuals that differ markedly in how they adapt their preferences: the first displays context-dependent preferences with context-specific exploitative tendencies; the second follows a persistent exploitative strategy regardless of context; and the third engages in an exploratory strategy combined with context-specific preferences.

View on arXiv

Comments on this paper