Deep learning framework for action prediction reveals multi-timescale locomotor control

Modeling human movement in real-world tasks is a fundamental goal for motor control, biomechanics, and rehabilitation engineering. However, existing models of essential tasks like locomotion are not applicable across varying terrain, mechanical conditions, and sensory contexts. This is at least in part due to simplifying assumptions like linear and fixed timescales mappings between inputs and future actions, which may not be broadly applicable. Here, we develop a deep learning-based framework for action prediction, outperforming traditional models across multiple contexts (walking and running, treadmill and overground, varying terrains) and input modalities (multiple body states, visual gaze). We find that neural network architectures with flexible input history-dependence, like GRU and Transformer, and with architecture-dependent trial embeddings perform best overall. By quantifying the model's predictions relative to an autoregressive baseline, we identify context- and modality-dependent timescales. These analyses reveal that there is greater reliance on fast-timescale predictions in complex terrain, gaze predicts future foot placement before body states, and the full-body state predictions precede those by center-of-mass states. This deep learning framework for human action prediction provides quantifiable insights into the control of real-world locomotion and can be extended to other actions, contexts, and populations.
View on arXiv@article{wang2025_2503.16340, title={ Deep learning framework for action prediction reveals multi-timescale locomotor control }, author={ Wei-Chen Wang and Antoine De Comite and Alexandra Voloshina and Monica Daley and Nidhi Seethapathi }, journal={arXiv preprint arXiv:2503.16340}, year={ 2025 } }