Short-term plasticity as cause-effect hypothesis testing in distal
reward learning
Asynchrony in sensory-motor signals and variable delays between causes and effects introduce ambiguity as to which stimuli, actions, and rewards are causally related. Only the repetition of reward episodes help distinguish true cause-effect relationships from coincidental occurrences. In the model proposed here, a form of short-term plasticity generates dynamics that test, approve, or reject hypotheses on cause-effect relationships. Short-term weights represent hypotheses that are consolidated in long-term memory only when they consistently predict future rewards. Short-term plasticity boosts the learning speed by biasing the exploration of the stimulus-response space towards actions that in the past occurred before rewards. The transition to long-term plasticity indicates under which conditions beliefs can be consolidated in long-term memory, also suggesting a solution to the plasticity-stability dilemma.
View on arXiv