v1v2 (latest)

$\pi2\text{vec}$ : Policy Representations with Successor Features

International Conference on Learning Representations (ICLR), 2023

16 June 2023

Abstract

This paper describes $\pi2\text{vec}$ , a method for representing behaviors of black box policies as feature vectors. The policy representations capture how the statistics of foundation model features change in response to the policy behavior in a task agnostic way, and can be trained from offline data, allowing them to be used in offline policy selection. This work provides a key piece of a recipe for fusing together three modern lines of research: Offline policy evaluation as a counterpart to offline RL, foundation models as generic and powerful state representations, and efficient policy selection in resource constrained environments.

View on arXiv

Comments on this paper

π2vec\pi2\text{vec}π2vec: Policy Representations with Successor Features

$\pi2\text{vec}$ : Policy Representations with Successor Features