Offline Minimax Soft-Q-learning Under Realizability and Partial CoverageNeural Information Processing Systems (NeurIPS), 2023 |
Evaluating the Robustness of Off-Policy EvaluationACM Conference on Recommender Systems (RecSys), 2021 |
Optimal Uniform OPE and Model-based Offline Reinforcement Learning in
Time-Homogeneous, Reward-Free and Task-Agnostic SettingsNeural Information Processing Systems (NeurIPS), 2021 Ming Yin Yu Wang |
Non-Negative Bregman Divergence Minimization for Deep Direct Density
Ratio EstimationInternational Conference on Machine Learning (ICML), 2020 |