ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1807.09387
  4. Cited By
Learning from Delayed Outcomes via Proxies with Applications to
  Recommender Systems
v1v2 (latest)

Learning from Delayed Outcomes via Proxies with Applications to Recommender Systems

24 July 2018
Timothy A. Mann
Sven Gowal
András Gyorgy
Ray Jiang
Huiyi Hu
Balaji Lakshminarayanan
Prav Srinivasan
    AI4TS
ArXiv (abs)PDFHTML

Papers citing "Learning from Delayed Outcomes via Proxies with Applications to Recommender Systems"

8 / 8 papers shown
Multi-Agent Reinforcement Learning for Long-Term Network Resource
  Allocation through Auction: a V2X Application
Multi-Agent Reinforcement Learning for Long-Term Network Resource Allocation through Auction: a V2X ApplicationComputer Communications (Comput. Commun.), 2022
Jing Tan
R. Khalili
Holger Karl
A. Hecker
OffRL
182
5
0
29 Jul 2022
Learning to Bid Long-Term: Multi-Agent Reinforcement Learning with
  Long-Term and Sparse Reward in Repeated Auction Games
Learning to Bid Long-Term: Multi-Agent Reinforcement Learning with Long-Term and Sparse Reward in Repeated Auction Games
Jing Tan
R. Khalili
Holger Karl
114
3
0
05 Apr 2022
A Conceptual Framework for Externally-influenced Agents: An Assisted
  Reinforcement Learning Review
A Conceptual Framework for Externally-influenced Agents: An Assisted Reinforcement Learning Review
Adam Bignold
Francisco Cruz
Matthew E. Taylor
Tim Brys
Richard Dazeley
Peter Vamplew
Cameron Foale
405
33
0
03 Jul 2020
Stochastic bandits with arm-dependent delays
Stochastic bandits with arm-dependent delays
Anne Gael Manegueu
Claire Vernade
Alexandra Carpentier
Michal Valko
279
49
0
18 Jun 2020
An empirical investigation of the challenges of real-world reinforcement
  learning
An empirical investigation of the challenges of real-world reinforcement learning
Gabriel Dulac-Arnold
Nir Levine
D. Mankowitz
Jerry Li
Cosmin Paduraru
Sven Gowal
Todd Hester
OffRL
464
130
0
24 Mar 2020
Nonstochastic Multiarmed Bandits with Unrestricted Delays
Nonstochastic Multiarmed Bandits with Unrestricted DelaysNeural Information Processing Systems (NeurIPS), 2019
Tobias Sommer Thune
Nicolò Cesa-Bianchi
Yevgeny Seldin
395
59
0
03 Jun 2019
Challenges of Real-World Reinforcement Learning
Challenges of Real-World Reinforcement Learning
Gabriel Dulac-Arnold
D. Mankowitz
Todd Hester
OffRL
431
640
0
29 Apr 2019
Linear Bandits with Stochastic Delayed Feedback
Linear Bandits with Stochastic Delayed FeedbackInternational Conference on Machine Learning (ICML), 2018
Claire Vernade
Alexandra Carpentier
Tor Lattimore
Giovanni Zappella
Beyza Ermis
M. Brueckner
409
73
0
05 Jul 2018
1
Page 1 of 1