ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2401.11237
  4. Cited By
Closing the Gap between TD Learning and Supervised Learning -- A
  Generalisation Point of View

Closing the Gap between TD Learning and Supervised Learning -- A Generalisation Point of View

20 January 2024
Raj Ghugare
Matthieu Geist
Glen Berseth
Benjamin Eysenbach
    OffRL
ArXivPDFHTML

Papers citing "Closing the Gap between TD Learning and Supervised Learning -- A Generalisation Point of View"

9 / 9 papers shown
Title
VLM Q-Learning: Aligning Vision-Language Models for Interactive Decision-Making
VLM Q-Learning: Aligning Vision-Language Models for Interactive Decision-Making
Jake Grigsby
Yuke Zhu
Michael S Ryoo
Juan Carlos Niebles
OffRL
VLM
28
0
0
06 May 2025
Generative Trajectory Stitching through Diffusion Composition
Generative Trajectory Stitching through Diffusion Composition
Yunhao Luo
Utkarsh Aashu Mishra
Yilun Du
Danfei Xu
59
0
0
07 Mar 2025
Temporal Representation Alignment: Successor Features Enable Emergent Compositionality in Robot Instruction Following
Temporal Representation Alignment: Successor Features Enable Emergent Compositionality in Robot Instruction Following
Vivek Myers
Bill Chunyuan Zheng
Anca Dragan
Kuan Fang
Sergey Levine
56
0
0
08 Feb 2025
OGBench: Benchmarking Offline Goal-Conditioned RL
OGBench: Benchmarking Offline Goal-Conditioned RL
Seohong Park
Kevin Frans
Benjamin Eysenbach
Sergey Levine
OffRL
38
8
0
26 Oct 2024
Learning Temporal Distances: Contrastive Successor Features Can Provide a Metric Structure for Decision-Making
Learning Temporal Distances: Contrastive Successor Features Can Provide a Metric Structure for Decision-Making
Vivek Myers
Chongyi Zheng
Anca Dragan
Sergey Levine
Benjamin Eysenbach
OffRL
33
7
0
24 Jun 2024
Language Models Are Greedy Reasoners: A Systematic Formal Analysis of
  Chain-of-Thought
Language Models Are Greedy Reasoners: A Systematic Formal Analysis of Chain-of-Thought
Abulhair Saparov
He He
ELM
LRM
ReLM
116
270
0
03 Oct 2022
Q-learning Decision Transformer: Leveraging Dynamic Programming for
  Conditional Sequence Modelling in Offline RL
Q-learning Decision Transformer: Leveraging Dynamic Programming for Conditional Sequence Modelling in Offline RL
Taku Yamagata
Ahmed Khalil
Raúl Santos-Rodríguez
OffRL
142
70
0
08 Sep 2022
Offline Reinforcement Learning with Implicit Q-Learning
Offline Reinforcement Learning with Implicit Q-Learning
Ilya Kostrikov
Ashvin Nair
Sergey Levine
OffRL
203
627
0
12 Oct 2021
The Distracting Control Suite -- A Challenging Benchmark for
  Reinforcement Learning from Pixels
The Distracting Control Suite -- A Challenging Benchmark for Reinforcement Learning from Pixels
Austin Stone
Oscar Ramirez
K. Konolige
Rico Jonschkowski
127
101
0
07 Jan 2021
1