- RUDDER -- Reinforcement Learning algorithm that is "exponentially faster than TD, MC, and MC Tree Search (MCTS)" https://arxiv.org/abs/1806.07857 5 comments reinforcementlearning
Linking pages
Related searches:
Search whole site: site:arxiv.org
Search title: [1806.07857] RUDDER: Return Decomposition for Delayed Rewards
See how to search.