[2006.05990] What Matters In On-Policy Reinforcement Learning? A Large-Scale Empirical Study - discu.eu

Reddit

"What Matters In On-Policy Reinforcement Learning? A Large-Scale Empirical Study", Andrychowicz et al 2020 {GB} [training 250k PG agents like PPO to ablate implementation details] https://arxiv.org/abs/2006.05990 4 comments 11/6/2020 reinforcementlearning

Linking pages

GitHub - google-research/seed_rl: SEED RL: Scalable and Efficient Deep-RL with Accelerated Central Inference. Implements IMPALA and R2D2 algorithms in TF2 with SEED's architecture. https://github.com/google-research/seed_rl 20 comments
The 32 Implementation Details of Proximal Policy Optimization (PPO) Algorithm https://costa.sh/blog-the-32-implementation-details-of-ppo.html 9 comments
GitHub - SimonHashtag/EconRL: A collection of economics and finance papers that adopt reinforcement learning as a solution method. https://github.com/SimonHashtag/EconRL 1 comment
Google Research: Looking Back at 2020, and Forward to 2021 – Google AI Blog https://ai.googleblog.com/2021/01/google-research-looking-back-at-2020.html 0 comments
Direct Preference Optimization Explained In-depth https://www.tylerromero.com/posts/2024-04-dpo/ 0 comments

Related searches:

Search whole site: site:arxiv.org

Search title: [2006.05990] What Matters In On-Policy Reinforcement Learning? A Large-Scale Empirical Study

See how to search.

Submit link to: