- "What Matters In On-Policy Reinforcement Learning? A Large-Scale Empirical Study", Andrychowicz et al 2020 {GB} [training 250k PG agents like PPO to ablate implementation details] https://arxiv.org/abs/2006.05990 4 comments reinforcementlearning
Linking pages
- GitHub - google-research/seed_rl: SEED RL: Scalable and Efficient Deep-RL with Accelerated Central Inference. Implements IMPALA and R2D2 algorithms in TF2 with SEED's architecture. https://github.com/google-research/seed_rl 20 comments
- The 32 Implementation Details of Proximal Policy Optimization (PPO) Algorithm https://costa.sh/blog-the-32-implementation-details-of-ppo.html 9 comments
- GitHub - SimonHashtag/EconRL: A collection of economics and finance papers that adopt reinforcement learning as a solution method. https://github.com/SimonHashtag/EconRL 1 comment
- Google Research: Looking Back at 2020, and Forward to 2021 – Google AI Blog https://ai.googleblog.com/2021/01/google-research-looking-back-at-2020.html 0 comments
- Direct Preference Optimization Explained In-depth https://www.tylerromero.com/posts/2024-04-dpo/ 0 comments
Related searches:
Search whole site: site:arxiv.org
Search title: [2006.05990] What Matters In On-Policy Reinforcement Learning? A Large-Scale Empirical Study
See how to search.