- OpenAI: Proximal Policy Optimization variant on TRPO for continuous actions (ALE, Roboschool) https://blog.openai.com/openai-baselines-ppo/ 5 comments reinforcementlearning
Linking pages
- Learning Dexterity https://blog.openai.com/learning-dexterity/ 160 comments
- Learning Montezuma's Revenge from a Single Demonstration https://blog.openai.com/learning-montezumas-revenge-from-a-single-demonstration/ 48 comments
- Reinforcement Learning with Prediction-Based Rewards https://blog.openai.com/reinforcement-learning-with-prediction-based-rewards/ 38 comments
- GitHub - higgsfield/RL-Adventure-2: PyTorch0.4 implementation of: actor critic / proximal policy optimization / acer / ddpg / twin dueling ddpg / soft actor critic / generative adversarial imitation learning / hindsight experience replay https://github.com/higgsfield/RL-Adventure-2 20 comments
- baselines/baselines/ppo2 at master · openai/baselines · GitHub https://github.com/openai/baselines/tree/master/baselines/ppo2 12 comments
- Ingredients for Robotics Research https://openai.com/blog/ingredients-for-robotics-research/ 8 comments
- GitHub - ikostrikov/pytorch-a2c-ppo-acktr-gail: PyTorch implementation of Advantage Actor Critic (A2C), Proximal Policy Optimization (PPO), Scalable trust-region method for deep reinforcement learning using Kronecker-factored approximation (ACKTR) and Generative Adversarial Imitation Learning (GAIL). https://github.com/ikostrikov/pytorch-a2c-ppo-acktr-gail 7 comments
- Our NIPS 2017: Learning to Run approach | by Adam Stelmaszczyk | ML Review https://medium.com/@stelmaszczykadam/our-nips-2017-learning-to-run-approach-b80a295d3bb5 4 comments
- Evolved Policy Gradients https://blog.openai.com/evolved-policy-gradients/ 4 comments
- Reinforcement Learning with Prediction-Based Rewards https://openai.com/blog/reinforcement-learning-with-prediction-based-rewards/ 3 comments
- Quantifying Generalization in Reinforcement Learning https://blog.openai.com/quantifying-generalization-in-reinforcement-learning/ 0 comments
- Berkeley Deep RL Bootcamp http://planspace.org/20170830-berkeley_deep_rl_bootcamp/ 0 comments
- The Last 5 Years In Deep Learning – Adit Deshpande – Engineering at Forward | UCLA CS '19 https://adeshpande3.github.io/The-Last-5-Years-in-Deep-Learning 0 comments
- Ingredients for Robotics Research https://blog.openai.com/ingredients-for-robotics-research/ 0 comments
- Learning Dexterity https://openai.com/blog/learning-dexterity/ 0 comments
- GitHub - lefnire/tforce_btc_trader: TensorForce Bitcoin Trading Bot https://github.com/lefnire/tforce_btc_trader 0 comments
- Reinforcement-Learning/Week5 at master · andri27-ts/Reinforcement-Learning · GitHub https://github.com/andri27-ts/60_Days_RL_Challenge/tree/master/Week5 0 comments
- Better Exploration with Parameter Noise https://openai.com/blog/better-exploration-with-parameter-noise/ 0 comments
- Better Exploration with Parameter Noise https://blog.openai.com/better-exploration-with-parameter-noise/ 0 comments
- GitHub - ikostrikov/pytorch-a2c-ppo-acktr-gail: PyTorch implementation of Advantage Actor Critic (A2C), Proximal Policy Optimization (PPO), Scalable trust-region method for deep reinforcement learning using Kronecker-factored approximation (ACKTR) and Generative Adversarial Imitation Learning (GAIL). https://github.com/ikostrikov/pytorch-a2c-ppo-acktr 0 comments
Linked pages
- ChatGPT https://chat.openai.com/ 752 comments
- Kullback–Leibler divergence - Wikipedia http://en.wikipedia.org/wiki/Kullback%E2%80%93Leibler_divergence 74 comments
- Human-level control through deep reinforcement learning | Nature http://www.nature.com/nature/journal/v518/n7540/full/nature14236.html 59 comments
- Roboschool https://blog.openai.com/roboschool/ 26 comments
- Deep Reinforcement Learning: Pong from Pixels https://karpathy.github.io/2016/05/31/rl/ 16 comments
- Mastering the game of Go with deep neural networks and tree search | Nature http://www.nature.com/nature/journal/v529/n7587/full/nature16961.html 6 comments
- [1611.01224] Sample Efficient Actor-Critic with Experience Replay https://arxiv.org/abs/1611.01224 4 comments
- https://arxiv.org/abs/1707.06347 3 comments
- GitHub - openai/baselines: OpenAI Baselines: high-quality implementations of reinforcement learning algorithms https://github.com/openai/baselines 3 comments
- [1506.02438] High-Dimensional Continuous Control Using Generalized Advantage Estimation https://arxiv.org/abs/1506.02438 3 comments
Related searches:
Search whole site: site:blog.openai.com
Search title: Proximal Policy Optimization
See how to search.