Hacker News
- Learning ‘Montezuma’s Revenge’ from a single demonstration https://blog.openai.com/learning-montezumas-revenge-from-a-single-demonstration/ 45 comments
- "Learning Montezuma's Revenge from a Single Demonstration", Salimans & Chen {OA} [PPO with backward chaining from reward state for curriculum learning] https://blog.openai.com/learning-montezumas-revenge-from-a-single-demonstration/ 3 comments reinforcementlearning
Linking pages
- Reinforcement learning’s foundational flaw https://thegradient.pub/why-rl-is-flawed/ 55 comments
- Reinforcement Learning with Prediction-Based Rewards https://blog.openai.com/reinforcement-learning-with-prediction-based-rewards/ 38 comments
- What is artificial intelligence? Your AI questions, answered. - Vox https://www.vox.com/future-perfect/2018/12/21/18126576/ai-artificial-intelligence-machine-learning-safety-alignment 8 comments
- Reinforcement Learning with Prediction-Based Rewards https://openai.com/blog/reinforcement-learning-with-prediction-based-rewards/ 3 comments
- On “solving” Montezuma’s Revenge. Looking beyond the hype of recent Deep… | by Arthur Juliani | Medium https://medium.com/@awjuliani/on-solving-montezumas-revenge-2146d83f0bc3 0 comments
Linked pages
- ChatGPT https://chat.openai.com/ 752 comments
- OpenAI Five https://blog.openai.com/openai-five/ 272 comments
- Proximal Policy Optimization https://blog.openai.com/openai-baselines-ppo/ 5 comments
- [1804.02717] DeepMimic: Example-Guided Deep Reinforcement Learning of Physics-Based Character Skills https://arxiv.org/abs/1804.02717 0 comments
Related searches:
Search whole site: site:blog.openai.com
Search title: Learning Montezuma's Revenge from a Single Demonstration
See how to search.