Linking pages
- Learning from Human Preferences https://blog.openai.com/deep-reinforcement-learning-from-human-preferences/ 7 comments
- DeepMind now learns from human preferences – just like a toddler | New Scientist https://www.newscientist.com/article/2134740-deepmind-now-learns-from-human-preferences-just-like-a-toddler/ 0 comments
- Scalable agent alignment via reward modeling | by DeepMind Safety Research | Medium https://medium.com/@deepmindsafetyresearch/scalable-agent-alignment-via-reward-modeling-bf4ab06dfd84 0 comments
- GitHub - opendilab/awesome-RLHF: A curated list of reinforcement learning with human feedback resources (continually updated) https://github.com/opendilab/awesome-RLHF 0 comments
- GitHub - CambioML/pykoi: pykoi: Active learning in one unified interface https://github.com/CambioML/pykoi 0 comments
Related searches:
Search whole site: site:deepmind.com
Search title: Learning through human feedback
See how to search.