Hacker News
- Reinforcement Learning as a fine-tuning paradigm https://ankeshanand.com/blog/2022/01/08/rl-fine-tuning.html 7 comments
Linked pages
- [2009.01325] Learning to summarize from human feedback https://arxiv.org/abs/2009.01325 12 comments
- Just Ask for Generalization | Eric Jang https://evjang.com/2021/10/23/generalization.html 6 comments
- WebGPT: Improving the Factual Accuracy of Language Models through Web Browsing https://openai.com/blog/improving-factual-accuracy/ 5 comments
- CLIPort https://cliport.github.io/ 1 comment
- Why Tool AIs Want to Be Agent AIs · Gwern.net https://www.gwern.net/Tool-AI 1 comment
Related searches:
Search whole site: site:ankeshanand.com
Search title: Reinforcement Learning as a fine-tuning paradigm | Ankesh Anand
See how to search.