Reinforcement Learning as a fine-tuning paradigm | Ankesh Anand - discu.eu

Hacker News

Reinforcement Learning as a fine-tuning paradigm https://ankeshanand.com/blog/2022/01/08/rl-fine-tuning.html 7 comments 11/1/2022

Linked pages

[2009.01325] Learning to summarize from human feedback https://arxiv.org/abs/2009.01325 12 comments
Just Ask for Generalization | Eric Jang https://evjang.com/2021/10/23/generalization.html 6 comments
WebGPT: Improving the Factual Accuracy of Language Models through Web Browsing https://openai.com/blog/improving-factual-accuracy/ 5 comments
Why Tool AIs Want to Be Agent AIs · Gwern.net https://www.gwern.net/Tool-AI 2 comments
CLIPort https://cliport.github.io/ 1 comment

Related searches:

Search whole site: site:ankeshanand.com

Search title: Reinforcement Learning as a fine-tuning paradigm | Ankesh Anand

See how to search.

Submit link to: