Linking pages
Linked pages
- Goodhart's law - Wikipedia http://en.wikipedia.org/wiki/Goodhart%27s_law 221 comments
- Stanford CRFM https://crfm.stanford.edu/2023/05/22/alpaca-farm.html 41 comments
- How RLHF actually works - by Nathan Lambert - Interconnects https://www.interconnects.ai/p/how-rlhf-works 32 comments
- [2305.18290] Direct Preference Optimization: Your Language Model is Secretly a Reward Model https://arxiv.org/abs/2305.18290 8 comments
- LLAMA 2: an incredible open-source LLM - by Nathan Lambert https://www.interconnects.ai/p/llama-2-from-meta 5 comments
- [2210.10760] Scaling Laws for Reward Model Overoptimization https://arxiv.org/abs/2210.10760 0 comments
- Beyond human data: RLAIF needs a rebrand https://www.interconnects.ai/p/beyond-human-data-rlaif 0 comments
- [2307.15217] Open Problems and Fundamental Limitations of Reinforcement Learning from Human Feedback https://arxiv.org/abs/2307.15217 0 comments
Related searches:
Search whole site: site:www.interconnects.ai
Search title: Specifying objectives in RLHF - by Nathan Lambert
See how to search.