Specifying objectives in RLHF - by Nathan Lambert - discu.eu

Linking pages

RLHF progress: Scaling DPO to 70B, DPO vs PPO update, Tülu 2, Zephyr-β, meaningful evaluation, data contamination https://www.interconnects.ai/p/rlhf-progress-scaling-dpo-to-70b 0 comments
RLHF learning resources in 2024 - by Nathan Lambert https://www.interconnects.ai/p/rlhf-resources 0 comments

Linked pages

Goodhart's law - Wikipedia http://en.wikipedia.org/wiki/Goodhart%27s_law 221 comments
Stanford CRFM https://crfm.stanford.edu/2023/05/22/alpaca-farm.html 41 comments
How RLHF actually works - by Nathan Lambert - Interconnects https://www.interconnects.ai/p/how-rlhf-works 32 comments
[2305.18290] Direct Preference Optimization: Your Language Model is Secretly a Reward Model https://arxiv.org/abs/2305.18290 8 comments
LLAMA 2: an incredible open-source LLM - by Nathan Lambert https://www.interconnects.ai/p/llama-2-from-meta 5 comments
[2210.10760] Scaling Laws for Reward Model Overoptimization https://arxiv.org/abs/2210.10760 0 comments
Beyond human data: RLAIF needs a rebrand https://www.interconnects.ai/p/beyond-human-data-rlaif 0 comments
[2307.15217] Open Problems and Fundamental Limitations of Reinforcement Learning from Human Feedback https://arxiv.org/abs/2307.15217 0 comments

Related searches:

Search whole site: site:www.interconnects.ai

Search title: Specifying objectives in RLHF - by Nathan Lambert

See how to search.

Submit link to: