Linking pages
- RLHF progress: Scaling DPO to 70B, DPO vs PPO update, Tülu 2, Zephyr-β, meaningful evaluation, data contamination https://www.interconnects.ai/p/rlhf-progress-scaling-dpo-to-70b 0 comments
- RLHF 201 - with Nathan Lambert of AI2 and Interconnects https://www.latent.space/p/rlhf-201 0 comments
- RLHF learning resources in 2024 - by Nathan Lambert https://www.interconnects.ai/p/rlhf-resources 0 comments
Related searches:
Search whole site: site:huggingface.co
Search title: allenai/tulu-2-dpo-70b · Hugging Face
See how to search.