allenai/tulu-2-dpo-70b · Hugging Face - discu.eu

Linking pages

RLHF progress: Scaling DPO to 70B, DPO vs PPO update, Tülu 2, Zephyr-β, meaningful evaluation, data contamination https://www.interconnects.ai/p/rlhf-progress-scaling-dpo-to-70b 0 comments
RLHF 201 - with Nathan Lambert of AI2 and Interconnects https://www.latent.space/p/rlhf-201 0 comments
RLHF learning resources in 2024 - by Nathan Lambert https://www.interconnects.ai/p/rlhf-resources 0 comments

Related searches:

Search whole site: site:huggingface.co

Search title: allenai/tulu-2-dpo-70b · Hugging Face

See how to search.

Submit link to: