Hacker News
Linking pages
- Specifying objectives in RLHF - by Nathan Lambert https://www.interconnects.ai/p/specifying-objectives-in-rlhf 0 comments
- RLHF learning resources in 2024 - by Nathan Lambert https://www.interconnects.ai/p/rlhf-resources 0 comments
- Why reward models are key for alignment - by Nathan Lambert https://www.interconnects.ai/p/why-reward-models-matter 0 comments
Linked pages
- Inside the AI Factory: the humans that make tech seem human - The Verge https://www.theverge.com/features/23764584/ai-artificial-intelligence-data-notation-labor-scale-surge-remotasks-openai-chatbots 17 comments
- [2305.18290] Direct Preference Optimization: Your Language Model is Secretly a Reward Model https://arxiv.org/abs/2305.18290 8 comments
- [2112.09332] WebGPT: Browser-assisted question-answering with human feedback https://arxiv.org/abs/2112.09332 3 comments
- John Schulman - Reinforcement Learning from Human Feedback: Progress and Challenges - YouTube https://www.youtube.com/watch?v=hhiLw5Q_UFg 3 comments
- [2303.08774] GPT-4 Technical Report https://arxiv.org/abs/2303.08774 1 comment
- [2204.05862] Training a Helpful and Harmless Assistant with Reinforcement Learning from Human Feedback https://arxiv.org/abs/2204.05862 1 comment
- https://arxiv.org/abs/2203.02155 0 comments
- [2112.00861] A General Language Assistant as a Laboratory for Alignment https://arxiv.org/abs/2112.00861 0 comments
- [2302.07459] The Capacity for Moral Self-Correction in Large Language Models https://arxiv.org/abs/2302.07459 0 comments
- Specifying hallucinations - by Nathan Lambert https://www.interconnects.ai/p/specifying-hallucinations-llms 0 comments
Related searches:
Search whole site: site:interconnects.ai
Search title: How RLHF actually works - by Nathan Lambert - Interconnects
See how to search.