Linking pages
- The Neural Net Tank Urban Legend · Gwern.net https://gwern.net/tank 48 comments
- The implicit dynamics of optimizing costs vs. rewards vs. preferences https://robotic.substack.com/p/costs-v-rewards-v-preferences 3 comments
- OpenAI API https://openai-downloads.com 2 comments
- GitHub - opendilab/awesome-RLHF: A curated list of reinforcement learning with human feedback resources (continually updated) https://github.com/opendilab/awesome-RLHF 0 comments
- GitHub - Mooler0410/LLMsPracticalGuide: A curated list of practical guide resources of LLMs (LLMs Tree, Examples, Papers) https://github.com/Mooler0410/LLMsPracticalGuide 0 comments
- FAQ on Catastrophic AI Risks - Yoshua Bengio https://yoshuabengio.org/2023/06/24/faq-on-catastrophic-ai-risks/ 0 comments
- GitHub - RUCAIBox/LLMSurvey: The official GitHub page for the survey paper "A Survey of Large Language Models". https://github.com/RUCAIBox/LLMSurvey 0 comments
- Specifying objectives in RLHF - by Nathan Lambert https://www.interconnects.ai/p/specifying-objectives-in-rlhf 0 comments
- RLHF learning resources in 2024 - by Nathan Lambert https://www.interconnects.ai/p/rlhf-resources 0 comments
- Why reward models are key for alignment - by Nathan Lambert https://www.interconnects.ai/p/why-reward-models-matter 0 comments
Related searches:
Search whole site: site:arxiv.org
Search title: [2210.10760] Scaling Laws for Reward Model Overoptimization
See how to search.