[2204.05862] Training a Helpful and Harmless Assistant with Reinforcement Learning from Human Feedback

Linking pages

What We Know About LLMs (Primer) https://willthompson.name/what-we-know-about-llms-primer 164 comments
Large language models propagate race-based medicine | npj Digital Medicine https://www.nature.com/articles/s41746-023-00939-z 160 comments
How Do AIs' Political Opinions Change As They Get Smarter And Better-Trained? https://astralcodexten.substack.com/p/how-do-ais-political-opinions-change 102 comments
How RLHF actually works - by Nathan Lambert - Interconnects https://www.interconnects.ai/p/how-rlhf-works 32 comments
LLM Training: RLHF and Its Alternatives https://magazine.sebastianraschka.com/p/llm-training-rlhf-and-its-alternatives 14 comments
AI-Written Critiques Help Humans Notice Flaws https://openai.com/blog/critiques/ 6 comments
Unpacking the HF in RLHF - by Justin Cranshaw https://maestroai.substack.com/p/unpacking-the-hf-in-rlhf 3 comments
GitHub - yaodongC/awesome-instruction-dataset: A collection of open-source dataset to train instruction-following LLMs (ChatGPT,LLaMA,Alpaca) https://github.com/yaodongC/awesome-instruction-dataset 2 comments
GitHub - inverse-scaling/prize: A prize for finding tasks that cause large language models to show inverse scaling https://github.com/inverse-scaling/prize 1 comment
RLHF: Reinforcement Learning from Human Feedback https://huyenchip.com/2023/05/02/rlhf.html 1 comment
Reward Modeling for Large language models (with code) https://explodinggradients.com/reward-modeling-for-large-language-models-with-code 1 comment
GitHub - tomohideshibata/BERT-related-papers: BERT-related papers https://github.com/tomohideshibata/BERT-related-papers 0 comments
RLHF, online ML systems, and RL going mainstream https://robotic.substack.com/p/rlhf-2022 0 comments
Data is a Public Good | Zack Witten https://zswitten.github.io/2022/04/24/data-public-good.html 0 comments
GitHub - opendilab/awesome-RLHF: A curated list of reinforcement learning with human feedback resources (continually updated) https://github.com/opendilab/awesome-RLHF 0 comments
Putting the human touch on LLMs - Molly Welch's Newsletter https://mewelch.substack.com/p/putting-the-human-touch-on-llms 0 comments
GitHub - Mooler0410/LLMsPracticalGuide: A curated list of practical guide resources of LLMs (LLMs Tree, Examples, Papers) https://github.com/Mooler0410/LLMsPracticalGuide 0 comments
Factorize your language models · Vadim Liventsev https://vadim.me/publications/factorize/ 0 comments
Unfortunately, OpenAI and Google have moats https://www.interconnects.ai/p/openai-google-llm-moats 0 comments
GitHub - RUCAIBox/LLMSurvey: The official GitHub page for the survey paper "A Survey of Large Language Models". https://github.com/RUCAIBox/LLMSurvey 0 comments