Reinforcement Learning from Human Feedback: When the Math Ain't Enough

Linking pages

Chasing the Numbers: The Puzzle of AI Benchmarks https://evalovernite.substack.com/p/ai-benchmarks-puzzle 0 comments

Linked pages

No Vehicles In The Park https://novehiclesinthepark.com/ 1186 comments
Open Assistant https://open-assistant.io/ 311 comments
Stanford CRFM https://crfm.stanford.edu/2023/03/13/alpaca.html 298 comments
Common Crawl - Open Repository of Web Crawl Data https://commoncrawl.org/ 86 comments
[2101.00027] The Pile: An 800GB Dataset of Diverse Text for Language Modeling https://arxiv.org/abs/2101.00027 81 comments
Free Dolly: Introducing the World's First Open and Commercially Viable Instruction-Tuned LLM - The Databricks Blog https://www.databricks.com/blog/2023/04/12/dolly-first-open-commercially-viable-instruction-tuned-llm 54 comments
[2305.11206] LIMA: Less Is More for Alignment https://arxiv.org/abs/2305.11206 44 comments
ShareGPT: Share your wildest ChatGPT conversations with one click. https://sharegpt.com/ 41 comments
Prodigy · Prodigy · An annotation tool for AI, Machine Learning & NLP https://prodi.gy 13 comments
Scale AI: The Data Platform for AI https://scale.com/ 9 comments
Vicuna: An Open-Source Chatbot Impressing GPT-4 with 90%* ChatGPT Quality | LMSYS Org https://lmsys.org/blog/2023-03-30-vicuna/ 7 comments
Open Source Data Labeling | Label Studio https://labelstud.io/ 6 comments
Reinforcement Learning without Reward Engineering | by Nikita Pavlichenko | Toloka Tech | Medium https://medium.com/p/60c63402c59f 1 comment
[2204.05862] Training a Helpful and Harmless Assistant with Reinforcement Learning from Human Feedback https://arxiv.org/abs/2204.05862 1 comment
https://arxiv.org/abs/2203.02155 0 comments
[2304.12244] WizardLM: Empowering Large Language Models to Follow Complex Instructions https://arxiv.org/abs/2304.12244 0 comments
[2306.01116] The RefinedWeb Dataset for Falcon LLM: Outperforming Curated Corpora with Web Data, and Web Data Only https://arxiv.org/abs/2306.01116 0 comments
Llama 2: Open Foundation and Fine-Tuned Chat Models | Meta AI Research https://ai.meta.com/research/publications/llama-2-open-foundation-and-fine-tuned-chat-models/ 0 comments
Large language model - Wikipedia https://en.wikipedia.org/wiki/Large_language_model 0 comments
Amazon Mechanical Turk https://www.mturk.com/ 0 comments