Illustrating Reinforcement Learning from Human Feedback (RLHF) - discu.eu

Reddit

[R] Illustrating Reinforcement Learning from Human Feedback (RLHF) https://huggingface.co/blog/rlhf 12 comments 9/12/2022 machinelearning

Linking pages

We come to bury ChatGPT, not to praise it. https://www.danmcquillan.org/chatgpt.html 1328 comments
GitHub - antimatter15/alpaca.cpp: Locally run an Instruction-Tuned Chat-Style LLM https://github.com/antimatter15/alpaca.cpp 287 comments
What We Know About LLMs (Primer) https://willthompson.name/what-we-know-about-llms-primer 164 comments
GitHub - mikeroyal/Self-Hosting-Guide: Self-Hosting Guide. Learn all about locally hosting (on premises & private web servers) and managing software applications by yourself or your organization. Including Cloud, LLMs, WireGuard, Automation, Home Assistant, and Networking. https://github.com/mikeroyal/Self-Hosting-Guide 108 comments
Normcore LLM Reads · GitHub https://gist.github.com/veekaybee/be375ab33085102f9027853128dc5f0e 54 comments
2022 Top Papers in AI — A Year of Generative Models | Medium https://chuanenlin.medium.com/2022-top-ai-papers-a-year-of-generative-models-a7dcd9109e39 26 comments
How to run your own LLM (GPT) https://blog.rfox.eu/en/Programming/How_to_run_your_own_LLM_GPT.html 26 comments
GitHub - lucidrains/PaLM-rlhf-pytorch: Implementation of RLHF (Reinforcement Learning with Human Feedback) on top of the PaLM architecture. Basically ChatGPT but with PaLM https://github.com/lucidrains/PaLM-rlhf-pytorch 15 comments
AI-powered Bing Chat gains three distinct personalities | Ars Technica https://arstechnica.com/information-technology/2023/03/microsoft-equips-bing-chat-with-multiple-personalities-creative-balanced-precise/ 11 comments
GitHub - mlabonne/llm-course: Course to get into Large Language Models (LLMs) with roadmaps and Colab notebooks. https://github.com/mlabonne/llm-course 10 comments
Breaking: Google Invests in AnthropicAI and Claude with $300 Million Round for 10 Percent of the A.I. Lab valued at $5 Billion https://aisupremacy.substack.com/p/breaking-google-invests-in-anthropicai 7 comments
The current state of artificial intelligence generative language models is more creative than humans on divergent thinking tasks - PMC https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10858891/ 6 comments
Artificial neurons considered harmful | Better without AI https://betterwithout.ai/artificial-neurons-considered-harmful 5 comments
Old Advocacy, New Algorithms: How 16th century "Devil's Advocates” Shaped AI Red Teaming https://royapakzad.substack.com/p/old-advocacy-new-algorithms 5 comments
Unpacking the HF in RLHF - by Justin Cranshaw https://maestroai.substack.com/p/unpacking-the-hf-in-rlhf 3 comments
Why are we using LLMs as calculators? • Buttondown https://newsletter.vickiboykis.com/archive/why-are-we-using-llms-as-calculators/ 3 comments
Yet Another ChatGPT Jailbreak https://www.ishan.coffee/notes/ChatGPTJailbreak/ 2 comments
Artificial Disinformation: Can Chatbots Destroy Trust on the Internet? https://nabilalouani.substack.com/p/artificial-disinformation-can-chatbots 2 comments
GitHub - rupeshs/alpaca.cpp at linux-android-build-support https://github.com/rupeshs/alpaca.cpp/tree/linux-android-build-support 2 comments
Fine Tuning LLMs - learnings from the DeepLearning SF Meetup https://www.anti-vc.com/p/fine-tuning-llms-learnings-from-the 2 comments

Would you like to stay up to date with Computer science? Checkout Computer science Weekly.

Related searches:

Search whole site: site:huggingface.co

Search title: Illustrating Reinforcement Learning from Human Feedback (RLHF)

See how to search.

Submit link to: