Linking pages
- Why StackOverflow usage is down 50% — with David Hsu of Retool https://www.latent.space/p/retool 1 comment
- Interviewing Louis Castricato of Synth Labs and Eleuther AI on RLHF, Gemini Drama, DPO, founding Carper AI, preference data, reward models, and everything in between https://www.interconnects.ai/p/rlhf-interview-1-louis 1 comment
- Emulating Humans with NSFW Chatbots - with Jesse Silver https://www.latent.space/p/nsfw-chatbots 1 comment
- Llama 2, 3 & 4: Synthetic Data, RLHF, Agents on the path to Open Source AGI https://www.latent.space/p/llama-3 1 comment
- RLHF learning resources in 2024 - by Nathan Lambert https://www.interconnects.ai/p/rlhf-resources 0 comments
- How to train your own Large Multimodal Model — with Hugo Laurençon & Leo Tronchon of HuggingFace M4 Research https://www.latent.space/p/idefics 0 comments
- Cloud Intelligence at the speed of 5000 tok/s - with Ce Zhang and Vipul Ved Prakash of Together AI https://www.latent.space/p/together 0 comments
- Worthwhile Research for building SOTA LLMs (Jan 2024 Recap) https://www.latent.space/p/jan-2024 0 comments
- One standard to deploy them all - with Ben Firshman of Replicate https://www.latent.space/p/replicate 0 comments
- Open Source AI is AI we can Trust — with Soumith Chintala of Meta AI https://www.latent.space/p/soumith 0 comments
Linked pages
- Mixtral of experts | Mistral AI | Open source models https://mistral.ai/news/mixtral-of-experts/ 300 comments
- Weak-to-strong generalization https://openai.com/research/weak-to-strong-generalization 201 comments
- Kullback–Leibler divergence - Wikipedia http://en.wikipedia.org/wiki/Kullback%E2%80%93Leibler_divergence 74 comments
- The Mathematics of Training LLMs — with Quentin Anthony of Eleuther AI https://www.latent.space/p/transformers-math#details 66 comments
- ByteDance is secretly using OpenAI’s tech to build a competitor - The Verge https://www.theverge.com/2023/12/15/24003151/bytedance-china-openai-microsoft-competitor-llm 58 comments
- GitHub - LAION-AI/Open-Assistant: OpenAssistant is a chat-based assistant that understands tasks, can interact with third-party systems, and retrieve information dynamically to do so. https://github.com/LAION-AI/Open-Assistant 36 comments
- [2305.18290] Direct Preference Optimization: Your Language Model is Secretly a Reward Model https://arxiv.org/abs/2305.18290 8 comments
- The Accidental AI Canvas - with Steve Ruiz of tldraw https://www.latent.space/p/tldraw 2 comments
- [2212.10560] Self-Instruct: Aligning Language Model with Self Generated Instructions https://arxiv.org/abs/2212.10560 1 comment
- AI Fundamentals: Benchmarks 101 https://www.latent.space/p/benchmarks-101 1 comment
- AI Fundamentals: Datasets 101 - Latent Space https://www.latent.space/p/datasets-101 1 comment
- Aligning language models to follow instructions https://openai.com/research/instruction-following 1 comment
- Jim Fan on X: "Can GPT-4 teach a robot hand to do pen spinning tricks better than you do? I'm excited to announce Eureka, an open-ended agent that designs reward functions for robot dexterity at super-human level. It’s like Voyager in the space of a physics simulator API! Eureka bridges the… https://t.co/Nubq8vSZr1" / X https://twitter.com/DrJimFan/status/1715397393842401440 1 comment
- [1706.03741] Deep reinforcement learning from human preferences https://arxiv.org/abs/1706.03741 0 comments
- Von Neumann–Morgenstern utility theorem - Wikipedia https://en.wikipedia.org/wiki/Von_Neumann%E2%80%93Morgenstern_utility_theorem 0 comments
- Latent Space | swyx | Substack https://www.latent.space/ 0 comments
- Anthropic \ Constitutional AI: Harmlessness from AI Feedback https://www.anthropic.com/index/constitutional-ai-harmlessness-from-ai-feedback 0 comments
- Intel/neural-chat-7b-v3-1 · Hugging Face https://huggingface.co/Intel/neural-chat-7b-v3-1 0 comments
- allenai/tulu-2-dpo-70b · Hugging Face https://huggingface.co/allenai/tulu-2-dpo-70b 0 comments
- The Busy Person's Intro to Finetuning & Open Source AI - Wing Lian, Axolotl https://www.latent.space/p/axolotl 0 comments
Related searches:
Search whole site: site:latent.space
Search title: RLHF 201 - with Nathan Lambert of AI2 and Interconnects
See how to search.