Hacker News
- The Illustrated DeepSeek-R1 https://newsletter.languagemodels.co/p/the-illustrated-deepseek-r1 118 comments
Linking pages
Linked pages
- How GPT3 Works - Visualizations and Animations – Jay Alammar – Visualizing machine learning one concept at a time. https://jalammar.github.io/how-gpt3-works-visualizations-animations/ 109 comments
- The Illustrated Transformer – Jay Alammar – Visualizing machine learning one concept at a time. https://jalammar.github.io/illustrated-transformer/ 36 comments
- The Illustrated GPT-2 (Visualizing Transformer Language Models) – Jay Alammar – Visualizing machine learning one concept at a time. http://jalammar.github.io/illustrated-gpt2/ 8 comments
- [2401.06066] DeepSeekMoE: Towards Ultimate Expert Specialization in Mixture-of-Experts Language Models https://arxiv.org/abs/2401.06066 0 comments
- Hands-On Large Language Models https://www.llm-book.com/ 0 comments
Related searches:
Search whole site: site:newsletter.languagemodels.co
Search title: The Illustrated DeepSeek-R1 - by Jay Alammar
See how to search.