Hacker News
- Mixtral 8x7B: A sparse Mixture of Experts language model https://arxiv.org/abs/2401.04088 151 comments
Linking pages
- MOIRAI: Salesforce's Foundation Transformer For Time-Series Forecasting https://aihorizonforecast.substack.com/p/moirai-salesforces-foundation-transformer 49 comments
- GitHub - AviSoori1x/makeMoE: From scratch implementation of a sparse mixture of experts language model inspired by Andrej Karpathy's makemore :) https://github.com/AviSoori1x/makeMoE 14 comments
- GitHub - mlabonne/llm-course: Course to get into Large Language Models (LLMs) with roadmaps and Colab notebooks. https://github.com/mlabonne/llm-course 10 comments
- Sholto Douglas & Trenton Bricken - How to Build & Understand GPT-7's Mind https://www.dwarkeshpatel.com/p/sholto-douglas-trenton-bricken 3 comments
- Model merging lessons in The Waifu Research Department https://www.interconnects.ai/p/model-merging 0 comments
- Research Papers in January 2024 - by Sebastian Raschka, PhD https://magazine.sebastianraschka.com/p/research-papers-in-january-2024 0 comments
- Mixture-of-Experts (MoE): The Birth and Rise of Conditional Computation https://cameronrwolfe.substack.com/p/conditional-computation-the-birth 0 comments
- Accelerating MoE model inference with Locality-Aware Kernel Design | PyTorch https://pytorch.org/blog/accelerating-moe-model/ 0 comments
- GitHub - elicit/machine-learning-list https://github.com/elicit/machine-learning-list 0 comments
Related searches:
Search whole site: site:arxiv.org
Search title: [2401.04088] Mixtral of Experts
See how to search.