[2401.04088] Mixtral of Experts - discu.eu

Hacker News

Mixtral 8x7B: A sparse Mixture of Experts language model https://arxiv.org/abs/2401.04088 151 comments 9/1/2024

Linking pages

MOIRAI: Salesforce's Foundation Transformer For Time-Series Forecasting https://aihorizonforecast.substack.com/p/moirai-salesforces-foundation-transformer 49 comments
GitHub - AviSoori1x/makeMoE: From scratch implementation of a sparse mixture of experts language model inspired by Andrej Karpathy's makemore :) https://github.com/AviSoori1x/makeMoE 14 comments
GitHub - mlabonne/llm-course: Course to get into Large Language Models (LLMs) with roadmaps and Colab notebooks. https://github.com/mlabonne/llm-course 10 comments
Sholto Douglas & Trenton Bricken - How to Build & Understand GPT-7's Mind https://www.dwarkeshpatel.com/p/sholto-douglas-trenton-bricken 3 comments
Model merging lessons in The Waifu Research Department https://www.interconnects.ai/p/model-merging 0 comments
Research Papers in January 2024 - by Sebastian Raschka, PhD https://magazine.sebastianraschka.com/p/research-papers-in-january-2024 0 comments
Mixture-of-Experts (MoE): The Birth and Rise of Conditional Computation https://cameronrwolfe.substack.com/p/conditional-computation-the-birth 0 comments
Accelerating MoE model inference with Locality-Aware Kernel Design | PyTorch https://pytorch.org/blog/accelerating-moe-model/ 0 comments
GitHub - elicit/machine-learning-list https://github.com/elicit/machine-learning-list 0 comments

Related searches:

Search whole site: site:arxiv.org

Search title: [2401.04088] Mixtral of Experts

See how to search.

Submit link to: