GitHub - AviSoori1x/makeMoE: From scratch implementation of a sparse mixture of experts language model inspired by Andrej Karpathy's makemore :) - discu.eu

Hacker News

Implementation of mixture of experts language model in a single file of PyTorch https://github.com/AviSoori1x/makeMoE 14 comments 18/3/2024

Linked pages

[2401.04088] Mixtral of Experts https://arxiv.org/abs/2401.04088 150 comments
[1701.06538] Outrageously Large Neural Networks: The Sparsely-Gated Mixture-of-Experts Layer https://arxiv.org/abs/1701.06538 125 comments
GitHub - karpathy/makemore: An autoregressive character-level language model for making more things https://github.com/karpathy/makemore 1 comment
makeMoE: Implement a Sparse Mixture of Experts Language Model from Scratch https://huggingface.co/blog/AviSoori1x/makemoe-from-scratch 1 comment

Related searches:

Search whole site: site:github.com

Search title: GitHub - AviSoori1x/makeMoE: From scratch implementation of a sparse mixture of experts language model inspired by Andrej Karpathy's makemore :)

See how to search.

Submit link to: