Hacker News
- Implementation of mixture of experts language model in a single file of PyTorch https://github.com/AviSoori1x/makeMoE 14 comments
Linked pages
- [2401.04088] Mixtral of Experts https://arxiv.org/abs/2401.04088 151 comments
- [1701.06538] Outrageously Large Neural Networks: The Sparsely-Gated Mixture-of-Experts Layer https://arxiv.org/abs/1701.06538 125 comments
- GitHub - karpathy/makemore: An autoregressive character-level language model for making more things https://github.com/karpathy/makemore 1 comment
- makeMoE: Implement a Sparse Mixture of Experts Language Model from Scratch https://huggingface.co/blog/AviSoori1x/makemoe-from-scratch 1 comment