Hacker News
- Outrageously Large Neural Networks: The Sparsely-Gated Mixture-of-Experts (2017) https://arxiv.org/abs/1701.06538 10 comments
- Outrageously Large Neural Nets: Sparsely-Gated Mixture-of-Experts Layer (2017) https://arxiv.org/abs/1701.06538 33 comments
- Outrageously Large Neural Networks: The Sparsely-Gated Mixture-Of-Experts Layer https://arxiv.org/abs/1701.06538 81 comments
Linking pages
- Introducing Gemini 1.5, Google's next-generation AI model https://blog.google/technology/ai/google-gemini-next-generation-model-february-2024/ 715 comments
- GitHub - EleutherAI/gpt-neo: An implementation of model parallel GPT-2 and GPT-3-style models using the mesh-tensorflow library. https://github.com/EleutherAI/gpt-neo/ 127 comments
- Google Gemini Eats The World – Gemini Smashes GPT-4 By 5X, The GPU-Poors https://www.semianalysis.com/p/google-gemini-eats-the-world-gemini 113 comments
- Google Research: Themes from 2021 and Beyond – Google AI Blog https://ai.googleblog.com/2022/01/google-research-themes-from-2021-and.html 52 comments
- How to Train Really Large Models on Many GPUs? | Lil'Log https://lilianweng.github.io/posts/2021-09-25-train-large/ 33 comments
- 10 Noteworthy AI Research Papers of 2023 https://magazine.sebastianraschka.com/p/10-ai-research-papers-2023 24 comments
- Techniques for Training Large Neural Networks https://openai.com/blog/techniques-for-training-large-neural-networks/ 23 comments
- GitHub - arpita8/Awesome-Mixture-of-Experts-Papers: Survey: A collection of AWESOME papers and resources on the latest research in Mixture of Experts. https://github.com/arpita8/Awesome-Mixture-of-Experts-Papers 17 comments
- Google Brain’s new super fast and highly accurate AI: the Mixture of Experts Layer. | by Théo Szymkowiak | Medium https://medium.com/@thoszymkowiak/google-brains-new-super-fast-and-highly-accurate-ai-the-mixture-of-experts-layer-dd3972c25663 15 comments
- GitHub - AviSoori1x/makeMoE: From scratch implementation of a sparse mixture of experts language model inspired by Andrej Karpathy's makemore :) https://github.com/AviSoori1x/makeMoE 14 comments
- Uber's Billion Dollar Problem: Predicting ETAs Reliably And Quickly https://open.substack.com/pub/codecompass00/p/uber-billion-dollar-problem-predicting-eta?r=rcorn 12 comments
- GitHub - JUSTSUJAY/ML-Research-Papers https://github.com/JUSTSUJAY/ML-Research-Papers 10 comments
- The Google Brain Team — Looking Back on 2017 (Part 1 of 2) – Google AI Blog https://research.googleblog.com/2018/01/the-google-brain-team-looking-back-on.html 6 comments
- GitHub - pjlab-sys4nlp/llama-moe: ⛷️ LLaMA-MoE: Building Mixture-of-Experts from LLaMA with Continual Pre-training https://github.com/pjlab-sys4nlp/llama-moe 6 comments
- A Visual Guide to Mixture of Experts (MoE) https://newsletter.maartengrootendorst.com/p/a-visual-guide-to-mixture-of-experts 6 comments
- Mixtures of Experts - Javid Lakha https://blog.javid.io/p/mixtures-of-experts 2 comments
- Alpa: Automated Model-Parallel Deep Learning – Google AI Blog https://ai.googleblog.com/2022/05/alpa-automated-model-parallel-deep.html 1 comment
- Knowing Enough About MoE to Explain Dropped Tokens in GPT-4 - 152334H https://152334h.github.io/blog/knowing-enough-about-moe/ 1 comment
- GitHub - amrzv/awesome-colab-notebooks: Collection of google colaboratory notebooks for fast and easy experiments https://github.com/amrzv/awesome-colab-notebooks 0 comments
- Core Modeling at Instagram. At Instagram we have many Machine… | by Thomas Bredillet | Instagram Engineering https://instagram-engineering.com/core-modeling-at-instagram-a51e0158aa48 0 comments
Related searches:
Search whole site: site:arxiv.org
Search title: [1701.06538] Outrageously Large Neural Networks: The Sparsely-Gated Mixture-of-Experts Layer
See how to search.