Linking pages
- Large AI models could soon become even larger much faster https://the-decoder.com/large-ai-models-could-soon-become-even-larger-much-faster/ 4 comments
- Knowing Enough About MoE to Explain Dropped Tokens in GPT-4 - 152334H https://152334h.github.io/blog/knowing-enough-about-moe/ 1 comment
- Google AI Introduces a Novel MoE Routing Algorithm Called Expert Choice (EC) That can Achieve Optimal Load Balancing in an MoE System While Allowing Heterogeneity in Token-to-Expert Mapping - MarkTechPost https://www.marktechpost.com/2022/11/23/google-ai-introduces-a-novel-moe-routing-algorithm-called-expert-choice-ec-that-can-achieve-optimal-load-balancing-in-an-moe-system-while-allowing-heterogeneity-in-token-to-expert-mapping/ 0 comments
- Google Research, 2022 & beyond: Algorithms for efficient deep learning – Google AI Blog https://ai.googleblog.com/2023/02/google-research-2022-beyond-algorithms.html 0 comments
- GitHub - koayon/awesome-adaptive-computation: A curated reading list of research in Adaptive Computation (AC). https://github.com/koayon/awesome-adaptive-computation 0 comments
Linked pages
- [1701.06538] Outrageously Large Neural Networks: The Sparsely-Gated Mixture-of-Experts Layer https://arxiv.org/abs/1701.06538 125 comments
- [2006.16668] GShard: Scaling Giant Models with Conditional Computation and Automatic Sharding https://arxiv.org/abs/2006.16668 35 comments
- Introducing Pathways: A next-generation AI architecture https://blog.google/technology/ai/introducing-pathways-next-generation-ai-architecture/ 33 comments
- https://arxiv.org/abs/2101.03961 4 comments
- Regularization (mathematics) - Wikipedia https://en.wikipedia.org/wiki/Regularization_(mathematics) 2 comments
- [2106.05974] Scaling Vision with Sparse Mixture of Experts https://arxiv.org/abs/2106.05974 2 comments
- [2112.06905] GLaM: Efficient Scaling of Language Models with Mixture-of-Experts https://arxiv.org/abs/2112.06905 1 comment
- Google AI Blog: Scaling Vision with Sparse Mixture of Experts https://ai.googleblog.com/2022/01/scaling-vision-with-sparse-mixture-of.html 1 comment
- More Efficient In-Context Learning with GLaM – Google AI Blog https://ai.googleblog.com/2021/12/more-efficient-in-context-learning-with.html 0 comments
- [1704.06363] Hard Mixtures of Experts for Large Scale Weakly Supervised Vision https://arxiv.org/abs/1704.06363 0 comments
Related searches:
Search whole site: site:ai.googleblog.com
Search title: Mixture-of-Experts with Expert Choice Routing – Google AI Blog
See how to search.