Hacker News
- Switch Transformers C – 2048 experts (1.6T params for 3.1 TB) (2022) https://huggingface.co/google/switch-c-2048 36 comments
Linking pages
- Large AI models could soon become even larger much faster https://the-decoder.com/large-ai-models-could-soon-become-even-larger-much-faster/ 4 comments
- Fast Inference of Mixture-of-Experts Language Models with Offloading https://browse.arxiv.org/html/2312.17238v1 0 comments
- Understanding Mixture Of Experts (MoE): A Deep Dive Into Modern LLM Architecture https://skillupexchange.com/understanding-mixture-of-experts-moe-a-deep-dive-into-modern-llm-architecture/ 0 comments
Related searches:
Search whole site: site:huggingface.co
Search title: google/switch-c-2048 · Hugging Face
See how to search.