Linking pages
- Google’s scalable supercomputers for machine learning, Cloud TPU Pods, are now publicly available in beta | Google Cloud Blog https://cloud.google.com/blog/products/ai-machine-learning/googles-scalable-supercomputers-for-machine-learning-cloud-tpu-pods-are-now-publicly-available-in-beta 71 comments
- Distributed Inference and Fine-tuning of Large Language Models Over The Internet https://browse.arxiv.org/html/2312.08361v1 22 comments
- How AI Training Scales https://blog.openai.com/science-of-ai/ 15 comments
- Introducing Character https://blog.character.ai/introducing-character/ 2 comments
- How to Go beyond Data Parallelism and Model Parallelism: Starting from GShard | by OneFlow | Medium https://oneflow2020.medium.com/how-to-go-beyond-data-parallelism-and-model-parallelism-talking-from-gshard-a45e20c1975d 1 comment
- MegatronLM: Training Billion+ Parameter Language Models Using GPU Model Parallelism - NVIDIA ADLR https://nv-adlr.github.io/MegatronLM 1 comment
Related searches:
Search whole site: site:arxiv.org
Search title: [1811.02084] Mesh-TensorFlow: Deep Learning for Supercomputers
See how to search.