[2105.04663] GSPMD: General and Scalable Parallelization for ML Computation Graphs

Linking pages

the world’s largest distributed LLM training job on TPU v5e | Google Cloud Blog https://cloud.google.com/blog/products/compute/the-worlds-largest-distributed-llm-training-job-on-tpu-v5e 50 comments
It Looks Like You’re Trying To Take Over The World · Gwern.net https://www.gwern.net/fiction/Clippy 33 comments
How to Go beyond Data Parallelism and Model Parallelism: Starting from GShard | by OneFlow | Medium https://oneflow2020.medium.com/how-to-go-beyond-data-parallelism-and-model-parallelism-talking-from-gshard-a45e20c1975d 1 comment
Google wins MLPerf benchmarks with TPU v4 | Google Cloud Blog https://cloud.google.com/blog/products/ai-machine-learning/google-wins-mlperf-benchmarks-with-tpu-v4 0 comments
General and Scalable Parallelization for Neural Networks – Google AI Blog https://ai.googleblog.com/2021/12/general-and-scalable-parallelization.html 0 comments
Cloud TPU v4 Pods, large model training, MLPerf v1.1 | Google Cloud Blog https://cloud.google.com/blog/topics/tpus/google-showcases-cloud-tpu-v4-pods-for-large-model-training 0 comments
GitHub - merrymercy/awesome-tensor-compilers: A list of awesome compiler projects and papers for tensor computation and deep learning. https://github.com/merrymercy/awesome-tensor-compilers 0 comments
GitHub - apple/axlearn https://github.com/apple/axlearn 0 comments
Using Cloud TPU Multislice to scale AI workloads | Google Cloud Blog https://cloud.google.com/blog/products/compute/using-cloud-tpu-multislice-to-scale-ai-workloads 0 comments