Alpa: Automated Model-Parallel Deep Learning – Google AI Blog

Linking pages

GitHub - alpa-projects/alpa: Training and serving large-scale neural networks https://github.com/alpa-projects/alpa 1 comment

Linked pages

[2005.14165] Language Models are Few-Shot Learners https://arxiv.org/abs/2005.14165 201 comments
[1701.06538] Outrageously Large Neural Networks: The Sparsely-Gated Mixture-of-Experts Layer https://arxiv.org/abs/1701.06538 125 comments
GitHub - google/jax: Composable transformations of Python+NumPy programs: differentiate, vectorize, JIT to GPU/TPU, and more https://github.com/google/jax 99 comments
[2006.16668] GShard: Scaling Giant Models with Conditional Computation and Automatic Sharding https://arxiv.org/abs/2006.16668 35 comments
Introducing GPipe, an Open Source Library for Efficiently Training Large-scale Neural Network Models – Google AI Blog http://ai.googleblog.com/2019/03/introducing-gpipe-open-source-library.html 12 comments
Transformer: A Novel Neural Network Architecture for Language Understanding – Google AI Blog https://ai.googleblog.com/2017/08/transformer-novel-neural-network.html 3 comments
GitHub - alpa-projects/alpa: Training and serving large-scale neural networks https://github.com/alpa-projects/alpa 1 comment
Dynamic programming - Wikipedia https://en.wikipedia.org/wiki/Dynamic_programming#History 0 comments
General and Scalable Parallelization for Neural Networks – Google AI Blog https://ai.googleblog.com/2021/12/general-and-scalable-parallelization.html 0 comments
Open Sourcing BERT: State-of-the-Art Pre-training for Natural Language Processing – Google AI Blog https://ai.googleblog.com/2018/11/open-sourcing-bert-state-of-art-pre.html 0 comments
Data parallelism - Wikipedia https://en.wikipedia.org/wiki/Data_parallelism 0 comments
One-hot - Wikipedia https://en.wikipedia.org/wiki/One-hot 0 comments
Amazon EC2 P3 – Ideal for Machine Learning and HPC - AWS https://aws.amazon.com/ec2/instance-types/p3/ 0 comments