[1811.06965] GPipe: Efficient Training of Giant Neural Networks using Pipeline Parallelism

Linking pages

How to Train Really Large Models on Many GPUs? | Lil'Log https://lilianweng.github.io/posts/2021-09-25-train-large/ 33 comments
Techniques for Training Large Neural Networks https://openai.com/blog/techniques-for-training-large-neural-networks/ 23 comments
GitHub - ntedgi/node-efficientnet: tensorflowJS implementation of EfficientNet 🚀 https://github.com/ntedgi/node-efficientnet 21 comments
Introducing GPipe, an Open Source Library for Efficiently Training Large-scale Neural Network Models – Google AI Blog http://ai.googleblog.com/2019/03/introducing-gpipe-open-source-library.html 12 comments
GitHub - shijianjian/EfficientNet-PyTorch-3D: A PyTorch implementation of EfficientNet https://github.com/shijianjian/EfficientNet-PyTorch-3D 4 comments
Ride the Hardware Lottery! - by Delip Rao https://pagestlabs.substack.com/p/ride-the-hardware-lottery 4 comments
OneFlow Made Training GPT-3 Easier（Part 1） | by OneFlow | Medium https://oneflow2020.medium.com/oneflow-made-training-gpt-3-easier-part-1-5b6b65d70d3c 1 comment
MegatronLM: Training Billion+ Parameter Language Models Using GPU Model Parallelism - NVIDIA ADLR https://nv-adlr.github.io/MegatronLM 1 comment
GitHub - RedditSota/state-of-the-art-result-for-machine-learning-problems: This repository provides state of the art (SoTA) results for all machine learning problems. We do our best to keep this repository up to date. If you do find a problem's SoTA result is out of date or missing, please raise this as an issue or submit Google form (with this information: research paper name, dataset, metric, source code and year). We will fix it immediately. https://github.com/RedditSota/state-of-the-art-result-for-machine-learning-problems 0 comments
Google at NeurIPS 2019 – Google AI Blog https://ai.googleblog.com/2019/12/google-at-neurips-2019.html 0 comments
Exploring Massively Multilingual, Massive Neural Machine Translation – Google AI Blog https://ai.googleblog.com/2019/10/exploring-massively-multilingual.html 0 comments
Machine Learning Systems https://thegradient.pub/systems-for-machine-learning/ 0 comments
EfficientNet: Improving Accuracy and Efficiency through AutoML and Model Scaling – Google AI Blog https://ai.googleblog.com/2019/05/efficientnet-improving-accuracy-and.html 0 comments
TensorFlow DTensor: Unified API for Distributed Deep Network Training https://www.infoq.com/news/2022/05/tensorflow-dtensor/ 0 comments
GitHub - qubvel/efficientnet: Implementation of EfficientNet model. Keras and TensorFlow Keras. https://github.com/qubvel/efficientnet 0 comments
Google Open-Sources GPipe Library for Training Large-Scale Neural Network Models | by Synced | SyncedReview | Medium https://medium.com/syncedreview/google-open-sources-gpipe-library-for-training-large-scale-neural-network-models-8b8ef324382c 0 comments
Five years of progress in GPTs - by Finbarr Timbers https://finbarrtimbers.substack.com/p/five-years-of-progress-in-gpts 0 comments
GitHub - RUCAIBox/LLMSurvey: The official GitHub page for the survey paper "A Survey of Large Language Models". https://github.com/RUCAIBox/LLMSurvey 0 comments
A Brief Overview of Parallelism Strategies in Deep Learning | Alex McKinney https://afmck.in/posts/2023-02-26-parallelism/ 0 comments
Pipeline-Parallelism: Distributed Training via Model Partitioning https://siboehm.com/articles/22/pipeline-parallel-training 0 comments