OneFlow Made Training GPT-3 Easier（Part 1） | by OneFlow | Medium - discu.eu

Linking pages

Correct Level of Abstraction for Distributed Deep Learning Frameworks（Part 2） | by OneFlow | Medium https://oneflow2020.medium.com/correct-level-of-abstraction-for-distributed-deep-learning-frameworks-part-2-9ce73898bb1e 1 comment

Linked pages

https://mobile.twitter.com/home 1880 comments
[2005.14165] Language Models are Few-Shot Learners https://arxiv.org/abs/2005.14165 201 comments
[1706.03762] Attention Is All You Need https://arxiv.org/abs/1706.03762 145 comments
GitHub - Oneflow-Inc/oneflow: OneFlow is a deep learning framework designed to be user-friendly, scalable and efficient. https://github.com/Oneflow-Inc/oneflow 3 comments
GitHub - microsoft/DeepSpeed: DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective. https://github.com/microsoft/DeepSpeed 1 comment
[2101.06840] ZeRO-Offload: Democratizing Billion-Scale Model Training https://arxiv.org/abs/2101.06840 1 comment
[2104.04473] Efficient Large-Scale Language Model Training on GPU Clusters Using Megatron-LM https://arxiv.org/abs/2104.04473 1 comment
[1811.06965] GPipe: Efficient Training of Giant Neural Networks using Pipeline Parallelism https://arxiv.org/abs/1811.06965 0 comments
[1604.06174] Training Deep Nets with Sublinear Memory Cost https://arxiv.org/abs/1604.06174 0 comments

Related searches:

Search whole site: site:oneflow2020.medium.com

Search title: OneFlow Made Training GPT-3 Easier（Part 1） | by OneFlow | Medium

See how to search.

Submit link to: