Linking pages
- GitHub - EleutherAI/gpt-neox: An implementation of model parallel autoregressive transformers on GPUs, based on the DeepSpeed library. https://github.com/EleutherAI/gpt-neox 67 comments
- DeepSpeed/README.md at master · microsoft/DeepSpeed · GitHub https://github.com/microsoft/DeepSpeed/blob/master/blogs/deepspeed-chat/README.md 55 comments
- GitHub - mlabonne/llm-course: Course to get into Large Language Models (LLMs) with roadmaps and Colab notebooks. https://github.com/mlabonne/llm-course 10 comments
- NVIDIA and Microsoft Join Forces on Massive Cloud AI Computer - News https://www.allaboutcircuits.com/news/nvidia-and-microsoft-join-forces-massive-cloud-ai-computer/ 3 comments
- Yandex Publishes YaLM 100B. It’s the Largest GPT-Like Neural Network in Open Source | by Mikhail Khrushchev | Yandex | Medium https://medium.com/yandex/yandex-publishes-yalm-100b-its-the-largest-gpt-like-neural-network-in-open-source-d1df53d0e9a6 3 comments
- GitHub - microsoft/DeepSpeed: DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective. https://github.com/microsoft/DeepSpeed 1 comment
- Exploring open-source capabilities in Azure AI | Azure Blog and Updates | Microsoft Azure https://azure.microsoft.com/blog/exploring-opensource-capabilities-in-azure-ai/?WT_mc_id=academic-0000-abartolo 1 comment
- Microsoft AI Releases 'DeepSpeed Compression': A Python-based Composable Library for Extreme Compression and Zero-Cost Quantization to Make Deep Learning Model Size Smaller and Inference Speed Faster - MarkTechPost https://www.marktechpost.com/2022/07/26/microsoft-ai-releases-deepspeed-compression-a-python-based-composable-library-for-extreme-compression-and-zero-cost-quantization-to-make-deep-learning-model-size-smaller-and-inference-speed-faste/ 0 comments
- EleutherAI Open-Sources 20 Billion Parameter AI Language Model GPT-NeoX-20B https://www.infoq.com/news/2022/04/eleutherai-gpt-neox/ 0 comments
- What is DeepSpeed? - by Michael Spencer https://datasciencelearningcenter.substack.com/p/what-is-deepspeed 0 comments
- Microsoft's ZeRO-2 Speeds up AI Training 10x https://www.infoq.com/news/2020/07/microsoft-ai-speedup/ 0 comments
- Accessible Multi-Billion Parameter Model Training with PyTorch Lightning + DeepSpeed | by PyTorch Lightning team | PyTorch Lightning Developer Blog https://medium.com/pytorch-lightning/accessible-multi-billion-parameter-model-training-with-pytorch-lightning-deepspeed-c9333ac3bb59 0 comments
- Efficient and Easy Training of Large AI Models — Introducing Colossal-AI | by HPC-AI Tech | Medium https://medium.com/@hpcaitech/efficient-and-easy-training-of-large-ai-models-introducing-colossal-ai-ab571176d3ed 0 comments
- Transformer Taxonomy (the last lit review) | kipply's blog https://kipp.ly/blog/transformer-taxonomy/ 0 comments
- DeepSpeed/blogs/deepspeed-chat at master · microsoft/DeepSpeed · GitHub https://github.com/microsoft/DeepSpeed/tree/master/blogs/deepspeed-chat 0 comments
- Training a Large Language Model With Metaflow, Featuring Dolly | Outerbounds https://outerbounds.com/blog/train-dolly-metaflow/ 0 comments
- GitHub - Hannibal046/Awesome-LLM: Awesome-LLM: a curated list of Large Language Model https://github.com/Hannibal046/Awesome-LLM 0 comments
- DeepSpeed’s Bag of Tricks for Speed & Scale: Part I | Kola Ayonrinde http://www.kolaayonrinde.com/blog/2023/07/14/deepspeed-train.html 0 comments
- [AINews] Claude 3 is officially America's Next Top Model • Buttondown https://buttondown.email/ainews/archive/ainews-claude-3-is-officially-americas-next-top/ 0 comments
Linked pages
- GitHub - yandex/YaLM-100B: Pretrained language model with 100B parameters https://github.com/yandex/YaLM-100B 902 comments
- Turing-NLG: A 17-billion-parameter language model by Microsoft - Microsoft Research https://www.microsoft.com/en-us/research/blog/turing-nlg-a-17-billion-parameter-language-model-by-microsoft/ 139 comments
- 20B-parameter Alexa model sets new marks in few-shot learning - Amazon Science https://www.amazon.science/blog/20b-parameter-alexa-model-sets-new-marks-in-few-shot-learning 87 comments
- GitHub - EleutherAI/gpt-neox: An implementation of model parallel autoregressive transformers on GPUs, based on the DeepSpeed library. https://github.com/EleutherAI/gpt-neox 67 comments
- Using DeepSpeed and Megatron to Train Megatron-Turing NLG 530B, the World’s Largest and Most Powerful Generative Language Model - Microsoft Research https://www.microsoft.com/en-us/research/blog/using-deepspeed-and-megatron-to-train-megatron-turing-nlg-530b-the-worlds-largest-and-most-powerful-generative-language-model/ 11 comments
- GitHub - THUDM/GLM-130B: GLM-130B: An Open Bilingual Pre-Trained Model https://github.com/THUDM/GLM-130B 1 comment
- GitHub - microsoft/DeepSpeed: DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective. https://github.com/microsoft/DeepSpeed 1 comment
- [2101.06840] ZeRO-Offload: Democratizing Billion-Scale Model Training https://arxiv.org/abs/2101.06840 1 comment
- Ultimate Guide To Scaling ML Models - Megatron-LM | ZeRO | DeepSpeed | Mixed Precision - YouTube https://youtu.be/hc0u4avAkuM 0 comments
- The Technology Behind BLOOM Training https://huggingface.co/blog/bloom-megatron-deepspeed 0 comments
- [2206.01859] Extreme Compression for Pre-trained Transformers Made Simple and Efficient https://arxiv.org/abs/2206.01859 0 comments
- [2104.07857] ZeRO-Infinity: Breaking the GPU Memory Wall for Extreme Scale Deep Learning https://arxiv.org/abs/2104.07857 0 comments
- DeepSpeed/blogs/deepspeed-chat at master · microsoft/DeepSpeed · GitHub https://github.com/microsoft/DeepSpeed/tree/master/blogs/deepspeed-chat 0 comments
Related searches:
Search whole site: site:deepspeed.ai
Search title: Latest News - DeepSpeed
See how to search.