Hacker News
- Megatron-Turing NLG 530B, the World’s Largest Generative Language Model https://www.microsoft.com/en-us/research/blog/using-deepspeed-and-megatron-to-train-megatron-turing-nlg-530b-the-worlds-largest-and-most-powerful-generative-language-model/ 2 comments
- Microsoft and Nvidia have created a 530B parameter language model https://www.microsoft.com/en-us/research/blog/using-deepspeed-and-megatron-to-train-megatron-turing-nlg-530b-the-worlds-largest-and-most-powerful-generative-language-model/ 2 comments
- [D] Where did MT-NLG go wrong with their scaling experiments, comparing its capabilities to PaLM? https://www.microsoft.com/en-us/research/blog/using-deepspeed-and-megatron-to-train-megatron-turing-nlg-530b-the-worlds-largest-and-most-powerful-generative-language-model/ 6 comments machinelearning
Linking pages
- GPT-3 is No Longer the Only Game in Town https://lastweekin.ai/p/gpt-3-is-no-longer-the-only-game 215 comments
- The Illustrated Retrieval Transformer – Jay Alammar – Visualizing machine learning one concept at a time. http://jalammar.github.io/illustrated-retrieval-transformer/ 55 comments
- AI’s Smarts Now Come With a Big Price Tag | WIRED https://www.wired.com/story/ai-smarts-big-price-tag/ 6 comments
- Easily Build Your Own GPT from Scratch using AWS: A Comprehensive Guide for Domain Adaptation | by Arun Shankar | Jan, 2023 | Medium https://tinyurl.com/hvrjkm5r 5 comments
- Microsoft expands its AI-supercomputer lineup with general availability of the latest 80GB NVIDIA A100 GPUs in Azure, claims 4 spots on TOP500 supercomputers list | Azure Blog and Updates | Microsoft Azure https://azure.microsoft.com/en-us/blog/microsoft-expands-its-aisupercomputer-lineup-with-general-availability-of-the-latest-80gb-nvidia-a100-gpus-in-azure-claims/?WT.mc_id=academic-0000-abartolo 4 comments
- GitHub - microsoft/DeepSpeed: DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective. https://github.com/microsoft/DeepSpeed 1 comment
- Modular: Increasing development velocity of giant AI models https://www.modular.com/blog/increasing-development-velocity-of-giant-ai-models 0 comments
- Latest News - DeepSpeed https://www.deepspeed.ai/ 0 comments
- Creating a GPT-Style Language Model for a Single Question - Unite.AI https://www.unite.ai/creating-a-gpt-style-language-model-for-a-single-question/ 0 comments
- Google Trains 280 Billion Parameter AI Language Model Gopher https://www.infoq.com/news/2022/01/deepmind-gopher/ 0 comments
- The Perils of Using Quotations to Authenticate NLG Content - Unite.AI https://www.unite.ai/the-perils-of-using-quotations-to-authenticate-nlg-content/ 0 comments
- Microsoft's Massive New Language AI Is Triple the Size of OpenAI’s GPT-3 https://singularityhub.com/2021/10/13/microsofts-massive-new-language-ai-is-triple-the-size-of-openais-gpt-3/ 0 comments
- GitHub - tomohideshibata/BERT-related-papers: BERT-related papers https://github.com/tomohideshibata/BERT-related-papers 0 comments
Would you like to stay up to date with Computer science? Checkout Computer science
Weekly.