Hacker News
- Fully Sharded Data Parallel: Faster AI Training with Fewer GPUs https://engineering.fb.com/2021/07/15/open-source/fsdp/ 2 comments
Linking pages
- GitHub - openlm-research/open_llama: OpenLLaMA, a permissively licensed open source reproduction of Meta AI’s LLaMA 7B trained on the RedPajama dataset https://github.com/openlm-research/open_llama 183 comments
- GitHub - QwenLM/Qwen: The official repo of Qwen (通义千问) chat & pretrained large language model proposed by Alibaba Cloud. https://github.com/QwenLM/Qwen 51 comments
- Transformer Math 101 | EleutherAI Blog https://blog.eleuther.ai/transformer-math/ 13 comments
- GitHub - Alpha-VLLM/LLaMA2-Accessory: An Open-source Toolkit for LLM Development https://github.com/Alpha-VLLM/LLaMA2-Accessory 3 comments
- Visualizing 6D Mesh Parallelism · main https://main-horse.github.io/posts/visualizing-6d/ 3 comments
- Aman's AI Journal • Primers • Overview of Large Language Models https://aman.ai/primers/ai/LLM/ 1 comment
- GitHub - upgundecha/applied-ai: A repository of curated use cases, articles, blogs, videos on how companies are using Artificial Intelligence and Machine Learning. https://github.com/upgundecha/applied-ai 1 comment
- Announcing Lightning 1.4. Lightning 1.4 Release adds TPU pods… | by PyTorch Lightning team | PyTorch Lightning Developer Blog https://devblog.pytorchlightning.ai/announcing-lightning-1-4-8cd20482aee9 0 comments
- The History of Open-Source LLMs: Early Days (Part One) https://cameronrwolfe.substack.com/p/the-history-of-open-source-llms-early 0 comments
- GitHub - stanford-crfm/haliax: Named Tensors for Legible Deep Learning in JAX https://github.com/stanford-crfm/haliax 0 comments
- Dolma, OLMo, and the Future of Open-Source LLMs https://cameronrwolfe.substack.com/p/dolma-olmo-and-the-future-of-open 0 comments
Related searches:
Search whole site: site:engineering.fb.com
Search title: Fully Sharded Data Parallel: faster AI training with fewer GPUs Engineering at Meta -
See how to search.