Hacker News
- Basic math related to computation and memory usage for transformers https://blog.eleuther.ai/transformer-math/ 13 comments
Linking pages
- The Mathematics of Training LLMs — with Quentin Anthony of Eleuther AI https://www.latent.space/p/transformers-math#details 66 comments
- A Visual Guide to Quantization - by Maarten Grootendorst https://newsletter.maartengrootendorst.com/p/a-visual-guide-to-quantization 29 comments
- Calculating GPU memory for serving LLMs | Substratus https://www.substratus.ai/blog/calculating-gpu-memory-for-llm/ 2 comments
- Everything about Distributed Training and Efficient Finetuning | Sumanth's Personal Website https://sumanthrh.com/post/distributed-and-efficient-finetuning/ 1 comment
- Transformer Memory Arithmetic: Understanding all the Bytes in nanoGPT https://erees.dev/transformer-memory/ 0 comments
- The Novice's LLM Training Guide https://rentry.co/llm-training 0 comments
- The Foundation Model Development Cheatsheet | EleutherAI Blog https://blog.eleuther.ai/fm-dev-cheatsheet/ 0 comments
Linked pages
- Transformer Inference Arithmetic | kipply's blog https://kipp.ly/blog/transformer-inference-arithmetic/ 4 comments
- Fully Sharded Data Parallel: faster AI training with fewer GPUs Engineering at Meta - https://engineering.fb.com/2021/07/15/open-source/fsdp/ 2 comments
- [2203.15556] Training Compute-Optimal Large Language Models https://arxiv.org/abs/2203.15556 0 comments
- [2001.08361] Scaling Laws for Neural Language Models https://arxiv.org/abs/2001.08361 0 comments
- GitHub - TimDettmers/bitsandbytes: 8-bit CUDA functions for PyTorch https://github.com/TimDettmers/bitsandbytes 0 comments
Related searches:
Search whole site: site:blog.eleuther.ai
Search title: Transformer Math 101 | EleutherAI Blog
See how to search.