Transformer Math 101 | EleutherAI Blog - discu.eu

Hacker News

Basic math related to computation and memory usage for transformers https://blog.eleuther.ai/transformer-math/ 13 comments 19/4/2023

Linking pages

Linked pages

Transformer Inference Arithmetic | kipply's blog https://kipp.ly/blog/transformer-inference-arithmetic/ 4 comments
Fully Sharded Data Parallel: faster AI training with fewer GPUs Engineering at Meta - https://engineering.fb.com/2021/07/15/open-source/fsdp/ 2 comments
[2203.15556] Training Compute-Optimal Large Language Models https://arxiv.org/abs/2203.15556 0 comments
[2001.08361] Scaling Laws for Neural Language Models https://arxiv.org/abs/2001.08361 0 comments
GitHub - TimDettmers/bitsandbytes: 8-bit CUDA functions for PyTorch https://github.com/TimDettmers/bitsandbytes 0 comments

Related searches:

Search whole site: site:blog.eleuther.ai

Search title: Transformer Math 101 | EleutherAI Blog

See how to search.

Submit link to: