Linking pages
Linked pages
- Transformer Inference Arithmetic | kipply's blog https://kipp.ly/blog/transformer-inference-arithmetic/ 4 comments
- [2205.14135] FlashAttention: Fast and Memory-Efficient Exact Attention with IO-Awareness https://arxiv.org/abs/2205.14135 3 comments
- bfloat16 floating-point format - Wikipedia https://en.wikipedia.org/wiki/Bfloat16_floating-point_format 1 comment
- [2203.15556] Training Compute-Optimal Large Language Models https://arxiv.org/abs/2203.15556 0 comments
- [2204.02311] PaLM: Scaling Language Modeling with Pathways https://arxiv.org/abs/2204.02311 0 comments
Related searches:
Search whole site: site:bounded-regret.ghost.io
Search title: How fast can we perform a forward pass?
See how to search.