Linking pages
- A guide to LLM inference and performance https://www.baseten.co/blog/llm-transformer-inference-guide/ 14 comments
- What to Expect From Retrievel-Augmented Generation and Self-hosted LLMs | MyScale | Blog https://myscale.com/blog/what-to-expect-rag/ 0 comments
- Understanding how LLM inference works with llama.cpp https://www.omrimallis.com/posts/understanding-how-llm-inference-works-with-llama-cpp/ 0 comments
- Transformer inference tricks - by Finbarr Timbers https://www.artfintel.com/p/transformer-inference-tricks 0 comments
- Where do LLMs spend their FLOPS? - by Finbarr Timbers https://www.artfintel.com/p/where-do-llms-spend-their-flops 0 comments
- Transformers Optimization: Part 1 - KV Cache | Rajan Ghimire https://r4j4n.github.io/blogs/posts/kv/ 0 comments
Linked pages
- Competitive programming with AlphaCode https://deepmind.com/blog/article/Competitive-programming-with-AlphaCode 584 comments
- The Illustrated Transformer – Jay Alammar – Visualizing machine learning one concept at a time. https://jalammar.github.io/illustrated-transformer/ 36 comments
- [1804.06826] Dissecting the NVIDIA Volta GPU Architecture via Microbenchmarking https://arxiv.org/abs/1804.06826 32 comments
- Making Deep Learning go Brrrr From First Principles https://horace.io/brrr_intro.html 20 comments
- Recommended GPU Instances - AWS Deep Learning AMIs https://docs.aws.amazon.com/dlami/latest/devguide/gpu.html 9 comments
- GitHub - NVIDIA/FasterTransformer: Transformer related optimization, including BERT, GPT https://github.com/NVIDIA/FasterTransformer/ 1 comment
- Amazon EC2 P4d Instances - Amazon Web Services https://aws.amazon.com/ec2/instance-types/p4/ 0 comments
- [2112.00861] A General Language Assistant as a Laboratory for Alignment https://arxiv.org/abs/2112.00861 0 comments
- [2104.05158] Software-Hardware Co-design for Fast and Scalable Training of Deep Learning Recommendation Models https://arxiv.org/abs/2104.05158 0 comments
Related searches:
Search whole site: site:kipp.ly
Search title: Transformer Inference Arithmetic | kipply's blog
See how to search.