Hacker News
- Transformer Inference Arithmetic (2022) https://kipp.ly/blog/transformer-inference-arithmetic 4 comments
Linking pages
- Transformer Math 101 | EleutherAI Blog https://blog.eleuther.ai/transformer-math/ 13 comments
- Why GPT-3.5 is (mostly) cheaper than Llama 2 https://www.cursor.so/blog/llama-inference 10 comments
- GitHub - 152334H/tortoise-tts-fast: Fast TorToiSe inference (5x or your money back!) https://github.com/152334H/tortoise-tts-fast 7 comments
- Transformer Inference Arithmetic | kipply's blog https://carolchen.me/blog/transformer-inference-arithmetic/ 2 comments
- How fast can we perform a forward pass? https://bounded-regret.ghost.io/how-fast-can-we-perform-a-forward-pass/ 0 comments
- Nintil - Set Sail For Fail? On AI risk https://nintil.com/ai-safety 0 comments
- Speeding up the GPT - KV cache | Becoming The Unbeatable https://immortal3.github.io/becoming-the-unbeatable/posts/gpt-kvcache/ 0 comments
- How is LLaMa.cpp possible? - by Finbarr Timbers https://finbarrtimbers.substack.com/p/how-is-llamacpp-possible 0 comments
- On Device AI – Double-Edged Sword https://www.semianalysis.com/p/on-device-ai-double-edged-sword 0 comments
- Dissecting Batching Effects in GPT Inference https://le.qun.ch/en/blog/2023/05/13/transformer-batching/ 0 comments
- Transformer Memory Arithmetic: Understanding all the Bytes in nanoGPT https://erees.dev/transformer-memory/ 0 comments
- The Novice's LLM Training Guide https://rentry.co/llm-training 0 comments
Linked pages
- Competitive programming with AlphaCode https://deepmind.com/blog/article/Competitive-programming-with-AlphaCode 584 comments
- The Illustrated Transformer – Jay Alammar – Visualizing machine learning one concept at a time. https://jalammar.github.io/illustrated-transformer/ 36 comments
- [1804.06826] Dissecting the NVIDIA Volta GPU Architecture via Microbenchmarking https://arxiv.org/abs/1804.06826 32 comments
- Making Deep Learning go Brrrr From First Principles https://horace.io/brrr_intro.html 20 comments
- Recommended GPU Instances - Deep Learning AMI https://docs.aws.amazon.com/dlami/latest/devguide/gpu.html 6 comments
- GitHub - NVIDIA/FasterTransformer: Transformer related optimization, including BERT, GPT https://github.com/NVIDIA/FasterTransformer/ 1 comment
- Amazon EC2 P4d Instances - Amazon Web Services https://aws.amazon.com/ec2/instance-types/p4/ 0 comments
- [2112.00861] A General Language Assistant as a Laboratory for Alignment https://arxiv.org/abs/2112.00861 0 comments
- [2104.05158] Software-Hardware Co-design for Fast and Scalable Training of Deep Learning Recommendation Models https://arxiv.org/abs/2104.05158 0 comments
Related searches:
Search whole site: site:kipp.ly
Search title: Transformer Inference Arithmetic | kipply's blog
See how to search.