Hacker News
- Understanding and coding the self-attention mechanism of large language models https://sebastianraschka.com/blog/2023/self-attention-from-scratch.html 37 comments
- [P] Understanding & Coding the Self-Attention Mechanism of Large Language Models https://sebastianraschka.com/blog/2023/self-attention-from-scratch.html 4 comments machinelearning
- Coding the Self-Attention Mechanism of Large Language Models in Python From Scratch https://sebastianraschka.com/blog/2023/self-attention-from-scratch.html 3 comments python
Linking pages
- Researchers upend AI status quo by eliminating matrix multiplication in LLMs | Ars Technica https://arstechnica.com/information-technology/2024/06/researchers-upend-ai-status-quo-by-eliminating-matrix-multiplication-in-llms/2 92 comments
- Understanding and Coding Self-Attention, Multi-Head Attention, Cross-Attention, and Causal-Attention in LLMs https://magazine.sebastianraschka.com/p/understanding-and-coding-self-attention 11 comments
- Deep Dips #3: Transformers - by Michael Lones https://open.substack.com/pub/fetchdecodeexecute/p/deep-dips-3-transformers?r=37tqze 1 comment
- The History of Open-Source LLMs: Early Days (Part One) https://cameronrwolfe.substack.com/p/the-history-of-open-source-llms-early 0 comments
- New Transformer architecture for powerful LLMs without GPUs | VentureBeat https://venturebeat.com/ai/new-transformer-architecture-could-enable-powerful-llms-without-gpus/ 0 comments
- Incremental Gambits and Premature Endgames | Matthew Lewis https://matthewlewis.xyz/blog/2024/08/29/incremental-gambits-and-premature-endgames.html 0 comments
Linked pages
- [1706.03762] Attention Is All You Need https://arxiv.org/abs/1706.03762 145 comments
- Machine Learning Q… by Sebastian Raschka, PhD [PDF/iPad/Kindle] https://leanpub.com/machine-learning-q-and-ai 12 comments
- [2205.14135] FlashAttention: Fast and Memory-Efficient Exact Attention with IO-Awareness https://arxiv.org/abs/2205.14135 3 comments
- [2009.06732] Efficient Transformers: A Survey https://arxiv.org/abs/2009.06732 0 comments
- [1409.0473] Neural Machine Translation by Jointly Learning to Align and Translate http://arxiv.org/abs/1409.0473 0 comments
Related searches:
Search whole site: site:sebastianraschka.com
Search title: Understanding and Coding the Self-Attention Mechanism of Large Language Models From Scratch
See how to search.