Understanding and Coding Self-Attention, Multi-Head Attention, Cross-Attention, and Causal-Attention in LLMs - discu.eu

Hacker News

Coding Self-Attention, Multi-Head Attention, Cross-Attention, Causal-Attention https://magazine.sebastianraschka.com/p/understanding-and-coding-self-attention 11 comments 14/1/2024

Linked pages

[1706.03762] Attention Is All You Need https://arxiv.org/abs/1706.03762 145 comments
GitHub - rasbt/LLMs-from-scratch: Implement a ChatGPT-like LLM in PyTorch from scratch, step by step https://github.com/rasbt/LLMs-from-scratch 98 comments
Understanding and Coding the Self-Attention Mechanism of Large Language Models From Scratch https://sebastianraschka.com/blog/2023/self-attention-from-scratch.html 45 comments
[2205.14135] FlashAttention: Fast and Memory-Efficient Exact Attention with IO-Awareness https://arxiv.org/abs/2205.14135 3 comments
[2009.06732] Efficient Transformers: A Survey https://arxiv.org/abs/2009.06732 0 comments
[1409.0473] Neural Machine Translation by Jointly Learning to Align and Translate http://arxiv.org/abs/1409.0473 0 comments

Related searches:

Search whole site: site:magazine.sebastianraschka.com

Search title: Understanding and Coding Self-Attention, Multi-Head Attention, Cross-Attention, and Causal-Attention in LLMs

See how to search.

Submit link to: