Why the Original Transformer Figure Is Wrong, and Some Other Interesting Historical Tidbits About LLMs - discu.eu

Hacker News

Why the original transformer figure is wrong, and some other tidbits about LLMs https://magazine.sebastianraschka.com/p/why-the-original-transformer-figure 49 comments 24/5/2023

Reddit

[P] Why the Original Transformer Figure Is Wrong, And Some Other Interesting Tidbits https://magazine.sebastianraschka.com/p/why-the-original-transformer-figure 11 comments 27/5/2023 machinelearning

Linked pages

[1706.03762] Attention Is All You Need https://arxiv.org/abs/1706.03762 145 comments
Understanding Large Language Models - by Sebastian Raschka https://magazine.sebastianraschka.com/p/understanding-large-language-models 53 comments
[2102.11174] Linear Transformers Are Secretly Fast Weight Programmers https://arxiv.org/abs/2102.11174 2 comments
[1801.06146] Universal Language Model Fine-tuning for Text Classification https://arxiv.org/abs/1801.06146 0 comments
Neural nets learn to program neural nets with with fast weights (1991) https://people.idsia.ch/~juergen/fast-weight-programmer-1991-transformer.html 0 comments
[2112.11446] Scaling Language Models: Methods, Analysis & Insights from Training Gopher https://arxiv.org/abs/2112.11446 0 comments

Would you like to stay up to date with Computer science? Checkout Computer science Weekly.

Related searches:

Search whole site: site:magazine.sebastianraschka.com

Search title: Why the Original Transformer Figure Is Wrong, and Some Other Interesting Historical Tidbits About LLMs

See how to search.

Submit link to: