[1706.03762] Attention Is All You Need - discu.eu

Hacker News

The paper that made ChatGPT possible https://arxiv.org/abs/1706.03762 55 comments 3/2/2023

Attention Is All You Need (Neural Networks) https://arxiv.org/abs/1706.03762 3 comments 13/6/2017

Reddit

Learnable matrices in sequence without nonlinearity - reasons? [R] https://arxiv.org/pdf/1706.03762 30 comments 30/4/2025 machinelearning
ELI5: computational complexity of Transformer models https://arxiv.org/pdf/1706.03762.pdf 2 comments 2/7/2023 languagetechnology
[D] If you had to pick 10-20 significant papers that summarize the research trajectory of AI from the past 100 years what would they be https://arxiv.org/abs/1706.03762 82 comments 7/12/2022 machinelearning
Tools to draw NN https://arxiv.org/abs/1706.03762 3 comments 6/2/2022 deeplearning
How are inputs fed to a Transformer? https://arxiv.org/pdf/1706.03762.pdf 5 comments 2/12/2019 artificial

Linking pages

Would you like to stay up to date with Computer science? Checkout Computer science Weekly.

Related searches:

Search whole site: site:arxiv.org

Search title: [1706.03762] Attention Is All You Need

See how to search.

Submit link to: