Attention? Attention! | Lil'Log

Linking pages

What We Know About LLMs (Primer) https://willthompson.name/what-we-know-about-llms-primer 164 comments
From Autoencoder to Beta-VAE | Lil'Log https://web.archive.org/web/20241202042731/https:/lilianweng.github.io/posts/2018-08-12-vae 67 comments
Normcore LLM Reads · GitHub https://gist.github.com/veekaybee/be375ab33085102f9027853128dc5f0e 54 comments
The Transformer Family Version 2.0 | Lil'Log https://lilianweng.github.io/posts/2023-01-27-the-transformer-family-v2/ 46 comments
GitHub - mlabonne/llm-course: Course to get into Large Language Models (LLMs) with roadmaps and Colab notebooks. https://github.com/mlabonne/llm-course 10 comments
DeepSeek V3 and the cost of frontier AI models https://www.interconnects.ai/p/deepseek-v3-and-the-actual-cost-of 9 comments
Meta-Learning: Learning to Learn Fast | Lil'Log https://lilianweng.github.io/posts/2018-11-30-meta-learning/ 0 comments

Linked pages

A Turing Machine Overview http://aturingmachine.com/ 340 comments
https://deepmind.com/blog/wavenet-generative-model-raw-audio/ 288 comments
Understanding LSTM Networks -- colah's blog https://colah.github.io/posts/2015-08-Understanding-LSTMs/ 64 comments
Turing machine - Wikipedia http://en.wikipedia.org/wiki/Turing_machine#Concurrency 62 comments
http://arxiv.org/abs/1410.5401 40 comments
Attention is All you Need https://papers.nips.cc/paper/7181-attention-is-all-you-need 30 comments
Von Neumann architecture - Wikipedia https://en.wikipedia.org/wiki/Von_Neumann_architecture#Von_Neumann_bottleneck 27 comments
A (Long) Peek into Reinforcement Learning | Lil'Log https://lilianweng.github.io/posts/2018-02-19-rl-overview/ 24 comments
https://dennybritz.com/posts/wildml/attention-and-memory-in-deep-learning-and-nlp/ http://www.wildml.com/2016/01/attention-and-memory-in-deep-learning-and-nlp/ 7 comments
[1512.03385] Deep Residual Learning for Image Recognition http://arxiv.org/abs/1512.03385 6 comments
Learning to Learn – The Berkeley Artificial Intelligence Research Blog http://bair.berkeley.edu/blog/2017/07/18/learning-to-learn/ 0 comments
[1711.07971] Non-local Neural Networks https://arxiv.org/abs/1711.07971 0 comments
[1805.08318] Self-Attention Generative Adversarial Networks https://arxiv.org/abs/1805.08318 0 comments
GitHub - tensorflow/nmt: TensorFlow Neural Machine Translation Tutorial https://github.com/tensorflow/nmt 0 comments
[1511.06434] Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks http://arxiv.org/abs/1511.06434 0 comments
[1409.0473] Neural Machine Translation by Jointly Learning to Align and Translate http://arxiv.org/abs/1409.0473 0 comments