Hacker News
- The Illustrated Transformer (2018) https://jalammar.github.io/illustrated-transformer/ 11 comments
- The Illustrated Transformer https://jalammar.github.io/illustrated-transformer/ 4 comments
- The Illustrated Transformer http://jalammar.github.io/illustrated-transformer/ 2 comments
- Output Token Dimensionality in Transformer Decoders - how does the transformer "get rid" of the dimension "number of previously outputted tokens" https://jalammar.github.io/illustrated-transformer/ 5 comments languagetechnology
- Do not understand how query, key and value matrices are generated in multi-headed self attention. https://jalammar.github.io/illustrated-transformer/ 5 comments learnmachinelearning
- ML Visualization Software Question https://jalammar.github.io/illustrated-transformer/ 4 comments deeplearning
- ML Visualization Software https://jalammar.github.io/illustrated-transformer/ 4 comments datascience
Linking pages
- We come to bury ChatGPT, not to praise it. https://www.danmcquillan.org/chatgpt.html 1328 comments
- imaginAIry/README.md at master · brycedrennan/imaginAIry · GitHub https://github.com/brycedrennan/imaginAIry 288 comments
- AI Canon | Andreessen Horowitz https://a16z.com/2023/05/25/ai-canon/ 219 comments
- Tempering Expectations for GPT-3 and OpenAI’s API | Max Woolf's Blog https://minimaxir.com/2020/07/gpt3-expectations/ 189 comments
- What We Know About LLMs (Primer) https://willthompson.name/what-we-know-about-llms-primer 164 comments
- AlphaFold 2 is here: what’s behind the structure prediction miracle | Oxford Protein Informatics Group https://www.blopig.com/blog/2021/07/alphafold-2-is-here-whats-behind-the-structure-prediction-miracle/ 93 comments
- Lessons Learned from two years as a Data Scientist https://dawndrain.github.io/braindrain/two_years.html 83 comments
- Building Custom Deep Learning Based OCR models https://nanonets.com/blog/attention-ocr-for-text-recogntion/ 69 comments
- The Illustrated Word2vec – Jay Alammar – Visualizing machine learning one concept at a time. https://jalammar.github.io/illustrated-word2vec/ 58 comments
- The Illustrated Retrieval Transformer – Jay Alammar – Visualizing machine learning one concept at a time. http://jalammar.github.io/illustrated-retrieval-transformer/ 55 comments
- Understanding Large Language Models - by Sebastian Raschka https://magazine.sebastianraschka.com/p/understanding-large-language-models 53 comments
- Modular: AIâs compute fragmentation: what matrix multiplication teaches us https://www.modular.com/blog/ais-compute-fragmentation-what-matrix-multiplication-teaches-us 44 comments
- Transformers from scratch | peterbloem.nl http://peterbloem.nl/blog/transformers 40 comments
- The fall of RNN / LSTM. We fell for Recurrent neural networks… | by Eugenio Culurciello | Towards Data Science https://towardsdatascience.com/the-fall-of-rnn-lstm-2d1594c74ce0 27 comments
- Understanding Large Language Models -- A Transformative Reading List https://sebastianraschka.com/blog/2023/llm-reading-list.html 26 comments
- Transformers are Graph Neural Networks https://thegradient.pub/transformers-are-graph-neural-networks/ 25 comments
- Techniques for Training Large Neural Networks https://openai.com/blog/techniques-for-training-large-neural-networks/ 23 comments
- A Visual Intro to NumPy and Data Representation – Jay Alammar – Visualizing machine learning one concept at a time. https://jalammar.github.io/visual-numpy/ 22 comments
- ML Resources https://sgfin.github.io/learning-resources/ 21 comments
- Transformers for software engineers - Made of Bugs https://blog.nelhage.com/post/transformers-for-software-engineers/ 20 comments
Linked pages
- [1706.03762] Attention Is All You Need https://arxiv.org/abs/1706.03762 145 comments
- Visual Information Theory -- colah's blog https://colah.github.io/posts/2015-09-Visual-Information/ 72 comments
- Stanford CS 224N | Natural Language Processing with Deep Learning https://web.stanford.edu/class/cs224n/ 28 comments
- [1706.05137] One Model To Learn Them All https://arxiv.org/abs/1706.05137 3 comments
- The Annotated Transformer https://nlp.seas.harvard.edu/2018/04/03/attention.html 3 comments
- Transformer: A Novel Neural Network Architecture for Language Understanding – Google AI Blog https://ai.googleblog.com/2017/08/transformer-novel-neural-network.html 3 comments
- Visualizing A Neural Machine Translation Model (Mechanics of Seq2seq Models With Attention) – Jay Alammar – Visualizing machine learning one concept at a time. https://jalammar.github.io/visualizing-neural-machine-translation-mechanics-of-seq2seq-models-with-attention/ 1 comment
- Kullback-Leibler Divergence Explained — Count Bayesie https://www.countbayesie.com/blog/2017/5/9/kullback-leibler-divergence-explained 0 comments
- [1801.10198] Generating Wikipedia by Summarizing Long Sequences https://arxiv.org/abs/1801.10198 0 comments
- GitHub - tensorflow/tensor2tensor: Library of deep learning models and datasets designed to make deep learning more accessible and accelerate ML research. https://github.com/tensorflow/tensor2tensor 0 comments
- Train and run machine learning models faster | Cloud TPU | Google Cloud https://cloud.google.com/tpu/ 0 comments
Would you like to stay up to date with Computer science? Checkout Computer science
Weekly.
Related searches:
Search whole site: site:jalammar.github.io
Search title: The Illustrated Transformer – Jay Alammar – Visualizing machine learning one concept at a time.
See how to search.