Hacker News
- Understanding Large Language Models – A Transformative Reading List https://sebastianraschka.com/blog/2023/llm-reading-list.html 16 comments
- [P] Understanding Large Language Models -- A Transformative Reading List https://sebastianraschka.com/blog/2023/llm-reading-list.html 7 comments machinelearning
- Understanding Large Language Models -- A Transformative Reading List https://sebastianraschka.com/blog/2023/llm-reading-list.html 3 comments learnmachinelearning
Linking pages
- AI's Carbon Footprint: Understanding and Reducing the Environmental Impact of Large Models https://theaiobserverx.substack.com/p/ais-carbon-footprint-understanding 1 comment
- GitHub - fabiochiusano/ai-news-tracker: ~300 news for quickly getting up-to-date with the generative AI landscape https://github.com/fabiochiusano/ai-news-tracker 0 comments
- GitHub - fabiochiusano/Awesome-AI-News: ~300 news for quickly getting up-to-date with the generative AI landscape https://github.com/fabiochiusano/Awesome-AI-News/tree/main 0 comments
Linked pages
- GitHub - karpathy/nanoGPT: The simplest, fastest repository for training/finetuning medium-sized GPTs. https://github.com/karpathy/nanoGPT 366 comments
- [2205.01068] OPT: Open Pre-trained Transformer Language Models https://arxiv.org/abs/2205.01068 318 comments
- [2005.14165] Language Models are Few-Shot Learners https://arxiv.org/abs/2005.14165 201 comments
- [1706.03762] Attention Is All You Need https://arxiv.org/abs/1706.03762 145 comments
- Large language models generate functional protein sequences across diverse families | Nature Biotechnology https://www.nature.com/articles/s41587-022-01618-2 50 comments
- The Illustrated Transformer – Jay Alammar – Visualizing machine learning one concept at a time. https://jalammar.github.io/illustrated-transformer/ 36 comments
- [1810.04805] BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding https://arxiv.org/abs/1810.04805 25 comments
- [2212.14034] Cramming: Training a Language Model on a Single GPU in One Day https://arxiv.org/abs/2212.14034 25 comments
- Introduction to Deep Learning https://sebastianraschka.com/blog/2021/dl-course.html 19 comments
- Highly accurate protein structure prediction with AlphaFold | Nature https://www.nature.com/articles/s41586-021-03819-2 9 comments
- Transformer models: an introduction and catalog — 2023 Edition - AI, software, tech, and people, not in that order… by X https://amatriain.net/blog/transformer-models-an-introduction-and-catalog-2d1e9039f376/ 4 comments
- [2205.14135] FlashAttention: Fast and Memory-Efficient Exact Attention with IO-Awareness https://arxiv.org/abs/2205.14135 3 comments
- [2203.15556] Training Compute-Optimal Large Language Models https://arxiv.org/abs/2203.15556 0 comments
- [2009.06732] Efficient Transformers: A Survey https://arxiv.org/abs/2009.06732 0 comments
- [1907.11692] RoBERTa: A Robustly Optimized BERT Pretraining Approach https://arxiv.org/abs/1907.11692 0 comments
- [1409.0473] Neural Machine Translation by Jointly Learning to Align and Translate http://arxiv.org/abs/1409.0473 0 comments
- https://arxiv.org/abs/2203.02155 0 comments
- [2211.05100] BLOOM: A 176B-Parameter Open-Access Multilingual Language Model https://arxiv.org/abs/2211.05100 0 comments
Would you like to stay up to date with Computer science? Checkout Computer science
Weekly.
Related searches:
Search whole site: site:sebastianraschka.com
Search title: Understanding Large Language Models -- A Transformative Reading List
See how to search.