Hacker News
- Understanding large language models: A cross-section of the relevant literature https://magazine.sebastianraschka.com/p/understanding-large-language-models 31 comments
- Understanding Large Language Models -- A Cross-Section of the Most Relevant Literature To Get Up to Speed https://magazine.sebastianraschka.com/p/understanding-large-language-models 4 comments learnmachinelearning
- [P] Understanding Large Language Models -- a collection of the most relevant papers https://magazine.sebastianraschka.com/p/understanding-large-language-models 18 comments machinelearning
Linking pages
- Finetuning Large Language Models - by Sebastian Raschka https://magazine.sebastianraschka.com/p/finetuning-large-language-models 72 comments
- Why the Original Transformer Figure Is Wrong, and Some Other Interesting Historical Tidbits About LLMs https://magazine.sebastianraschka.com/p/why-the-original-transformer-figure 60 comments
- The Tech Buffet #16: Quickly Evaluate your RAG Without Manually Labeling Test Data https://thetechbuffet.substack.com/p/evaluate-rag-with-synthetic-data 0 comments
- (Opinionated) Guide to ML Engineer Job Hunting | Yuan Meng https://www.yuan-meng.com/posts/mle_interviews/ 0 comments
Linked pages
- GitHub - karpathy/nanoGPT: The simplest, fastest repository for training/finetuning medium-sized GPTs. https://github.com/karpathy/nanoGPT 366 comments
- [2205.01068] OPT: Open Pre-trained Transformer Language Models https://arxiv.org/abs/2205.01068 318 comments
- [2005.14165] Language Models are Few-Shot Learners https://arxiv.org/abs/2005.14165 201 comments
- [1706.03762] Attention Is All You Need https://arxiv.org/abs/1706.03762 145 comments
- dolly/data at master · databrickslabs/dolly · GitHub https://github.com/databrickslabs/dolly/tree/master/data 89 comments
- [2101.00027] The Pile: An 800GB Dataset of Diverse Text for Language Modeling https://arxiv.org/abs/2101.00027 81 comments
- Large language models generate functional protein sequences across diverse families | Nature Biotechnology https://www.nature.com/articles/s41587-022-01618-2 50 comments
- The Illustrated Transformer – Jay Alammar – Visualizing machine learning one concept at a time. https://jalammar.github.io/illustrated-transformer/ 36 comments
- [1810.04805] BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding https://arxiv.org/abs/1810.04805 25 comments
- [2212.14034] Cramming: Training a Language Model on a Single GPU in One Day https://arxiv.org/abs/2212.14034 25 comments
- Introduction to Deep Learning https://sebastianraschka.com/blog/2021/dl-course.html 19 comments
- [2009.01325] Learning to summarize from human feedback https://arxiv.org/abs/2009.01325 12 comments
- [2304.13712] Harnessing the Power of LLMs in Practice: A Survey on ChatGPT and Beyond https://arxiv.org/abs/2304.13712 12 comments
- Highly accurate protein structure prediction with AlphaFold | Nature https://www.nature.com/articles/s41586-021-03819-2 9 comments
- [2106.09685] LoRA: Low-Rank Adaptation of Large Language Models https://arxiv.org/abs/2106.09685 8 comments
- https://arxiv.org/abs/1602.01783 7 comments
- [2211.01786] Crosslingual Generalization through Multitask Finetuning https://arxiv.org/abs/2211.01786 7 comments
- [2304.01373] Pythia: A Suite for Analyzing Large Language Models Across Training and Scaling https://arxiv.org/abs/2304.01373 7 comments
- [1909.08593] Fine-Tuning Language Models from Human Preferences https://arxiv.org/abs/1909.08593 5 comments
- Transformer models: an introduction and catalog — 2023 Edition - AI, software, tech, and people, not in that order… by X https://amatriain.net/blog/transformer-models-an-introduction-and-catalog-2d1e9039f376/ 4 comments
Would you like to stay up to date with Computer science? Checkout Computer science
Weekly.
Related searches:
Search whole site: site:magazine.sebastianraschka.com
Search title: Understanding Large Language Models - by Sebastian Raschka
See how to search.