[1910.10683] Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer

Linking pages

What We Know About LLMs (Primer) https://willthompson.name/what-we-know-about-llms-primer 164 comments
Does GPT-2 Know Your Phone Number? – The Berkeley Artificial Intelligence Research Blog https://bair.berkeley.edu/blog/2020/12/20/lmmem/ 155 comments
Distilling step-by-step: Outperforming larger language models with less training data and smaller model sizes – Google Research Blog https://blog.research.google/2023/09/distilling-step-by-step-outperforming.html 123 comments
Exploring Transfer Learning with T5: the Text-To-Text Transfer Transformer – Google AI Blog https://ai.googleblog.com/2020/02/exploring-transfer-learning-with-t5.html 66 comments
The Illustrated Retrieval Transformer – Jay Alammar – Visualizing machine learning one concept at a time. http://jalammar.github.io/illustrated-retrieval-transformer/ 55 comments
GitHub - xenova/transformers.js: State-of-the-art Machine Learning for the web. Run 🤗 Transformers directly in your browser, with no need for a server! https://github.com/xenova/transformers.js 55 comments
Deep language algorithms predict semantic comprehension from brain activity | Scientific Reports https://www.nature.com/articles/s41598-022-20460-9 46 comments
GitHub - lucidrains/x-transformers: A simple but complete full-attention transformer with a set of promising experimental features from various papers https://github.com/lucidrains/x-transformers 40 comments
A New Lens on Understanding Generalization in Deep Learning – Google AI Blog https://ai.googleblog.com/2021/03/a-new-lens-on-understanding.html 35 comments
GitHub - huggingface/transformers: 🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX. https://github.com/huggingface/transformers 26 comments
Transformers are Graph Neural Networks https://thegradient.pub/transformers-are-graph-neural-networks/ 25 comments
MOMENT: A Foundation Model for Time Series Forecasting, Classification, Anomaly Detection and Imputation https://aihorizonforecast.substack.com/p/moment-a-foundation-model-for-time 25 comments
FlexAttention: The Flexibility of PyTorch with the Performance of FlashAttention | PyTorch https://pytorch.org/blog/flexattention/ 24 comments
Transformers are Graph Neural Networks | NTU Graph Deep Learning Lab https://graphdeeplearning.github.io/post/transformers-are-gnns/ 19 comments
More Efficient NLP Model Pre-training with ELECTRA – Google AI Blog https://ai.googleblog.com/2020/03/more-efficient-nlp-model-pre-training.html 12 comments
GitHub - JUSTSUJAY/ML-Research-Papers https://github.com/JUSTSUJAY/ML-Research-Papers 10 comments
Google Research, 2022 & beyond: Language, vision and generative models – Google AI Blog https://ai.googleblog.com/2023/01/google-research-2022-beyond-language.html 5 comments
Evaluating long context large language models https://www.artfish.ai/p/long-context-llms 4 comments
Towards Reliability in Deep Learning Systems – Google AI Blog https://ai.googleblog.com/2022/07/towards-reliability-in-deep-learning.html 3 comments
GitHub - onnx/models: A collection of pre-trained, state-of-the-art models in the ONNX format https://github.com/onnx/models 3 comments