Distilling step-by-step: Outperforming larger language models with less training data and smaller model sizes – Google Research Blog - discu.eu

Hacker News

Outperforming larger language models with less training data and smaller models https://blog.research.google/2023/09/distilling-step-by-step-outperforming.html 123 comments 22/9/2023

Linking pages

AI Gravity (or LLM Gravity) - by John Hwang https://nextword.substack.com/p/ai-gravity-or-llm-gravity 0 comments

Linked pages

Pathways Language Model (PaLM): Scaling to 540 Billion Parameters for Breakthrough Performance – Google AI Blog https://ai.googleblog.com/2022/04/pathways-language-model-palm-scaling-to.html 279 comments
[2005.14165] Language Models are Few-Shot Learners https://arxiv.org/abs/2005.14165 201 comments
Exploring Transfer Learning with T5: the Text-To-Text Transfer Transformer – Google AI Blog https://ai.googleblog.com/2020/02/exploring-transfer-learning-with-t5.html 66 comments
[2305.02301] Distilling Step-by-Step! Outperforming Larger Language Models with Less Training Data and Smaller Model Sizes https://arxiv.org/abs/2305.02301 56 comments
[1810.04805] BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding https://arxiv.org/abs/1810.04805 25 comments
[1503.02531] Distilling the Knowledge in a Neural Network https://arxiv.org/abs/1503.02531 5 comments
[2201.11903] Chain of Thought Prompting Elicits Reasoning in Large Language Models https://arxiv.org/abs/2201.11903 1 comment
[1910.10683] Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer https://arxiv.org/abs/1910.10683 1 comment
[1801.06146] Universal Language Model Fine-tuning for Text Classification https://arxiv.org/abs/1801.06146 0 comments
Vertex AI | Google Cloud https://cloud.google.com/vertex-ai 0 comments
[1910.14599] Adversarial NLI: A New Benchmark for Natural Language Understanding https://arxiv.org/abs/1910.14599 0 comments

Related searches:

Search whole site: site:blog.research.google

Search title: Distilling step-by-step: Outperforming larger language models with less training data and smaller model sizes – Google Research Blog

See how to search.

Submit link to: