Hacker News
Linking pages
Linked pages
- [2101.00027] The Pile: An 800GB Dataset of Diverse Text for Language Modeling https://arxiv.org/abs/2101.00027 81 comments
- [2304.01373] Pythia: A Suite for Analyzing Large Language Models Across Training and Scaling https://arxiv.org/abs/2304.01373 7 comments
- [2210.11416] Scaling Instruction-Finetuned Language Models https://arxiv.org/abs/2210.11416 5 comments
- [2210.09261] Challenging BIG-Bench Tasks and Whether Chain-of-Thought Can Solve Them https://arxiv.org/abs/2210.09261 1 comment
- [2009.03300] Measuring Massive Multitask Language Understanding https://arxiv.org/abs/2009.03300 0 comments
- [2304.09151] UniMax: Fairer and more Effective Language Sampling for Large-Scale Multilingual Pretraining https://arxiv.org/abs/2304.09151 0 comments
- GitHub - EleutherAI/lm-evaluation-harness: A framework for few-shot evaluation of autoregressive language models. https://github.com/EleutherAI/lm-evaluation-harness 0 comments
- [2302.13971] LLaMA: Open and Efficient Foundation Language Models https://arxiv.org/abs/2302.13971 0 comments
Related searches:
Search whole site: site:blog.eleuther.ai
Search title: Pile-T5 | EleutherAI Blog
See how to search.