Hacker News
- SlimPajama: A 627B token cleaned and deduplicated version of RedPajama https://www.cerebras.net/blog/slimpajama-a-627b-token-cleaned-and-deduplicated-version-of-redpajama 7 comments
Linking pages
- Researchers upend AI status quo by eliminating matrix multiplication in LLMs | Ars Technica https://arstechnica.com/information-technology/2024/06/researchers-upend-ai-status-quo-by-eliminating-matrix-multiplication-in-llms/2 92 comments
- GitHub - pjlab-sys4nlp/llama-moe: ⛷️ LLaMA-MoE: Building Mixture-of-Experts from LLaMA with Continual Pre-training https://github.com/pjlab-sys4nlp/llama-moe 6 comments
- How to train a Million Context LLM — with Mark Huang of Gradient.ai https://www.latent.space/p/gradient 1 comment
- BTLM-3B-8K: 7B Performance in a 3 Billion Parameter Model - Cerebras https://www.cerebras.net/machine-learning/btlm-3b-8k-7b-performance-in-a-3-billion-parameter-model/ 0 comments
- Sheared LLaMA: Accelerating Language Model Pre-training via Structured Pruning https://xiamengzhou.github.io/sheared-llama/ 0 comments
- Cloud Intelligence at the speed of 5000 tok/s - with Ce Zhang and Vipul Ved Prakash of Together AI https://www.latent.space/p/together 0 comments
Linked pages
- Introducing MPT-7B: A New Standard for Open-Source, Commercially Usable LLMs https://www.mosaicml.com/blog/mpt-7b 11 comments
- [2303.09540] SemDeDup: Data-efficient learning at web-scale through semantic deduplication https://arxiv.org/abs/2303.09540 0 comments
- GitHub - togethercomputer/RedPajama-Data: The RedPajama-Data repository contains code for preparing large datasets for training large language models. https://github.com/togethercomputer/RedPajama-Data 0 comments
- [2306.01116] The RefinedWeb Dataset for Falcon LLM: Outperforming Curated Corpora with Web Data, and Web Data Only https://arxiv.org/abs/2306.01116 0 comments
- [2302.13971] LLaMA: Open and Efficient Foundation Language Models https://arxiv.org/abs/2302.13971 0 comments
Related searches:
Search whole site: site:www.cerebras.net
Search title: SlimPajama: A 627B token cleaned and deduplicated version of RedPajama - Cerebras
See how to search.