SlimPajama: A 627B token cleaned and deduplicated version of RedPajama - Cerebras - discu.eu

Hacker News

SlimPajama: A 627B token cleaned and deduplicated version of RedPajama https://www.cerebras.net/blog/slimpajama-a-627b-token-cleaned-and-deduplicated-version-of-redpajama 7 comments 11/6/2023

Linking pages

Researchers upend AI status quo by eliminating matrix multiplication in LLMs | Ars Technica https://arstechnica.com/information-technology/2024/06/researchers-upend-ai-status-quo-by-eliminating-matrix-multiplication-in-llms/2 92 comments
Medical large language models are vulnerable to data-poisoning attacks | Nature Medicine https://www.nature.com/articles/s41591-024-03445-1 7 comments
GitHub - pjlab-sys4nlp/llama-moe: ⛷️ LLaMA-MoE: Building Mixture-of-Experts from LLaMA with Continual Pre-training https://github.com/pjlab-sys4nlp/llama-moe 6 comments
How to train a Million Context LLM — with Mark Huang of Gradient.ai https://www.latent.space/p/gradient 1 comment
BTLM-3B-8K: 7B Performance in a 3 Billion Parameter Model - Cerebras https://www.cerebras.net/machine-learning/btlm-3b-8k-7b-performance-in-a-3-billion-parameter-model/ 0 comments
Sheared LLaMA: Accelerating Language Model Pre-training via Structured Pruning https://xiamengzhou.github.io/sheared-llama/ 0 comments
Cloud Intelligence at the speed of 5000 tok/s - with Ce Zhang and Vipul Ved Prakash of Together AI https://www.latent.space/p/together 0 comments

Linked pages

Related searches:

Search whole site: site:www.cerebras.net

Search title: SlimPajama: A 627B token cleaned and deduplicated version of RedPajama - Cerebras

See how to search.

Submit link to: