discu
Newsletters
Mentions
Extension
Pricing
Login
Sign Up
Hacker News
3T Token Open Corpus for Language Model Pretraining
https://blog.allenai.org/dolma-3-trillion-tokens-open-llm-corpus-9a0ff4b8da64
5 comments
18/8/2023