Hacker News
- FineWeb: 15T tokens of the finest data the web has to offer https://huggingface.co/datasets/HuggingFaceFW/fineweb 3 comments
Linking pages
- GitHub - SalvatoreRa/ML-news-of-the-week: A collection of the the best ML and AI news every week (research, news, resources) https://github.com/SalvatoreRa/ML-news-of-the-week 8 comments
- 15 trillion token dataset took down HuggingFace. - Promptzone https://www.promptzone.com/promptzone/15-trillion-token-dataset-took-down-huggingface-935 0 comments
- How much LLM training data is there, in the limit? – Educating Silicon https://www.educatingsilicon.com/2024/05/09/how-much-llm-training-data-is-there-in-the-limit/ 0 comments
- GitHub - hiyouga/LLaMA-Factory: A WebUI for Efficient Fine-Tuning of 100+ LLMs (ACL 2024) https://github.com/hiyouga/LLaMA-Factory 0 comments
Related searches:
Search whole site: site:huggingface.co
Search title: HuggingFaceFW/fineweb · Datasets at Hugging Face
See how to search.