Hacker News
- DeepScaleR: Surpassing O1-Preview with a 1.5B Model by Scaling RL https://pretty-radio-b75.notion.site/DeepScaleR-Surpassing-O1-Preview-with-a-1-5B-Model-by-Scaling-RL-19681902c1468005bed8ca303013a4e2 126 comments
Linked pages
- deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B · Hugging Face https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B 17 comments
- DeepScaleR: Surpassing O1-Preview with a 1.5B Model by Scaling RL | Notion https://pretty-radio-b75.notion.site/DeepScaleR-Surpassing-O1-Preview-with-a-1-5B-Model-by-Scaling-RL-19681902c1468005bed8ca303013a4e2 0 comments
Related searches:
Search whole site: site:github.com
Search title: GitHub - agentica-project/deepscaler: Democratizing Reinforcement Learning for LLMs
See how to search.