Hacker News
Linked pages
- [2501.12948] DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning https://arxiv.org/abs/2501.12948 1060 comments
- https://unsloth.ai/blog/deepseekr1-dynamic 350 comments
- GitHub - huggingface/open-r1: Fully open reproduction of DeepSeek-R1 https://github.com/huggingface/open-r1 327 comments
- [2403.09629] Quiet-STaR: Language Models Can Teach Themselves to Think Before Speaking https://arxiv.org/abs/2403.09629 271 comments
- 7B Model and 8K Examples: Emerging Reasoning with Reinforcement Learning is Both Effective and Efficient | Notion https://hkust-nlp.notion.site/simplerl-reason 217 comments
- [2412.06769] Training Large Language Models to Reason in a Continuous Latent Space https://arxiv.org/abs/2412.06769 114 comments
- GitHub Star History https://star-history.com/#microsoft/playwright&cypress-io/cypress&Date 78 comments
- [2501.00663] Titans: Learning to Memorize at Test Time https://arxiv.org/abs/2501.00663 52 comments
- [2501.04519] rStar-Math: Small LLMs Can Master Math Reasoning with Self-Evolved Deep Thinking https://arxiv.org/abs/2501.04519 35 comments
- [2411.10440] LLaVA-o1: Let Vision Language Models Reason Step-by-Step https://arxiv.org/abs/2411.10440 32 comments
- GitHub - Jiayi-Pan/TinyZero: Clean, minimal, accessible reproduction of DeepSeek R1-Zero https://github.com/Jiayi-Pan/TinyZero 27 comments
- Mini-R1: Reproduce Deepseek R1 „aha moment“ a RL tutorial https://www.philschmid.de/mini-deepseek-r1 15 comments
- [2206.02336] On the Advance of Making Language Models Better Reasoners https://arxiv.org/abs/2206.02336 12 comments
- [2406.07394] Accessing GPT-4 level Mathematical Olympiad Solutions via Monte Carlo Tree Self-refine with LLaMa-3 8B https://arxiv.org/abs/2406.07394 11 comments
- [2311.11829] System 2 Attention (is something you might need too) https://arxiv.org/abs/2311.11829 9 comments
- [2412.16145] Offline Reinforcement Learning for LLM Multi-Step Reasoning https://arxiv.org/abs/2412.16145 9 comments
- [2203.14465] STaR: Bootstrapping Reasoning With Reasoning https://arxiv.org/abs/2203.14465 5 comments
- [2404.12253] Toward Self-Improvement of LLMs via Imagination, Searching, and Criticizing https://arxiv.org/abs/2404.12253 4 comments
- [2410.01707] Interpretable Contrastive Monte Carlo Tree Search Reasoning https://arxiv.org/abs/2410.01707 4 comments
- [2502.04327] Value-Based Deep RL Scales Predictably https://arxiv.org/abs/2502.04327 4 comments
Related searches:
Search whole site: site:github.com
Search title: GitHub - zzli2022/Awesome-System2-Reasoning-LLM
See how to search.