Hacker News
- "Reinforcement Learning for Reasoning in Small LLMs: What Works and What Doesn't", Dang et al. 2025 https://arxiv.org/abs/2503.16219 2 comments reinforcementlearning
- "DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning", Guo et al 2025 {DeepSeek} https://arxiv.org/abs/2501.12948#deepseek 2 comments reinforcementlearning
- Meet ReSearch: A Novel AI Framework that Trains LLMs to Reason with Search via Reinforcement Learning without Using Any Supervised Data on Reasoning Steps https://www.marktechpost.com/2025/03/31/meet-research-a-novel-ai-framework-that-trains-llms-to-reason-with-search-via-reinforcement-learning-without-using-any-supervised-data-on-reasoning-steps/ 2 comments machinelearningnews
- [R] DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning https://arxiv.org/abs/2501.12948 3 comments machinelearning
- LLMs Can’t Learn Maths & Reasoning, Finally Proved! But they can answer correctly using Heursitics https://medium.com/aiguys 36 comments learnmachinelearning
- LLMs Can’t Learn Maths & Reasoning, Finally Proved! But they can answer correctly using Heursitics https://medium.com/aiguys 2 comments deeplearning
- [R] Composite Learning Units: Generalized Learning Beyond Parameter Updates to Transform LLMs into Adaptive Reasoners https://arxiv.org/abs/2410.08037v1 4 comments machinelearning
- "The Problem with Reasoners: Praying for Transfer Learning", Aidan McLaughlin (will more RL fix o1-style LLMs?) https://aidanmclaughlin.notion.site/reasoners-problem 4 comments reinforcementlearning