Hacker News
- D1: Scaling Reasoning in Diffusion LLMs via Reinforcement Learning https://dllm-reasoning.github.io/ 0 comments
- Learning to Reason with LLMs https://openai.com/index/learning-to-reason-with-llms/ 1261 comments
- "Reinforcement Learning for Reasoning in Small LLMs: What Works and What Doesn't", Dang et al. 2025 https://arxiv.org/abs/2503.16219 2 comments reinforcementlearning
- "DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning", Guo et al 2025 {DeepSeek} https://arxiv.org/abs/2501.12948#deepseek 2 comments reinforcementlearning
- Meet ReSearch: A Novel AI Framework that Trains LLMs to Reason with Search via Reinforcement Learning without Using Any Supervised Data on Reasoning Steps https://www.marktechpost.com/2025/03/31/meet-research-a-novel-ai-framework-that-trains-llms-to-reason-with-search-via-reinforcement-learning-without-using-any-supervised-data-on-reasoning-steps/ 2 comments machinelearningnews
- [R] DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning https://arxiv.org/abs/2501.12948 3 comments machinelearning
- LLMs Can’t Learn Maths & Reasoning, Finally Proved! But they can answer correctly using Heursitics https://medium.com/aiguys 36 comments learnmachinelearning
- LLMs Can’t Learn Maths & Reasoning, Finally Proved! But they can answer correctly using Heursitics https://medium.com/aiguys 2 comments deeplearning
- [R] Composite Learning Units: Generalized Learning Beyond Parameter Updates to Transform LLMs into Adaptive Reasoners https://arxiv.org/abs/2410.08037v1 4 comments machinelearning
- LLMs Can Now Learn to Try Again: Researchers from Menlo Introduce ReZero, a Reinforcement Learning Framework That Rewards Query Retrying to Improve Search-Based Reasoning in RAG Systems https://www.marktechpost.com/2025/04/18/llms-can-now-learn-to-try-again-researchers-from-menlo-introduce-rezero-a-reinforcement-learning-framework-that-rewards-query-retrying-to-improve-search-based-reasoning-in-rag-systems/ 2 comments machinelearningnews
- "The Problem with Reasoners: Praying for Transfer Learning", Aidan McLaughlin (will more RL fix o1-style LLMs?) https://aidanmclaughlin.notion.site/reasoners-problem 4 comments reinforcementlearning