Learning to Reason with LLMs - discu.eu

Hacker News

D1: Scaling Reasoning in Diffusion LLMs via Reinforcement Learning https://dllm-reasoning.github.io/ 0 comments 8/5/2025

Learning to Reason with LLMs https://openai.com/index/learning-to-reason-with-llms/ 1261 comments 12/9/2024

Reddit

"Reinforcement Learning for Reasoning in Small LLMs: What Works and What Doesn't", Dang et al. 2025 https://arxiv.org/abs/2503.16219 2 comments 31/3/2025 reinforcementlearning
"DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning", Guo et al 2025 {DeepSeek} https://arxiv.org/abs/2501.12948#deepseek 2 comments 25/1/2025 reinforcementlearning
Meet ReSearch: A Novel AI Framework that Trains LLMs to Reason with Search via Reinforcement Learning without Using Any Supervised Data on Reasoning Steps https://www.marktechpost.com/2025/03/31/meet-research-a-novel-ai-framework-that-trains-llms-to-reason-with-search-via-reinforcement-learning-without-using-any-supervised-data-on-reasoning-steps/ 2 comments 1/4/2025 machinelearningnews
[R] DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning https://arxiv.org/abs/2501.12948 3 comments 25/1/2025 machinelearning
LLMs Can’t Learn Maths & Reasoning, Finally Proved! But they can answer correctly using Heursitics https://medium.com/aiguys 36 comments 18/12/2024 learnmachinelearning
LLMs Can’t Learn Maths & Reasoning, Finally Proved! But they can answer correctly using Heursitics https://medium.com/aiguys 2 comments 18/12/2024 deeplearning
[R] Composite Learning Units: Generalized Learning Beyond Parameter Updates to Transform LLMs into Adaptive Reasoners https://arxiv.org/abs/2410.08037v1 4 comments 11/10/2024 machinelearning
LLMs Can Now Learn to Try Again: Researchers from Menlo Introduce ReZero, a Reinforcement Learning Framework That Rewards Query Retrying to Improve Search-Based Reasoning in RAG Systems https://www.marktechpost.com/2025/04/18/llms-can-now-learn-to-try-again-researchers-from-menlo-introduce-rezero-a-reinforcement-learning-framework-that-rewards-query-retrying-to-improve-search-based-reasoning-in-rag-systems/ 2 comments 19/4/2025 machinelearningnews
"The Problem with Reasoners: Praying for Transfer Learning", Aidan McLaughlin (will more RL fix o1-style LLMs?) https://aidanmclaughlin.notion.site/reasoners-problem 4 comments 21/1/2025 reinforcementlearning